Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: clarifying Ranger product name
Info
titleUPDATE: Feb 5, 2015

The component referred to as "XA Secure (Apache Argus)" finally settled down with its finalized open-source name and is now referred to as Apache Ranger.  I've had some very positive hands-on experiences with Ranger since this blog posting was written and I am very enthusiastic about its place in the Hadoop security stack.  Check out the Apache project or Hortworks product page for more information on Ranger.

There are plenty of folks who will tell you "Hadoop isn't secure!", but that's not the whole story.  Yes, as Merv Adrian points out, Hadoop in its most bare-bone mode does have a number of concerns.  Fortunately, like with the rest of Hadoop, software layers upon software layers come together in concert to provide various levels of security shields to address specific requirements.

Hadoop started out with no heavy thought to security.  This is because there was a problem to be solved and the "users" of the earliest incarnations of Hadoop all worked together – and all trusted each other.  Fortunately for all of us, especially those of us who made career bets on this technology, Hadoop acceptance and adoption has been growing by leaps and bounds which only makes security that much more important.  Some of the earliest thoughts around Hadoop security was to simply "wall it off".  Yep, wrap it up with network security and only let a few, trusted, folks in.  Of course, then you needed to keep letting a few more folks in and from that approach of network isolation came the every ever present edge node (aka gateway server, ingestion node, etc) that almost every cluster employs today.  But wait... I'm getting ahead of myself.

...

The Apache Knox gateway provides a software layer intended to perform this perimeter security function.  It has a pluggable provider based mechanism to integrate customer AAA mechanisms.  Not all operations are fully supported yet with Knox to have it completely replace the need for the traditional edge node, but the project's roadmap addresses missing functionality.  It's REST API extends the reach to different types of clients and eliminates the need to SSH to a fully configured "Hadoop client" to interact with the Hadoop cluster (or several as Knox can front multiple clusters).

What are people doing in this space?  Early adopters have already deployed Knox, but the majority of clusters still rely heavily on traditional edge nodes.  The interest is clearly present in almost all customers that I work and I expect a rapid adoption of this technology.

Where do I go for more info?  http://www.dummies.com/how-to/content/edge-nodes-in-hadoop-clusters.html and http://knox.apache.org/

...

Data encryption can be broken into two primary scenarios; encryption in-transit and encryption at-rest.  Wire encryption options exist in Hadoop to aid with the in-transit needs that might be present.  HDP has many There are multiple options available to protect data as it moves through Hadoop over RPC, HTTP, Data Transfer Protocol (DTP), and JDBC. 

For encryption at-rest, there are some open source activities underway, but HDP Hadoop does not inherently have a baseline encryption solution for the data that is persisted within HDFS.  There are several 3rd party solutions available (including Hortonworks partners) that specifically target this requirement.  Custom development could also be undertaken, but the absolute easiest mechanism to obtain encryption at-rest is to tackle this at an OS or hardware level. Mercy will need to determine what, if any, of the available encryption options provide the right trade-off of simplicity and security.  This decision will surely be impacted by what other security components and practices are employed

What are people doing in this space?  My awareness is that few Hadoop administrators have enabled encryption, at-rest or in-transit, at this time.

Where do I go for more info?  http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_reference/content/reference_chap-wire-encryption.html

Summary

As you can see, there are many avenues to explore to ensure you create the best security posture for your particular needs.  Remember, the vast majority of these options are mutually exclusive allowing for multiple approaches to security.  This is surely one area there still is work to be done and definitely in pulling together the disparate pieces into tools that are easy to adopt by enterprises.