Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: more details on knox

...

Hadoop started out with no heavy thought to security.  This is because there was a problem to be solved and the "users" of the earliest incarnations of Hadoop all worked together – and all trusted each other.  Fortunately for all of us, especially those of us who made career bets on this technology, Hadoop acceptance and adoption has been growing by leaps and bounds which only makes security that much more important.  Some of the earliest thoughts around Hadoop security was to simply "wall it off".  Yep, wrap it up with network security and only let a few, trusted, folks in.  Of course, then you needed to keep letting a few more folks in and from that approach of network isolation came the every ever present edge node (aka gateway server, ingestion node, etc) that almost every cluster employs today.  But wait... I'm getting ahead of myself.

...

The Apache Knox gateway provides a software layer intended to perform this perimeter security function.  It has a pluggable provider based mechanism to integrate customer AAA mechanisms.  Not all operations are fully supported yet with Knox to have it completely replace the need for the traditional edge node, but the project's roadmap addresses missing functionality.  It's REST API extends the reach to different types of clients and eliminates the need to SSH to a fully configured "Hadoop client" to interact with the Hadoop cluster (or several as Knox can front multiple clusters).

What are people doing in this space?  Early adopters have already deployed Knox, but the majority of clusters still rely heavily on traditional edge nodes.  The interest is clearly present in almost all customers that I work and I expect a rapid adoption of this technology.

Where do I go for more info?  http://www.dummies.com/how-to/content/edge-nodes-in-hadoop-clusters.html and http://knox.apache.org/

...