Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: link to corrections added
Info

These corrections were made on 9/2/2015 to this blog posting.

So... time to eat some crow.  I had a customer who is automating their user onboarding process for his Hadoop cluster and wanted to know if he could use a linux account besides hdfs to create a HDFS user home directory and set the appropriate permissions (see "Creating a New HDFS User" in my Hadoop Cheat Sheet)permissions – see simple hadoop cluster user provisioning process (simple = w/o pam or kerberos) .  I told him he was out of luck and that was just the way it was going to be.

...

.

Code Block
languagebash
[root@sandbox ~]# useradd nonadminuser
[root@sandbox ~]# su nonadminuser
[nonadminuser@sandbox root]$ hdfs dfs -mkdir /user/nonadminuser
[nonadminuser@sandbox root]$ hdfs dfs -chgrp nonadminuser /user/nonadminuser
[nonadminuser@sandbox root]$ hdfs dfs -ls /user

   ... rm'd some lines ...

drwxr-xr-x   - nonadminuser   nonadminuser           0 2014-08-14 00:01 /user/nonadminuser

   ... rm'd some lines ...

If you want to have a process that doesn't involve switching to any other user, (or more importantly, want to have other linux users with superuser rights) then please read on.

Thinking about it a bit later, I realized I actually never ran this one down.  Navigating through the Hadoop site got me to http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#The_Super-User which told me what I've been espousing all along; the user that starts up the NameNode (NN) is the superuser.  Then I saw it – the phrase that let me know I was wrong in my reply...

In addition, the administrator my may identify a distinguished group using a configuration parameter. If set, members of this group are also super-users.

...

No joy, but that is as expected.  The instructions at http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#Configuration_Parameters let me know I need to make sure there is a dfs.permissions.groupsuperusergroup KVP created for hdfs-site.xml.  This parameter can be found in Ambari at Services > HDFS > Configs > Advanced > dfs.permissions.superusergroup.  For my Hortonworks Sandbox this value is set to hdfs.  This also aligns with the fact that unless you do a −chgrp, your newly created items have the group set to hdfs on this little pseudo-cluster.  Contrary to what you would expect (i.e. the group becomes the value for this setting), I did find out later that even with a different superusergroup identified, the owning group stayed as hdfs.

...