Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: link to corrections added
Info

These corrections were made on 9/2/2015 to this blog posting.

So... time to eat some crow.  I had a customer who is automating their user onboarding process for his Hadoop cluster and wanted to know if he could use a linux account besides hdfs to create a HDFS user home directory and set the appropriate permissions (see "Creating a New HDFS User" in my Hadoop Cheat Sheet)permissions – see simple hadoop cluster user provisioning process (simple = w/o pam or kerberos) .  I told him he was out of luck and that was just the way it was going to be.

Info

I eventually realized my "you must switch to hdfs to create the home directory and change the owner" is actually wrong.  You can just switch to the newly created user and keep on keeping on.

Code Block
languagebash
[root@sandbox ~]# useradd nonadminuser
[root@sandbox ~]# su nonadminuser
[nonadminuser@sandbox root]$ hdfs dfs -mkdir /user/nonadminuser
[nonadminuser@sandbox root]$ hdfs dfs -chgrp nonadminuser /user/nonadminuser
[nonadminuser@sandbox root]$ hdfs dfs -ls /user

   ... rm'd some lines ...

drwxr-xr-x   - nonadminuser   nonadminuser           0 2014-08-14 00:01 /user/nonadminuser

   ... rm'd some lines ...

If you want to have a process that doesn't involve switching to any other user, (or more importantly, want to have other linux users with superuser rights) then please read on.

Thinking about it a bit later, I realized I actually never ran this one down.  Navigating through the Hadoop site got me to http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#The_Super-User which told me what I've been espousing all along; the user that starts up the NameNode (NN) is the superuser.  Then I saw it – the phrase that let me know I was wrong in my reply...

In addition, the administrator my may identify a distinguished group using a configuration parameter. If set, members of this group are also super-users.

...

No joy, but that is as expected.  The instructions at http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#Configuration_Parameters let me know I need to make sure there is a dfs.permissions.superusergroup KVP created for hdfs-site.xml.  This parameter can be found in Ambari at Services > HDFS > Configs > Advanced > dfs.permissions.superusergroup.  For my Hortonworks Sandbox this value is set to hdfs.  This also aligns with the fact that unless you do a −chgrp, your newly created items have the group set to hdfs on this little pseudo-cluster.  Contrary to what you would expect (i.e. the group becomes the value for this setting), I did find out later that even with a different superusergroup identified, the owning group stayed as hdfs.

...