Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: link to corrections added
Info

These corrections were made on 9/2/2015 to this blog posting.

So... time to eat some crow.  I had a customer who is automating the their user onboarding process for his Hadoop cluster and wanted to know if he could use a linux account besides hdfs to create a HDFS user home directory and set the appropriate permissions (see "Creating a New HDFS User" in my Hadoop Cheat Sheet)permissions – see simple hadoop cluster user provisioning process (simple = w/o pam or kerberos) .  I told him he was out of luck and that was just the way it was going to be.

...

To make matters worse, my "you must switch to hdfs to create the home directory and change the owner" is actually wrong.  You can just switch to the newly created user and keep on keeping on.

Code Block
languagebash
[root@sandbox ~]# useradd nonadminuser
[root@sandbox ~]# su nonadminuser
[nonadminuser@sandbox root]$ hdfs dfs -mkdir /user/nonadminuser
[nonadminuser@sandbox root]$ hdfs dfs -chgrp nonadminuser /user/nonadminuser
[nonadminuser@sandbox root]$ hdfs dfs -ls /user

   ... rm'd some lines ...

drwxr-xr-x   - nonadminuser   nonadminuser           0 2014-08-14 00:01 /user/nonadminuser

   ... rm'd some lines ...

...

.

Thinking about it a bit later, I realized I actually never ran this one down.  Navigating through the Hadoop site got me to http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#The_Super-User which told me what I've been espousing all along; the user that starts up the NameNode (NN) is the superuser.  Then I saw it – the phrase that let me know I was wrong in my reply...

In addition, the administrator my may identify a distinguished group using a configuration parameter. If set, members of this group are also super-users.

...

No joy, but that is as expected.  The instructions at http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#Configuration_Parameters let me know I need to make sure there is a dfs.permissions.groupsuperusergroup KVP created for hdfs-site.xml.  This parameter can be found in Ambari at Services > HDFS > Configs > Advanced > dfs.permissions.superusergroup.  For my Hortonworks Sandbox this value is set to hdfs.  This also aligns with the fact that unless you do a −chgrp, your newly created items have the group set to hdfs on this little pseudo-cluster.  Contrary to what you would expect (i.e. the group becomes the value for this setting), I did find out later that even with a different superusergroup identified, the owning group stayed as hdfs.

...

After I changed the "superuser" group to be animals, I can now could then make the changes that I wanted to earlier.

...

Which can now also be done as a "real" user if set up appropriately.  If bat had appropriate sudo rights, then I could have done the following without starting out at root.

Code Block
languagebash
[hdfs@sandbox root]$ exit
exit
[root@sandbox ~]# useradd user2
[root@sandbox ~]# su bat
[bat@sandbox root]$ hdfs dfs -mkdir /user/user2
[bat@sandbox root]$ hdfs dfs -ls /user

   ... rm'd some lines ...  NOTICE THAT THE GROUP STILL DEFAULTS TO hdfs, NOT animals

drwxr-xr-x   - user1          user1          0 2014-08-13 23:49 /user/user1
drwxr-xr-x   - bat            hdfs           0 2014-08-13 23:55 /user/user2
[bat@sandbox root]$ hdfs dfs -chown user2 /user/user2
[bat@sandbox root]$ hdfs dfs -chgrp user2 /user/user2
[bat@sandbox root]$ hdfs dfs -ls /user

   ... rm'd some lines ...

drwxr-xr-x   - user1          user1          0 2014-08-13 23:49 /user/user1
drwxr-xr-x   - user2          user2          0 2014-08-13 23:55 /user/user2

...