Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

So... time to eat some crow.  I had a customer who is automating the user onboarding process for his Hadoop cluster and wanted to know if he could use a linux account besides hdfs to create a HDFS user home directory and set the appropriate permissions (see "Creating a New HDFS User" in my Hadoop Cheat Sheet).  I told him he was out of luck and that was just the way it was going to be.

Info

To make matters worse, my "you must switch to hdfs to create the home directory and change the owner" is actually wrong.  You can just switch to the newly created user and keep on keeping on.

Code Block
languagebash
[root@sandbox ~]# useradd nonadminuser
[root@sandbox ~]# su nonadminuser
[nonadminuser@sandbox root]$ hdfs dfs -mkdir /user/nonadminuser
[nonadminuser@sandbox root]$ hdfs dfs -chgrp nonadminuser /user/nonadminuser
[nonadminuser@sandbox root]$ hdfs dfs -ls /user

   ... rm'd some lines ...

drwxr-xr-x   - nonadminuser   nonadminuser           0 2014-08-14 00:01 /user/nonadminuser

   ... rm'd some lines ...

If you want to have a process that doesn't involve switching to any other user, then please read on.

Thinking about it a bit later, I realized I actually never ran this one down.  Navigating through the Hadoop site got me to http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#The_Super-User which told me what I've been espousing all along; the user that starts up the NameNode (NN) is the superuser.  Then I saw it – the phrase that let me know I was wrong in my reply...

In addition, the administrator my identify a distinguished group using a configuration parameter. If set, members of this group are also super-users.

Doh!  I was definitely wrong in my thinking and reply to my customer.  Hey, only the second time this month, but we have half a month to go!!

...

No joy, but that is as expected.  The instructions at http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#Configuration_Parameters let me know I need to make sure there is a dfs.permissions.group KVP created for hdfs-site.xml.  This parameter can be found in Ambari at Services > HDFS > Configs > Advanced > dfs.permissions.superusergroup.  For my Hortonworks Sandbox this value is set to hdfs.  This also aligns with the fact that unless you do a −chgrp, your newly created items have the owner group set to hdfs on this little pseudo-clusterI did find out later that even with a different superusergroup identified, the owning group stayed as hdfs.

Code Block
languagebash
[cat@sandbox root]$ exit
exit
[root@sandbox ~]# su turtle
[turtle@sandbox root]$ hdfs dfs -put /etc/group groups.txt
[turtle@sandbox root]$ hdfs dfs -ls 
Found 1 items
-rw-r--r--   1 turtle hdfs       1033 2014-08-13 23:12 groups.txt

...