These instructions are for "simple" Hadoop clusters that have no sophisticated PAM and/or Kerberos integrations. They are ideal for the HDP Sandbox or other such "simple" setups like the one called out in building a virtualized 5-node HDP 2.0 cluster (all within a mac) that rely on "local" users.
For all command examples, replace |
On the box(es) where the user will SSH to and utilize CLI tools (this does NOT have to be a dedicated machine; for example, on the Sandbox there is only one machine), login as root and execute the following commands to create a local account and set the password.
useradd -m -s /bin/bash $theNEWusername passwd $theNEWusername |
Then create a HDFS home directory for this new user.
su - hdfs hdfs dfs -mkdir /user/$theNEWusername hdfs dfs -chown $theNEWusername /user/$theNEWusername hdfs dfs -chmod -R 755 /user/$theNEWusername |
On the remainder of the cluster nodes (if any), we just need to have the new user present. There is no need to set a password as these CLI users will not need to log into any of these hosts directly.
useradd $theNEWusername |
To validate, users can SSH into the edge node with their new credentials and run the following commands to verify that they can manipulate content on HDFS. Note: where in Linux user can use "~" to reference their home directory, the FS Shell treats relative referencing (i.e. nothing before the initial file or folder name) as the equivalent to "~/" which means everything is based on the user's home folder in HDFS.
hdfs dfs -put /etc/group groupList.txt hdfs dfs -ls /user/$theNEWusername hdfs dfs -cat groupList.txt hdfs dfs -rm -skipTrash /user/$theNEWusername/groupList.txt |