Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

If you find yourself needing to setup Hortonworks Data Platform (HDP) with Ambari in an environment that users and groups need to be pre-provisioned instead of simply created during the install process, then don't fret as Ambari has got you covered.  This write-up piggybacks the HDP Documentation site and uses HDP 2.1.2 along with Ambari 1.5.1 as a baseline to build against.  It will also build a 4-node cluster (2 worker nodes, 1 master node, and 1 node to run Knox on) all running on CentOS 6.5 

...

Warning

A note is identified in http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-users_2x.html that states "All new service user accounts, and any existing user accounts used as service users, must have a UID >= 1000. Unfortunately, as described here, CentOS & RHEL begin their numbering at 500.

...

Warning

A note is identified in http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap1-5-2.html that (in regards to installing/running Ambari) states, "It is possible to use a non-root SSH account, if that account can execute sudo without entering a password.

On that note there are great write-ups out there like found here, but I cheated since this is a dev-only setup (virtualized even within my Mac) and every "real" environment will have a sysadmin who knows how to do this best for their setup.  I followed this thread and just did the following to grant everyone the ability to do password-less sudo commands.

...

Now you can return to the notes in Build in building a Virtualized virtualized 5-Node Hadoop node HDP 2.0 Cluster (cluster (all within a mac) ( i.e. keep on chugging through the install docs) after the SSH Setup section and work all the way through Rinse, Lather, and Repeat... as long as you take into account the change to the hosts we are trying to build and set them up as identified below.  We can call the newly created VirtualBox appliance 4N-HDP212-template (with file name of 4N-HDP212-template.ova).

...

Phew... that's done!  Now we (finally) get to start installing Ambari.  For the most part, just follow the Install Cluster via Ambari ramblings back in Build building a Virtualized virtualized 5-Node Hadoop node HDP 2.0 Clustercluster (all within a mac).  Of course, this is HDP 2.1, not 2.0, but very similar.  I'll call out the major differences below.

...

I also found out that I had a warning from each of the four nodes that the ntpd service was not running.  I thought I took care of this earlier, but either way I just followed the instructions on this back in Build in building a Virtualized virtualized 5-Node Hadoop node HDP 2.0 Cluster and cluster (all within a mac) the warnings cleared up.

Unlike the other cluster install instructions, for this setup we want all services checked on the Choose Services page and then you can take some creative liberty on the Assign Masters page.  Here's a snapshot of my selections.

...

Item to ResolveAction Taken
The Misc tab had a "Proxy group for Hive, WebHCat, Oozie and Falcon" field that I wasn't expectingI simply left it as "users"
The Misc tab had no place to identify the Ganglia Group of "ryonobody" that I previously created and used as the primary group for the "ryonobody" userKnowing there is a user and a group both named "nobody" on the base OS install (and considering the bolt-on nature of Ganglia to HDP) I left the user as "nobody"
The Misc tab had no place to identify the RRDTool's "ryorrdcahed" user that I previously createdReading the notes again I decided (maybe realized?) this is another bolt-on service for HDP and didn't worry about the user I previously created
The Misc tab had no place to identify the "ryoapache" user that is associated with GangliaSame as prior action
The Misc tab had not place to identify the "ryopostgres" user that Ambari itself usesNo worries, but this could have been resolved during the CLI setup of Ambari as mentioned earlier

Thankfully... the Install, Start and Test screen clear with all green bars.  Now, like shown at the end of Build building a Virtualized virtualized 5-Node Hadoop node HDP 2.0 Clustercluster (all within a mac), we need to make sure the ambari-server and ambari-agent services are started upon reboot.  After taking care of that and restarted the virtual machines and starting all Hadoop services from Ambari, all heck broke loose.  As you do in situations like this, I worked the logs and finally got some answers.

...