building a virtualized 5-node HDP 2.0 cluster (all within a mac)
This blog posting's content was originally on the Build a Virtualized 5-Node Hadoop 2.0 Cluster wiki page, but it just made sense to refactor it into a blog posting based on the short shelf-life it has due to changes in the ever-evolving HDP stack.
This write-up is designed to capture the steps required to stand up a 5-node HDP2 (Hortonworks Data Platform) Hadoop 2.0/YARN cluster (with 2 master nodes & 3 worker nodes) running on CentOS 6.4 – all executing within VirtualBox. Of course, you'll need a beefy host machine to run all of these 5 guest machines within. To help me out, the good folks at Hortonworks outfitted me with a MacBook Pro that has a 2.3 GHz i7 processor, 16GB of ram, and a 500GB SSD. Yes... I know... I'm lucky!
HDP 2.0.9.0 was utilized for this tutorial. Documentation for the most current version of HDP is available at http://docs.hortonworks.com/.
When we are done, we will have a cluster like the visualization below.
This effort isn't a 15 minute effort for the faint of heart and keeping up with the Earl of Wisdom to the right that suggests that you have to bring some level of sys admin, networking, debugging, root/cause analysis, and just plain "hunting" skills before you'd even try to do this, this write-up will NOT present a series of mindless screenshots.
Additionally, I CAN GUARANTEE that there will be some bumps along the way that will require you to do some digging on your own. You can raise concerns via the comments section at the bottom of the page and I'll try help you unjam any problems you can't work through.
That said, let's get started!!
First Master Node
After you get VirtualBox setup, then we need to get a base install of CentOS going that we can use as a baseline. We can then tweak it for the other nodes. As we'll be doing a "Minimal Install", we only need the Disk1 ISO which you can get here.
Minimal Installation using DHCP
As mentioned above, some of this is an "explore as you go" exercise and I surely don't want to give you screenshots of every single step of setting up CentOS to run within a VirtualBox VM. Fortunately, there are good examples out there in the wild and some quick google searching yields some good links. For this exercise, I suggest taking a look at the write-up at http://webees.me/installing-centos-6-4-in-virtualbox-and-setting-up-host-only-networking/. It is not perfect (and the author, like me, leaves some things out), but it should get you going. The page at http://extr3metech.wordpress.com/2012/10/25/centos-6-3-installation-in-virtual-box-with-screenshots/ could also be reviewed prior to starting this step.
Here are some specific items that need to be addressed during the setup (again, above-n-beyond the basics identified in the earlier tutorials) that are focused on our need to build a multi-node Hadoop cluster.
- In the VirtualBox UI:
- Name the host 5N-HDP2-M1. More on the naming convention later.
- Set the memory to 2GB and create a 20GB (dynamically allocated) hard drive.
- For the Network options, set Adapter 1 to use NAT and set Adapter 2 to use the Host-only Adapter identified as vboxnet0 which will allow the host OS to access the CentOS VM. We'll move this second adapter from DHCP to static later.
- During the CentOS installation setup:
- When it is time to select the Hostname, use "m1.hdp2" (again, more on naming convention later).
- On that same screen, click the Configure Network button to edit each of the two listed "System ethX" network connections and select the Connect automatically checkbox for both.
- For simplicity's sake (i.e. were going to end up with a lot of passwords!) just use "hadoop" for the root password.
- Stick with the default "Minimal installation" type.
If all goes well, then you'll end up with a CLI-only Linux box that you can now log into.
CentOS release 6.4 (Final) Kernel 2.6.32-358.el6.x86_64 on an x86_64 m1 login: root Password: [root@m1 ~]# _
At this point, we're using DHCP so run ifconfig
to see what IP address is being used (my eth1 connection reported 192.168.56.102, but yours may vary). Then use a terminal program from your host to make sure you can log into the newly created virtual machine. And once there, ping an external server to verify it can reach back out.
hw10653:~ lmartin$ ssh root@192.168.56.102 root@192.168.56.102's password: Last login: Thu Jan 16 15:34:32 2014 from 192.168.56.1 [root@m1 ~]# [root@m1 ~]# ping www.google.com PING www.google.com (74.125.227.113) 56(84) bytes of data. 64 bytes from dfw06s16-in-f17.1e100.net (74.125.227.113): icmp_seq=1 ttl=63 time=46.0 ms 64 bytes from dfw06s16-in-f17.1e100.net (74.125.227.113): icmp_seq=2 ttl=63 time=43.2 ms ^C --- www.google.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1531ms rtt min/avg/max/mdev = 43.230/44.615/46.001/1.401 ms [root@m1 ~]#
Configure Static IP
Edit /etc/sysconfig/network-scripts/ifcfg-eth1
to make the following changes:
- Make sure
ONBOOT
is set toyes
(will be if you selected the Connect automatically checkbox during installation and mentioned above) - Change
NM_CONTROLLED
fromyes
tono
- Change
BOOTPROTO
fromdhcp
tonone
- Add the following lines
IPADDR=192.168.56.41
NETMASK=255.255.255.0
- Remove the lines that start with the following
UUID
HWADDR
Edit /etc/sysconfig/network-scripts/ifcfg-eth0
and remove the line starting with UUID
. Run rm /etc/udev/rules.d/70-persistent-net.rules
to flush out the existing network settings and then issue a shutdown -r now
to reboot the machine.
After the box is restarted try to ping 192.168.56.41 (the virtual machine "m1" that we just built), ssh to it, and once there ping back to the guest host (the vboxnet0 identifies it as 192.168.56.1) to verify it is routable in & out. Also, verify that you can still ping addresses in the wild.
hw10653:~ lmartin$ ping 192.168.56.41 PING 192.168.56.41 (192.168.56.41): 56 data bytes 64 bytes from 192.168.56.41: icmp_seq=0 ttl=64 time=0.297 ms 64 bytes from 192.168.56.41: icmp_seq=1 ttl=64 time=0.345 ms ^C --- 192.168.56.41 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.297/0.321/0.345/0.024 ms hw10653:~ lmartin$ hw10653:~ lmartin$ ssh root@192.168.56.41 root@192.168.56.41's password: Last login: Sat Jan 18 03:28:27 2014 from 192.168.56.1 [root@m1 ~]# [root@m1 ~]# ping 192.168.56.1 PING 192.168.56.1 (192.168.56.1) 56(84) bytes of data. 64 bytes from 192.168.56.1: icmp_seq=1 ttl=64 time=0.201 ms 64 bytes from 192.168.56.1: icmp_seq=2 ttl=64 time=0.230 ms ^C --- 192.168.56.1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1428ms rtt min/avg/max/mdev = 0.201/0.215/0.230/0.020 ms [root@m1 ~]# [root@m1 ~]# ping www.google.com PING www.google.com (74.125.227.115) 56(84) bytes of data. 64 bytes from dfw06s16-in-f19.1e100.net (74.125.227.115): icmp_seq=1 ttl=63 time=48.0 ms 64 bytes from dfw06s16-in-f19.1e100.net (74.125.227.115): icmp_seq=2 ttl=63 time=105 ms ^C --- www.google.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1481ms rtt min/avg/max/mdev = 48.053/76.825/105.598/28.773 ms [root@m1 ~]#
Hadoop Installation Prep
Before we create the additional nodes using this one as the base to start from, there are some additional steps we should take so we won't have to do them for each VM. Later in this document we will utilize the Ambari documentation to setup our cluster, but there are some items within that documentation that are useful at this time.
SSH Setup
First, run yum install openssh-clients
and then use ssh localhost
to verify it is working.
Next, follow the instructions at http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.0/bk_using_Ambari_book/content/ambari-chap1-5-2.html to allow password-less SSH connections.
[root@m1 ~]# ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: 23:4a:e9:9f:c9:ea:91:1f:9b:bf:98:e2:16:af:0c:24 root@m1.hdp2 The key's randomart image is: +--[ RSA 2048]----+ | | | | | | | . | |E . o . S | | o o.o . . | | . =o. | | oo=.O | | +*+@.o. | +-----------------+ [root@m1 ~]# cd .ssh [root@m1 .ssh]# ls -l total 12 -rw-------. 1 root root 1675 Jan 18 04:15 id_rsa -rw-r--r--. 1 root root 394 Jan 18 04:15 id_rsa.pub -rw-r--r--. 1 root root 391 Jan 18 04:09 known_hosts [root@m1 .ssh]# [root@m1 .ssh]# cat id_rsa.pub >> authorized_keys [root@m1 .ssh]# ls -l total 16 -rw-r--r--. 1 root root 394 Jan 18 04:17 authorized_keys -rw-------. 1 root root 1675 Jan 18 04:15 id_rsa -rw-r--r--. 1 root root 394 Jan 18 04:15 id_rsa.pub -rw-r--r--. 1 root root 391 Jan 18 04:09 known_hosts [root@m1 .ssh]# [root@m1 .ssh]# chmod 600 authorized_keys [root@m1 .ssh]# ls -l total 16 -rw-------. 1 root root 394 Jan 18 04:17 authorized_keys -rw-------. 1 root root 1675 Jan 18 04:15 id_rsa -rw-r--r--. 1 root root 394 Jan 18 04:15 id_rsa.pub -rw-r--r--. 1 root root 391 Jan 18 04:09 known_hosts [root@m1 .ssh]# [root@m1 .ssh]# ssh localhost Last login: Sat Jan 18 04:09:59 2014 from localhost [root@m1 ~]#
Lastly, create a file named config
in /root/.ssh
that contains StrictHostKeyChecking no
as the only line.
[root@m1 ~]# cd /root/.ssh [root@m1 .ssh]# echo 'StrictHostKeyChecking no' >> config [root@m1 .ssh]# cat config StrictHostKeyChecking no [root@m1 .ssh]#
Disable Key Security Options
Security-Enhanced Linux (SELinux) is a "good thing" and I'm not recommending you not use it in your "real" environments, but for our purposes let's just disabled it. To do this simply edit /etc/selinux/config
and change the SELINUX
value from enforcing
to disabled
.
The Linux firewall, iptables, is also another "good thing", but let's make our environment easier to install to by disabling it. Run the following commands.
[root@m1 ~]# chkconfig iptables off [root@m1 ~]# chkconfig ip6tables off
Reboot and make sure the following output is seen to ensure iptables are not running at system startup.
[root@m1 ~]# service iptables status iptables: Firewall is not running. [root@m1 ~]#
Setup NTP Service
The clocks on all nodes in the cluster will need to be in-sync and the NTP service will take care of this for us. Execute yum install ntp
to enable this service. Then type the following commands.
[root@m1 ~]# chkconfig ntpd on [root@m1 ~]# service ntpd start Starting ntpd: [ OK ] [root@m1 ~]#
Flush Networking Rules
Perform the following as the final step (as a final flush of networking rules) in our preparation of a base VM to help create the other nodes.
[root@m1 ~]# cd /etc/udev/rules.d [root@m1 rules.d]# rm 70-persistent-net.rules rm: remove regular file `70-persistent-net.rules'? y [root@m1 rules.d]#
Pre-Configure Hosts
When finished we will have the following 5 hosts in our cluster.
VirtualBox Name | OS Hostname | IP Address |
---|---|---|
5N-HDP2-M1 | m1.hdp2 | 192.168.56.41 |
5N-HDP2-M2 | m2.hdp2 | 192.168.56.42 |
5N-HDP2-W1 | w1.hdp2 | 192.168.56.51 |
5N-HDP2-W2 | w2.hdp2 | 192.168.56.52 |
5N-HDP2-W3 | w3.hdp2 | 192.168.56.53 |
Update Hosts File
Since we will not be relying on DNS for name resolution, we need to add the lines below (as described in http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.0/bk_using_Ambari_book/content/ambari-chap1-5-5.html) to /etc/hosts
.
192.168.56.41 m1.hdp2 192.168.56.42 m2.hdp2 192.168.56.51 w1.hdp2 192.168.56.52 w2.hdp2 192.168.56.53 w3.hdp2
Obviously, we've only created the first one of these servers, but we'll build out the rest of them shortly.
Set the Hostname
Run the following commands and make sure the output is as shown.
[root@m1 etc]# hostname m1.hdp2 [root@m1 etc]# hostname -f m1.hdp2 [root@m1 etc]#
Create Node Appliance
Now that we have a solid baseline, let's use VirtualBox to create an "appliance" that can be used as an import when standing up the subsequent nodes. Select Export Appliance... from the File pulldown and the highlight the 5N-HDP2-M1 VM before clicking on Continue. Change the filename of the export to be 5N-HDP2-template.ova (stick with the default OVF 1.0 format) and click on Continue (make note of the directory this file will be created in). In the Appliance settings screen, change the Name to be 5N-HDP2-template and click on the Export button.
Remaining Nodes
For the second master node and all three of the worker nodes, repeat the steps in this section.
NOTE: All examples will show the VirtualBox Name, OS Hostname, and IP Address of the second master node. For the worker nodes, replace them with the appropriate values as detailed in the Pre-Configure Hosts section earlier in this document.
Import Template Appliance
From the VirtualBox File pulldown, select Import Appliance... and select the 5N-HDP2-template.ova previously created and click the Continue button. On the Appliance settings screen edit the Name field to represent the appropriate VirtualBox Name (example: 5N-HDP2-M2). Select the Reinitialize the MAC address of all network cards checkbox and click on Import.
Modify Node-Specific Settings
Make sure the 5N-HDP2-M1 VM is not running before starting up each newly imported machine as to avoid two machines binding to the same static IP address; 192.168.56.41.
Power up the new VM, log into it locally (root password is still 'hadoop'), and then make the following changes to configure the appropriate hostname and IP address for each additional node.
- Modify
HWADDR
in/etc/sysconfig/network-scripts/ifcfg-eth0
to be what the MAC Address is set to for the Adapter 1 interface as shown in the example below.
- Modify
IPADDR
in/etc/sysconfig/network-scripts/ifcfg-eth1
to be the appropriate IP address (example: 192.168.56.42). - Modify
HOSTNAME
in/etc/sysconfig/network
to be the correct fully-qualified hostname (example: m2.hdp2). - Set the hostname as described in the earlier 'Set the Hostname' section (example: run
hostname m2.hdp2
and verify by typinghostname -f
to see it displayed). - Delete the
/etc/udev/rules.d/70-persistent-net.rules
file. - Execute
shutdown -r now
to reboot the VM.
Ensure Connectivity Between Nodes
The 5N-HDP2-M1 VM can now be brought back up for connectivity testing as there will not be a collision on the 192.168.56.41 address.
Make an ssh connection to the new cluster node from the host OS and once there ping the first master node & any other operational cluster nodes as well as an internet address. Then verify the correct hostname is assigned to the newly created node.
hw10653:~ lmartin$ ssh root@192.168.56.42 root@192.168.56.42's password: Last login: Sat Jan 18 22:21:24 2014 from 192.168.56.1 [root@m2 ~]# [root@m2 ~]# ping 192.168.56.1 PING 192.168.56.1 (192.168.56.1) 56(84) bytes of data. 64 bytes from 192.168.56.1: icmp_seq=1 ttl=64 time=0.176 ms 64 bytes from 192.168.56.1: icmp_seq=2 ttl=64 time=0.249 ms ^C --- 192.168.56.1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1440ms rtt min/avg/max/mdev = 0.176/0.212/0.249/0.039 ms [root@m2 ~]# [root@m2 ~]# ping m1.hdp2 PING m1.hdp2 (192.168.56.41) 56(84) bytes of data. 64 bytes from m1.hdp2 (192.168.56.41): icmp_seq=1 ttl=64 time=1.16 ms 64 bytes from m1.hdp2 (192.168.56.41): icmp_seq=2 ttl=64 time=0.394 ms ^C --- m1.hdp2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1431ms rtt min/avg/max/mdev = 0.394/0.777/1.160/0.383 ms [root@m2 ~]# [root@m2 ~]# ping www.google.com PING www.google.com (173.194.115.49) 56(84) bytes of data. 64 bytes from dfw06s40-in-f17.1e100.net (173.194.115.49): icmp_seq=1 ttl=63 time=54.8 ms 64 bytes from dfw06s40-in-f17.1e100.net (173.194.115.49): icmp_seq=2 ttl=63 time=144 ms ^C --- www.google.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1517ms rtt min/avg/max/mdev = 54.879/99.851/144.824/44.973 ms [root@m2 ~]# [root@m2 ~]# hostname -f m2.hdp2 [root@m2 ~]#
Now verify the password-less SSH connectivity is working between nodes.
hw10653:~ lmartin$ ssh root@192.168.56.42 root@192.168.56.42's password: Last login: Sat Jan 18 22:21:34 2014 from 192.168.56.1 [root@m2 ~]# [root@m2 ~]# ssh m1.hdp2 Last login: Sat Jan 18 22:09:13 2014 from m2.hdp2 [root@m1 ~]# [root@m1 ~]# ssh m2.hdp2 Last login: Sat Jan 18 22:32:01 2014 from 192.168.56.1 [root@m2 ~]#
Rinse, Lather, and Repeat...
Go back to the beginning of this Remaining Nodes section and repeat all of the steps for each cluster node that still needs to be created, configured, and tested.
Install Cluster via Ambari
Now it is time to get down to the business at hand; setting up a Hadoop cluster. We'll make it easy by using the Ambari installer. The instructions are at http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.0/bk_using_Ambari_book/content/ambari-chap1-1.html. Fortunately, we did much of the "prep" work already.
Perform all the CLI operations on m1.hdp2 as this is the box where we will be running Ambari. After you run yum install wget
, you can pick up at http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.0/bk_using_Ambari_book/content/ambari-chap2-1.html and drop the bits some place like /tmp. Just keep rolling, the instructions are rock-solid. For our simple case, take all the defaults and skip the optional steps.
Don't stop ambari-server as shown in http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.0/bk_using_Ambari_book/content/ambari-chap2-3.html as we're ready to start up the Ambari UI which should now be available at http://192.168.56.41:8080. As http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.0/bk_using_Ambari_book/content/ambari-chap3-1_2x.html states, the username/password is admin/admin.
When asked, name the cluster 'hdp2' and continue to follow the documentation. On the Install Options screen, paste the contents from /root/.ssh/id_rsa
including the "BEGIN" and "END" lines in the SSH Private Key textbox so the screen looks like the following.
After you click on the Register and Confirm button you will see the following Warning screen.
Just click OK. After all hosts have been registered, uncheck Nagios, Ganglia, and HBase on the Choose Services page. Just clear the Limited Functionality Warning message that surfaces.
The Assign Masters screen defaulted to deploying the NameNode, SNameNode, and ResourceManager as visualized at the beginning of this document. If yours looks different than below, then adjust to fit this approach.
Configure the Assign Slaves and Clients screen to look like the following.
To clear the error messages on the Customize Services screen, select the Hive tab and type in "hadoop" (twice) for the Default Password field. Then do the exact same for the Oozie tab. Take the default options on everything else and press OK. The press Deploy on the Review page (again, I hope you're following along in the Hortonworks docs). Now let the Install, Start and Test screen do its magic.
BTW; there is a LOT to be done so it will take some time. In fact, some time-outs may occur and may surface as failures as the screenshot below showed I received.
Hit the Retry button again! I was lucky, I only had to do that once before getting a screen full of green bars, but it did take a looooooooooooooong time to finish and I heard the fans on my MacBook Pro spinning up quite loud.
Lastly, we want to enable the base Ambari daemons to kick up at machine start up time so do the following.
- On m1.hdp2 (i.e. where Ambari itself runs), type
chkconfig ambari-server on
. - On all machines, including m1.hdp2, type
chkconfig ambari-agent on
.
There you have it... as http://192.168.56.41:8080 shows, you have a fully functional cluster with 2 master nodes and 3 worker nodes!!