manually installing hue (on my virtualized 5-node cluster)

I needed to manually install Hue on my little cluster I previousy documented in Build a Virtualized 5-Node Hadoop 2.0 Cluster so I thought I'd document it as I went just in case it worked (and if there were any tweaks from the documentation).  The Hortonworks Doc site URL for the instructions I used are at http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.0/bk_installing_manually_book/content/rpm-chap-hue.html.

One of the first things you get asked to do is to make sure Python 2.6 is installed.  I ran into the following issue below that suggested I couldn't get this rolling.

[root@m1 ~]# yum install python26
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.dattobackup.com
 * extras: centos.someimage.com
 * updates: mirror.beyondhosting.net
Setting up Install Process
No package python26 available.
Error: Nothing to do

I'm pretty sure Ambari already laid this down so a quick double-check on the installed version was done on all my 5 nodes to verify I'm in good shape.

[root@m1 ~]# which python
/usr/bin/python
[root@m1 ~]# python -V
Python 2.6.6

When you get to the Configure HDP page you'll be reminded that if you are using Ambari (like me) to NOT edit the conf files directly.  I used vi to check the existing files in /etc/hadoop/conf to see what needed to be done.  The single property for hdfs-site.xml was already in place as described.  For core-site.xml, the properties starting with hadoop.proxyuser.hcat where already present as shown below.

The next screenshot shows I changed them as described in the documentation.  The properties starting with hadoop.proxyuser.hue where not present (no surprise!) so I added them as described (and shown below).

I then used Ambari to add the ...hue.hosts and ...hue.groups custom properties for the webhcat-site.xml and oozie-site.xml conf files.  That took us to the Install Hue instructions which I decided to run on my first master node and completed without problems.  When you get to Configure Web Server steps 1-3 don't really require any action (remember, we're building a sandbox within a machine not a production ready cluster).  Step 4 was a tiny bit confusing, so I'm dumping my screen in case it helps.

[root@m1 conf]# cd /usr/lib/hue/build/env/bin
[root@m1 bin]# ./easy_install pyOpenSSL
Searching for pyOpenSSL
Best match: pyOpenSSL 0.13
Processing pyOpenSSL-0.13-py2.6-linux-x86_64.egg
pyOpenSSL 0.13 is already the active version in easy-install.pth

Using /usr/lib/hue/build/env/lib/python2.6/site-packages/pyOpenSSL-0.13-py2.6-linux-x86_64.egg
Processing dependencies for pyOpenSSL
Finished processing dependencies for pyOpenSSL
[root@m1 bin]# vi /etc/hue/conf
[root@m1 bin]# vi /etc/hue/conf/hue.ini

  ... MAKE THE CHANGES IN STEP 4-B ((I ALSO MADE A COPY OF THE .INI FILE FOR COMPARISON)) ...

[root@m1 bin]# diff /etc/hue/conf/hue.ini.orig /etc/hue/conf/hue.ini
70c70
<   ## ssl_certificate=
---
>   ## ssl_certificate=$PATH_To_CERTIFICATE
73c73
<   ## ssl_private_key=
---
>   ## ssl_private_key=$PATH_To_KEY
[root@m1 bin]# openssl genrsa 1024 > host.key
Generating RSA private key, 1024 bit long modulus
.........................................................................++++++
...................................................................++++++
e is 65537 (0x10001)
[root@m1 bin]# openssl req -new -x509 -nodes -sha1 -key host.key > host.cert
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [XX]:US
State or Province Name (full name) []:Georgia
Locality Name (eg, city) [Default City]:Alpharetta
Organization Name (eg, company) [Default Company Ltd]:Hortonworks
Organizational Unit Name (eg, section) []:
Common Name (eg, your name or your server's hostname) []:m1.hdp2
Email Address []:lmartin@hortonworks.com
[root@m1 bin]# 

For sections 4.2 through 4.6 it looks like there was at least one problem (namely hadoop_hdfs_home) so I've dumped my screen again.  The following is aligned with the 5-node cluster I did previously.

[root@m1 bin]# cd /etc/hue/conf
[root@m1 conf]# vi hue.ini

  ... MAKE THE CHANGES IN STEP 4.2 - 4.6 ((I ALSO MADE A COPY OF THE .INI FILE FOR COMPARISON)) ...

[root@m1 conf]# diff hue.ini.orig hue.ini
70c70
<   ## ssl_certificate=
---
>   ## ssl_certificate=$PATH_To_CERTIFICATE
73c73
<   ## ssl_private_key=
---
>   ## ssl_private_key=$PATH_To_KEY
238c238
<       fs_defaultfs=hdfs://localhost:8020
---
>       fs_defaultfs=hdfs://m1.hdp2:8020
243c243
<       webhdfs_url=http://localhost:50070/webhdfs/v1/
---
>       webhdfs_url=http://m1.hdp2:50070/webhdfs/v1/
251c251
<       ## hadoop_hdfs_home=/usr/lib/hadoop/lib
---
>       ## hadoop_hdfs_home=/usr/lib/hadoop-hdfs
298c298
<       resourcemanager_host=localhost
---
>       resourcemanager_host=m2.hdp2
319c319
<       resourcemanager_api_url=http://localhost:8088
---
>       resourcemanager_api_url=http://m2.hdp2:8088
322c322
<       proxy_api_url=http://localhost:8088
---
>       proxy_api_url=http://m2.hdp2:8088
325c325
<       history_server_api_url=http://localhost:19888
---
>       history_server_api_url=http://m2.hdp2:19888
328c328
<       node_manager_api_url=http://localhost:8042
---
>       node_manager_api_url=http://m2.hdp2:8042
338c338
<   oozie_url=http://localhost:11000/oozie
---
>   oozie_url=http://m2.hdp2:11000/oozie
377c377
<   ## beeswax_server_host=<FQDN of Beeswax Server>
---
>   ## beeswax_server_host=m2.hdp2
529c529
<   templeton_url="http://localhost:50111/templeton/v1/"
---
>   templeton_url="http://m2.hdp2:50111/templeton/v1/"

The Start Hue directions yielded the following output.

[root@m1 conf]# /etc/init.d/hue start
Detecting versions of components...
HUE_VERSION=2.3.0-101
HDP=2.0.6
Hadoop=2.2.0
HCatalog=0.12.0
Pig=0.12.0
Hive=0.12.0
Oozie=4.0.0
Ambari-server=1.4.3
HBase=0.96.1
Starting hue:                                              [  OK  ]

The instructions then go to Validate Configuration, but since we stopped everything with Ambari earlier it is a great time to start up all the services before going to Hue URL which for me is http://192.168.56.41:8000.

For reasons that will take longer to explain than I want to go into during this posting, when replacing 'YourHostName' in http://YourHostName:8000 to pull up Hue be sure to use a host name (or just the ip address) that all nodes within the cluster can access the node that Hue is running on.  Buy me a Dr Pepper and I'll tell you all about it.

If you configured (or actually left the default configuration as it was) authentication like I did you will get this reminder when Hue comes up for the first time.

To keep my life easy, I just use hue and hue for the username and password.  I also ran a dir listing on HDFS before I logged in and after as shown below.  Notice that /user/hue was created after I logged in (group is hue as well).

[root@m1 ~]# su hdfs
[hdfs@m1 root]$ hadoop fs -ls /user
Found 5 items
drwxrwx---   - ambari-qa hdfs          0 2014-04-08 19:24 /user/ambari-qa
drwxr-xr-x   - hcat      hdfs          0 2014-01-20 00:23 /user/hcat
drwx------   - hdfs      hdfs          0 2014-03-20 23:00 /user/hdfs
drwx------   - hive      hdfs          0 2014-01-20 00:23 /user/hive
drwxrwxr-x   - oozie     hdfs          0 2014-01-20 00:25 /user/oozie
[hdfs@m1 root]$ 
[hdfs@m1 root]$ hadoop fs -ls /user
Found 6 items
drwxrwx---   - ambari-qa hdfs          0 2014-04-08 19:24 /user/ambari-qa
drwxr-xr-x   - hcat      hdfs          0 2014-01-20 00:23 /user/hcat
drwx------   - hdfs      hdfs          0 2014-03-20 23:00 /user/hdfs
drwx------   - hive      hdfs          0 2014-01-20 00:23 /user/hive
drwxr-xr-x   - hue       hue           0 2014-04-08 19:29 /user/hue
drwxrwxr-x   - oozie     hdfs          0 2014-01-20 00:25 /user/oozie

My Hue UI came up fine without any misconfiguration detected so I decided to run through some of my prior blog postings to check things out.  I selected how do i load a fixed-width formatted file into hive? (with a little help from pig) since it exercises Pig and Hive pretty quick.

For some reason, I could not get away with using the simple way to register the piggybank jar file shown in that quick tutorial.  I had to actually load it to HDFS, I put it at /user/hue/jars/piggybank.jar, then register as shown below and explained in more detail in the comments section of create and share a pig udf (anyone can do it).

REGISTER /user/hue/jars/piggybank.jar;  --that is an HDFS path

I got into trouble when I ran convert-emp and Hue's Pig interface complained for me to "Please initialize HIVE_HOME".  You may not run into this problem yourself as the fix (which I actually got help from Hortonworks Support on as seen in Case_00004924.pdf) was simply to add the Hive Client to all nodes within the cluster (this will be fixed in HDP 2.1).  As the ticket said, that would be painful if I had to do for tons of nodes, especially with the version of Ambari I'm using that does not yet allow you to do operations like this one many machines at a time.  That said, I just needed to add it to three workers via the Ambari feature show below.

Truthfully, on my little virtualized cluster this takes a few minutes for each host.  It will be nice when stuff like this can happen in parallel.  Hey... just another reason to add "Clients" to all nodes in the cluster!

All in all, a bit more arduous than it ought to be, but now you have Hue running in your very own virtualized cluster!!