Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The Apache Pig project's User Defined Functions gives a pretty good overview of how to create a UDF.  In fact, I stole my simple UDF from there.  For Pig UDF's the obligitory "Hello World" program is actually a "Convert to Upper Case" function.  For this effort, I'm using the Hortonworks Sandbox (version 2.0).  Once you have that setup operational, follow along and we'll get your first UDF created and placed on HDFS where others can easily share it. 

...

Code Block
languagebash
hw10653:~ lmartin$ ssh root@127.0.0.1 -p 2222
root@127.0.0.1's password: 
Last login: Fri Mar 28 21:34:47 2014 from 10.0.2.2
[root@sandbox ~]# su hue
[hue@sandbox root]$ cd ~
[hue@sandbox ~]$ mkdir exampleudf
[hue@sandbox ~]$ cd exampleudf/
[hue@sandbox exampleudf]$ vi UPPER.java

...

As you are starting to see, the goal is to create a SIMPLE User-Defined Function.  This will get give you a strawman , but you'll have to add that you can build your own slick new function on top of.  That, or pay some decent Java Hadoop programmer to do it for you – heck, I'm not allergic to a little moonlighting.  (wink)

Then just compile the class and jar it up (your jdk and pig version numbers might vary slightly).  If you have trouble compiling/jaring it, or don't even want to try, then just download exampleudf.jar directly and load it into the directory described further down in the post.

...

Info

For this compiled UDF library to be accessible then for everyone, the jar file needs to have its HDFS permissions set to allow read rights to for all users.

Now, create a file (example: typingText.txt) with some random text such and get it into HDFS as shown below.

...

Code Block
languagebash
[hue@sandbox ~]$ pig test-UPPER.pig
2014-03-29 01:20:40,579 [main] INFO  org.apache.pig.Main - Apache Pig version 0.12.0.2.0.6.0-76 (rexported) compiled Oct 17 2013, 20:44:07
2014-03-29 01:20:40,580 [main] INFO  org.apache.pig.Main - Logging error messages to: /usr/lib/hue/pig_1396081240577.log

... LOTS of lines removed ...

2014-03-29 01:21:12,501 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-03-29 01:21:12,502 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(NOW IS THE TIME FOR ALL GOOD MEN TO COME TO THE AID OF THEIR COUNTRY.)
( NOW IS THE TIME FOR ALL GOOD MEN TO COME TO THE AID OF THEIR COUNTRY)
(. NOW IS THE TIME FOR ALL GOOD MEN TO COME TO THE AID OF THEIR COUNTR)
(Y. NOW IS THE TIME FOR ALL GOOD MEN TO COME TO THE AID OF THEIR COUNT)
(RY. NOW IS THE TIME FOR ALL GOOD MEN TO COME TO THE AID OF THEIR COUN)
(TRY. NOW IS THE TIME FOR ALL GOOD MEN TO COME TO THE AID OF THEIR COU)
(NTRY. NOW IS THE TIME FOR ALL GOOD MEN TO COME TO THE AID OF THEIR CO)
(UNTRY. NOW IS THE TIME FOR ALL GOOD MEN TO COME TO THE AID OF THEIR C)
(OUNTRY. NOW IS THE TIME FOR ALL GOOD MEN TO COME TO THE AID OF THEIR )
(COUNTRY. NOW IS THE TIME FOR ALL GOOD MEN TO COME TO THE AID OF THEIR)

It worked!  You did it!!  Everything has been CAPITALIZED!!!  Awesome!Congratulations!!!!