Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This is the first of a three-part series on showing alternative Hadoop & Big Data tools being utilized for Open Georgia Analysis.  The data we are working against looks like the following which is an include of the Format & Sample Data for Open Georgia wiki page.

...

Include Page
lestermartin:Format & Sample Data for Open Georgialestermartin:
Format & Sample Data for Open Georgia

...

In this first installment, let's jump right in where Hadoop began; MapReduce.  After you visit Preparing Open Georgia Test Data and get some test data loaded into HDFS, then you'll want to clone my GitHub repo as referenced in GitHub > lestermartin > hadoop-exploration.  Once you have the code up in your favorite IDE (mine is IntelliJ on my MBPro) then you'll want to hone in on the lestermartin.hadoop.exploration.opengeorgia package (details on the major MapReduce stereotypes in that last link).  You can then build the jar file with Maven; or just grab hadoop-exploration-0.0.1-SNAPSHOT.jar.

As with all three editions of this blog posting series, let's use the Hortonworks Sandbox to run everything.  Make sure the hue user has a folder to put your jar in and then put it there.

...