Hadoop & Big Data
My landing page for "all things Hadoop", Big Data, and related technologies. The content is rather unstructured right now, but I'll get there. Take a look at David Streever's Hadoop space.
I try to post a fair amount to my Professional Blog on this site around Big Data technologies; just look for content with the "hadoop", "spark" and/or "big_data" label as shown below. Feel free to offer up thoughts on what my Upcoming Blog Posts should be about.
NOTE: The remainder of this document has no real meaningful structure and is as much a parking lot of ideas and links that I will SOMEDAY come back to apply some structure to. Thanks, Lester Martin.
Best Practices for 3rd Party JARs
Figure out what the best practice is. Some notes at http://stackoverflow.com/questions/16825821/parsing-json-input-in-hadoop-java to get this topic going.
Generic Convert Uncompressed Text File to Snappy Encoded Sequence File
Based on thoughts from http://blog.cloudera.com/blog/2011/09/snappy-and-hadoop/ and http://stackoverflow.com/questions/5377118/how-to-convert-txt-file-to-hadoops-sequence-file-format write a simple utility that converts text files to sequence files and compresses them with Snappy. Or... am I overthinking this and there is a far easier way to do this?
Hadoop in the Small
Of course... want, no NEED, to build a Hadoop cluster with Raspberry PI devices as seen in the following urls:
- http://www.raspberrypi.org/phpBB3/viewtopic.php?f=41&t=37190
- http://raspberrypicloud.wordpress.com/2013/04/25/getting-hadoop-to-run-on-the-raspberry-pi/
- http://itbleedingedge.blogspot.com/2013/02/for-my-holiday-project-this-year-i.html#!/2013/02/for-my-holiday-project-this-year-i.html
- http://blog.ittoby.com/2013/08/starting-small-set-up-hadoop-compute.html
Maybe could do it with Java on the BeagleBoard? Maybe just post a very straight forward post like http://java.dzone.com/articles/getting-hadoop-and-running.
Bureau of Labor Statistics Example
How about a project using the BLS OES datasets?
Integration with MongoDB
Investigate MongoDB Connector for Hadoop as called out at http://www.mongodb.com/press/integration-hadoop-and-mongodb-big-data%E2%80%99s-two-most-popular-technologies-gets-significant.