My landing page for "all things Hadoop", Big Data, and related technologies.  The content is rather unstructured right now, but I'll get there.  Take a look at David Streever's Hadoop space.

NOTE: The remainder of this document has no real meaningful structure and is as much a parking lot of ideas and links that I will SOMEDAY come back to apply some structure to.  Thanks, Lester Martin.

Best Practices for 3rd Party JARs

Figure out what the best practice is.  Some notes at to get this topic going.

Generic Convert Uncompressed Text File to Snappy Encoded Sequence File

Based on thoughts from and write a simple utility that converts text files to sequence files and compresses them with Snappy.  Or... am I overthinking this and there is a far easier way to do this?

Hadoop in the Small

Of course... want, no NEED, to build a Hadoop cluster with Raspberry PI devices as seen in the following urls:

Maybe could do it with Java on the BeagleBoard?  Maybe just post a very straight forward post like

Bureau of Labor Statistics Example

How about a project using the BLS OES datasets?

Integration with MongoDB

Investigate MongoDB Connector for Hadoop as called out at