/
Hadoop & Big Data

Hadoop & Big Data

My landing page for "all things Hadoop", Big Data, and related technologies.  The content is rather unstructured right now, but I'll get there.  Take a look at David Streever's Hadoop space.

 

I try to post a fair amount to my Professional Blog on this site around Big Data technologies; just look for content with the "hadoop", "spark" and/or "big_data" label as shown below.  Feel free to offer up thoughts on what my Upcoming Blog Posts should be about.


NOTE: The remainder of this document has no real meaningful structure and is as much a parking lot of ideas and links that I will SOMEDAY come back to apply some structure to.  Thanks, Lester Martin.

Best Practices for 3rd Party JARs

Figure out what the best practice is.  Some notes at http://stackoverflow.com/questions/16825821/parsing-json-input-in-hadoop-java to get this topic going.

Generic Convert Uncompressed Text File to Snappy Encoded Sequence File

Based on thoughts from http://blog.cloudera.com/blog/2011/09/snappy-and-hadoop/ and http://stackoverflow.com/questions/5377118/how-to-convert-txt-file-to-hadoops-sequence-file-format write a simple utility that converts text files to sequence files and compresses them with Snappy.  Or... am I overthinking this and there is a far easier way to do this?

Hadoop in the Small

Of course... want, no NEED, to build a Hadoop cluster with Raspberry PI devices as seen in the following urls:

Maybe could do it with Java on the BeagleBoard?  Maybe just post a very straight forward post like http://java.dzone.com/articles/getting-hadoop-and-running.

Bureau of Labor Statistics Example

How about a project using the BLS OES datasets?

Integration with MongoDB

Investigate MongoDB Connector for Hadoop as called out at http://www.mongodb.com/press/integration-hadoop-and-mongodb-big-data%E2%80%99s-two-most-popular-technologies-gets-significant.

Related content

Links & Cheat Sheets for Hadoop & Big Data
Links & Cheat Sheets for Hadoop & Big Data
More like this
got a hadoop question? (then ask lester!)
got a hadoop question? (then ask lester!)
More like this
need an overview of hadoop? (i need some reviewers)
need an overview of hadoop? (i need some reviewers)
More like this
Hadoop Distribution Components & Versions
Hadoop Distribution Components & Versions
More like this
Build a Virtualized 5-Node Hadoop 2.0 Cluster
Build a Virtualized 5-Node Hadoop 2.0 Cluster
More like this
hadoop security (it's all about layering)
hadoop security (it's all about layering)
More like this