Upcoming Blog Posts

Ideas for upcoming blog posts.

Hive's export/import operations (note to self; tracking in O.F.)
Test drive of Hive 14's CRUD operations (note to self; tracking in O.F.)
Local DataNode disk balancing options (note to self; tracking in O.F.)
Typical data ingestion workflow (note to self; tracking in O.F.)
- Sqoop'ing some data
- Transformation/enrichment with Pig
- Accessing it from Hive
- Pulling it all together with Oozie
- Best practices of location/naming/structure of code & config for all components of the workflow
- Maybe a redo of this workflow using Cascading?
Recap of Summit preso if only to provide links to deck and recording; loosely based on http://hortonworks.com/blog/four-step-strategy-incremental-updates-hive/
Using snapshots with archiving solution
Support Hadoop vendors like you'd support PBS
Change hostnames & IPs of all hosts in a HDP cluster
HBase via JDBC (using Phoenix)
HBase via JDBC (part deaux; just using HIve)
Pig schema reuse (and why not really good)
Connecting to SparkSQL, possibly as suggested here.

If you have some things you'd like to see, please share them in the comments.