General Links
Hortonworks Product Documentation
Hadoop Distribution Components & Versions
Any (and all) of Lester's
blog posts & wiki pages with 'hadoop' label
HDP Developer: Apache Pig and Hive
Course Labs
Sqoop
User Guide
Flume
User Guide
Pig
Pig Latin Basics
Built In Functions
Hive
Project Wiki
stinger.next to the rescue (but you do have stinger.NOW tuning options available, well, "now")
INSERT/UPDATE/DELETE, ACID & Transactions
ACID and Transactions in Hive
Alan Gates' 2015 Hadoop Summit "Adding Insert, Update, and Delete to Hive" talk
Video
Slides
ORC
Project Page
Hive's Wiki Page
Oozie
Documentation Home
HDP Developer: Storm and Trident Fundamentals
Course Labs
Storm
Concepts
JavaDoc
DRPC Server
Storm on YARN via Slider
(HDP 2.3)
Fault-tolerant Nimbus
Guaranteeing Message Processing
Kafka
HDP Tutorial:
Transporting Real-Time Event Stream with Apache Kafka
Some insight on why so fast (i.e. using memory to support a cache for writing to disk) is available on this
SlideShare preso
HDP Developer: Java
MapReduce Design Patterns
book
HDP Developer: Custom YARN Application
Course Labs
Writing YARN Applications
YARN JavaDoc (Hadoop 2.4.x since current Rev using HDP 2.1.x)
Client API
AppMaster API
Slider
Apache Site
2015 Hadoop Summit Presentation: Authoring and Hosting Applications on YARN using Slider
YouTube
SlideShare
HDP Operations: Install and Manage with Apache Hadoop
HDFS
FS Commands
Heterogeneous Storage
Storage Types and Storage Policies
Use Case from eBay's presentation at
2015 Hadoop Summit
Video
Slides
High Availability
HDFS HA w/QJM
Snapshots
MapReduce
Combiner and Partitioner slideshare deck
Solr
HDP Product Documentation
page
Hardware and General Configuration
a robust set of hadoop master nodes (it is hard to swing it with two machines)
hadoop worker node configuration recommendations (1-2-10)
HDP Operations: Apache HBase Advanced Management
Reference Guide
HBase Shell Commands
To introduce "Polyglot Persistence" –
what the world needs now is another nosql preso (like i need a hole in my head)
A pretty solid walk through of the architecture and major moving parts is
presented here
, but do overlook the source and the eventual plug for the non-ASF MapR-DB
Range Prefix Scans