...
Actually, we'll need at least three classes since we'll need a class to be the job that kicks off the MapReduce job. For our use case, we'll go back to Square-1 and implement the old "Word Count" quintessential example. For anyone not familiar with it, we want to count words in a body of text such as the Shakespeare's Sonnets. Now that we have some test data, let's get into the code.
Writing The Code
Info |
---|
These source files can be retrieve from GitHub at https://github.com/lestermartin/hadoop-exploration/tree/master/src/main/csharp/wordcount |
Mapper
The Hadoop SDK gives us a class called MapperBase
that delivers to us a line at a time from the input data along with a MapperContext
object that we can use to emit back our 0..n KVPs for the Mapper contract. The code below show a simple implementation of this activity.
...