Wiki page associated with the Open Georgia Analysis effort. Specifically for Java code found in the following locations on GitHub > lestermartin > hadoop-exploration.
...
Tip |
---|
This content has moved to https://github.com/lestermartin/hadoop-exploration/tree/master/src/main/java/lestermartin/hadoop/exploration/opengeorgia |
...
This code solves the Simple Open Georgia Use Case using the following classes.
The Mapper
The TitleMapper
class first takes each row of CSV data (see Format & Sample Data for Open Georgia for more details) that it is passed during invocation of the map()
method and constructs a SalaryReport
object using the crude & primitive parsing logic of SalaryReportBuilder
.
Then it simply bails out if it doesn't meet the basic Simple Open Georgia Use Case criteria. If it does get past this initial filtering, then it emits a KVP of the job title and the salary value that goes along with it.
The Reducer
SalaryStatisticsReducer
simply calculates the total number of people for the given job title along with the minimum/maximum/average statistics.
The Driver
GenerateStatistics
pulls it all together so the MapReduce job can be run.
...