Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This concept is really a Java question and it can be done programmatically (see hint), but there is no basic "by convention" approach that says create something like a properties.config file and load the KVPs in it so that they will be ~automagically~ picked up.  Well... not for Java main()'s...  Thus, for just firing something like this off (again, it is a general Java question) from the command-line then a simple wrapper script would do quite nicely.  Then you could parametrize all the other values like source and target directories.

Now, if you were using Oozie, you have some other options.  As https://cwiki.apache.org/confluence/display/OOZIE/Java+Cookbook shows, there is a <java-opts> tag that can be used for parameters such as this when using the “Java Java action".  The example they show is just for a memory setting (I.e. -Xms512m), but http://jayatiatblogs.blogspot.com/2011/05/building-java-action-in-oozie.html shows an example of passing in an -Denv=stg -DPP=DB_PASSPHRASE option.  Furthermore, https://issues.cloudera.org/browse/HUE-1030 (notice from the URL that Hue is a Cloudera project that uses the Apache License, but is not a true ASF project ;-) shows how the <java-opts> and <arg> tags work within a single action.

All that said, teragen/sort are MapReduce Java applications so Oozie can leverage the MapReduce action described at https://cwiki.apache.org/confluence/display/OOZIE/Map+Reduce+Cookbook.  You’ll notice the generic <java-opts> tag is gone, but there is a <property> tag within <configuration> and the example uses the mapred.job.queue.name KVP that is also being inquired upon which itself can be used as an example for passing in the number of mappers and reducers, too..

Maybe I'll start answering all of my email this way!