/
Site Map
Site Map
A directory tree view of all the pages on this wiki.
All blog posts; most recent first.
Blog Posts
-
moving my tech blog (already missing confluence)
created by
Dec 18, 2020
-
hive delta file compaction (minor and major)
created by
Dec 23, 2019
-
hive acid transactions with partitions (a behind the scenes perspective)
created by
Dec 22, 2019
-
viewing the content of ORC files (using the Java ORC tool jar)
created by
Dec 12, 2019
-
topology supervision features of streaming frameworks (or lack thereof)
created by
Mar 21, 2019
-
are partially-written hdfs files accessible? (not exactly, but much more yes than I previously thought)
created by
Mar 21, 2019
-
use spark to calculate salary statistics for georgia educators (the fourth book of the trilogy)
created by
Mar 09, 2019
-
building a c# storm topology (yes, it is a jvm-based framework)
created by
Jun 23, 2018
-
learning something new every day (seems hdfs is not as immutable as i thought)
created by
Oct 03, 2017
-
accepting "best answer" on hcc (it is the ~right~ thing to do)
created by
Apr 10, 2017
-
NoClassDefFoundError for Log4jLoggerFactory on hdp 2.5.3 when running the KafkaSpout in your topology? (how's that for a title?)
created by
Mar 10, 2017
-
initial hbase grants on a new secure hadoop cluster (without ranger)
created by
Mar 04, 2017
-
opening up a port on centos 7 firewall (using firewalld)
created by
Mar 02, 2017
-
my talk at devnexus (links to video and preso)
created by
Feb 26, 2017
-
why you should be a supporting "member" of oss (hey, it works for npr)
created by
Dec 28, 2016
-
how to counteract ageism? (maintain relevance!)
created by
Dec 28, 2016
-
don't be in such a hurry to change your job (the suck continuum explained)
created by
Oct 20, 2016
-
the agile manifesto (it is still a good idea)
created by
Aug 12, 2016
-
joining multiple datasets with pig (i/o courtesy of hcatloader & hcatstorer)
created by
Aug 08, 2016
-
storing dynamically created file names with pig (piggbank's multistorage to the rescue)
created by
Jul 26, 2016
-
unboxing my new little box (my first intel nuc)
created by
Jun 01, 2016
-
why spark's mapPartitions transformation is faster than map (calls your function once/partition, not once/element)
created by
May 19, 2016
-
performing a non-root ambari install (with hortonworks admin 1 course)
created by
Apr 19, 2016
-
need an overview of hadoop? (i need some reviewers)
created by
Mar 17, 2016
-
some "mandatory" training really is MANDATORY (termination threat is a clue)
created by
Feb 04, 2016
-
novel thoughts on national pride (ask not...)
created by
Jan 14, 2016
-
never try (well... if you employer doesn't value you or your opinion)
created by
Jan 12, 2016
-
authoring presentations with markdown (deckset gets you pretty far)
created by
Dec 09, 2015
-
viewing diffs between powerpoint decks (with a little help from adobe)
created by
Oct 16, 2015
-
transitioned to training (and loving it)
created by
Oct 15, 2015
-
hadoop mount points (more art than science)
created by
Sept 02, 2015
-
trying out hive testbench on hdp sandbox (and packaging it up for deployment elsewhere)
created by
Jul 07, 2015
-
installing hdp 2.2 with ambari 2.0 (moving to the amazon cloud)
created by
Jun 30, 2015
-
presenting at hadoop summit (archiving evolving databases in hive)
created by
Jun 11, 2015
-
installing hdp 2.2 with ambari 2.0 (moving to the azure cloud)
created by
May 06, 2015
-
connecting dbvisualizer to hive (running on hdp 2.2)
created by
Apr 10, 2015
-
got a hadoop question? (then ask lester!)
created by
Mar 28, 2015
-
took the pig/hive test (got a shiny new certificate)
created by
Mar 28, 2015
-
declaring work is beneath you (probably not the best course of action)
created by
Feb 25, 2015
-
os patching your hadoop cluster (pre & post rolling upgrades)
created by
Feb 13, 2015
-
parameterizing mapred.* properties (cli vs oozie)
created by
Feb 05, 2015
-
hadoop mini smoke test (VERY mini)
created by
Jan 10, 2015
-
a lightning quick tutorial on pdsh (for when you need to run the same command on many machines)
created by
Jan 09, 2015
-
help me go to belgium (not asking for money, just a couple of votes)
created by
Jan 08, 2015
-
improving datanode resiliency (it's all about the settings)
created by
Dec 02, 2014
-
simple hadoop cluster user provisioning process (simple = w/o pam or kerberos)
created by
Nov 24, 2014
-
hadoop worker node configuration recommendations (1-2-10)
created by
Nov 13, 2014
-
a patent for the "idea" of ingesting data into hadoop (is it really "sponge worthy"?)
created by
Oct 17, 2014
-
changes to hive's decimal datatype (it could cost you lots of pennies)
created by
Oct 15, 2014
-
stinger.next to the rescue (but you do have stinger.NOW tuning options available, well, "now")
created by
Oct 10, 2014
-
a robust set of hadoop master nodes (it is hard to swing it with two machines)
created by
Sept 15, 2014
-
obtained hortonworks' apache hadoop administrator certification (finally)
created by
Sept 05, 2014
-
hadoop security (it's all about layering)
created by
Aug 25, 2014
-
hadoop superuser (you can have more than 'hdfs')
created by
Aug 13, 2014
-
hadoop demystified presentation (with atlanta's .net user group)
created by
Jul 30, 2014
-
hadoop streaming with .net map reduce api (executing on hdp for windows)
created by
Jul 28, 2014
-
installing hdp on windows (and then running something on it)
created by
Jul 23, 2014
-
small files and hadoop's hdfs (bonus: an inode formula)
created by
Jul 11, 2014
-
volvo thinks like me (they've got two of my mantras covered)
created by
Jun 17, 2014
-
setting up hdp 2.1 with non-standard users for hadoop services (why not use a non-standard user for ambari, too)
created by
May 07, 2014
-
using your mac to install a virtualized hadoop cluster? (then setup a local repo on it)
created by
May 06, 2014
-
use hive to calculate salary statistics for georgia educators (third of a three-part series)
created by
Apr 30, 2014
-
use pig to calculate salary statistics for georgia educators (second of a three-part series)
created by
Apr 30, 2014
-
use mapreduce to calculate salary statistics for georgia educators (first of a three-part series)
created by
Apr 30, 2014
-
hadoop component versions by distributions (the open source ones)
created by
Apr 29, 2014
-
feeling a bit prolific (or maybe i'm just a smart aleck)
created by
Apr 14, 2014
-
manually installing hue (on my virtualized 5-node cluster)
created by
Apr 08, 2014
-
create and share a hive udf (the cli is your friend)
created by
Mar 29, 2014
-
create and share a pig udf (anyone can do it)
created by
Mar 29, 2014
-
stopping oozie from limiting the number of reducers on your hive action (just add some more xml)
created by
Mar 20, 2014
-
how do i load a fixed-width formatted file into hive? (with a little help from pig)
created by
Mar 06, 2014
-
visiting the computer history museum (yes, i'm a geek)
created by
Mar 01, 2014
-
confluence column width hack (where have you been!?!?)
created by
Jan 30, 2014
-
what's after the hortonworks sandbox? (a 5-node cluster!)
created by
Jan 20, 2014
-
building a virtualized 5-node HDP 2.0 cluster (all within a mac)
created by
Jan 19, 2014
-
disruptive possibilities (the rise of platform architecture)
created by
Dec 30, 2013
-
foxtrot and java (it still cracks me up)
created by
Dec 18, 2013
-
hadoop world 2013 (reflections from nyc)
created by
Nov 05, 2013
-
sometimes the pig walks to slaughter because he knows it is better for the farmer (or the team)
created by
Oct 24, 2013
-
too big to ignore (too boring to read)
created by
Oct 23, 2013
-
cat herding (bringing together information, ideas, and technologies)
created by
Oct 09, 2013
-
agile = more meetings? (wtf!)
created by
Oct 03, 2013
-
taking sides (finally)
created by
Aug 22, 2013
-
scaled agile framework (please share your experiences)
created by
Aug 12, 2013
-
fancy yourself a data scientist? (then show me the money!)
created by
Aug 08, 2013
-
fruITion and recrEAtion (a double-header book review)
created by
Jul 07, 2013
-
hadoop yarn (in a nutshell)
created by
Jun 27, 2013
-
just reboot it (when was that ever a good idea?)
created by
Jun 19, 2013
-
assholes and prima donnas (you need a few)
created by
Jun 03, 2013
-
i'm a certified hadoop developer (so, what does that mean?)
created by
Apr 11, 2013
-
hey, i'm here for you (but, you need to show up)
created by
Apr 09, 2013
-
published again (well… not really…)
created by
Mar 24, 2013
-
what the world needs now is another nosql preso (like i need a hole in my head)
created by
Feb 07, 2013
-
do we really have bugs? (did he really ask that?)
created by
Feb 03, 2013
-
how projects really start (it's all about the money)
created by
Jan 27, 2013
-
leadership principles (shouldn't the rangers know?)
created by
Jan 03, 2013
-
generalizing specialists (or is it specializing generalists?)
created by
Dec 20, 2012
-
we need collaborators (not a chief collaboration officer)
created by
Dec 03, 2012
-
emailing manifestos (just don't do it)
created by
Nov 28, 2012
-
lucky (it doesn't mean privileged)
created by
Nov 26, 2012
-
are you a mort, elvis or einstein (or are these labels nonsense)?
created by
Nov 20, 2012
-
enterprise 2.0 book review (using web 2.0 technologies within organizations)
created by
Nov 18, 2012
-
give as few orders as possible (encourage autonomy and responsibility)
created by
Nov 03, 2012
, multiple selections available,
Related content
Hawaii Trips
Hawaii Trips
More like this
Singapore’s Gardens by the Bay
Singapore’s Gardens by the Bay
More like this
Indiana Covered Bridge Loop
Indiana Covered Bridge Loop
More like this
Sheikh Zayed Grand Mosque
Sheikh Zayed Grand Mosque
More like this
Hadoop, Spark & Big Data Blog Posts
Hadoop, Spark & Big Data Blog Posts
More like this
My THIRD Trip to Singapore
My THIRD Trip to Singapore
More like this