Hive Cheat Sheet
ORC
File dump utility; https://stackoverflow.com/questions/20847024/how-to-see-contents-of-hive-orc-files-in-linux
HCC Best Practices Article; https://community.hortonworks.com/articles/75501/orc-creation-best-practices.html
Benchmarking; https://thisdataguy.com/2018/12/21/orc-benchmarking/
LLAP
Community Connection Deep Dive; https://community.cloudera.com/t5/Community-Articles/Hive-LLAP-deep-dive/ta-p/248893
Hortonworks LLAP Connector; https://github.com/hortonworks-spark/spark-llap (not yet in either Hive or Spark ASF project)
Memory / Configuration Calculator; https://github.com/dstreev/hive_llap_calculator
JOINS
Old Facebook-Engineering post; https://www.facebook.com/notes/facebook-engineering/join-optimization-in-apache-hive/470667928919/
Hive Warehouse Connector (HWC)
HDP 3.1.5 Docs; https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/integrating-hive/content/hive_hivewarehouseconnector_for_handling_apache_spark_data.html
HDP < 3.1.5 (separate catalogs); http://docs.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP3/HDP-3.0.0/integrating-hive/hive_integrating_hive_and_bi.pdf
Druid
External Resources
Date Functions write-up
David Streever's Tuning Hive-Tez article
Popular Config Properties (could also use "hive --hiveconf name=value"; can have multiple --hiveconf blocks)
These entries could also go in a ~/.hiverc file.
Configuration Property | Notes |
---|---|
set hive.cli.print.header=true; | Shows column names at top of query results |
set hive.cli.print.current.db=true; | Shows the current database in the command prompt |
Other Stuff
LLAP (Low Latency Analytical Processor – aka Live Long And Persist)
Random Notes
Limiting HS2 Concurrent Connections
load data local inpath '/home/user1/test.txt' into table my_table;
Connecting via Beeline
With two passes
/usr/hdp/current/hive/bin/beeline
!connect jdbc:hive2://$hive.server.full.hostname:10000 $HIVE_USER password org.apache.hive.jdbc.HiveDriver
As a single line
beeline -u jdbc:hive2://$HIVE_SERVER_FQDN:10000 -n $HIVE_USER
To quit, just type
!q
Setting hive properties via JDBC connection string; https://community.hortonworks.com/articles/60309/working-with-variables-in-hive-hive-shell-and-beel.html