...
Since there were changes in this partition from transaction #s 1, 3, and 5, we now see rolled together versions of the delta directories spanning these transaction #s. Let’s verify that the contents of the files has have the rolled up details in a single file for the delta
and delete_delta
transactions.
...
Code Block | ||||
---|---|---|---|---|
| ||||
ALTER TABLE try_it partition (prt='p1') COMPACT 'major'; ALTER TABLE try_it partition (prt='p2') COMPACT 'major'; ALTER TABLE try_it partition (prt='p3') COMPACT 'major'; +---------------+-----------+----------+------------+--------+------------+-----------+----------------+---------------+-------------------------+ | compactionid | dbname | tabname | partname | type | state | workerid | starttime | duration | hadoopjobid | +---------------+-----------+----------+------------+--------+------------+-----------+----------------+---------------+-------------------------+ | CompactionId | Database | Table | Partition | Type | State | Worker | Start Time | Duration(ms) | HadoopJobId | | 1 | default | try_it | prt=p1 | MINOR | succeeded | --- | 1576145642000 | 179000 | job_1575915931720_0012 | | 2 | default | try_it | prt=p2 | MINOR | succeeded | --- | 1576181672000 | 22000 | job_1575915931720_0013 | | 3 | default | try_it | prt=p3 | MINOR | succeeded | --- | 1576181677000 | 37000 | job_1575915931720_0014 | | 4 | default | try_it | prt=p1 | MAJOR | succeeded | --- | 1576183603000 | 31000 | job_1575915931720_0015 | | 5 | default | try_it | prt=p2 | MAJOR | succeeded | --- | 1576187575000 | 50000 | job_1575915931720_0017 | | 6 | default | try_it | prt=p3 | MAJOR | succeeded | --- | 1576187583000 | 57000 | job_1575915931720_0018 | +---------------+-----------+----------+------------+--------+------------+-----------+----------------+---------------+-------------------------+ 7 rows selected (0.029 seconds) |
This will build a single “base” file for each partition.
Code Block | ||||
---|---|---|---|---|
| ||||
$ hdfs dfs -ls -R /wa/t/m/h/try_it drwxrwx---+ - hive hadoop 0 2019-12-12 20:47 /wa/t/m/h/try_it/prt=p1 drwxrwx---+ - hive hadoop 0 2019-12-12 20:46 /wa/t/m/h/try_it/prt=p1/base_0000005 -rw-rw----+ 3 hive hadoop 48 2019-12-12 20:46 /wa/t/m/h/try_it/prt=p1/base_0000005/_metadata_acid -rw-rw----+ 3 hive hadoop 1 2019-12-12 20:46 /wa/t/m/h/try_it/prt=p1/base_0000005/_orc_acid_version -rw-rw----+ 3 hive hadoop 228 2019-12-12 20:46 /wa/t/m/h/try_it/prt=p1/base_0000005/bucket_00000 drwxrwx---+ - hive hadoop 0 2019-12-12 21:53 /wa/t/m/h/try_it/prt=p2 drwxrwx---+ - hive hadoop 0 2019-12-12 21:53 /wa/t/m/h/try_it/prt=p2/base_0000004 -rw-rw----+ 3 hive hadoop 48 2019-12-12 21:53 /wa/t/m/h/try_it/prt=p2/base_0000004/_metadata_acid -rw-rw----+ 3 hive hadoop 1 2019-12-12 21:53 /wa/t/m/h/try_it/prt=p2/base_0000004/_orc_acid_version -rw-rw----+ 3 hive hadoop 815 2019-12-12 21:53 /wa/t/m/h/try_it/prt=p2/base_0000004/bucket_00000 drwxrwx---+ - hive hadoop 0 2019-12-12 21:54 /wa/t/m/h/try_it/prt=p3 drwxrwx---+ - hive hadoop 0 2019-12-12 21:53 /wa/t/m/h/try_it/prt=p3/base_0000006 -rw-rw----+ 3 hive hadoop 48 2019-12-12 21:53 /wa/t/m/h/try_it/prt=p3/base_0000006/_metadata_acid -rw-rw----+ 3 hive hadoop 1 2019-12-12 21:53 /wa/t/m/h/try_it/prt=p3/base_0000006/_orc_acid_version -rw-rw----+ 3 hive hadoop 835 2019-12-12 21:53 /wa/t/m/h/try_it/prt=p3/base_0000006/bucket_00000 |
...
Again, the contents of the now fully compacted into “base” base
files table looks like the following.
...