Friday, April 27, 2018

Hive | Impala | External Table | Parquet Storage Format | Renaming Columns | All Rows show NULL




An existing Hive table with data stored in parquet format, and the columns in the table do not match the columns stored in the Parquet schema, if one or more of the columns in table are renamed, the parquet.column.index.access property must be set to true on the table.


After renaming the table columns, the values in the renamed columns are returned as NULL for all rows.


No exceptions are produced.

Set the property parquet.column.index.access=true on the table.

Example:

ALTER TABLE parquet_columnar_renamed SET TBLPROPERTIES ('parquet.column.index.access'='true');

Or alternatively in the Hive session:

set parquet.column.index.access=true; 

2 comments:

Tejuteju said...

The blog is so interactive and Informative , you should write more blogs like this Big Data Hadoop Online course

veera Ravula said...

very nice blog,keep sharing more posts with us.

Thank you....
big data and hadoop course