Job Failure With Schema Changes In Hive When Using Parquet for Storage
Those that are using Hive tables with Parquet storage notice job errors during processing when the Hive schema changes.
Currently, when the schema of a Hive table changes when using Parquet as storage, the user will need to recreate any Data Link or Import Job associated with it that table.
Datameer recommended best practices:
To prevent data copying and to limit the amount of data that needs to be migrated when the schema of those tables changes use Data Links instead of import jobs to minimize the chance of a small change being disruptive.