Hive Jobs Fail with Custom SerDe JARs

Problem

When upgrading Datameer it became necessary to migrate custom SerDe JAR files from the current deployment to the target deployment.  

The documentation currently recommends:

https://documentation.datameer.com/documentation/display/DAS50/Frequently+Asked+Hadoop+Questions#FrequentlyAskedHadoopQuestions-Q.HowcanIuseacustomHiveSerDe? 

...and starting Datameer, we noted that our Hive plug-in was not loading upon boot of Datameer.

Further investigation of the log files provided the following error messages:

"conductor.log":

[somasul]  INFO [2017-04-13 00:23:40.798] [qtp997120758-20] (PluginUtil.java:225) - Installing plugin in '/home/datameer/Datameer-5.11.37-apache-2.4.1/etc/custom-plugins'
[somasul]  INFO [2017-04-13 00:23:41.199] [qtp997120758-20] (ConductorPluginRegistryConfiguration.java:93) - Scanning plugin: plugin-hive-0.13.1 (Hive 0.13.1 plug-in)
[somasul]  WARN [2017-04-13 00:23:41.251] [qtp997120758-20] (ConductorPluginRegistryConfiguration.java:138) - Unable to load plugin class name: datameer.das.plugin.hive.metastore.HiveMetastoreDataStoreType for plugin:Hive 0.13.1 plug-in because:Plugin for id 'plugin-hive-0.13.1' could not be found
[somasul]  WARN [2017-04-13 00:23:41.288] [qtp997120758-20] (ConductorPluginRegistryConfiguration.java:138) - Unable to load plugin class name: datameer.das.plugin.hive.hiveserver2.HiveServer2DataStoreType for plugin:Hive 0.13.1 plug-in because:Plugin for id 'plugin-hive-0.13.1' could not be found
[somasul]  WARN [2017-04-13 00:23:41.294] [qtp997120758-20] (ConductorPluginRegistryConfiguration.java:138) - Unable to load plugin class name: datameer.das.plugin.hive.hiveserver2.sentry.SentryAwareHiveServer2DataType for plugin:Hive 0.13.1 plug-in because:Plugin for id 'plugin-hive-0.13.1' could not be found
[anonymous]  INFO [2017-04-12 22:52:16.568] [main] (PluginRegistryImpl.java:585) - Registered Plugins:
[anonymous]  INFO [2017-04-12 22:52:16.569] [main] (PluginRegistryImpl.java:591) -  Hive 0.13.1 plug-in (plugin-hive-0.13.1, zip=null, copyToCluster=false)

"job.log":

diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.IllegalStateException: No plugin with id 'plugin-hive-0.13.1' found for deserializing class 'datameer.das.plugin.hive.HiveImportFormat'.

 

Cause

It was determined that a reference to the plug-in that was contained in the supporting "dap" database was causing the Hive plug-in's load to fail. The value was located in the "dap.plugin_state" table. 

The "plugin_id" column contained an entry matching that in the error log, "plugin-hive-0.13.1".  

 

Solution

We performed the following sequence that resulted in Datameer consuming the augmented Hive plug-in:

  • stopped Datameer service
  • removed all references to Hive from "<datameer>/plugins"
  • removed database entry for Hive plugin located in "dap.plugin_state"
  • booted Datameer service, validated no Hive references
  • stopped Datameer service
  • placed augmented Hive Plugin into "<datameer>/plugins"
  • booted Datameer service, validated plugin ingestion via "conductor.log" messages*
  • tested successfully against Hive executions

*:

[anonymous]  INFO [2017-04-19 14:50:24.466] [main] (PluginRegistryImpl.java:585) - Registered Plugins:
[anonymous]  INFO [2017-04-19 16:02:21.713] [main] (PluginRegistryImpl.java:591) -  Hive 0.13.1 plug-in (plugin-hive-0.13.1, zip=/opt/datameer/Datameer-5.11.37-apache-2.4.1/plugins/plugin-hive-0.13.1-5.11.37.zip, copyToCluster=true)