MapR-5.2: JDBC Hive Import Job Fails With - "Failed to Create Splits"

Problem

Using a Hive JDBC connection (with LDAP authentication user = datameer), import jobs with multiple partitions fail with the following error in the tasklog-failed-syslog...input.log

 2017-01-17 10:04:28,812 [INFO] [InputInitializer \{Map for sheets:[import] (066ec437-e810-428b-8c27-20d05e0e55ea)} #0|#0] |input.TezSplitGenerator|: Determined 2 avaliable containers by node count (node count=2)
 2017-01-17 10:04:28,857 [INFO] [InputInitializer \{Map for sheets:[import] (066ec437-e810-428b-8c27-20d05e0e55ea)} #0|#0] |input.TezSplitGenerator|: Splitting the raw Data. Total Available Containers:2 based on 2.
 2017-01-17 10:04:28,859 [INFO] [InputInitializer \{Map for sheets:[import] (066ec437-e810-428b-8c27-20d05e0e55ea)} #0|#0] |importjob.Splitter$SplitHintBuilder|: Configured wave count=6, wave count left over=1
 2017-01-17 10:04:29,079 [INFO] [InputInitializer \{Map for sheets:[import] (066ec437-e810-428b-8c27-20d05e0e55ea)} #0|#0] |jdbc.JdbcSplitter|: SplitHint\{numMapTasks=11, minSplitSize=16777216, maxSplitSize=5368709120, minSplitCount=0, maxSplitCount=2147483647}
 2017-01-17 10:04:29,080 [INFO] [InputInitializer \{Map for sheets:[import] (066ec437-e810-428b-8c27-20d05e0e55ea)} #0|#0] |jdbc.JdbcSplitter|: number of desired splits: 11
 2017-01-17 10:04:29,137 [INFO] [InputInitializer \{Map for sheets:[import] (066ec437-e810-428b-8c27-20d05e0e55ea)} #0|#0] |jdbc.Utils|: Supplied authorities: ec2-54-170-194-123.eu-west-1.compute.amazonaws.com:10000
 2017-01-17 10:04:29,138 [INFO] [InputInitializer \{Map for sheets:[import] (066ec437-e810-428b-8c27-20d05e0e55ea)} #0|#0] |jdbc.Utils|: Resolved authority: ec2-54-170-194-123.eu-west-1.compute.amazonaws.com:10000
 2017-01-17 10:04:29,167 [INFO] [InputInitializer \{Map for sheets:[import] (066ec437-e810-428b-8c27-20d05e0e55ea)} #0|#0] |jdbc.HiveConnection|: Will try to open client transport with JDBC Uri: jdbc:hive2://ec2-54-170-194-123.eu-west-1.compute.amazonaws.com:10000
 2017-01-17 10:04:29,313 [INFO] [InputInitializer \{Map for sheets:[import] (066ec437-e810-428b-8c27-20d05e0e55ea)} #0|#0] |db.JdbcConnector|: SELECT MIN( userid ),MAX( userid ) FROM datameer . test_bookorders_avro 
 2017-01-17 10:04:30,334 [ERROR] [InputInitializer \{Map for sheets:[import] (066ec437-e810-428b-8c27-20d05e0e55ea)} #0|#0] |input.AbstractDatameerInputInitializer|: Unable to initialise split event(s)
 java.lang.RuntimeException: failed to create splits
 at datameer.dap.common.job.dapimport.jdbc.JdbcSplitter.createSplits(JdbcSplitter.java:99)
 at datameer.dap.common.job.dapimport.jdbc.JdbcSplitter.createSplits(JdbcSplitter.java:19)
 at datameer.dap.common.job.mr.input.v2.impl.ImportSplitter.createSplits(ImportSplitter.java:31)
 at datameer.dap.common.job.mr.input.v2.impl.CombineSplitter.createSplits(CombineSplitter.java:41)
 at datameer.dap.common.graphv2.hadoop.ExternalDataReader.createSplits(ExternalDataReader.java:56)
 at datameer.plugin.tez.input.TezInputFormat$DataHandleInputFormat.createSplits(TezInputFormat.java:62)
 at datameer.plugin.tez.input.TezSplitGenerator.createSplitEvents(TezSplitGenerator.java:86)
 at datameer.plugin.tez.input.TezSplitGenerator.initialize(TezSplitGenerator.java:75)
 at datameer.plugin.tez.input.TezSplitGenerator.initializeEvents(TezSplitGenerator.java:71)
 at datameer.plugin.tez.input.AbstractDatameerInputInitializer.initialize(AbstractDatameerInputInitializer.java:35)
 at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:264)
 at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:258)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
 at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:258)
 at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:245)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

 

Cause

Hive tries to authenticate with secure impersonation on a cluster which doesn't support it.

 

Solution

The following property must be set in hive-site.xml

 <property>
     <name>hive.server2.enable.doAs</name>
     <value>false</value>
 </property>