Job Failure - ClassNotFoundException for One or More Datameer Core Class Files

Problem

Some Datameer jobs in the environment fail with errors indicating java.lang.ClassNotFoundException for one or more Datameer classes. Here are a couple examples:

ERROR [2015-01-01 00:00:00.000] [JobScheduler thread-1] (JobScheduler.java:800) - Job 1234 failed with exception.
java.lang.RuntimeException: java.lang.RuntimeException: Failed to run cluster job for 'Workbook job (1234): Workbook#Attributes(Group by operation)'
        at datameer.dap.common.graphv2.ConcurrentClusterSession$1.run(ConcurrentClusterSession.java:51)
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.RuntimeException: Failed to run cluster job for 'Workbook job (1234): Workbook#Attributes(Group by operation)'
        at datameer.dap.common.graphv2.ClusterSession.execute(ClusterSession.java:196)
        at datameer.dap.common.graphv2.ConcurrentClusterSession$1.run(ConcurrentClusterSession.java:48)
        ... 6 more
Caused by: java.lang.RuntimeException: Job job_201503171121_1154 failed! Failure info: NA
        at datameer.dap.sdk.util.ExceptionUtil.convertToRuntimeException(ExceptionUtil.java:49)
        at datameer.dap.sdk.util.ExceptionUtil.convertToRuntimeException(ExceptionUtil.java:31)
        at datameer.dap.common.graphv2.hadoop.MrJob.runImpl(MrJob.java:197)
        at datameer.dap.common.graphv2.ClusterJob.run(ClusterJob.java:129)
        at datameer.dap.common.graphv2.ClusterSession.execute(ClusterSession.java:189)
        ... 7 more
Caused by: java.io.IOException: Job job_201503171121_1154 failed! Failure info: NA
        at datameer.dap.common.job.mr.HadoopMrJobClient.waitUntilJobCompletion(HadoopMrJobClient.java:172)
        at datameer.dap.common.job.mr.HadoopMrJobClient.runJobImpl(HadoopMrJobClient.java:76)
        at datameer.dap.common.job.mr.MrJobClient.runJob(MrJobClient.java:34)
        at datameer.dap.common.graphv2.hadoop.MrJob.runImpl(MrJob.java:185)
        ... 9 more
Caused by: java.lang.RuntimeException: Task: Error: java.lang.ClassNotFoundException: datameer.das.functions.logical.OrFunction$1
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at datameer.dap.sdk.plugin.PluginClassLoader.loadClass(PluginClassLoader.java:59)
        at datameer.das.functions.logical.OrFunction.createComputor(OrFunction.java:31)
        at datameer.dap.common.formula.SimpleFunctionExpression.getValueComputor(SimpleFunctionExpression.java:80)
        at datameer.dap.common.formula.SimpleFunctionExpression.computeValue(SimpleFunctionExpression.java:41)
        at datameer.dap.common.graphv2.FilterRecordProcessor$1.apply(FilterRecordProcessor.java:36)
        at datameer.dap.common.graphv2.FilterRecordProcessor$1.apply(FilterRecordProcessor.java:31)
        at datameer.dap.sdk.sequence.Sequence$11.computeNext(Sequence.java:560)
        at datameer.dap.sdk.sequence.Sequence$Simple.moveToNext(Sequence.java:157)
        at datameer.dap.sdk.sequence.Sequence$13.moveToNext(Sequence.java:603)
        at datameer.dap.common.graphv2.hadoop.MrJobKeyValueMapper.run(MrJobKeyValueMapper.java:76)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:706)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:351)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:282)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122)
        at org.apache.hadoop.mapred.Child.main(Child.java:271)

In the above log, this class datameer.das.functions.logical.OrFunction is the missing class which stems from the Datameer jar files. This is important in distinguishing this symptom from a symptom involving a missing Hadoop library. 

Cause

This is caused by the removal of required jar files from the Hadoop Distributed Cache. The absence of these class files at run time causes these errors.

Solution

To work around this issue, follow these steps to manually repopulate the Job Jar files on HDFS:

  1. Stop Datameer.
  2. Check the listing in HDFS for
    <datameer_private_directory>/jobjar
  3. Rename this directory on HDFS from jobjar to jobjar-old
  4. Start Datameer.
  5. Ensure that the
    <datameer_private_directory>/jobjar
    directory is recreated in HDFS.

Comments

  • Avatar
    Kewal Chopra

    Hi Team,,

    Please suggest the reason for this.... Can't make the changes in production for every job fails in the manner.

  • Avatar
    Joel Stewart

    The cause is unknown Kewal. If this issue happens consistently, I'd recommend following up with Datameer Support. I'd also recommend following up with the support team for your Hadoop vendor.