TEZ and SmallJob Jobs Fail with NullPointerException for addFolderToZip

Problem

When jobs are executed in either the Tez or SmallJob framework, the jobs fail and a NullPointerException is logged in the Job Trace:

 

ERROR [2015-01-01 00:00:00.000] [MrPlanRunnerV2] (ClusterSession.java:192) - Failed to run cluster job 'Sample job (12345): MYJOBNAME#datalink-sample(Limit record processor)' [0 sec]
java.lang.NullPointerException
        at datameer.dap.sdk.util.ZipUtil.addFolderToZip(ZipUtil.java:89)
        at datameer.dap.sdk.util.ZipUtil.compress(ZipUtil.java:81)
        at datameer.plugin.tez.TezClientFacade.createTezPluginJar(TezClientFacade.java:242)
        at datameer.plugin.tez.TezClientFacade.createTaskLocalResourceMap(TezClientFacade.java:199)
        at datameer.plugin.tez.session.TezSessionImpl.<init>(TezSessionImpl.java:35)
        at datameer.plugin.tez.session.TezSessionPool.createSession(TezSessionPool.java:94)
        at datameer.plugin.tez.session.TezSessionPool.createNewSession(TezSessionPool.java:139)
        at datameer.plugin.tez.session.PoolingTezSessionFactory.get(PoolingTezSessionFactory.java:105)
        at datameer.plugin.tez.session.SkipPoolingTezSessionFactory.get(SkipPoolingTezSessionFactory.java:24)
        at datameer.plugin.tez.session.TezSessionFactory$ReuseSessionFactory.createNewSession(TezSessionFactory.java:103)
        at datameer.plugin.tez.session.PoolingTezSessionFactory.get(PoolingTezSessionFactory.java:105)
        at datameer.plugin.tez.session.TrackRunningSessionFactory.get(TrackRunningSessionFactory.java:47)
        at datameer.plugin.tez.session.TrackRunningSessionFactory.get(TrackRunningSessionFactory.java:47)
        at datameer.plugin.tez.DagRunner.submit(DagRunner.java:69)
        at datameer.plugin.tez.smalljob.SmallJobClusterJob.runImpl(SmallJobClusterJob.java:144)
        at datameer.dap.common.graphv2.ClusterJob.run(ClusterJob.java:129)
        at datameer.dap.common.graphv2.ClusterSession.execute(ClusterSession.java:186)
        at datameer.dap.common.graphv2.mixedframework.MixedClusterSession.execute(MixedClusterSession.java:48)
        at datameer.dap.common.graphv2.ClusterSession.runAllClusterJobs(ClusterSession.java:360)
        at datameer.dap.common.graphv2.MrPlanRunnerV2.run(MrPlanRunnerV2.java:86)
        at java.lang.Thread.run(Thread.java:745)

 

Jobs that are executed on the MapReduce framework do not experience this issue.

 

Cause

The cause of this issue is a configuration issue related to the Datameer TMP variable.

 

Solution

To resolve this issue, ensure that the Datameer TMP variable configured in the <INSTALLDIR>/etc/das-env.sh file points to "tmp" and not "/tmp". Here is an example of the variable being set properly:

# Temp folder where java should write temp files export
TMP=tmp

For more information, please contact Datameer Support for further information. This may be referenced as DAP-23621.