Tez Jobs Fail with UnsatisfiedLinkError Due to SnappyCodec Loading Issue

Problem

Smart Execution is not working and generates error. The files are compressed with Snappy version 1.1.4 but read with version 1.1.3.

Error message

...
2015-01-13 09:39:29,694 ERROR [TezChild] org.apache.tez.runtime.task.TezTaskRunner: Exception of type Error. Exiting now
java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
        at org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method)
        at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:63)
        at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:190)
        at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:176)
        at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1915)
        at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1810)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1759)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1773)
        at datameer.dap.common.data.DatameerFile.open(DatameerFile.java:108)
        at datameer.dap.common.graphv2.hadoop.GlobalDatameerFileReader.openSingleSplit(GlobalDatameerFileReader.java:175)
        at datameer.dap.common.graphv2.hadoop.GlobalDatameerFileReader$3.apply(GlobalDatameerFileReader.java:201)
        at datameer.dap.common.graphv2.hadoop.GlobalDatameerFileReader$3.apply(GlobalDatameerFileReader.java:197)
        at datameer.dap.sdk.sequence.Sequence$13.moveToNext(Sequence.java:608)
        at datameer.plugin.tez.input.AliasedRecords$CombinedAliasedRecords$1.computeNext(AliasedRecords.java:63)
        at datameer.plugin.tez.input.AliasedRecords$CombinedAliasedRecords$1.computeNext(AliasedRecords.java:52)
        at datameer.dap.sdk.sequence.Sequence$Simple.moveToNext(Sequence.java:157)
        at datameer.dap.sdk.sequence.Sequence$11.computeNext(Sequence.java:558)
        at datameer.dap.sdk.sequence.Sequence$Simple.moveToNext(Sequence.java:157)
        at datameer.dap.sdk.sequence.Sequence$13.moveToNext(Sequence.java:603)
        at datameer.dap.sdk.sequence.Sequence$14.computeNext(Sequence.java:647)
        at datameer.dap.sdk.sequence.Sequence$Simple.moveToNext(Sequence.java:157)
        at datameer.plugin.tez.processing.SimpleVertexProcessor.run(SimpleVertexProcessor.java:161)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
...

Troubleshooting steps

During check of Hadoop Cluster the following settings were found: 

tez.am.launch.env="/usr/lib/hadoop/lib/native</value>" 
tez.task.launch.env="/usr/lib/hadoop/lib/native</value>"

The properties should be set EXACTLY to those without the quotes:

tez.am.launch.env=LD_LIBRARY_PATH=/usr/lib/hadoop/lib/native
tez.task.launch.env=LD_LIBRARY_PATH=/usr/lib/hadoop/lib/native

Cause

The Tez environment libraries are not configured properly for Datameer jobs.

Solution

To resolve this issue, ensure that the following properties are set in the Datameer Custom Properties:

tez.am.launch.env=LD_LIBRARY_PATH=/usr/lib/hadoop/lib/native
tez.task.launch.env=LD_LIBRARY_PATH=/usr/lib/hadoop/lib/native

Note

The Tez property has to be specified without use of quotes (") so it should be

tez.task.launch.env=LD_LIBRARY_PATH=... and NOT tez.task.launch.env="LD_LIBRARY_PATH=...

Once these properties are set, the job can be re-run to verify that this resolved the issue.