NoInputPathFoundException Generated when connecting to Hive data

Problem

When a user attempts to create a new Import Job or Data Link to a Hive data source, the following error is displayed in the Datameer GUI: "Unable to parse input sample: Could not find any existing file for".

The error message continues with details referencing the HDFS file path that was expected to be available, but could not access successfully.

The following error is logged in the conductor.log file:

 [admin] ERROR [2014-01-01 00:00:00] (DefineFieldsController.java:216) - Can not parse input sample
 datameer.dap.sdk.importjob.NoInputPathFoundException: Could not find any existing file for
        at datameer.dap.sdk.importjob.FileSplitter.checkIfAnyFileFound(FileSplitter.java:113)
        at datameer.dap.sdk.importjob.FileSplitter.createPreviewSplits(FileSplitter.java:107)
        at datameer.dap.common.job.dapimport.ImportPreviewCreator$RecordSourceProviderImpl.iterator(ImportPreviewCreator.java:212)
        at datameer.dap.common.job.dapimport.ImportPreview.parseRawRecords(ImportPreview.java:143)
        at datameer.dap.common.job.dapimport.ImportPreview.parse(ImportPreview.java:115)
        at datameer.dap.conductor.webapp.controller.data.DefineFieldsController.parseRecords(DefineFieldsController.java:211)
        at datameer.dap.conductor.webapp.controller.data.DefineFieldsController.pushRecordsAndLogUrl(DefineFieldsController.java:187)
        at datameer.dap.conductor.webapp.controller.data.DefineFieldsController.provideRecords(DefineFieldsController.java:88)
        at datameer.dap.conductor.webapp.controller.data.importjob.DataSourceW3DefineFieldsController.showDefineFields(DataSourceW3DefineFieldsController.java:93)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
        at java.lang.reflect.Method.invoke(Method.java:611)
		...

Cause

The cause of this issue is an incompatibility with the Datameer native libraries. The native libraries differ on the Hadoop cluster compared to the Datameer installation's native libraries.

Solution

To resolve this issue, check the differences between the native libraries on the Hadoop cluster.

  1. List the library files contained in the <DM_INSTALLATION_DIR>/lib/native directory. The path may include an architecture version as well. To confirm that the location is correct, the listing should include libhadoop.so.
  2. Make a copy of these library files for comparison. For example, copy these files to /tmp/datameer.
  3. Find the lib/native path for the Hadoop Cluster host members (i.e. find / -name "libhadoop.so").
  4. Make a copy of these library files for comparison. For example, copy these files to /tmp/hadoop on the Datameer application server.
  5. For each of the native library files found in the Datameer native library directory, execute a "diff" command to identify if the binary files match exactly or if they differ. (i.e. diff /tmp/datameer/libhadoop.so /tmp/hadoop/libhadoop.so)

If there are differences between the native libraries on the Datameer application and on the Hadoop cluster, please contact Datameer Support to discuss the possibility to merge the native libraries from the Hadoop Cluster onto the Datameer application server.