How to Synchronize Hadoop Libraries to Datameer with Cloudera

Goal

This article describes the steps to take to synchronize Hadoop library files to a Datameer instance. This process is needed when Datameer is not built for a specific maintenance release or patch version from Hadoop. 

 

 

Learn

To ensure the highest level of compatibility with Datameer to a Hadoop cluster, Datameer recommends synchronizing the Hadoop library files from the cluster to the Datameer Server.

This process is performed automatically when Datameer distributes binaries for certain versions of Hadoop, however the library files may become inconsistent if patches or maintenance releases are applied to the Hadoop cluster beyond the version that Datameer was built for. 

To begin, it is recommended to start with the version of Datameer that is compatible with a version of Hadoop that is slightly earlier than your current Hadoop version. For example, if your cluster is running CDH 5.8.5 but Datameer provides binaries for CDH 5.8.0, 5.8.4 and 5.9.0 please start from CDH 5.8.4 since this is the closest version that is before your target version. 

To manually synchronize the Hadoop library files, please follow these steps: 

  1. Stop the Datameer service.
  2. Locate the following JAR files in <Datameer>/webapps/conductor/WEB-INF/lib directory and move it outside of the Datameer directory structure as a backup:
    hadoop-common-*-cdh*.jar
    avro-*-cdh*.jar
    hadoop-hdfs-*-cdh*.jar
    hadoop-mapreduce-client-shuffle-*-cdh*.jar
    hadoop-mapreduce-client-app-*-cdh*.jar
    hadoop-openstack-*-cdh*.jar
    hadoop-annotations-*-cdh*.jar
    hadoop-mapreduce-client-common-*-cdh*.jar
    hadoop-yarn-api-*-cdh*.jar
    hadoop-auth-*-cdh*.jar
    hadoop-mapreduce-client-core-*-cdh*.jar
    hadoop-yarn-client-*-cdh*.jar
    hadoop-aws-*-cdh*.jar
    hadoop-mapreduce-client-hs-*-cdh*.jar
    hadoop-yarn-common-*-cdh*.jar
    hadoop-common-*-cdh*.jar
    hadoop-mapreduce-client-hs-plugins-*-cdh*.jar
    hadoop-yarn-server-common-*-cdh*.jar
    hadoop-distcp-*-cdh*.jar
    hadoop-mapreduce-client-jobclient-*-cdh*.jar
  3. Locate all the corresponding JAR files from the current CDH distribution and copy it to the above location.
  4. Start the Datameer service.

Please contact Datameer Technical Support for additional information.