Operation Procedure - Manual failover between HDFS NameNodes (NN)

Problem

After manual failover between HDFS NameNodes (NN) you experience errors like

NoInputPathFoundException: Could not find any existing file for hdfs://<namenode>:8020/user/datameer/workbooks/<configID>/<execID>/<workSheetName>/data 

The Hadoop cluster is fully operational but random Datameer jobs fail with the error above.

 

Cause

This issue is caused by references to the active NN written during the time of workbook execution.

You can check the current NameNode (NN) setting within the database via

mysql -udap -pdap dap -Bse "SELECT uri FROM data" | cut -d"/" -f3 | sort | uniq 

It will report at least two URIs.

 

Solution

You need to update the paths to the new or correct location within the Datameer application database

./bin/update_paths.sh hdfs://<passiveNameNode>:8020/user/datameer hdfs://<activeNameNode>:8020/user/datameer 

Double check to ensure the new NN has been applied to the database and that the path is correct via

mysql -udap -pdap dap -Bse "SELECT uri FROM data" | cut -d"/" -f3,4,5 | sort | uniq 

The above steps are the recommended standard operating procedure for manual fail over between HDFS NNs.

Further Information

To avoid this manual operation step you may Configure Datameer for HDFS High Availability (HA)