How to Troubleshoot Performance Degradations

Goal

How to troubleshoot performance degradations.

Intension

If a performance degradation is experienced, it is often not clear which components are involved and what needs to be checked. This script should provide some guidance. 

Learn

Datameer

  • Note the software and release versions of involved Operating System (OS), Datameer (DM) package, Cluster distribution (HDP, CDH, ...) and Browser.
  • Capture logs! Like application log file, job trace(s). Double check for correct information. Force to gather Hadoop task logs if missing. Include from <datameer-install-path>/logs the useraction.log, metrics.log.*, metrics-jamon.log.*, jvm-stdout.log,httperror.log, <date>.stderrout.log, hadoop-metrics.log.*. Do this as early as possible.
  • Capture configuration files from <datameer-install-path>/logs and <datameer-install-path>/ etc. Check them for valid values. 
  • Create a Heap Dump for Datameer Java Process. It might be necessary to force em
  • Gather a database dump
  • Gather Datameer Hadoop Cluster configuration.

Environment

  • Gather core-site.xml, hdfs-site.xml from the cluster. 
  • Review the settings regarding the job-conf-cluster.xml from available job traces.

Browser

Data Flow

  • Stop unnecessary imports, jobs and workbooks. Remove them from the queue
  • Depending on the issue get the data pipeline (workbook and any upstream workbooks and import jobs / data links) as JSON and sample data. 
  • Capture screenshots.
  • Explore more details about the pattern for which workbooks or sheets this occurs on and which workbooks sheets this does not occur on to try to detect a predictable pattern.

Contact Datameer support. It might be necessary to proceed further with Data Collection for Root Cause Analysis. Providing as much information as possible will give us the capability to resolve an issue fast.