XMLStreamException: ParseError ... Message: JAXP00010001: The parser has encountered more than "64000" entity expansions

Problem

When parsing big XML file Datameer import fails with the following error.

XMLStreamException: ParseError at [row,col]:[1,1] Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.

The XML file is about 65 MB compressed, with about 670,000 records, and only about 64,000 records get imported.

Cause

This is a bug in JDK 7u45 and fixed in 7u55. Reference DAP-20704 and JAXP00010001: The parser has encountered more than "64000" entity expansion for more information.

Solution

Either upgrade the JDK to 1.7u55 or apply the workaround, add -Djdk.xml.entityExpansionLimit=0 Java parameter to Datameer. If the cluster is using the affected JDK too, this parameter needs to be added to all data nodes as well.