Housekeeping job stopped working post upgrade from 5.x to 7.1.x
After upgrading from a 5.11.x release to a 7.1.x release, housekeeping fails with the following exception:
[system] ERROR [2019-01-11 11:40:56.838] [HousekeepingService thread-1] (EventPublishingResult.java:30) - Name 'unsaved 2019-05-02 11:27:52' contains invalid characters. Allowed characters are a-z, A-Z, 0-9, '_', ' '. java.lang.IllegalStateException: Name 'unsaved 2019-05-02 11:27:52' contains invalid characters. Allowed characters are a-z, A-Z, 0-9, '_', ' '.
Over time this results in high File System utilization due to old data not being cleaned up.
Upgrading Datameer from 5.11 or lower to 7.1.x or higher could result in a creation of orphaned data sources in the Database.
When a new workbook is created in Datameer 5.11, a temporary file name is saved to the Datameer MySql Database with the following format:
unsaved yyyy-MM-dd hh:mm:dd
As of Datameer 7.1, the internal naming convention has changed and with the change, dash "-" is no longer allowed in the file names.
First, list all artifacts that are affected by running the following MySQL query:
SELECT * FROM dap_job_configuration WHERE id NOT IN (SELECT id FROM data_source_configuration) AND id NOT IN (SELECT id FROM workbook_configuration) AND id NOT IN (SELECT id FROM data_sink_configuration);
Note the id, and dap_file__id's returned.
To allow housekeeping to run again, we need to delete the orphaned data sources.
Execute the following queries after replacing the <id> and <dap_file__id> values within the list elements. The list can accept any number of entries, comma "," separated. After the queries have been run, validate that housekeeping is running again.
delete from job_configuration_property where configuration_fk in (<id_1>, <id_2>, <id_n>); delete from dap_job_configuration where dap_file__id in (<dap_file__id_1>,<dap_file__id_2>,<dap_file__id_n>); delete from dap_file where id in (<dap_file__id_1>,<dap_file__id_2>,<dap_file__id_n>);