Some Workbooks not running after update to Datameer 6.3.3

2 followers
0
Avatar

After upgrading Datameer from 6.1.23 to 6.3.3 some Workbooks cannot be processed anymore. There is no problem editing the workbooks in the browser, but when I try to run them on the cluster I get an error like this:

INFO [2017-11-23 12:58:11.501] [JobScheduler thread-1] (JobScheduler.java:412) - Starting job 3386275 (DAS Version: 6.3.3, Revision: 02d087faefb94aca6628b6b6a67ff5ed67833d8c, Hadoop-Distribution: 2.6.0-cdh5.11.0 (cdh-5.11.0), JVM: 1.7)
INFO [2017-11-23 12:58:11.504] [JobScheduler thread-1] (NormalJobDriver.java:124) - Checking if JobExecutionValueObject{_id=3386275} can be started
INFO [2017-11-23 12:58:11.537] [JobScheduler thread-1] (JobScheduler.java:444) - [Job 3386275] Preparing job in job scheduler thread for WorkbookConfigurationImpl{id=1770}...
INFO [2017-11-23 12:58:11.704] [JobScheduler thread-1] (JobScheduler.java:447) - [Job 3386275] Preparing job in job scheduler thread for WorkbookConfigurationImpl{id=1770}... done (0 sec)
INFO [2017-11-23 12:58:11.707] [JobScheduler worker1-thread-24438] (JobSchedulerJob.java:89) - [Job 3386275] Preparing job for WorkbookConfigurationImpl{id=1770}...
INFO [2017-11-23 12:58:11.731] [JobScheduler worker1-thread-24438] (JobSchedulerJob.java:94) - [Job 3386275] Preparing job for WorkbookConfigurationImpl{id=1770}... done (0 sec)
INFO [2017-11-23 12:58:11.739] [JobScheduler worker1-thread-24438] (JobSchedulerJob.java:109) - Starting job ...
INFO [2017-11-23 12:58:11.775] [JobScheduler worker1-thread-24438] (WorkbookJob.java:231) - Registering operations for sheet 'EXOP' (keep=false) with datameer.dap.common.sheet.StaticDataSheetBuilder@e00635e
INFO [2017-11-23 12:58:11.787] [JobScheduler worker1-thread-24438] (WorkbookJob.java:231) - Registering operations for sheet 'EXOP_Dedup' (keep=false) with GroupBySheetBuilder{EXOP_Dedup, source-sheet=EXOP, key-expressions=[GROUPBY(#EXOP!ID)]}
INFO [2017-11-23 12:58:11.787] [JobScheduler worker1-thread-24438] (WorkbookJob.java:231) - Registering operations for sheet 'splitMessages' (keep=false) with FormulaSheetBuilder{splitMessages, source-sheet=EXOP_Dedup, expressions=ExpressionContext[column='ID', id='0', index=0, expression=COPY(#EXOP_Dedup!ID)], ExpressionContext[column='Raw', id='1', index=5, expression=IF(STARTSWITH(#EXOP_Dedup!Messages;"[");EXPAND(JSONTOLIST(#EXOP_Dedup!Messages));#EXOP_Dedup!Messages)], ExpressionContext[column='Message_ID', id='2', index=1, expression=INT(JSON_VALUE(#Raw;"ID"))], ...}
INFO [2017-11-23 12:58:11.790] [JobScheduler worker1-thread-24438] (WorkbookJob.java:231) - Registering operations for sheet 'splitTextBlocks' (keep=false) with GroupBySheetBuilder{splitTextBlocks, source-sheet=splitMessages, key-expressions=[GROUPBY(#splitMessages!Message_ID)]}
INFO [2017-11-23 12:58:11.791] [JobScheduler worker1-thread-24438] (WorkbookJob.java:231) - Registering operations for sheet 'splitPoints' (keep=false) with FormulaSheetBuilder{splitPoints, source-sheet=EXOP_Dedup, expressions=ExpressionContext[column='ID', id='0', index=0, expression=COPY(#EXOP_Dedup!ID)], ExpressionContext[column='Points_Raw', id='1', index=10, expression=IF(STARTSWITH(#EXOP_Dedup!Features_Points;"[");EXPAND(JSONTOLIST(#EXOP_Dedup!Features_Points));#EXOP_Dedup!Features_Points)], ExpressionContext[column='Point_Geometry', id='2', index=1, expression=JSON_VALUE(#Points_Raw;"Geometry")], ...}
ERROR [2017-11-23 12:58:12.748] [JobScheduler thread-1] (JobScheduler.java:885) - Job 3386275 failed with exception.
datameer.com.google.common.base.VerifyException: RecordSource for sheet 'splitPoints' (das.internal.FormulaSheetType) of incorrect type. Is RecordType{[STRING, STRING, INTEGER, STRING, FLOAT, FLOAT, STRING, STRING]} but expected RecordType{[INTEGER, STRING, FLOAT, FLOAT, STRING, STRING, STRING, STRING]}.
at datameer.com.google.common.base.Verify.verify(Verify.java:125)
at datameer.dap.common.job.WorkbookJob.registerJobOperations(WorkbookJob.java:238)
at datameer.dap.common.job.DatameerJob.createExecutionPlan(DatameerJob.java:78)
at datameer.dap.common.job.DasJobCallable.call(DasJobCallable.java:107)
at datameer.dap.common.job.DasJobCallable.call(DasJobCallable.java:69)
at datameer.dap.conductor.jobscheduler.JobSchedulerJob$1.call(JobSchedulerJob.java:110)
at datameer.dap.conductor.jobscheduler.JobSchedulerJob$1.call(JobSchedulerJob.java:77)
at datameer.dap.common.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:117)
at datameer.dap.conductor.jobscheduler.JobSchedulerJob.call(JobSchedulerJob.java:77)
at datameer.dap.conductor.jobscheduler.JobSchedulerJob.call(JobSchedulerJob.java:40)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
INFO [2017-11-23 12:58:12.782] [JobScheduler thread-1] (JobScheduler.java:960) - Computing after job completion operations for execution 3386275 (type=NORMAL)
INFO [2017-11-23 12:58:12.782] [JobScheduler thread-1] (JobScheduler.java:964) - Finished computing after job completion operations for execution 3386275 (type=NORMAL) [0 sec]
WARN [2017-11-23 12:58:12.787] [JobScheduler thread-1] (JobScheduler.java:782) - Job DapJobExecution{id=3386275, type=NORMAL, status=ERROR} completed with status ERROR.
Michael Ahn

7 comments

  • Avatar
    Michael Ahn

    This is the sheet summary of sheet splitPoints from the Browser:

    Column 1 : =COPY(#EXOP_Dedup!ID)
    Column 2 : =JSON_VALUE(#Points_Raw;"Geometry")
    Column 3 : =FLOAT(JSON_VALUE(#Koordinates;"x"))
    Column 4 : =FLOAT(JSON_VALUE(#Koordinates;"y"))
    Column 5 : =JSON_VALUE(#Points_Raw;"Accuracy")
    Column 6 : =JSON_VALUE(#Points_Raw;"Countrywide")
    Column 7 : =JSON_VALUE(#Points_Raw;"Country")
    Column 8 : =JSON_VALUE(#Points_Raw;"Admin0")
    Column 9 : =JSON_VALUE(#Points_Raw;"Admin1")
    Column 10 : =JSON_VALUE(#Points_Raw;"Admin2")
    Column 11 : =IF(STARTSWITH(#EXOP_Dedup!Features_Points;"[");EXPAND(JSONTOLIST(#EXOP_Dedup!Features_Points));#EXOP_Dedup!Features_Points)
    Column 12 : =WKB2JSON(#Point_Geometry)
    0
  • Avatar
    Joel Stewart

    Thank you for sharing Michael. Within the JSON, I could see a different between the columnID and the columnIndex fields for the splitPoints sheet. I manually updated these to match each other and force alignment of these values as a test. This update is available here: https://datameer.box.com/s/qf2s2k7gpyji4d0w35deskovjqy97q3i

    Can you let us know if this resolves the issue? If so, we'll also work on a reproduction to resolve this programmatically moving forward. 

    0
  • Avatar
    Michael Ahn

    I uploaded your modification but the error keeps the same when I try to run the workbook on the cluster.

    0
  • Avatar
    Joel Stewart

    Thanks for testing that version of the workbook, I'm disappointed that it did not provide any relief. Do you have a mysql backup of the dap database available from before the upgrade? We could try to simulate this in our lab environment after taking a closer look. 

     

    0
  • Avatar
    Michael Ahn

    I have the backup of the database before the upgrade. Please contact me directly to get the download link. I don't want to post it on a public forum.

    0
Please sign in to leave a comment.