datameer commands & hive queries



i'm just wondering if there is a way to translate datameer commands (applied to worksheets) into hive queries. For instance, say I have sheet0 with column A and sheet1 with column B which is obtained by groupBy(#data!A). In hive, this would be



select * from datasource
group by A


no? Is there a way to automatically generate such hive queries for the formulas in a worksheet? At least for a fragment of datameer commands, it would still be very helpful. I don't need that the resulting queries are optimised, I would just need a "correct" translation, or say, compiler. Any opinion on this?

Thank you,


Lacramioara Astefanoaei


  • Avatar
    Alan Mark


    Unfortunately, I believe this is out of the scope of workbook architecture.  Additionally, what's entered as far as formulae is concerned on a workbook doesn't directly reflect what will be generated.  Depending on the cluster size/number of nodes - we may break up queries into smaller chunks for parallel execution.  Remember, Datameer is a job compiler. We build a job and submit it to the hadoop cluster.  It's a primary goal to remove the need to know or have to interact with SQL.

    The closest you could get would be to monitor hive itself during job execution to see what SQL was used to retrieve the data.


    Alan Mark

Please sign in to leave a comment.