Precisely Controlling Hadoop Splits
In some situations, a job's performance may improve with increasing or decreasing the number of splits for a particular job.
The following parameters are available to control the splitting of a job. Each of these can be set in the Custom Hadoop Properties for the job:
- das.splitting.max-split-count - Explicitly set the maximum number of splits for a particular job.
- das.splitting.min-split-count - Explicitly set the minimum number of splits for a particular job.
- das.splitting.min-split-size-hard - Explicitly set the minimum split size for a particular job.
- mapred.max.split.size - Set the maximum size of a split.
- mapred.min.split.size - Set the minimum size of a split.
Note that the "mapred" options above are native Hadoop properties and that these are not enforced by Datameer.