equested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1941

2 followers
1
Avatar

We are installing Datameer on and EC2 Cloudera cluster at the partner training. We ran into the follow error - equested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1941. We increased the Yarn container memory to more than 3GB we still get the issue. We are thinking of setting the following in custom properties to reduce the requested memory. Is this correct approach? What other things can we change in Datameer or on the cluster? Thanks!

das.job.map-task.memory=1024

das.job.reduce-task.memory=1024

das.job.application-manager.memory=1024

tez.am.resource.memory.mb=1024

tez.task.resource.memory.mb=1024

 

 

INFO [2016-02-23 18:38:10.843] [JobScheduler thread-1] (JobScheduler.java:382) - Starting job 7 (DAS Version: 5.11.5, Revision: fb871300daf4cb9ecc756d45993ac951c05710f1, Hadoop-Distribution: 2.6.0-cdh5.5.0 (cdh-5.5.0-mr2), JVM: 1.7)
INFO [2016-02-23 18:38:10.902] [JobScheduler thread-1] (JobScheduler.java:408) - [Job 7] Preparing job in job scheduler thread for SystemJobConfiguration{id=1}...
INFO [2016-02-23 18:38:10.903] [JobScheduler thread-1] (JobScheduler.java:411) - [Job 7] Preparing job in job scheduler thread for SystemJobConfiguration{id=1}... done (0 sec)
INFO [2016-02-23 18:38:10.910] [JobScheduler worker1-thread-1] (JobSchedulerJob.java:94) - [Job 7] Preparing job for SystemJobConfiguration{id=1}...
INFO [2016-02-23 18:38:10.913] [JobScheduler worker1-thread-1] (JobSchedulerJob.java:99) - [Job 7] Preparing job for SystemJobConfiguration{id=1}... done (0 sec)
INFO [2016-02-23 18:38:10.977] [JobScheduler worker1-thread-1] (JobSchedulerJob.java:120) - Starting job ...
INFO [2016-02-23 18:38:11.096] [JobScheduler worker1-thread-1] (SmartClusterJobFlowReporter.java:34) - {"jobName":"System job (7): check_system_job","graph":"digraph G {\n \"89f92a35-67c2-440b-b5de-c2f9f6c7b348\" [label \u003d \"RecordStream{sheetName\u003dcheck_system_job, description\u003dCluster assertion processor}\"];\n \"47c61c76-67dc-4c35-a427-1e8a40d6d69a\" [label \u003d \"GeneratingRecordSource{description\u003dsingle record}\"];\n \"47c61c76-67dc-4c35-a427-1e8a40d6d69a\" -\u003e \"89f92a35-67c2-440b-b5de-c2f9f6c7b348\" [style\u003dsolid];\n}\n","time":"20160223T183811.096-0500"}
INFO [2016-02-23 18:38:11.101] [JobScheduler worker1-thread-1] (MrPlanRunnerV2.java:61) - Allow running Datameer job with up to 1 concurrent cluster jobs.
INFO [2016-02-23 18:38:11.132] [MrPlanRunnerV2] (JobExecutionTraceService.java:82) - Creating local job execution trace log at /home/ec2-user/Datameer-5.11.5-cdh-5.5.0-mr2/temp/cache/dfscache/local-job-execution-traces/7
INFO [2016-02-23 18:38:11.136] [MrPlanRunnerV2] (SmartClusterJobFlowReporter.java:34) - {"jobName":"System job (7): check_system_job","open":["check_system_job (89f92a35-67c2-440b-b5de-c2f9f6c7b348)"],"inProgress":[],"time":"20160223T183811.136-0500"}
INFO [2016-02-23 18:38:11.138] [MrPlanRunnerV2] (SmartClusterJobFlowReporter.java:34) - {"jobName":"System job (7): check_system_job","executionFramework":"TEZ","inputsWithDataStatistics":[{"_1":"single record (47c61c76-67dc-4c35-a427-1e8a40d6d69a)","_2":{"bytes":null,"failedRecordCount":0,"outputPartitionCount":null,"recordCount":1,"uncompressedBytes":null}}],"time":"20160223T183811.137-0500"}
INFO [2016-02-23 18:38:11.139] [MrPlanRunnerV2] (SmartClusterJobFlowResolver.java:83) - Using Tez Execution Framework for System job (7): check_system_job with inputs: single record (47c61c76-67dc-4c35-a427-1e8a40d6d69a) because number of bytes of input 'single record' is unknown.
INFO [2016-02-23 18:38:11.148] [MrPlanRunnerV2] (TezClusterSession.java:42) - Creating a TEZ job for session System job (7): check_system_job with a job count 1
INFO [2016-02-23 18:38:11.172] [MrPlanRunnerV2] (ClusterSession.java:175) - -------------------------------------------
INFO [2016-02-23 18:38:11.172] [MrPlanRunnerV2] (ClusterSession.java:176) - Running cluster job (TEZ) for 'System job (7): check_system_job#check_system_job(Cluster assertion processor)#check_system_job(Clus'
INFO [2016-02-23 18:38:11.172] [MrPlanRunnerV2] (ClusterSession.java:178) - Output (final): check_system_job (89f92a35-67c2-440b-b5de-c2f9f6c7b348)
INFO [2016-02-23 18:38:11.172] [MrPlanRunnerV2] (ClusterSession.java:180) - -------------------------------------------
INFO [2016-02-23 18:38:11.270] [MrPlanRunnerV2] (ClusterJobFlow.java:150) - Created configuration for MultiOutputClusterJobFlow{mapChain=NormalChain{streams=[RecordStream[sheetName=check_system_job,description=Cluster assertion processor]]}}: ClusterJobConfiguration{enabledConsumers=[151eb686-5cde-4bbc-bb2c-eb66a7761408]}
INFO [2016-02-23 18:38:11.317] [MrPlanRunnerV2] (TezJob.java:168) - Submitting DAG to Tez cluster with name:System job (7): check_system_job#check_system_job(Cluster assertion processor)#check_system_job(Clus (50ede78a-5418-46f4-8be8-30d49a84f178)
INFO [2016-02-23 18:38:11.324] [MrPlanRunnerV2] (TezClientFacade.java:72) - Cleaning up tmp/tez-plugin-jars
INFO [2016-02-23 18:38:11.326] [MrPlanRunnerV2] (LightweightDasJobContext.java:69) - Synchronize global task local resources with remote hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars
INFO [2016-02-23 18:38:11.404] [MrPlanRunnerV2] (LightweightDasJobContext.java:85) - Synchronize job-specific task local resources with remote hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars
INFO [2016-02-23 18:38:11.405] [MrPlanRunnerV2] (TezClientFacade.java:236) - Creating jar for plugin tez at 'tmp/tez-plugin-jars/plugin-tez-1456270548000.jar' from '/home/ec2-user/Datameer-5.11.5-cdh-5.5.0-mr2/tmp/das-plugins4541735943498581626.folder/plugin-tez-5.11.5.zip/plugin-tez-5.11.5/classes/datameer'
INFO [2016-02-23 18:38:11.476] [MrPlanRunnerV2] (LightweightDasJobContext.java:111) - Synchronize additional task local resource 'tmp/tez-plugin-jars/plugin-tez-1456270548000.jar' with remote filesystem hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars
INFO [2016-02-23 18:38:11.497] [MrPlanRunnerV2] (LocalResourceUploader.java:132) - created md5 file tmp/tez-plugin-jars/plugin-tez-1456270548000.jar.md5: 765f9678fec81167539fec8790577fac
INFO [2016-02-23 18:38:11.499] [MrPlanRunnerV2] (LocalResourceUploader.java:149) - uploading (184.9 KB) tmp/tez-plugin-jars/plugin-tez-1456270548000.jar to hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars/5.11.5/tez-jars/plugin-tez-1456270548000.jar_765f9678fec81167539fec8790577fac.jar
INFO [2016-02-23 18:38:11.590] [MrPlanRunnerV2] (LightweightDasJobContext.java:111) - Synchronize additional task local resource '/home/ec2-user/Datameer-5.11.5-cdh-5.5.0-mr2/webapps/conductor/WEB-INF/lib/hadoop-mapreduce-client-core-2.6.0-cdh5.5.0.jar' with remote filesystem hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars
INFO [2016-02-23 18:38:11.678] [MrPlanRunnerV2] (TezSessionImpl.java:45) - Creating new TezClient...
INFO [2016-02-23 18:38:11.737] [MrPlanRunnerV2] (LightweightDasJobContext.java:111) - Synchronize additional task local resource 'tmp/tez-plugin-jars/tez-libs-1456270548000/tez-dag-0.7.0.jar' with remote filesystem hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars
INFO [2016-02-23 18:38:11.785] [MrPlanRunnerV2] (LocalResourceUploader.java:132) - created md5 file tmp/tez-plugin-jars/tez-libs-1456270548000/tez-dag-0.7.0.jar.md5: 1c9935416dcde3887885af0d04c9077e
INFO [2016-02-23 18:38:11.787] [MrPlanRunnerV2] (LightweightDasJobContext.java:111) - Synchronize additional task local resource 'tmp/tez-plugin-jars/tez-libs-1456270548000/tez-runtime-internals-0.7.0.jar' with remote filesystem hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars
INFO [2016-02-23 18:38:11.788] [MrPlanRunnerV2] (LocalResourceUploader.java:132) - created md5 file tmp/tez-plugin-jars/tez-libs-1456270548000/tez-runtime-internals-0.7.0.jar.md5: 7524b98873bb0b714578caa737473b53
INFO [2016-02-23 18:38:11.790] [MrPlanRunnerV2] (LightweightDasJobContext.java:111) - Synchronize additional task local resource 'tmp/tez-plugin-jars/tez-libs-1456270548000/tez-runtime-library-0.7.0.jar' with remote filesystem hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars
INFO [2016-02-23 18:38:11.792] [MrPlanRunnerV2] (LocalResourceUploader.java:132) - created md5 file tmp/tez-plugin-jars/tez-libs-1456270548000/tez-runtime-library-0.7.0.jar.md5: c07f7229d1230ad569958a6b91349a28
INFO [2016-02-23 18:38:11.793] [MrPlanRunnerV2] (LightweightDasJobContext.java:111) - Synchronize additional task local resource 'tmp/tez-plugin-jars/tez-libs-1456270548000/tez-api-0.7.0.jar' with remote filesystem hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars
INFO [2016-02-23 18:38:11.797] [MrPlanRunnerV2] (LocalResourceUploader.java:132) - created md5 file tmp/tez-plugin-jars/tez-libs-1456270548000/tez-api-0.7.0.jar.md5: c6c17bcebd9d816e4bb88b614fc4cb1e
INFO [2016-02-23 18:38:11.799] [MrPlanRunnerV2] (LightweightDasJobContext.java:111) - Synchronize additional task local resource 'tmp/tez-plugin-jars/tez-libs-1456270548000/commons-collections4-4.1.jar' with remote filesystem hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars
INFO [2016-02-23 18:38:11.802] [MrPlanRunnerV2] (LocalResourceUploader.java:132) - created md5 file tmp/tez-plugin-jars/tez-libs-1456270548000/commons-collections4-4.1.jar.md5: 45af6a8e5b51d5945de6c7411e290bd1
INFO [2016-02-23 18:38:11.804] [MrPlanRunnerV2] (LightweightDasJobContext.java:111) - Synchronize additional task local resource 'tmp/tez-plugin-jars/tez-libs-1456270548000/tez-common-0.7.0.jar' with remote filesystem hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars
INFO [2016-02-23 18:38:11.804] [MrPlanRunnerV2] (LocalResourceUploader.java:132) - created md5 file tmp/tez-plugin-jars/tez-libs-1456270548000/tez-common-0.7.0.jar.md5: b10f2d1bb2a54a250ce3475c28e1cb94
INFO [2016-02-23 18:38:11.832] [MrPlanRunnerV2] (TezClient.java:153) - Tez Client Version: [ component=tez-api, version=0.7.0, revision=fb4df0d61da1bb576f22e4ebdd47c0000ed2e71f, SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, buildTime=2015-08-11T21:31:14Z ]
INFO [2016-02-23 18:38:11.832] [MrPlanRunnerV2] (TezClientFacade.java:318) - Starting Tez session ...
INFO [2016-02-23 18:38:11.846] [MrPlanRunnerV2] (RMProxy.java:98) - Connecting to ResourceManager at ec2-54-162-103-167.compute-1.amazonaws.com/10.37.210.43:8032
INFO [2016-02-23 18:38:11.847] [MrPlanRunnerV2] (TezClient.java:333) - Session mode. Starting session.
INFO [2016-02-23 18:38:11.850] [MrPlanRunnerV2] (TezClientUtils.java:172) - Using tez.lib.uris value from configuration: hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars/5.11.5/tez-jars/tez-dag-0.7.0.jar_1c9935416dcde3887885af0d04c9077e.jar,hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars/5.11.5/tez-jars/tez-runtime-internals-0.7.0.jar_7524b98873bb0b714578caa737473b53.jar,hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars/5.11.5/tez-jars/tez-runtime-library-0.7.0.jar_c07f7229d1230ad569958a6b91349a28.jar,hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars/5.11.5/tez-jars/tez-api-0.7.0.jar_c6c17bcebd9d816e4bb88b614fc4cb1e.jar,hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars/5.11.5/tez-jars/commons-collections4-4.1.jar_45af6a8e5b51d5945de6c7411e290bd1.jar,hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/jobjars/5.11.5/tez-jars/tez-common-0.7.0.jar_b10f2d1bb2a54a250ce3475c28e1cb94.jar
INFO [2016-02-23 18:38:11.957] [MrPlanRunnerV2] (TezCommonUtils.java:122) - Tez system stage directory hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/temp/job-7/.staging-9067737f-803d-43a9-96cb-218a6a0fdcd5/.tez/application_1456269186049_0005 doesn't exist and is created
INFO [2016-02-23 18:38:12.190] [MrPlanRunnerV2] (TezJob.java:157) - Completed Tez job 'System job (7): check_system_job#check_system_job(Cluster assertion processor)#check_system_job(Clus' with output path: hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/temp/job-7/...
INFO [2016-02-23 18:38:12.191] [MrPlanRunnerV2] (ClusterJob.java:134) - Tez Execution Framework completed cluster job 'System job (7): check_system_job#check_system_job(Cluster assertion processor)#check_system_job(Clus' [1 sec]
ERROR [2016-02-23 18:38:12.192] [MrPlanRunnerV2] (ClusterSession.java:198) - Failed to run cluster job 'System job (7): check_system_job#check_system_job(Cluster assertion processor)#check_system_job(Clus' [1 sec]
java.lang.RuntimeException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1941
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:203)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:377)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:320)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:273)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:574)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:213)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:403)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

at datameer.dap.sdk.util.ExceptionUtil.convertToRuntimeException(ExceptionUtil.java:49)
at datameer.dap.sdk.util.ExceptionUtil.convertToRuntimeException(ExceptionUtil.java:31)
at datameer.plugin.tez.TezClientFacade.createClient(TezClientFacade.java:161)
at datameer.plugin.tez.TezClientFacade.createSession(TezClientFacade.java:181)
at datameer.plugin.tez.session.TezSessionImpl.<init>(TezSessionImpl.java:46)
at datameer.plugin.tez.session.TezSessionFactory$AlwaysNewSessionFactory.get(TezSessionFactory.java:50)
at datameer.plugin.tez.session.TezSessionFactory$ReuseSessionFactory.createNewSession(TezSessionFactory.java:104)
at datameer.plugin.tez.session.PoolingTezSessionFactory.get(PoolingTezSessionFactory.java:108)
at datameer.plugin.tez.session.TrackRunningSessionFactory.get(TrackRunningSessionFactory.java:50)
at datameer.plugin.tez.session.TrackRunningSessionFactory.get(TrackRunningSessionFactory.java:50)
at datameer.plugin.tez.DagRunner.submit(DagRunner.java:79)
at datameer.plugin.tez.TezJob.runTezDag(TezJob.java:169)
at datameer.plugin.tez.TezJob.runImpl(TezJob.java:147)
at datameer.dap.common.graphv2.ClusterJob.run(ClusterJob.java:128)
at datameer.dap.common.graphv2.ClusterSession.execute(ClusterSession.java:184)
at datameer.dap.common.graphv2.mixedframework.MixedClusterSession.execute(MixedClusterSession.java:48)
at datameer.dap.common.graphv2.ClusterSession.runAllClusterJobs(ClusterSession.java:347)
at datameer.dap.common.graphv2.MrPlanRunnerV2.run(MrPlanRunnerV2.java:103)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at datameer.dap.common.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:93)
at datameer.dap.common.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:170)
at datameer.dap.common.security.RunAsThread$1.run(RunAsThread.java:34)
at datameer.dap.common.security.RunAsThread$1.run(RunAsThread.java:30)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at datameer.dap.common.filesystem.Impersonator.doAs(Impersonator.java:31)
at datameer.dap.common.security.RunAsThread.run(RunAsThread.java:30)
Caused by: org.apache.tez.dag.api.TezException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1941
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:203)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:377)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:320)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:273)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:574)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:213)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:403)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

at org.apache.tez.client.TezClient.start(TezClient.java:370)
at datameer.plugin.tez.TezClientFacade$CreateTezClientAction.run(TezClientFacade.java:319)
at datameer.plugin.tez.TezClientFacade.createClient(TezClientFacade.java:159)
... 25 more
Caused by: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1941
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:203)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:377)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:320)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:273)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:574)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:213)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:403)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:235)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy138.submitApplication(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:240)
at org.apache.tez.client.TezYarnClient.submitApplication(TezYarnClient.java:72)
at org.apache.tez.client.TezClient.start(TezClient.java:365)
... 27 more
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException): Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1941
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:203)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:377)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:320)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:273)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:574)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:213)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:403)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1403)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy137.submitApplication(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:232)
... 37 more
INFO [2016-02-23 18:38:12.195] [MrPlanRunnerV2] (ClusterSession.java:201) - -------------------------------------------
INFO [2016-02-23 18:38:12.195] [MrPlanRunnerV2] (MixedClusterSession.java:53) - Committing 1 nested sessions for MixedClusterSession{tempJobOutput=hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/temp/job-7, finalJobOutput=hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/systemjobs/1/7}.
INFO [2016-02-23 18:38:12.196] [MrPlanRunnerV2] (MixedClusterSession.java:57) - Committing TezClusterSession{tempJobOutput=hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/temp/job-7, finalJobOutput=hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/systemjobs/1/7}.
INFO [2016-02-23 18:38:12.196] [MrPlanRunnerV2] (ClusterSession.java:80) - Committing failed job and moving data from 'hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/temp/job-7' to 'hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/systemjobs/1/7'.
INFO [2016-02-23 18:38:12.218] [MrPlanRunnerV2] (ClusterSession.java:106) - Completed job flow with FAILURE and 0 completed cluster jobs. (hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/systemjobs/1/7)
INFO [2016-02-23 18:38:12.218] [MrPlanRunnerV2] (PoolingTezSessionFactory.java:151) - Closing ReuseSessionFactory{source=AlwaysNewSessionFactory{}}.
INFO [2016-02-23 18:38:12.364] [MrPlanRunnerV2] (HarBuilder.java:77) - Created har file at hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/systemjobs/1/7/job-metadata.har.tmp out of [hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/systemjobs/1/7/check_system_job/job-conf.xml, hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/systemjobs/1/7/job-plan-compiled.dot, hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/systemjobs/1/7/job-plan-original.dot, hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/systemjobs/1/7/job-definition.json]. Moving it to hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/systemjobs/1/7/job-metadata.har
INFO [2016-02-23 18:38:12.374] [MrPlanRunnerV2] (MrPlanRunnerV2.java:147) - Deleting temporary job directory hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/temp/job-7
INFO [2016-02-23 18:38:12.388] [MrPlanRunnerV2] (DatameerJobStorage.java:70) - Copying job execution trace log from /home/ec2-user/Datameer-5.11.5-cdh-5.5.0-mr2/temp/cache/dfscache/local-job-execution-traces/7 to hdfs://ec2-54-162-103-167.compute-1.amazonaws.com:8020/user/datameer/systemjobs/1/7/job-execution-trace.log
INFO [2016-02-23 18:38:12.408] [JobScheduler worker1-thread-1] (DapJobCounter.java:172) - Job FAILURE with '0' mr-jobs and following counters:
ERROR [2016-02-23 18:38:12.943] [JobScheduler thread-1] (JobScheduler.java:813) - Job 7 failed with exception.
java.lang.RuntimeException: Failed to run cluster job for 'System job (7): check_system_job#check_system_job(Cluster assertion processor)#check_system_job(Clus'
at datameer.dap.common.graphv2.ClusterSession.execute(ClusterSession.java:199)
at datameer.dap.common.graphv2.mixedframework.MixedClusterSession.execute(MixedClusterSession.java:48)
at datameer.dap.common.graphv2.ClusterSession.runAllClusterJobs(ClusterSession.java:347)
at datameer.dap.common.graphv2.MrPlanRunnerV2.run(MrPlanRunnerV2.java:103)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at datameer.dap.common.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:93)
at datameer.dap.common.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:170)
at datameer.dap.common.security.RunAsThread$1.run(RunAsThread.java:34)
at datameer.dap.common.security.RunAsThread$1.run(RunAsThread.java:30)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at datameer.dap.common.filesystem.Impersonator.doAs(Impersonator.java:31)
at datameer.dap.common.security.RunAsThread.run(RunAsThread.java:30)
Caused by: java.lang.RuntimeException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1941
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:203)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:377)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:320)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:273)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:574)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:213)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:403)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

at datameer.dap.sdk.util.ExceptionUtil.convertToRuntimeException(ExceptionUtil.java:49)
at datameer.dap.sdk.util.ExceptionUtil.convertToRuntimeException(ExceptionUtil.java:31)
at datameer.plugin.tez.TezClientFacade.createClient(TezClientFacade.java:161)
at datameer.plugin.tez.TezClientFacade.createSession(TezClientFacade.java:181)
at datameer.plugin.tez.session.TezSessionImpl.<init>(TezSessionImpl.java:46)
at datameer.plugin.tez.session.TezSessionFactory$AlwaysNewSessionFactory.get(TezSessionFactory.java:50)
at datameer.plugin.tez.session.TezSessionFactory$ReuseSessionFactory.createNewSession(TezSessionFactory.java:104)
at datameer.plugin.tez.session.PoolingTezSessionFactory.get(PoolingTezSessionFactory.java:108)
at datameer.plugin.tez.session.TrackRunningSessionFactory.get(TrackRunningSessionFactory.java:50)
at datameer.plugin.tez.session.TrackRunningSessionFactory.get(TrackRunningSessionFactory.java:50)
at datameer.plugin.tez.DagRunner.submit(DagRunner.java:79)
at datameer.plugin.tez.TezJob.runTezDag(TezJob.java:169)
at datameer.plugin.tez.TezJob.runImpl(TezJob.java:147)
at datameer.dap.common.graphv2.ClusterJob.run(ClusterJob.java:128)
at datameer.dap.common.graphv2.ClusterSession.execute(ClusterSession.java:184)
... 13 more
Caused by: org.apache.tez.dag.api.TezException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1941
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:203)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:377)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:320)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:273)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:574)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:213)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:403)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

at org.apache.tez.client.TezClient.start(TezClient.java:370)
at datameer.plugin.tez.TezClientFacade$CreateTezClientAction.run(TezClientFacade.java:319)
at datameer.plugin.tez.TezClientFacade.createClient(TezClientFacade.java:159)
... 25 more
Caused by: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1941
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:203)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:377)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:320)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:273)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:574)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:213)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:403)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:235)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy138.submitApplication(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:240)
at org.apache.tez.client.TezYarnClient.submitApplication(TezYarnClient.java:72)
at org.apache.tez.client.TezClient.start(TezClient.java:365)
... 27 more
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException): Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=2048, maxMemory=1941
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:203)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:377)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:320)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:273)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:574)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:213)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:403)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1403)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy137.submitApplication(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:232)
... 37 more
INFO [2016-02-23 18:38:12.982] [JobScheduler thread-1] (JobScheduler.java:884) - Computing after job completion operations for execution 7 (type=NORMAL)
INFO [2016-02-23 18:38:12.983] [JobScheduler thread-1] (JobScheduler.java:888) - Finished computing after job completion operations for execution 7 (type=NORMAL) [0 sec]
WARN [2016-02-23 18:38:12.994] [JobScheduler thread-1] (JobScheduler.java:743) - Job DapJobExecution{id=7, type=NORMAL, status=ERROR} completed with status ERROR.

Nikhil Kumar

Official comment

  • Avatar
    Brian Junio

    This error indicates to me that we are not suffering from a framework specific memory setting, but rather a system (OS) memory limitation.

    By setting the following parameter's values to be lower than the indicated "max" value, you can resolve this issue.  

    das.job.map-task.memory=1024
    das.job.reduce-task.memory=1024
    das.job.application-manager.memory=1024
    tez.am.resource.memory.mb=1024
    tez.task.resource.memory.mb=1024

    Where 1024 is a value less than the described maximum contained in the provided log file.  

    0

2 comments

  • Avatar
    Nikhil Kumar

    Good news. Setting these solved the issue:

    das.job.map-task.memory=1024 
    das.job.reduce-task.memory=1024 
    das.job.application-manager.memory=1024
    tez.am.resource.memory.mb=1024
    tez.task.resource.memory.mb=1024
    0
Please sign in to leave a comment.