OOM | Not able to query Spark Temporary table - apache-spark

I have 4.5 million records in a Hive table.
My requirement is to cache this table as a temporary table through Spark thrift server, beeline so that Tableau can query the temporary table and generate reports.
I have 4 node clusters, each node has 50g RAM and 25 vCores. I'm using HDP2.3 with Spark 1.4.1
Issue:
I'm able to cache the table in less than a minute and able to get the correct count from temp table. But the problem is when I try to execute a select query (using beeline, same spark sqlContext) with one column, hitting OOM error.
Tried below configurations without any luck:
1) sudo ./sbin/start-thriftserver.sh --hiveconf hive.server2.thrift.bind.host=10.74.129.175 --hiveconf hive.server2.thrift.port=10002 --master yarn-client --driver-memory 35g --driver-cores 25 --num-executors 4 --executor-memory 35g --executor-cores 25
$SPARK_HOME./bin/beeline> cache table temp1 as select * from hive_table;
set below config in spark-default file –
spark.driver.maxResultSize 20g
spark.kryoserializer.buffer.max 2000mb
spark.rdd.compress true
spark.speculation true
2) sudo ./sbin/start-thriftserver.sh --hiveconf hive.server2.thrift.bind.host=10.74.129.175 --hiveconf hive.server2.thrift.port=10002 --master yarn-client --driver-memory 35g --driver-cores 5 --num-executors 11 --executor-memory 35g --executor-cores 5
$SPARK_HOME./bin/beeline> cache table temp1 as select * from hive_table;
set below config in spark-default file –
spark.driver.maxResultSize 20g
spark.kryoserializer.buffer.max 2000mb
spark.rdd.compress true
spark.speculation true
As per my understanding, I have enough RAM in driver machine and should be able to bring the result of select to driver.

Related

Spark fail if not all resources are allocated

Does spark or yarn has any flag to fail fast job if we can't allocate all resoucres?
For example if i run
spark-submit --class org.apache.spark.examples.SparkPi
--master yarn-client
--num-executors 7
--driver-memory 512m
--executor-memory 4g
--executor-cores 1
/usr/hdp/current/spark2-client/examples/jars/spark-examples_*.jar 1000
For now if spark can allocate only 5 executors it just will go with 5. Can we make to run it only with 7 or fail in other case?
You can set a spark.dynamicAllocation.minExecutors config in your job. For it you need to set spark.dynamicAllocation.enabled=true, detailed in this doc

pyspark with spark 2.4 on EMR SparkException: Cannot broadcast the table that is larger than 8GB

I've checked the other posts related to this error and I do not found anything working at all.
What I'm trying to do:
df = spark.sql("""
SELECT DISTINCT
action.AccountId
...
,to_date(date) as Date
FROM sc_raw_report LEFT JOIN adwords_accounts ON action.AccountId=sc_raw_report.customer_id
WHERE date >= to_date(concat_ws('-',2018,1,1))
GROUP BY action.AccountId
,Account_Name
...
,to_date(date)
,substring(timestamp,12,2)
""")
df.show(5, False)
and then a saveAsTable.. Nonetheless it returns an error:
py4j.protocol.Py4JJavaError: An error occurred while calling o119.showString.
: org.apache.spark.SparkException: Exception thrown in awaitResult:
[...]
Caused by: org.apache.spark.SparkException: Cannot broadcast the table that is larger than 8GB: 13 GB
I've tried the:
'spark.sql.autoBroadcastJoinThreshold': '-1'
But it did nothing.
The table adwords_account is very small and doing a print of df.count() on sc_raw_report returns: 2022197
emr-5.28.0 spark 2.4.4
My cluster core: 15 r4.4xlarge (16 vCore, 122 GiB memory, EBS only storage)
main: r5a.4xlarge (16 vCore, 128 GiB memory, EBS only storage)
with config for spark-submit --deploy-mode cluster:
--conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem --conf fs.s3a.attempts.maximum=30 --conf spark.sql.crossJoin.enabled=true --executor-cores 5 --num-executors 5 --conf spark.dynamicAllocation.enabled=false --conf spark.executor.memoryOverhead=3g --driver-memory 22g --executor-memory 22g --conf spark.executor.instances=49 --conf spark.default.parallelism=490 --conf spark.driver.maxResultSize=0 --conf spark.sql.broadcastTimeout=3600
Anyone know what I can do here?
EDIT: additional info:
upgrading to 16 instances or r4.8xlarge (32CPU, 244RAM) did nothing either.
graph with step, then it goes idle before throwing the broadcast error
Executors report few moment before the crash:
the config:
spark.serializer.objectStreamReset 100
spark.sql.autoBroadcastJoinThreshold -1
spark.executor.memoryOverhead 3g
spark.driver.maxResultSize 0
spark.shuffle.service.enabled true
spark.rdd.compress True
spark.stage.attempt.ignoreOnDecommissionFetchFailure true
spark.sql.crossJoin.enabled true
hive.metastore.client.factory.class com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory
spark.scheduler.mode FIFO
spark.driver.memory 22g
spark.executor.instances 5
spark.default.parallelism 490
spark.resourceManager.cleanupExpiredHost true
spark.executor.id driver
spark.driver.extraJavaOptions -Dcom.amazonaws.services.s3.enableV4=true
spark.hadoop.fs.s3.getObject.initialSocketTimeoutMilliseconds 2000
spark.submit.deployMode cluster
spark.sql.broadcastTimeout 3600
spark.master yarn
spark.sql.parquet.output.committer.class com.amazon.emr.committer.EmrOptimizedSparkSqlParquetOutputCommitter
spark.ui.filters org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
spark.blacklist.decommissioning.timeout 1h
spark.sql.hive.metastore.sharedPrefixes com.amazonaws.services.dynamodbv2
spark.executor.memory 22g
spark.dynamicAllocation.enabled false
spark.sql.catalogImplementation hive
spark.executor.cores 5
spark.decommissioning.timeout.threshold 20
spark.hadoop.mapreduce.fileoutputcommitter.cleanup-failures.ignored.emr_internal_use_only.EmrFileSystem true
spark.hadoop.yarn.timeline-service.enabled false
spark.yarn.executor.memoryOverheadFactor 0.1875
After ShuffleMapStage, part of shuffle block needs to be broadcasted at driver.
Please make sure Driver (in your case an AM in YARN ) has enough memory/overhead.
Could you post sc run time config ?

Spark: Entire dataset concentrated in one executor

I am running a spark job with 3 files each of 100MB size, for some reason my spark UI shows all dataset concentrated into 2 executors.This is making the job run for 19 hrs and still running.
Below is my spark configuration . spark 2.3 is the version used.
spark2-submit --class org.mySparkDriver \
--master yarn-cluster \
--deploy-mode cluster \
--driver-memory 8g \
--num-executors 100 \
--conf spark.default.parallelism=40 \
--conf spark.yarn.executor.memoryOverhead=6000mb \
--conf spark.dynamicAllocation.executorIdleTimeout=6000s \
--conf spark.executor.cores=3 \
--conf spark.executor.memory=8G \
I tried repartitioning inside the code which works , as this makes the file go into 20 partitions (i used rdd.repartition(20)). But why should I repartition , i believe specifying spark.default.parallelism=40 in the script should let spark divide the input file to 40 executors and process the file in 40 executors.
Can anyone help.
Thanks,
Neethu
I am assuming you're running your jobs in YARN if yes, you can check following properties.
yarn.scheduler.maximum-allocation-mb
yarn.nodemanager.resource.memory-mb
yarn.scheduler.maximum-allocation-vcores
yarn.nodemanager.resource.cpu-vcores
In YARN these properties would affect number of containers that can be instantiated in a NodeManager based on spark.executor.cores, spark.executor.memory property values (along with executor memory overhead)
For example, if a cluster with 10 nodes (RAM : 16 GB, cores : 6) and set with following yarn properties
yarn.scheduler.maximum-allocation-mb=10GB
yarn.nodemanager.resource.memory-mb=10GB
yarn.scheduler.maximum-allocation-vcores=4
yarn.nodemanager.resource.cpu-vcores=4
Then with spark properties spark.executor.cores=2, spark.executor.memory=4GB you can expect 2 Executors/Node so total you'll get 19 executors + 1 container for Driver
If the spark properties are spark.executor.cores=3, spark.executor.memory=8GB then you will get 9 Executor (only 1 Executor/Node) + 1 container for Driver
you can refer to link for more details
Hope this helps

Spark - Capping the number of CPU cores or memory of slave servers

I am using Spark 2.1. This question is for use cases where some of Spark slave servers run other apps as well. Is there a way to tell the Spark Master server to to use only certain # of CPU cores or memory of a slave server ?
Thanks.
To limit the number of cores used by a spark job, you need to add the --total-executor-cores option into your spark-submit command. To limit the amount of memory used by each executor, use the --executor-memory option. For example:
spark-submit --total-executor-cores 10 \
--executor-memory 8g \
--class com.example.SparkJob \
SparkJob.jar
This also works with spark-shell
spark-shell --total-executor-cores 10 \
--executor-memory 8g

Over utilization of yarn resources with spark

I have EMR cluster with below configuration.
Data Nodes : 6
RAM per Node : 56 GB
Cores per Node: 32
Instance Type: M4*4xLarge
I am running below spark-sql to execute 5 hive scripts in parallel.
spark-sql --master yarn --num-executors 1 --executor-memory 20G --executor-cores 20 --driver-memory 4G -f hive1.hql & spark-sql --master yarn --num-executors 1 --executor-memory 20G --executor-cores 20 --driver-memory 4G -f hive2.hql & spark-sql --master yarn --num-executors 1 --executor-memory 20G --executor-cores 20 --driver-memory 4G -f hive3.hql & spark-sql --master yarn --num-executors 1 --executor-memory 20G --executor-cores 20 --driver-memory 4G -f hive4.hql & spark-sql --master yarn --num-executors 1 --executor-memory 20G --executor-cores 20 --driver-memory 4G -f hive5.hql
But, 270 GB of memory is being utilized by yarn.
As per the parameters in given command,
Each spark job should utilize only 120 GB RAM.
1*20+4 = 24 GB RAM
5 jobs = 5 * 24 = 120 GB
But, why yarn is utilizing 270 GB RAM? (No other Hadoop jobs are running in the cluster)
Do I need to include any extra parameters to limit yarn resource utilization?
Make it as "spark.dynamicAllocation.enabled" false at spark-defaults.conf (../../spark/spark-x.x.x/conf/spark-defaults.conf)
This should help you limiting/avoiding dynamic allocation of resources.
Even though we set executor memory in the command, spark allocates memory dynamically if resources are available in the cluster. To restrict the memory usage to only executor memory, spark dynamic memory allocation parameter should set to false.
You can change it directly in spark config file or pass as config parameter to the command.
spark-sql --master yarn --num-executors 1 --executor-memory 20G --executor-cores 20 --driver-memory 4G --conf spark.dynamicAllocation.enabled=false -f hive1.hql

Resources