PBS jobs vs PBS job-arrays - pbs

What is the difference between submitting individual jobs as PBS scripts and submitting them as a single PBS-array? (I am getting a significant run-time improvement for the latter)

Related

Spark long running jobs with dataset

I have a spark code that used to run batch jobs(each job span varies from few seconds to few minutes). Now I wanted to take this same code and run it long running. To do this I have thought to create spark context only once and then in a while loop I would wait for new config/tasks to come and will start executing them.
So far whenever I tried to run this code, my applications stops running after 5-6 iterations without any exception or error printed. This long running job has been assigned with 1 executor with 10GB of memory and a spark driver with 4GB of memory(which was good for our batch job). So my questions is what are various things that we need to do to move from small batch jobs to long running jobs within code itself. I have seen this useful link - http://mkuthan.github.io/blog/2016/09/30/spark-streaming-on-yarn/ but this link is mostly about spark configurations to keep them running for long.
Spark version - 2.3 (can move to spark 2.4.1) running over yarn cluster

In Apache Spark , do Tasks in the same Stage work simultaneously or not?

do tasks in the same stage work simultaneously? if so, the line between partitions in a stage refers to what? example of a DAG
here is a good link for your reading. that explains DAG in detail and few other things that may be of interest. databricks blog on DAG
I can try to explain. as each stage is created it has a set of tasks that are divided. when an action is encountered. Driver sends the task to executors. based on how your data is partitioned N number tasks are invoked on the data in your distributed cluster. so the arrows that you are seeing is execution plan. as in it cannot do map function prior to reading the file. each node that has some data will execute those tasks in order that is provided by the DAG.

Distribution of spark code into jobs, stages and tasks [duplicate]

This question already has answers here:
What is the concept of application, job, stage and task in spark?
(5 answers)
Closed 5 years ago.
As per my understanding each action in whole job is translated to job, whil each shuffling stage within a job is traslated into stage and each partition for each stages input is translated into task.
Please corrrect me if I am wrong, I am unable to get any actual definition.
Invoking an action inside a Spark application triggers the launch of a Spark job to fulfill it.Spark examines the DAG and formulates an execution plan.The execution plan consists of assembling the job’s transformations into stages.
When Spark optimises code internally, it splits it into stages, where
each stage consists of many little tasks.Each stage contains a sequence of transformations that can be completed without shuffling the full data.
Every task for a given stage is a single-threaded atom of computation consisting of exactly the same
code, just applied to a different set of data.The number of tasks is determined by the number of partitions.
To manage the job flow and schedule tasks Spark relies on an active driver process.
The executor processes are responsible for executing this work, in the form of tasks, as well as for storing any data that the user chooses to cache
A single executor has a number of slots for running tasks and will run many concurrently throughout its lifetime.

Could FAIR scheduling mode make Spark Streaming jobs that read from different topics running in parallel?

I use Spark 2.1 and Kafka 0.9.
Under fair sharing, Spark assigns tasks between jobs in a “round robin” fashion, so that all jobs get a roughly equal share of cluster resources. This means that short jobs submitted while a long job is running can start receiving resources right away and still get good response times, without waiting for the long job to finish.
According to this if i have multiple jobs from multiple threads in case of spark streaming(one topic from each thread) is it possible that multiple topics can run simultaneously if i have enough cores in my cluster or would it just do a round robin across pools but run only one job at a time ?
Context:
I have two topics T1 and T2, both with one 1 partition. I have configured a pool with scheduleMode to be FAIR. I have 4 cores registered with spark. Now each topic has two actions(hence two jobs - totally 4 jobs across topics) Let's say J1 and J2 are jobs for T1 and J3 and J4 are jobs for topic T2. What spark is doing in FAIR mode is execute J1 J3 J2 J4, but at any time only one job is executing. Now as each topic has only one partition, only once core is being used and 3 are just free. This is something which i don't want.
Any way i can avoid this ?
if i have multiple jobs from multiple threads...is it possible that multiple topics can run simultaneously
Yes. That's the purpose of FAIR scheduling mode.
As you may have noticed, I removed "Spark Streaming" from your question since it does not contribute in any way to how Spark schedules Spark jobs. It does not really matter whether you start your Spark jobs from a "regular" application or Spark Streaming one.
Quoting Scheduling Within an Application (highlighting mine):
Inside a given Spark application (SparkContext instance), multiple parallel jobs can run simultaneously if they were submitted from separate threads.
By default, Spark’s scheduler runs jobs in FIFO fashion. Each job is divided into "stages" (e.g. map and reduce phases), and the first job gets priority on all available resources while its stages have tasks to launch, then the second job gets priority, etc.
And then the quote you used to ask the question that should now get clearer.
it is also possible to configure fair sharing between jobs. Under fair sharing, Spark assigns tasks between jobs in a "round robin" fashion, so that all jobs get a roughly equal share of cluster resources.
So, speaking about Spark Streaming you'd have to configure FAIR scheduling mode and Spark Streaming's JobScheduler should submit Spark jobs per topic in parallel (haven't tested it out myself so it's more theory than practice).
I think that fair scheduler alone will not help, as it's the Spark Streaming engine that takes care of submitting the Spark Jobs and normally does so in a sequential mode.
There's a non-documented configuration parameter in Spark Streaming: spark.streaming.concurrentJobs[1], which is set to 1 by default. It controls the parallelism level of jobs submitted to Spark.
By increasing this value, you may see parallel processing of the different spark stages of your streaming job.
I would think that combining this configuration with the fair scheduler in Spark, you will be able to achieve controlled parallel processing of the independent topic consumers. This is mostly uncharted territory.

Spark Yarn running 1000 jobs in queue

I am trying to schedule 1000 jobs in Yarn cluster. I want to run more then 1000 jobs daily at same time and yarn to manage the resources. For 1000 files of different category from hdfs i am trying to create spark submit command from python and execute. But i am getting out of memory error due to spark submit using driver memory.
How can schedule 1000 jobs in spark yarn cluster? I even tried oozie job scheduling framework along with spark, it did not work as expected with HDP.
Actually, you might not need 1000 jobs to read from 1000 files in HDFS. You could try to load everything in a single RDD as well (the APIs do support reading multiple files and wildcards in paths). Now, after reading all the files in a single RDD, you should really focus on ensuring if you have enough memory, cores, etc. assigned to it and start looking at your business logic which avoids costly operations like shuffles, etc.
But, if you insist that you need to spawn 1000 jobs, one for each file, you should look at --executor-memory and --executor-cores (along with num-executors for parallelism). These give you leverage to optimise for memory/CPU footprint.
Also curious, you are saying that you get OOM during spark-submit (using driver memory). The driver doesn't really use any memory at all, unless you do things like collect or take with large set, which bring the data from the executors to the driver. Also you are firing the jobs in yarn-client mode? Another hunch is to check if the box where you spawn spark spark jobs has even enough memory just to spawn the jobs in the first place?
It will be easier if you could also paste some logs here.

Resources