gpars to create parallel jobs in pipeline plugin for jenkins - multithreading

My environment is Tibco, where I have 200 to 300 services. My design needs to trigger the 'stop' command to all these services parallely. Please provide some insight as to how I deal(code) the threading mechanism within Gpars. Examples will help.

GParsPool.withPool(PreVerifyManager.THREADS){
servicePathList.eachParallel{
InitiateApplicationDowntime(it)
}
}
I was totally a new bee with groovy, after 2 months of learning, researching and working, this just fit my requirement

Related

Is it possible to run different tasks on different schedules with prefect?

I'm moving my first steps with prefect, and I'm trying to see what its degrees of freedom are. To this end, I'm investigating whether prefect supports running different tasks on different schedules in the same python process. For example, Task A might have to run every 5 minutes, while Task B might run twice a day with a Cron scheduler.
It seems to me that schedules are associated with a Flow, not with a task, so to do the above, one would have to create two distinct one-task Flows, each with its own schedule. But even as that, given that running a flow is a blocking operation, I can't see how to "start" both flows concurrently (or pseudo-concurrently, I'm perfectly aware the flows won't execute on separate threads).
Is there a built-in way of getting the tasks running on their independent schedules? I'm under the impression that there is a way to achieve this, but given my limited experience with prefect, I'm completely missing it.
Many thanks in advance for any pointers.
You are right that schedules are associated with Flows and not Tasks, so the only place to add a schedule is a Flow. Running a Flow is a blocking operation if you are using the open source Prefect core only. For production use cases, it's recommended running your Flows against Prefect Cloud or Prefect Server. Cloud is the managed offering and Server is when you host it yourself. Note that Cloud has a very generous free tier.
When using a backend, you will use an agent that will kick off the flow run in a new process. This will not be blocking.
To start with using a backend, you can check the docs here
This Prefect Discourse topic discusses a very similar problem and shows how you could solve it using a flow-of-flows orchestrator pattern.
One way to approach it is to leverage Caching to avoid recomputation of certain tasks that require lower-frequency scheduling than the main flow.

In which way Node.js handles jobs in parallel? [Bull v3]

I have started experimenting in Node.js lately.
I'm currently setting up an app, which will handle multiple queues in parallel by utilising bull library, to run heavy jobs in the background.
I'm looking for an answer, which i hope i did not miss in the documentation.
**It's still kind of "blurry" to me, how this library is handling those tasks in parallel. **
So i have the following scenario:
2 jobs are running at the same time, both of them are heavy and are
taking some time to finish.
During the run of those 2 jobs, i can still use the rest of the
application - The event loop is not blocked.
What i get from that, is, that probably something else handles those 2 jobs, since JavaScript is single-threaded What is this?
Any guidance or any advice will be highly appreciated!
https://github.com/OptimalBits/bull

Keep track of all the parameters of spark-submit

I have a team where many member has permission to submit Spark tasks to YARN (the resource management) by command line. It's hard to track who is using how much cores, who is using how much memory...e.g. Now I'm looking for a software, framework or something could help me monitor the parameters that each member used. It will be a bridge between client and YARN. Then I could used it to filter the submit commands.
I did take a look at mlflow and I really like the MLFlow Tracking but it was designed for ML training process. I wonder if there is an alternative for my purpose? Or there is any other solution for the problem.
Thank you!
My recommendation would be to build such a tool yourself as its not too complicated,
have a wrapper script to spark submit which logs the usage in a DB and after the spark job finishes the wrapper will know to release information. could be done really easily.
In addition you can even block new spark submits if your team already asked for too much information.
And as you build it your self its really flexible as you can even create "sub teams" or anything you want.

How to execute parallel computing between several instances in Google Cloud Compute Engine?

I've recently encountered a problem to process a pickle file of 8 Gigabytes with a Python script using VMs in Google Cloud Compute Engine. The problem is that the process takes too long and I am searching for ways to decrease the time of processing. One of possible solutions could be sharing the processes in the script or map them between CPUs of several VMs. If somebody knows how to perform it, please, share with me!))
You can use Clusters for Large-scale Technical Computing in the Google Cloud Platform (GCP). There are open source software like ElastiCluster provide cluster management and support for provisioning nodes while using Google Compute Engine (GCE).
After the cluster is operational, workload manager manages the task execution and node allocation. There are a variety of popular commercial and open source workload managers such as HTCondor from the University of Wisconsin, Slurm from SchedMD, Univa Grid Engine, and LSF Symphony from IBM.
This article is also helpful.
it looks like an HPC problem. Look at this link: https://cloud.google.com/solutions/architecture/highperformancecomputing.
There are lot of valuable solutions to your problem but it depends on the details of your case. A first simple approach could be to logically split your task in small jobs. Then you can assign a subset of these jobs to each GCE instance in your group of dedicated instances.
You can consider to create a group of a predefined number of instances. Each run could rely on a startup scripts in order to reach out the job it must execute. When the job finishes the instance can be deleted and substituted by a new one (Google Compute Engine Managed Instance Groups will create a new instance automatically). You must only manage when the group should start and stop.
Furthermore, you can consider preemptible instances (more cheaper).
Hope this helps you.
Bye

How to implement a quartz or cron scheduler in IBM Bluemix?

Hi i am having an application which needs to be scheduled in order to perform a task continuously, but i am doing the development on IBM Bluemix cloud and after like 10 days of research i have not been able to find a correct solution to implement a QUARTZ or CRON scheduler in bluemix, in a web application.
whatever i have found out is a service inside the BLuemix called as Workload scheduler but no success has been obtained so far from the steps mentioned there.
Secondly i have found a blog where there are some steps posted to implement scheduler in bluemix, but no success from that as well, the link is mentioned below.
Link: http://sureshgarrepalli.blogspot.in/2015_08_01_archive.html
If anyone here can help me regarding this would be a lot of help. Thanks.
I am using java as my technology and would like to prefer QUARTZ scheduler over CRONJOB.
If someone would be having a snippet of working code, it will be of great help. Thanks
have you considered using OpenWhisk, which has the capability for triggering logic on a scheduled fashion?
See here for details: https://console.ng.bluemix.net/docs/openwhisk/openwhisk_alarms.html#openwhisk_catalog_alarm
can you add more details on what you want to do? If you want one of your restful services to be invoked, the workload scheduler service will help.
If you want to have a workflow with steps spanning across different systems, monitor the execution of these steps from a user interface, be notified if anything goes wrong etc etc, the workload scheduler service will be your first choice.
Thanks, Umberto

Resources