How to make multiple gearman worker to work on single job - gearman

I am processing 1 lakh urls using perl gearman client and worker .
I need your help to run single job in multiple worker . (ie if i have 5 workers and 1 client i want all these 5 workers to do the job of one client ),currently I am running 20 clients and 30 workers but only 20 workers are running the job balance 10 workers are idle.
Thanks in advance

Gearman worker grabs one job, and takes it as an execution unit. If you would like to run one job on multiple workers, you should probably divide your job into several sub-jobs.

You can create a manager which manages jobs and coordinates other workers.

The approach you need is called fan-out, but gearman cant do this. you have to use rabbitmq like message queue. it can send a same message to different workers with fanout exchanges.

Related

How to manage Managed Executor Service

I'm using Managed Executor Service to implement a process manager which will process tasks in the background upon receiving an JMS message event. Normally, there will be a small number of tasks running (maybe 10 max) but what if something happens and my application starts getting hundred of JMS message events. How do I handle such event?
My thought is to limit the number of threads if possible and save all the other messages to database and will be run when thread available. Thanks in advance.
My thought is to limit the number of threads if possible and save all the other messages to database and will be run when thread available.
The detailed answer to this question depends on which Java EE app server you choose to run on, since they all have slightly different configuration.
Any Java EE app server will allow you to configure the thread pool size of your Managed Executor Service (MES), this is the number of worker threads for your thread pool.
Say you have a 10 worker threads, and you get flooded with 100 requests all at once, the MES will keep a queue of requests that are backlogged, and the worker threads will take work off the queue whenever they finish work until the queue is empty.
Now, it's fine if work goes to the queue sometimes but if overall your work queue increases more quickly than your worker threads can take work off the queue, you will run into problems. The solution to this is to increase your thread pool size otherwise the backlog will get overrun and your server will run out of memory.
what if something happens and my application starts getting hundred of JMS message events. How do I handle such event?
If the load on your server will be so sporadic that tasks need to be saved to a database, it seems that the best approach would be to either:
increase thread pool size
have the server immediately reject incoming tasks when the task backlog queue is full
have clients do a blocking wait for the server task queue to be not full (I would only advise this option if client task submission is in no way connected to user experience)

Fork NodeJS clusters as working load changes

I am trying to fork worker clusters to a maximun of 10, and only if the working load increases. Can it be done?
I have tried with strong-cluster-control's setSize, but I can't find an easy way of forking automatically (if many requests are being done then fork, for example), or closing/"suiciding" forks (maybe with a timeOut if nothing is being done, like in this answer)
This is my repo's main file at GitHub
Thank you in advance!!
I assume that you already have some idea as to how you would like to spread your load so I will not include details about that and instead focus on the interprocess communication required for this.
Notifying the master
To send arbitrary data to the master, you can use process.send() from a worker. The way I would go about this is probably something along these steps:
The application is started
Minimum amount of workers are spawned
Each worker will send the master a request message every time it receives a new request, via process.send()
The master keeps track of all the request events from all workers
If the amount of request events increases above a predefined threshold (i.e. > 100 requests/s) it spawns a new worker
If the amount of request events decreases below a predefined threshold it asks one of the workers to stop processing new requests and close itself gracefully (note that it should not simply kill the process to avoid interrupting ongoing requests)
Main point is: Do not focus on time - focus on rate. In an application that is supposed to handle tens to thousands of requests per second, your setTimout() (the task of which might be to kill the worker if it has been idle for too long) will never fire because Node.js evenly distributes your load across your workers - you could start with one worker, but once you reach your maximum you will never drop to one worker again under continuous load even if there is only one request per second.
It should be noted that it is counterproductive to spawn more workers than the amount of CPU cores you have at your disposal. It might, however, be beneficial to start with a single worker and incrementally increase the amount to all cores as load increases.

How many master node we can create in Single Server using Nodejs?

Hi I am implementing an Email Client Application. My application going to deal 10000 * 10000 of records. so for scalability purpose, i prefered cluster Concept from Nodejs. so my requirement is for every 1000 Process should be handled by one master Node with help of its Workers. so i got question like, in Single Server How many Master Node can be allowed. if any one know pls let me know.... Waiting for reply....
Master 1 -> should handle 1000 of Records
4 Workers (At a time 4 Record can be processed if CPU core is 4)
Master 2 -> should handle 1000 of Records
4 Workers (At a time 4 Record can be processed if CPU core is 4)
like above i need to handle...
Answer is 1. Nodejs scales nice for one CPU and to use more CPUs you have the workers. You also can't share the ports, so how would the email clients know how to connect to additional masters?

is PYTHON Gearman Worker accept multi-tasks

For example:
I have a task named "URLDownload", the task's function is download a large file from internet.
Now I have a Worker Process running, but have about 1000 files to download.
It is easy for a Client Process to create 1000 task, and send them to Gearman Server.
My Question is the Worker Process will do the task one by one, or it will accept multi-tasks at one time,
If the Worker Process can accept multi-tasks, So How can I limit the task-pool-size in Worker Process.
Workers process one request at a time. You have a few options:
1) You can run multiple workers (this is the most common method). Workers sit in poll() when they aren't processing so this model works pretty well.
2) Write a fork() implementation around the worker. This way you can fire up a set number of worker processes, but don't have to monitor multiple processes.

Tomcat thread control

I have two tomcat servers running at the same time. I have reports which are requested from server 1 sent to server 2 for processing. So how would I go about managing the threads on server 2? For example, if I wanted to queue up the threads how would I go about doing that?
Use a message queue (like RabbitMQ) in the middle to queue up the tasks that need to be done.
Then, your report generating server can pull jobs from the queue and work on them. If you need to slow down or speed up, then you can increase the number of "workers" running.

Resources