I will do several tasks periodically in my app, but they have different period, how to do this?
run each task in a separate timer thread.
run all the period task in the same timer thread, but check the time to see if the task should be activated.
do you have any better solution?
It would mostly depend on how many tasks you have to run.
With two or three tasks it would make sense to keep have a separate timer for each task, but will get unwieldy with more tasks.
If there a good number of tasks I would have single timer that checks a list of tasks to see if there are any tasks ready to run. That way to add a task; just add it to the list. Having a list of tasks would also make it easy for the tasks to be data driven.
Sounds like you should execute each task on its own thread. IT will easy the configuration of timing and controlling start/stop of each task
Using the Timer control is a good option, if the tasks should execute every given time delta.
Related
I would like to know what is the correct way to visualize background jobs which are running on a schedule controlled by scheduling service?
In my opinion the correct way should be to have an action representing the scheduling itself, followed by fork node which splits the flow for each of the respective scheduled jobs.
Example. On a schedule Service X is supposed to collect data from an API every day, on another schedule Service Y is supposed to aggregate the collected data.
I've tried to research old themes and find any diagram representing similar activity.
Your current diagram says that:
first the scheduler does something (e.g. identifying the jobs to launch)
then passes control in parallel to all the jobs it wants to launch
no other jobs are scheduled after the scheduler finished its task
the first job that finshes interrupts all the others.
The way it should work would be:
first the scheduler is setup
then the setup launches the real scheduler, which will run in parallel to the scheduled jobs
scheduled jobs can finish (end flow, i.e. a circle with a X inside) without terminating everything
the activity would stop (flow final) only when the scheduler is finished.
Note that the UML specifications do not specify how parallelism is implemented. And neither does your scheduler: whether it is true parallelism using multithreaded CPUs or multiple CPUs, or whether it is time slicing where some interruptions are used to switch between tasks that are executed in reality in small sequential pieces is not relevant for this modeling.
The remaining challenges are:
the scheduler could launch additional jobs. One way of doing it could be to fork back to itself and to a new job.
the scheduler could launch a variable number of jobs in parallel. A better way to represent it is with a «parallel» expansion region, with the input corresponding to task object, and actions that consume the taks by executing them.
if the scheduler runs in parallel to the expansion region, you could also imagine that the schedule provides at any moment some additional input (new tasks to be processed).
I want to find a way to execute the script tasks of the separate instances of the same workflow sequentially.
In my case multiple workflow instances are being started on one resource in parallel by a script task basing on some attributes of the resource that the master flow is opened on and the script tasks of those instances are run in parallel, which I don't want. I tried both options of "Asynchronous" flag, but it still executes the script tasks in parallel. For now I'm just saving the duration for sleep() function as variable in the function that starts those instances putting the various values depending on a condition and it basically works, but using it is not the best practice, so maybe some of you, more experienced colleagues will be able to help me finding a "nicer" way to resolve my problem.
Use inline signal or message events to signal from one process to the other. On completion of one task in one process, signal the release of the next task in the next process.
Continue until all tasks are complete.
I have a process Scheduled using Timer and TimerTask that runs nightly. Currently ti takes about an hour to finish. Considering there are only 6000 records to loop through the process and the upper management feels like it is very inefficient Job. So I wanted to know if I could span multiple threads of the same job with different datasets. Probaby each thread processes only 500 records at a time.
If i am hitting the same table for read/insert and update using
multiple threads would that be ok to do it?
if so how do i run multiple threads within a timer task? I suppose I could
just create threads and run but how do i ensure they run simultaneously but not sequentially?
I am using java 1.4 and this runs on a jboss 2.4 and i make use EJB 1.1 session beans in the process to read/update/add data.
There isn't enough info in your post for a surefire answer, but I'll share some thoughts:
It depends. Generally you can do reads in parallel, but not writes. If you're doing much more reading than writing, you're probably ok, but you may find yourself dealing with frustrating race conditions.
It depends. You are never guaranteed to have threads run in parallel. That's up to the cpu/kernel/jvm to decide. You just make threads to tell the machine that it's allowed to execute them in parallel.
I have a gearman job that runs and itself executes more jobs when in turn may execute more jobs. I would like some kind of callback when all nested jobs have completed. I can easily do this, but my implementations would tie up workers (spin until children are complete) which I do not want to do.
Is there a workaround? There is no concept of "groups" in Gearman AFAIK, so I can't add jobs to a group and have something fire once that group has completed.
As you say, there's nothing built-in to Gearman to handle this. If you don't want to tie up a worker (and letting that worker add tasks and track their completion for you), you'll have to do out-of-band status tracking.
A way to do this is to keep a group identifier in memcached, and increment the number of finished subtasks when a task finishes, and increment the number of total tasks when you add a new one for the same group. You can then poll memcached to see the current state of execution (tasks finished vs tasks total).
My question might sound a bit naive but I'm pretty new with multi-threaded programming.
I'm writing an application which processes incoming external data. For each data that arrives a new task is created in the following way:
System.Threading.Tasks.Task.Factory.StartNew(() => methodToActivate(data));
The items of data arrive very fast (each second, half second, etc...), so many tasks are created. Handling each task might take around a minute. When testing it I saw that the number of threads is increasing all the time. How can I limit the number of tasks created, so the number of actual working threads is stable and efficient. My computer is only dual core.
Thanks!
One of your issues is that the default scheduler sees tasks that last for a minute and makes the assumption that they are blocked on another tasks that have yet to be executed. To try and unblock things it schedules more pending tasks, hence the thread growth. There are a couple of things you can do here:
Make your tasks shorter (probably not an option).
Write a scheduler that deals with this scenario and doesn't add more threads.
Use SetMaxThreads to prevent
unbounded thread pool growth.
See the section on Thread Injection here:
http://msdn.microsoft.com/en-us/library/ff963549.aspx
You should look into using the producer/consumer pattern with a BlockingCollection<T> around a ConcurrentQueue<T> where you set the BoundedCapacity to something that makes sense given the characteristics of your workload. You can make your BoundedCapacity configurable and then tweak as you run through some profiling sessions to find the sweet spot.
While it's true that the TPL will take care of queueing up the tasks you create, creating too many tasks does not come without penalties. Also, what's the point in producing more work than you can consume? You want to produce enough work that the consumers will never be starved, but you don't want to get to far ahead of yourself because that's just wasting resources and potentially stealing those very same resources from your consumers.
You can create a custom TaskScheduler for the Task Parallel library and then schedule tasks on that by passing an instance of it to the TaskFactory constructor.
Here's one example of how to do that: Task Scheduler with a maximum degree of parallelism.