Should a batch scheduled process (for example, a nightly process) be modeled as a Use Case? it is something the system should do, but there is not an Actor "using" the feature, because it is scheduled.
Any suggestions?
Thanks!
We've defined a 'Scheduler' actor to model that scenario. The Scheduler usually has its own set of use cases which are batch jobs, or executables that need to run regularly, etc. For example, the Use Case can be written like "The Use Case begins when the current time is on the hour" for a job that runs 24 times a day. We try not to include too many of these cases because it is too easy to get bogged down into implementation details. We wait until really important activities have to be timed, like monthly close procedures for the accounting department. They don't mention any software specifics (like the name of the scheduling software), just that the Use Case is triggered by the Scheduler actor on a given day and/or time.
First attempt:
Time can be actor in your use case.
But as you said it is strange as an primary actor.
You can think a human alternative.
So ask yourself:
System automatically do a batch scheduled process but: when? how? ...
So WHO will tell the system when? how ? to do you scheduled process? Is there a role which configure a batch scheduled process? If so..
Second Attempt:
There is a good article at IBM site Dear Dr. Use Case: Is the Clock an Actor?
And you can check similiar question at Is TIME an actor in a use case?
The system (O.S.) its the "actor":
http://en.wikipedia.org/wiki/Actor_%28UML%29
In U.M.L, an "Actor" is not just a person, can be a process or the O.S., you just add an stereotype, indicating its "system".
Related
For example, a program that sends a token or nft to a specific address once a month.
No program on solana will be executed unless an off-chain actor submits a transaction containing an instruction for that program. There is no timer mechanism inherent to solana that will automatically execute your transactions at a later date.
You can write a program to restrict an instruction such that it can only be executed successfully once per month. The program can check the current timestamp against a previous execution to check if it's allowed to execute now. Or it could check the number of months since the previous execution and transfer the appropriate number of tokens that should be available after that number of months.
Additionally, you need to consider the incentives of the actor who submits the instruction. Does an ordinary user have reason to execute some instructions in your program already? If it fits within the compute budget, you can bundle this monthly logic along with the other logic that users routinely execute. If not, then you need to incentivize someone else to make sure the instruction is executed often enough. You could just submit a transaction once a month on your own. Or you could design your program to collect fees from ordinary users so it can pay rewards to a crank turner who runs these periodic instructions so you don't have to. You also need to let people know they can get paid for running a crank.
So, there are ways to get things to run periodically, but you need to get creative to make it happen. There are some interesting ideas that build on the primitives I have described, you can go pretty far down this rabbit hole. It has been proposed that multisig can play a role in a generic cron timer. As always, this would still require someone to turn the crank by submitting transactions to the network periodically. https://github.com/solana-labs/solana/issues/17218
I am trying to implement a Django web application (on Python 3.8.5) which allows a user to create “activities” where they define an activity duration and then set the activity status to “In progress”.
The POST action to the View writes the new status, the duration and the start time (end time, based on start time and duration is also possible to add here of course).
The back-end should then keep track of the duration and automatically change the status to “Finished”.
User actions can also change the status to “Finished” before the calculated end time (i.e. the timer no longer needs to be tracked).
I am fairly new to Python so I need some advice on the smartest way to implement such a concept?
It needs to be efficient and scalable – I’m currently using a Heroku Free account so have limited system resources, but efficiency would also be important for future production implementations of course.
I have looked at the Python threading Timer, and this seems to work on a basic level, but I’ve not been able to determine what kind of constraints this places on the system – e.g. whether the spawned Timer thread might prevent the main thread from finishing and releasing resources (i.e. Heroku Dyno threads), etc.
I have read that persistence might be a problem (if the server goes down), and I haven’t found a way to cancel the timer from another process (the .cancel() method seems to rely on having the original object to cancel, and I’m not sure if this is achievable from another process).
I was also wondering about a more “background” approach, i.e. a single process which is constantly checking the database looking for activity records which have reached their end time and swapping the status.
But what would be the best way of implementing such a server?
Is it practical to read the database every second to find records with an end time of “now”? I need the status to change in real-time when the end time is reached.
Is something like Celery a good option, or is it overkill for a single process like this?
As I said I’m fairly new to these technologies, so I may be missing other obvious solutions – please feel free to enlighten me!
Thanks in advance.
To achieve this you need some kind of scheduling tasks functionality. For a fast simpler implementation is a good solution to use the Timer object from the
Threading module.
A more complete solution is tu use Celery. If you are new, deeping in it will give you a good value start using celery as a queue manager distributing your work easily across several threads or process.
You mentioned that you want it to be efficient and scalable, so I guess you will want to implement similar functionalities that will require multiprocessing and schedule so for that reason my recommendation is to use celery.
You can integrate it into your Django application easily following the documentation Integrate Django with Celery.
I've managed to use the Job scheduling example for a project I'm working on. I have an additionnal constraint I would like to add. Some Resources should be blocked sometimes. For example a Global renewable Resource shouldn't be used between minutes 10 to 20. Is it currently already doable or if not, how can it be done in the score calculation ?
Thanks
Use a custom shadow variable listener to predict the starting time of each task.
Then simply have a hard constraint to check that the task won't overlap with its blocks.
Penalize the amount of overlap to avoid a "score trap".
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I want to design a job scheduler cluster, which contains several hosts to do cron job scheduling. For example, a job which needs run every 5 minutes is submitted to the cluster, the cluster should point out which host to fire next run, making sure:
Disaster tolerance: if not all of the hosts are down, the job should be fired successfully.
Validity: only one host to fire next job run.
Due to disaster tolerance, job cannot bind to a specific host. One way is all the hosts polling a DB table(certainly with lock), this guaranteed only one host gets the next job run. Since it often locks table, is there any better design?
Use the Quartz framework for that. It has a cron like syntax, can be clustered and only one of the hosts in the cluster will do one job at a time. If a host or job fails, another host will retry the pending job.
I googled out the Dkron (Distributed job scheduling system). It has rest api and looks good. I plan try to use it
Dkron site
I'm not sure how to design one, but there are open-source products that do that which can serve as an example. One is Quartz scheduler that is mentioned above.
But, apparently, WallmartLabs have evaluated Quartz, found it to be not good enough, and thus created and open-sourced a better (in their opinion) alternative to it called BigBen. Perhaps you could also look at that one.
Consider using AWS Simple Workflow Service if you are OK with using AWS web services. The benefit over something like Quartz is that it doesn't depend on database which you have to host and it can provide much more than scheduling. For example it can run some activities that fix your cluster or page you if scheduling is not possible for any reason. Here is an example of a cron workflow.
I did require something like this long ago, when synchronisation was done with floppy disks. You should be clear about three things, which seem to be simple, but in distributed environment the arent :-)
"Synchronisation Sections"
If you get a net split, which means your cluster is split in two seperate sections wich can communicate inside the sections, but not between the two sections, the "fire the job exactly once" can only acquired per synchronisation section.
"Disaster"
If almost all times all computers are up and running and only very seldom one fails, and the failure of two is almost unthinkable, its a completely different thing, than every host is running only part time, the connections are unstable, or the synchronisation is done by dial-up connections or by floppys. If you want even deal with a net split, it becomes really really complicated.
If you want to deal with malicious hosts, you have another Problem.
"Validity"
Fire every job exactly once... you have to synchronize faster than the job firing interval.
edit: Tipp for scheduler-tasks design. I have a big text file, wich contains lines. Every line is a job task, starting with job-type, then time to execute, then command and last but not least a optional resubmission-interval for repeating tasks. Syncing means merging. Executed tasks are deleted. If resubmission is on, then a new task is inserted or appended.
In an ideal world, every host ist allways connected to the others, I would implement something like a token ring. If there is no master, one is selected by the hosts, and the master is expected to schedule everything until he is not sending heardbeats for some time. If there are two masters, they negotiate for one of them to become master(maybe lower MAC-Adress... whatever).
If you have to deal with malicious hosts, you can use some byzantine gerenals-problem solution. The selection of the master is allready pretty good proofed against malicious hosts. With a little bit of rsa-krypto the selected master can signature every command, resend attacks can be treated with timestamps or growing indices... voila.
only as a story from an onld programmer, not intended for today everything is allways connected to the internet world:
My big problem about 20 years ago was, that the hosts were synchronized from once a hour and once a day to once a week or once a month. So the solution was to have different commands:
1. execute on every host at a given date (wich is far enough in the future for synchronisation)
2. execute on a host, where "whoami" contains a certain substring.
3. execute on a random host with little probability, and send an acknowledgement to all others, that it is allready executed.
The third command-type does something like "fire only once", if the synchronisation is much faster than the probability of execution. It needs no master-slave architecture and it works pretty well, if you know the synchronisation intervalls.
Check out Chronos (https://mesos.github.io/chronos/) which runs on top of Mesos - (https://mesos.apache.org/) resource scheduler.
I have an application that has to launch jobs repeatingly. But (yes, that would have been to easy without a but...) I would like users to define their backup frequency in application.
In worst case, they would have to choose between :
weekly,
daily,
every 12 hours,
every 6 hours,
hourly
In best case, they should be able to use crontab expressions (see documentation for example)
How to do this? Do I launch a job every minutes that check for last execution time, frequency and then launches another job if needed? Do I create a sort of queue that will be executed by a masterjob?
Any clues, ideas, opinions, best pratices, experiences are welcome!
EDIT : Solved this problem using Akka scheduler. Ok, this is a technical solution not a design answer but still everything works great.
Each user defined repetition is an actor that send messages every period to a new actor to execute the actual job.
There may be two ways to do this depending on your requirements/architecture:
If you can only use Play:
The user creates the job and the frequency it will run (crontab, whatever).
On saving the job, you calculate the first time it will have to be run. You then add an entry to a table JOBS with the execution time, job id, and any other information required. This is required as Play is stateless and information must be stored in the DB for later retrieval.
You have a job that queries the table for entries whose execution date is less than now. Retrieves the first, runs it, removes it from the table and adds a new entry for next execution. You should keep some execution counter so if a task fails (which means the entry is not removed from DB) it won't block execution of the other tasks by the job trying again and again.
The frequency of this job is set to run every second. That way while there is information in the table, you should execute the request around as often as they are required. As Play won't spawn a new job while the current one is working if you have enough tasks this one job will serve all. If not, it will be killed at some point and restored when required.
Of course, the crons of the users will not be too precise, as you have to account for you own cron delays plus execution delays on all the tasks in queue, which will be run sequentially. Not the best approach, unless you somehow disallow crons which run every second or more often than every minute (to be safe). Doing a check on execution time of the crons to kill them if they are over a certain amount of time would be a good idea.
If you can use more than Play:
The better alternative I believe is to use Quartz (see this) to create a future execution when the user creates the job, and reproram it once the execution is over.
There was a discussion on google-groups about it. As far as I remember you must define a job which start every 6 hours and check which backups must be done. So you must remember when the last backup job was finished and make the control yourself. I'm unsure if Quartz can handle such a requirement.
I looked in the source-code (always a good source ;-)) and found a method every, where I think this should be do what you want. How ever I'm unsure if this is a clever design, because if you have 1000 user you will have then 1000 Jobs. I'm unsure if Play was build to handle such a large number of jobs.
[Update] For cron-expressions you should have a look into JobPlugin.scheduleForCRON()
There are several ways to solve this.
If you don't have a really huge load of jobs, I'd just persist them to a table using the required flexibility. Then check all of them every hour (or the lowest interval you support) and run those eligible. Simple.
Or, if you prefer to use cron syntax anyway, just write (export) jobs to a user crontab using a wrapper which calls back to your running app, or starts the job in a standalone process if that's possible.