Repeating same analysis decreases completion time. How to avoid that? - alloy

I noticed that repeating the same analysis several time decreases drastically the time needed for completion.
In my case, a generation that takes 1700ms in the first run takes a mere 200ms after several repetitions.
I guess that the Analyser, or the underlying SAT-solvers are keeping track of previous analysis, and that's certainly all for the better in most of the case.
But currently I would like to have a more or less constant completion time. So my question is :
(tl;dr)
Is there a way to empty the Analyzer "cache" (other than restarting the Analyzer) ?
EDIT
I just made several run of this model of mine and here is what I get :
run #1:
113309 vars. 3023 primary vars. 298922 clauses. 1964ms.
run #2:
113309 vars. 3023 primary vars. 298922 clauses. 1081ms.
run #3:
113309 vars. 3023 primary vars. 298922 clauses. 514ms.
run #4:
113309 vars. 3023 primary vars. 298922 clauses. 380ms.
run #5:
113309 vars. 3023 primary vars. 298922 clauses. 342ms.
run #6:
113309 vars. 3023 primary vars. 298922 clauses. 438ms.

I've noticed the same behavior many times, and I've never been certain about why it happens. As far as I know, neither the Alloy Analyzer nor Kodkod maintains an explicit cache of any sort (of course, there are caches used within a single translation/execution, but I don't think they are carried over between executions).
My simple explanation is that the first "slow" run is due to "cold start". One argument for that is that if you open two unrelated Alloy models, and first execute a command from the first model, then execute a command from the second model, the second execution (in my experience) still runs "faster" than when the same command is executed from cold start.

Related

Run job repeatedly, but with no overlap and not at precise scheduled times

I have a background task that needs to be run repeatedly, every hour or so, sending me an email whenever the task emitted non-trivial output.
I'm currently using cron for that, but it's somewhat ill-suited: it forces me to choose exact times at which the command is run, and it doesn't prevent overlap.
An alternative would be to run the script in a loop with sleep 3600 at the end of each iteration but this then needs extra work to make sure the script is always restarted after boot and such.
Ideally, I'd like a cron-like tool where I can give a set of commands to run repeatedly with approximate execution rates and the tool will run them "when convenient" and without overlapping execution of different iterations of a command (or even without overlapping execution of any command).
Short of writing such a tool myself, what would be the recommended approach?

Block Resource in Optaplanner Job scheduling

I've managed to use the Job scheduling example for a project I'm working on. I have an additionnal constraint I would like to add. Some Resources should be blocked sometimes. For example a Global renewable Resource shouldn't be used between minutes 10 to 20. Is it currently already doable or if not, how can it be done in the score calculation ?
Thanks
Use a custom shadow variable listener to predict the starting time of each task.
Then simply have a hard constraint to check that the task won't overlap with its blocks.
Penalize the amount of overlap to avoid a "score trap".

Why in kubernetes cron job two jobs might be created, or no job might be created?

In k8s Cron Job Limitations mentioned that there is no guarantee that a job will executed exactly once:
A cron job creates a job object about once per execution time of its
schedule. We say “about” because there are certain circumstances where
two jobs might be created, or no job might be created. We attempt to
make these rare, but do not completely prevent them. Therefore, jobs
should be idempotent
Could anyone explain:
why this could happen?
what are the probabilities/statistic this could happen?
will it be fixed in some reasonable future in k8s?
are there any workarounds to prevent such a behavior (if the running job can't be implemented as idempotent)?
do other cron related services suffer with the same issue? Maybe it is a core cron problem?
The controller:
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/cronjob/cronjob_controller.go
starts with a comment that lays the groundwork for an explanation:
I did not use watch or expectations. Those add a lot of corner cases, and we aren't expecting a large volume of jobs or scheduledJobs. (We are favoring correctness over scalability.)
If we find a single controller thread is too slow because there are a lot of Jobs or CronJobs, we we can parallelize by Namespace. If we find the load on the API server is too high, we can use a watch and UndeltaStore.)
Just periodically list jobs and SJs, and then reconcile them.
Periodically means every 10 seconds:
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/cronjob/cronjob_controller.go#L105
The documentation following the quoted limitations also has some useful color on some of the circumstances under which 2 jobs or no jobs may be launched on a particular schedule:
If startingDeadlineSeconds is set to a large value or left unset (the default) and if concurrentPolicy is set to AllowConcurrent, the jobs will always run at least once.
Jobs may fail to run if the CronJob controller is not running or broken for a span of time from before the start time of the CronJob to start time plus startingDeadlineSeconds, or if the span covers multiple start times and concurrencyPolicy does not allow concurrency. For example, suppose a cron job is set to start at exactly 08:30:00 and its startingDeadlineSeconds is set to 10, if the CronJob controller happens to be down from 08:29:00 to 08:42:00, the job will not start. Set a longer startingDeadlineSeconds if starting later is better than not starting at all.
Higher level, solving for only-once in a distributed system is hard:
https://bravenewgeek.com/you-cannot-have-exactly-once-delivery/
Clocks and time synchronization in a distributed system is also hard:
https://8thlight.com/blog/rylan-dirksen/2013/10/04/synchronization-in-a-distributed-system.html
To the questions:
why this could happen?
For instance- the node hosting the CronJobController fails at the time a job is supposed to run.
what are the probabilities/statistic this could happen?
Very unlikely for any given run. For a large enough number of runs, very unlikely to escape having to face this issue.
will it be fixed in some reasonable future in k8s?
There are no idemopotency-related issues under the area/batch label in the k8s repo, so one would guess not.
https://github.com/kubernetes/kubernetes/issues?q=is%3Aopen+is%3Aissue+label%3Aarea%2Fbatch
are there any workarounds to prevent such a behavior (if the running job can't be implemented as idempotent)?
Think more about the specific definition of idempotent, and the particular points in the job where there are commits. For instance, jobs can be made to support more-than-once execution if they save state to staging areas, and then there is an election process to determine whose work wins.
do other cron related services suffer with the same issue? Maybe it is a core cron problem?
Yes, it's a core distributed systems problem.
For most users, the k8s documentation gives perhaps a more precise and nuanced answer than is necessary. If your scheduled job is controlling some critical medical procedure, it's really important to plan for failure cases. If it's just doing some system cleanup, missing a scheduled run doesn't much matter. By definition, nearly all users of k8s CronJobs fall into the latter category.

How to make FIO replay a trace with multiple thread

I'm trying to use fio to replay some block traces.
The job file I wrote looks like:
[global]
name=replay
filename=/dev/md0
direct=1
ioengine=psync
[replay]
read_iolog=iolog.fio
replay_no_stall=0
write_lat_log=replay_metrics
numjobs=1
The key here is I want to use "psync" as the ioengine, and replay the iolog.
However, with psync, fio seems to ignore "replay_no_stall" option, which ignore the timestamp in the iolog.
And by setting numjobs to be 4, fio seems to make 4 copies of the same workload, instead of using 4 threads to split the workload.
So, how could I make fio with psync respect the timestamp, and use multiple threads to replay the trace?
Without seeing a small problem snippet of the iolog itself I can't say why the replay is always going as fast as possible. Be aware that waits are in milliseconds and successive waits in the iolog MUST increase if the later ones are to have an effect (as they are relative to the start of the job itself and not to each other or the previous I/O). See the "Trace file format v2" section of the HOWTO for more details. This problem sounds like a good question for the fio mailing list (but as it's a question please don't put it in the bug tracker).
numjobs is documented as only creating clones in the HOWTO so your experience matches the documented behaviour.
Sadly fio replay currently (end of 2016) doesn't work in a way that a single replay file can be arbitrarily split among multiple jobs and you need multiple jobs to have fio use multiple threads/processes. If you don't mind the fact that you will lose I/O ordering between jobs you could split the iolog into 4 pieces and create a job that uses each of the new iolog files.

postgresql concurrent queries debug

There is a multithreaded application executing some PL/pgsql function. That function produces record inserts to a critically important resource( table ). Also it executes some select/update/etc operations while executing.
The issue is, sometimes we face duplicate( 2-3 ) records each one passed to the function in a parallel thread. And they all are inserted into table as a function execution result, while they should not.
It happens, because both transactions are executed in parallel, and have no idea that the same record is being prepared to insert in a parallel transaction.
The table is critically important and all kinds of LOCK TABLE are extremely not welcomed (LOCK FOR SHARE MODE meanwhile gave as some useful experience).
So, the question is, is there any best practice how to organize PL/pgsql function working with a critical resource (table) to be executed by multithreaded app and producing no harmful locks on this resource?
PS. I know, that some thread partinioning by record.ID in the app is a possible solution. But I.m interested in a PL/pgsql solution first of all.
Sometimes you can use a advisory locks - http://www.postgresql.org/docs/current/static/explicit-locking.html .With these locks some subset of numbers. I used it for synchronization of parallel inserts with success.

Resources