JCL Return code FLUSH - mainframe

//STE1 IF RC EQ 1 THEN
....
//ENDIF
the return code is giving me FLUSH and all the other job is not executing becaue of this
can anyone help me on this.
is it because i havent given ELSE?

If you have conditions for running steps, either COND or IF, and the condition determines that a step is not run, then there is no "Return Code" from the step. The step is not run, it is FLUSHed, so there is no RC.
If the rest of the steps in your JOB are expecting to run on a RC=0, then you will have to change something.
Consult the JCL Reference, you have other options, like EVEN, ONLY, but these may not suit (haven't a clue, as don't know exactly what you are trying).
//STEPA
...
//STEPB
...
//STEPC
If STEPB depends on STEPA, so will not run with a zero RC from STEPA, you need to decide what is needed for STEPC. You have three situations: STEPB not run; runs with zero RC; runs with non-zero RC. What should STEPC do in each case.
If STEPC has no conditional processing, then it will just run whatever happens to STEPB (except an abend, and no EVEN).
If STEPC needs to run conditionally, you have to decide what it is about STEPA and STEPB which tells you how to run it.
If your JOB is big, and the conditions are complex, consider splitting it into separate JOBs and letting the Scheduler take care of it.
If your JCL is destined for Production, there should be JCL Standards to follow, and if you are unclear how to do something, you should consult those responsible for the Production JCL, they will tell you how they want it, and whether you even need be concerned about it (as they may well just re-write from scratch anyway).

When a particular step in a JOB is skipped due to COND parameter or any other reason, what will be the retuen code that will displayed in the spool

Related

Reproducing race conditions: continue all threads except one in gdb

I have found a race condition in my application code and now I am wondering how I could create a test case for it, that can be run as a test script that is determined to trigger a specific effect of the race condition and doesn't require a reproduction code patch and/or a manual gdb session.
The situation is a schoolbook example of a race condition: I have an address A and thread 1 wants to write to its location and thread 2 wants to read from it.
So I was thinking of writing a gdb script for this that breaks when thread 1 is about to write at address A then write some garbage into A, then have all threads continue except for thread 1. Then fire the query that causes thread 2 to guarantee to read the garbage at A and then causes a segmentation fault or something.
I guess this is the reverse of set scheduler-locking = on. I was hoping there exist a way to do something like this in a gdb script. But I am afraid there isn't. Hopefully somebody can prove me wrong.
Also I am open to a non-gdb based solution. The main point is that this race condition test, can run automatically without requiring source code modifications. Think of it as an integration test.

GitLab CI: How to keep going a job that fails

I have this gitlab-ci job and I would like it to just ignore failures and keep going. Do you have a way of doing that? Note that allow_fail: true does not work because it will just ignore that the job have failed however I want that the job keep executing in spite of failing commands in the middle.
palms up, serious look: "We don't do that here"
The pipeline is supposed to work every time, and by design its commands cannot fail. You can however:
change the commands logic and avoid failure
split the commands in different jobs, using the on_failure parameter to manage workflow
force the commands to have a clean exit code (ie: using || true after the fallible command)
During debug I often use the third option after debug statement, or after commands that I'm not sure how will behave. The definitive version, however, is supposed to always work.

Threshold for allowed amount of failed Hyperdrive runs

Because "reasons", we know that when we use azureml-sdk's HyperDriveStep we expect a number of HyperDrive runs to fail -- normally around 20%. How can we handle this without failing the entire HyperDriveStep (and then all downstream steps)? Below is an example of the pipeline.
I thought there would be an HyperDriveRunConfig param to allow for this, but it doesn't seem to exist. Perhaps this is controlled on the Pipeline itself with the continue_on_step_failure param?
The workaround we're considering is to catch the failed run within our train.py script and manually log the primary_metric as zero.
thanks for your question.
I'm assuming that HyperDriveStep is one of the steps in your Pipeline and that you want the remaining Pipeline steps to continue, when HyperDriveStep fails, is that correct?
Enabling continue_on_step_failure, should allow the rest of the pipeline steps to continue, when any single steps fails.
Additionally, the HyperDrive run consists of multiple child runs, controlled by the HyperDriveConfig. If the first 3 child runs explored by HyperDrive fail (e.g. with user script errors), the system automatically cancels the entire HyperDrive run, in order to avoid further wasting resources.
Are you looking to continue other Pipeline steps when the HyperDriveStep fails? or are you looking to continue other child runs within the HyperDrive run, when the first 3 child runs fail?
Thanks!

Shake: Signal whether anything had to be rebuilt at all

I use shake to build a bunch of static webpages, which I then have to upload to a remote host, using sftp. Currently, the cronjob runs
git pull # get possibly updated sources
./my-shake-system
lftp ... # upload
I’d like to avoid running the final command if shake did not actually rebuild anything. Is there a way to tell shake “Run command foo, after everything else, and only if you changed something!”?
Or alternatively, have shake report whether it did something in the process exit code?
I guess I can add a rule that depends on all possibly generated file, but that seems to be redundant and error prone.
Currently there is no direct/simple way to determine if anything built. It's also not such a useful concept as for simpler build systems, as certain rules (especially those that define storedValue to return Nothing) will always "rerun", but then very quickly decide they don't need to run the rules that depend on them. To Shake, that is the same as rerunning. I can think of a few approaches, which one is best probably depends on your situation:
Tag the interesting rules
You could tag each interesting rule (one that produces something that needs uploading) with a function that writes to a specific file. If that specific file exists, then you need to upload. This might work slightly better, as if you do multiple Shake runs, and in the first something changes but the second nothing does, the file will still be present. If it makes sense, use an IORef instead of a file.
Use profiling
Shake has quite advanced profiling. If you pass shakeProfile=["output.json"] it will produce a JSON file detailing what built and when. Runs are indexed by an Int, with 0 for the most recent run, and any runs that built nothing are excluded. If you have one rule that always fires (e.g. write to a dummy file with alwaysRerun) then if anything fired at the same time, it rebuilt.
Watch the .shake.database file size
Shake has a database, stored under shakeFiles. Each uninteresting run it will grow by a fairly small amount (~100 bytes) - but a fixed size given your system. If it changes in size by a greater amount, then it did something interesting.
Of these approaches, tagging the interesting rules is probably the simplest and most direct (although does run the risk of you forgetting to tag something).

Is is OK to use a non-zero return code for a process that executed successfully?

I'm implementing a simple job scheduler, which spans a new process for every job to run. When a job exits, I'd like it to report the number of actions executed to the scheduler.
The simplest way I could find, is to exit with the number of actions as a return code. The process would for example exit with return code 3 for "3 actions executed".
But the standard (AFAIK) being to use the return code 0 when a process exited successfully, and any other value when there was en error, would this approach risk to create any problem?
Note: the child process is not an executable script, but a fork of the parent, so not accessible from the outside world.
What you are looking for is inter process communication - and there are plenty ways to do it:
Sockets
Shared memory
Pipes
Exclusive file descriptors (to some extend, rather go for something else if you can)
...
Return convention changes are not something a regular programmer should dare to violate.
The only risk is confusing a calling script. What you describe makes sense, since what you want really is the count. As Joe said, use negative values for failures, and you should consider including a --help option that explains the return values ... so you can figure out what this code is doing when you try to use it next month.
I would use logs for it: log the number of actions executed to the scheduler. This way you can also log datetimes and other extra info.
I would not change the return convention...
If the scheduler spans a child and you are writing that you could also open a pipe per child, or a named pipes or maybe unix domain sockets, and use that for inter process communication and writing the processed jobs there.
I would stick with conventions, namely returning 0 for success, expecially if your program is visible/usable around by other people, or anyway document well those decisions.
Anyway apart from conventions there are also standards.

Resources