How to ftp the file by appending the number at the end of the file using JCL - mainframe

We have Job which will run 4 times in a day, so now at every time it runs it creates a file and ftp the file as "LCD_BI_ORDERINFO_&OYMD._&OHHMM._001_01.txt " , So now i want the file to be sent as below:
after each run it should append with 01 02 03 and 04.txt at the end of the file as below.
LCD_BI_ORDERINFO_&OYMD._001_01.txt
LCD_BI_ORDERINFO_&OYMD._001_02.txt
LCD_BI_ORDERINFO_&OYMD._001_03.txt
LCD_BI_ORDERINFO_&OYMD._001_04.txt
Thanks in advance

Wrap your existing job as a cataloged procedure with a parameter indicating the suffix number. Now, instead of one job that runs four times a day, have four different sets of execution JCL for the cataloged proc, each of which will execute once a day, specifying the correct suffix number.
Schedule these in your shop's job scheduler.

Related

how to generate batch files in a loop in Python (not a loop in a batch file) with slightly changed parameters per iteration

I am a researcher that needs to run files for a set of years on a SLURM system (high performance computing center). The available nodes for long compute times have a long queue. I have 42 years to run, and the only way to run them that gets my files processed quickly (due to the wait times, and that this is many GB of data, it takes time), is to submit them individually, one batch file per year, as jobs. I cannot include multiple years in a single batch file, or I have to wait a week in the queue to run my data due to the time I have to reserve per batch file. This is the fastest way my university's system lets me run my data.
To do this, I have 2 lines in my batch script that I have to change every time: the name of the job, and the last line which is the python script name plus a parameter being passed to it (the year)
like so: pythonscript.py 2020.
I would like to generate batch files with a python or other script I can run, where it loops over a list of years and just changes the job name to jobNameYEAR and changes the last line to pythonscript.py YEAR, writes that to a file jobNameYEAR.sl, then continues in a loop to output the next batch file. ...Even better if it can write the batch file and submit the job (sjob jobNameYEAR) before continuing in the loop, but I realize maybe this is asking too much. But separately...
Is there a way to submit jobs in a loop once these files are created? E.g. loop through the year list and submit sjob jobName2000.sl, sjob jobName2001.sl, sjob jobName2002.sl
I do not want a loop in the batch file changing the variable, this would mean reserving too many hours on the SLURM system for a single job. I want a loop outside of the batch file that generates multiple batch files I can submit as jobs.
Thank you for your help!
This is what one of my .sl files looks like, it works fine, I just want to generate these files in a loop so I can stop editing them by hand:
#!/bin/bash -l
# The -l above is required to get the full environment with modules
# Set the allocation to be charged for this job
# not required if you have set a default allocation
#SBATCH -A MYFOLDER
# The name of the job
#SBATCH -J jobNameYEAR
# 24 hour wall-clock time will be given to this job
#SBATCH -t 3:00:00
# Job partition
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=6
#SBATCH --mem=30GB
#SBATCH -p main
# load the anaconda module
ml PDC/21.11
ml Anaconda3/2021.05
conda activate myEnv
python pythonfilename.py YEAR
Create a script with the following content (let's call it chainsubmit.sh):
#!/bin/bash
SCRIPT=${1?Usage: $0 script.slum arg1 arg2 ...}
shift
ARG=$1
ID=$(sbatch --job-name=jobName$ARG --parsable $SCRIPT $ARG)
shift
for ARG in "$#"; do
ID=$(sbatch --job-name=jobName$ARG --parsable --dependency=afterok:${ID%%;*} $SCRIPT $ARG)
done
Then, adapt your script so that the last line
python pythonfilename.py YEAR
is replaced with
python pythonfilename.py $1
Finally submit all the jobs with
./chainsubmit.sh jobName.sl {2000..2004}
for instance for YEAR ranging from 2000 to 2004
… script I can run, where it loops over a list of years and just changes the job name to jobNameYEAR and changes the last line to pythonscript.py YEAR, writes that to a file jobNameYEAR.sl… submit the job (sjob jobNameYEAR) before continuing in the loop…
It can easily be done with a few shell commands and sed. Assume you have a template file jobNameYEAR.sl as shown, which literally contains jobNameYEAR and YEAR as the parameters. Then we can substitute YEAR with each given year in the loop, e. g.
seq 2000 2002|while read year
do <jobNameYEAR.sl sed s/YEAR$/$year/ >jobName$year.sl
sjob jobName$year.sl
done
If your years aren't in sequence, we can use e. g. echo 1962 1965 1970 instead of seq ….
Other variants are on Linux also possible, like for year in {2000..2002} instead of seq 2000 2002|while read year, and using envsubst instead of sed.

Scheduling more jobs than MaxArraySize

Let's say I have 6233 simulations to run. The commands are generated and stored in a file, one in each line. I would like to use Slurm to schedule and run these commands. However, the MaxArraySize limit is 2000. So I can't use one job array to schedule all of them.
One solution is given here, where we create four separate jobs and use arithmetic indexing into the file, with the last job having a smaller number of tasks to run (233).
Is it possible to do this using one sbatch script with one job ID?
I set ntasks=1 when using job arrays. Do larger ntasks help in such situations?
Update:
Following Damien's solution and examples given here, I ended up with the following line in my bash script:
curID=$(( ${SLURM_ARRAY_TASK_ID} * ${SLURM_NTASKS} + ${SLURM_PROCID} ))
The same can be done using Python (shown in the referenced page). The only difference is that the environment variables should be imported into the script.
Is it possible to do this using one sbatch script with one job ID?
No. That solution will give you multiple job IDs
I set ntasks=1 when using job arrays. Do larger ntasks help in such situations?
Yes, that is a factor that you can leverage.
Each job in the array can spawn multiple tasks (--ntasks=...). In that case, the line number in the command file must be computed from $SLURM_ARRAY_TASK_ID and $SLURM_PROCID, and the program must be started with srun. Each task in a job member of the array will run in parallel. How large the job can be will depend on the MaxJobsize limit defined on the cluster/partition/qos you have access to.
Another option is to chain the tasks inside each job of the array, with a Bash loop (for i in $seq(...) ; do ...; done). In that case, the line number in the command file must be computed from $SLURM_ARRAY_TASK_ID and $i. Each task in a job member of the array will run serially. How large the job can be will depend on the MaxWall limit defined on the cluster/partition/qos you have access to.

Cron Expression: starting a job after completion of another job

I have to perform 2 jobs - A and B. Job 'A' is to be prformed at 9:00 am of every weekday. I dont know the duration for job 'A' though, duration may vary.
Also I want to perform job 'B' after 3mins of completion of job 'A'.
Can anyone suggest the cron expression for this please.
Assuming you are trying to run the second job three minutes after the first job completes, let's say you have Job A which involves calling /home/user/job_a.sh and then once that completes, you want to run /home/user/job_b.sh. Instead of trying to set up two different Cron jobs, you could just make a separate script, say job_c.sh. And all job_c.sh does is run Job A, wait three minutes and then run Job B.
Basically, rather than calling two Cron Jobs and trying to sort out timing for both of them, you can just establish one Cron Job which runs both jobs.
On the other hand, if you want to run the second job three minutes after the first one starts then you might as well create two Cron Jobs with three minutes between them which would look something like this:
00 9 * * 1-5 /home/user/job_a.sh
03 9 * * 1-5 /home/user/job_b.sh

creating cron job that sends output to file every day and overwrites this file every month

I need help with cron job that sends output to file every day and overwrites this file every month my only problem is how to make it overwrite each month and I need this in one job so creating 2 jobs one that outputs to a file and other removing it every month is out of picture
You could run it every day but use date +%w to print the day number and act differently (call with > to clobber the file instead of >> to append) based on that.
Note that some cron daemons require % to be escaped, hence \%.
# Run every day at 00:30 but overwrite file on Mondays; append every other day.
# Note that this requires bash as your shell.
# May need to override with SHELL=/bin/bash
30 00 * * * if [ "$(date +\%w)" = "1" ]; then /your/command > /your/logfile; else /your/command >> /your/logfile; fi
Edit:
You mention in comments above that your actual goal is log rotation.
The norm for Linux systems is to use something like logrotate to manage logs like this. That also has the advantage that you can keep multiple previous log files and compress them if you like.
I would recommend making use of a logrotate config snippet to accomplish your goal instead of doing it in the cron job itself. To put this in the cron job is counter-intuitive if it's merely for log rotation.
Here's an example logrotate snippet, which may go in a location like /etc/logrotate.d/yourapp depending on which Linux distribution you're using.
/var/log/yourlog {
daily
missingok
# keep one year of logs
rotate 365
compress
# keep the first one uncompressed for ease of viewing
delaycompress
}
This will result in your log file being rotated daily, with the first iteration being like /var/log/yourlog.1 and then compressed iterations like /var/log/yourlog.2.gz, /var/log/yourlog.3.gz and so on.
In my opinion therefore, your question is not actually a cron question. The kind of cron trickery used above would only be appropriate in situations such as when you want a job to fire on the last Sunday of the month, or the last day of the month, or other criteria that can't be expressed in cron syntax.

cron job two parameters - start time and end time should cover entire day

I have written a shell script for data extraction that accepts two parameters - Start time and end time in YYDDMMHHSSSS format. The shell script in turn will run sql queries and fetch data between these two date parameters.
My intention is to deploy the shell script as a cron job which should run at least once every day(preferably every 6 hours). The second time it runs it should use that last End time as the Start time, and the new End time as, say (Starttime + 6 hours). So all data is always extracted once. Another job will kick off at say 12 in the midnight everyday and it will pick up the data that the shell scrip deposited for that day.
I have never setup a cron job before but it looks doable from what I have read, I'm not sure if the above thing can be done though?
Cron executes jobs at specific times and/or days with all parameters for the script defined at the time the job is placed into the cron job table. The script needs to handle all other requirements. If your requirements are based on the current time and the last time the script was executed, then the script will need to preserve the time of execution each time it is run and the obtain the last time it was invoked from the information preserved.
In this particular case, because you are accessing a database, I suggest that you use the database to preserve time of the previous script execution.

Resources