Setting nextFireTime of a quartz job manually in groovy - groovy

I want to use a own errorhandling on quartz jobs. Each job has a different waiting time, when an exception occurs. For example, a job runs every 30 seconds, but when an exception occurs, the job should wait for 5 minutes.
I tried this approach, but it doesn't work:
SchedulerFactory sf = new StdSchedulerFactory()
Scheduler sched = sf.getScheduler()
def name = "jobname"
Trigger trigger = sched.getTrigger(new TriggerKey("trigger_" + name))
def currentDate = new Date()
use (TimeCategory) {
currentDate = currentDate + 300.seconds
}
trigger.nextFireTime = currentDate
The job runs in 30 seconds again.
What am I doing wrong?

I may be wrong but are you sure you can schedule a job by simply setting the nextFireTime property?
I guess you have to use http://quartz-scheduler.org/api/2.2.0/org/quartz/Scheduler.html#rescheduleJob(org.quartz.TriggerKey, org.quartz.Trigger) do reschedule a job.
e.g.
SchedulerFactory sf = new StdSchedulerFactory()
Scheduler sched = sf.getScheduler()
def name = "jobname"
Trigger trigger = sched.getTrigger(new TriggerKey("trigger_" + name))
trigger.repeatInterval = 30000
sched.rescheduleJob(trigger.name, trigger.group, trigger)
That would run the job in 5 minutes and then you'd have to reschedule it to run in 30 seconds.

Related

MLflow is taking longer than expected time to finish logging metrics and parameters

I'm running a code where I have to perform multiple iterations for a set of products to select the best performing model. While running multiple iterations for a single product, I need to log details of every single run using mlflow(using mlflow with pandas-udf). While logging for individual iterations are taking around 2 seconds but the parent run under which I'm tracking every iteration details is taking 1.5 hours to finish. Here is the code -
#F.pandas_udf( model_results_schema, F.PandasUDFType.GROUPED_MAP )
def get_gam_pe_results( model_input ):
...
...
for j, gam_terms in enumerate(term_list[-1]):
results_iteration_output_1, results_iteration_output, results_iteration_all = run_gam_model(gam_terms)
results_iteration_version = results_iteration_version.append(results_iteration_output)
unique_id = uuid.uuid1()
metric_list = ["AIC", "AICc", "GCV", "adjusted_R2", "deviance", "edof", "elasticity_in_k", "loglikelihood",
"scale"]
param_list = ["features"]
start_time = str(datetime.now())
with mlflow.start_run(run_id=parent_run_id, experiment_id=experiment_id):
with mlflow.start_run(run_name=str(model_input['prod_id'].iloc[1]) + "-" + unique_id.hex,
experiment_id=experiment_id, nested=True):
for item in results_iteration_output.columns.values.tolist():
if item in metric_list:
mlflow.log_metric(item, results_iteration_output[item].iloc[0])
if item in param_list:
mlflow.log_param(item, results_iteration_output[item].iloc[0])
end_time = str(datetime.now())
mlflow.log_param("start_time", start_time)
mlflow.log_param("end_time", end_time)
Outside pandas-udf -
current_time = str(datetime.today().replace(microsecond=0))
run_id = None
with mlflow.start_run(run_name="MLflow_pandas_udf_testing-"+current_time, experiment_id=experiment_id) as run:
run_id = run.info.run_uuid
gam_model_output = (Product_data
.withColumn("run_id", F.lit(run_id))
.groupby(['prod_id'])
.apply(get_gam_pe_results)
)
Note - Running this entire code in Databricks(cluster has 8 cores and 28gb ram).
Any idea why this parent run is taking so long to finish while it's only 2 seconds to finish each iterations?

Schedule jobs dynamically with Flask APScheduler

I'm trying the code given at advanced.py with the following modification for adding new jobs.
I created a route for adding a new job
#app.route('/add', methods = ['POST'])
def add_to_schedule():
data = request.get_json()
print(data)
d = {
'id': 'job'+str(random.randint(0, 100)),
'func': 'flask_aps_code:job1',
'args': (random.randint(200,300),random.randint(200, 300)),
'trigger': 'interval',
'seconds': 10
}
print(app.config['JOBS'])
scheduler.add_job(id = 'job'+str(random.randint(0, 100)), func = job1)
#app.config.from_object(app.config['JOBS'].append(d))
return str(app.config['JOBS']), 200
I tried adding the jobs to config['JOBS'] as well as scheduler.add_job. But none of my new jobs are not getting executed. Additionally, my first scheduled job doesnt get executed till I do a ctrl+c on the terminal, after which the first scheduled job seems to execute, twice. What am I missing?
Edit: Seemingly the job running twice is because of flask reloading, so ignore that.

How to prevent Execution usage limit in scheduled scripts

I am using the scheduled script which will create the custom records based on criteria. every time when the schedule script runs it should create approx. 100,000 records but the script is timing out after creating 5000 or 10000 records. I am using the below script to prevent the script execution usage limit but even with this also the script is not working. can any one please suggest some thing or provide any information. any suggestions are welcome and highly appreciated.
In my for loop iam using the below script. with this below script included the scheduled script is able to create up to 5000 or 10000 records only.
if (nlapiGetContext().getRemainingUsage() <= 0 && (i+1) < results.length )
{
var stateMain = nlapiYieldScript();
}
If you are going to reschedule using the nlapiYieldScript mechanism, then you also need to use nlapiSetRecoveryPoint at the point where you wish the script to resume. See the Help documentation for each of these methods, as well as the page titled Setting Recovery Points in Scheduled Scripts
Be aware that nlapiSetRecoveryPoint uses 100 governance units, so you will need to account for this in your getRemainingUsage check.
#rajesh, you are only checking the remaining usage. Also do check for execution time limit, which is 1 hour for any scheduled script. Something like below snippet-
var checkIfYieldOrContinue = function(startTime) {
var endTime = new Date().getTime();
var timeElapsed = (endTime * 0.001) - (startTime * 0.001);
if (nlapiGetContext().getRemainingUsage() < 3000 ||
timeElapsed > 3500) { //3500 secs
nlapiLogExecution('AUDIT', 'Remaining Usage: ' + nlapiGetContext().getRemainingUsage() + '. Time elapsed: ' + timeElapsed);
startTime = new Date().getTime();
var yieldStatus = nlapiYieldScript();
nlapiLogExecution('AUDIT', 'script yielded.' + yieldStatus.status);
nlapiLogExecution('AUDIT', 'script yielded reason.' + yieldStatus.reason);
nlapiLogExecution('AUDIT', 'script yielded information.' + yieldStatus.information);
}
};
Inside your for loop, you can call this method like-
var startTime = new Date();
if ((i+1) < results.length ) {
//do your operations here and then...
checkIfYieldOrContinue(startTime);
}
I have a script that lets you process an array like a forEach. The script checks each iteration and calculates the maximum usage and yields when there is not enough usage left to cover the max.
Head over to https://github.com/BKnights/KotN-Netsuite and download simpleBatch.js

JMeter: How to run test from .bat file for a specific duration ignoring TG Start & End Times

I have a test with 3 threads:
Startup
Actual Test
Cleanup
I do not want to have to specify a Scheduler Start/End time for each Thread Group. Instead, I want anyone to kick off the test whenever necessary from a .bat file and have it run for a duration specified in the .bat file.
My .bat file is configured as follows where I want the test to run for 30 minutes (1800 seconds):
#echo on
call ..\..\binaries\apache-jmeter-2.13\bin\jmeter -Jduration=1800 -Jhostname=localhost -Jport=18100 -n -t "API Performance.jmx" -l performanceAPITestResults.log
If I run the test from the .bat file as outlined and no Scheduler set for each Thread Group, then the test only runs once and exits. (approx. 90 seconds)
In the test, if I enable the Scheduler in each TG and specify a date in the past along with a duration for each, kicking off the .bat file results in the test only being run once and ignoring the duration. If I specify a date in the future, the test hangs awaiting for the future time to begin.
Anyone have any suggestions?
Additional Details
In the .jmx test, I seem to have to specify the following Scheduler in each TG:
Startup TG
Start Date = today # 11:00:00
Stop Date = today # 11:00:10
Duration (seconds) = 10
Startup Delay = null
Test TG
Start Date = today # 11:00:10
Stop Date = today # 11:30:10
Duration (seconds) = 1800
Startup Delay = null
Teardown TG
Start Date = today # 11:30:10
Stop Date = today # 11:30:15
Duration (seconds) = 4
Startup Delay = null
JMeter ignores the start time if it is past time. You can just parameterize the duration.
Just pass 3 arguments for duration like this.
jmeter -n -t test.jmx -Jsetup.duration=10 -Jtest.duration=1800 -Jtear.duration=4
In your test
Startup TG
Start Date = today # 11:00:00
Stop Date = today # 11:00:10
Duration (seconds) = ${__P(setup.duration)}
Startup Delay = null
Test TG
Start Date = today # 11:00:10
Stop Date = today # 11:30:10
Duration (seconds) = ${__P(test.duration)}
Startup Delay = null
Teardown TG
Start Date = today # 11:30:10
Stop Date = today # 11:30:15
Duration (seconds) = ${__P(tear.duration)}
Startup Delay = null
More info on __P function.
${__P(property_name,default_value)} - When you use like this, if the property is not passed to the test, the test will use the default value.
So, jmeter -n -t test.jmx will do to invoke the test.
Startup TG
Start Date = today # 11:00:00
Stop Date = today # 11:00:10
Duration (seconds) = ${__P(setup.duration,10)}
Startup Delay = null
Test TG
Start Date = today # 11:00:10
Stop Date = today # 11:30:10
Duration (seconds) = ${__P(test.duration,1800)}
Startup Delay = null
Teardown TG
Start Date = today # 11:30:10
Stop Date = today # 11:30:15
Duration (seconds) = ${__P(tear.duration,4)}
Startup Delay = null
If you want to override the default values, you need to pass only those properties.
Ex: jmeter -n -t test.jmx -Jtest.duration=3600 -Jtear.duration=20
will run the setup TG for 10 seconds with the default value, test TG for 3600 seconds with the overridden value and tear TG for 20 seconds with overridden value.

run node scheduler between 9am and 5pm - monday to friday

I have a simple node.js parser that has to push data to a remote server during work hours only and sleep for the rest of the time.
Looking at the available modules, schedule and node-cron (https://github.com/ncb000gt/node-cron) seems to do part of my requirement.
I am using the PM2 module to restart the process, when it goes down
Here is what I have so far in coffee script:
runParser = (callback) ->
#...
console.log 'waking up parser...'
parseAll()
return
_jobs = [ {
name: 'Start parser'
cronTime: '00 34 16 * * 1-5'
onTick: runParser
start: true
id: 'parsedbf'
#timeZone: 'Europe/London'
} ]
_cronJobs = {}
schedule = ->
_jobs.map (job) ->
_cronJobs[job.id] = new cronJob(job)
console.log util.format('%s cronjob scheduled at %s on timezone', job.name, job.cronTime)
return
return
run = ->
start = moment('08:30','HH:mm').valueOf()
now = moment().valueOf()
end = moment('18:00','HH:mm').valueOf()
if start < now and now < end
runParser()
else
schedule(console.info 'scheduler started...')
run(console.info 'sync code statrted after a hard reboot...')
my question, how do i change the script so that at 18:30 the parser is just idle?
should i use schedule.js (http://bunkat.github.io/schedule/index.html) how do i modify the code for this?
any advice much appreciated
Is there any reason you can't just run this in cron? You have the question tagged for cron which refers to the unix utility, it was made to do this sort of thing. You could use a combination of cron and forever: one call starts it in the am and another stops the script in the evening but it runs continuously otherwise.

Resources