We use Hudson for our continuose build environment. For some reason, the thread for SCM Polling hungs somethimes after a while. I've experiemented a lot with the settings, but nothing seems to really work. How to fix this and are there some scripts out there which can detect such a case to be able to restart hudson? Btw. restarting hudson is the only way to solve this issue for us at the moment.
That is similar to bug 5413, which should be solved since late 2010 with HUDSON 5977 (Hudson 1.380+, or now Jenkins).
You had in those thread some way to kill any thread stuck on the polling step:
very primitive (I'm too lazy to develop something better as this is not very important issue) Groovy script is bellow.
It could happened that it will kill also SCM polling which are not stuck, but we run this script automatically only once a day so it doesn't cause any troubles for us.
You can improve it e.g. by saving ids and names of SCM polling threads, check once again after some time and kill only threads which ids are on the list from previous check.
Thread.getAllStackTraces().keySet().each(){ item ->
if( item.getName().contains("SCM polling") &&
item.getName().contains("waiting for hudson.remoting")){
println "Interrupting thread " + item.getId() item.interrupt()
}
}
The other answer didn't work for me, but the following script found the the issue for this problem did:
Jenkins.instance.getTrigger("SCMTrigger").getRunners().each()
{
item ->
println(item.getTarget().name)
println(item.getDuration())
println(item.getStartTime())
long millis = Calendar.instance.time.time - item.getStartTime()
if(millis > (1000 * 60 * 3)) // 1000 millis in a second * 60 seconds in a minute * 3 minutes
{
Thread.getAllStackTraces().keySet().each()
{
tItem ->
if (tItem.getName().contains("SCM polling") && tItem.getName().contains(item.getTarget().name))
{
println "Interrupting thread " + tItem.getName();
tItem.interrupt()
}
}
}
}
Related
I am very new to NodeJS and trying to develop an application which acts as a scheduler that tries to fetch data from ELK and sends the processed data to another ELK. I am able to achieve the expected behaviour but after completing all the processes, scheduler job does not exists and wait for another scheduler job to come up.
Note: This scheduler runs every 3 minutes.
job.js
const self = module.exports = {
async schedule() {
if (process.env.SCHEDULER == "MinuteFrequency") {
var timenow = moment().seconds(0).milliseconds(0).valueOf();
var endtime = timenow - 60000;
var starttime = endtime - 60000 * 3;
//sendData is an async method
reports.sendData(starttime, endtime, "SCHEDULER");
}
}
}
I tried various solutions such Promise.allSettled(....., Promise.resolve(true), etc, but not able to fix this.
As per my requirement, I want the scheduler to complete and process and exit so that I can save some resources as I am planning to deploy the application using Kubernetes cronjobs.
When all your work is done, you can call process.exit() to cause your application to exit.
In this particular code, you may need to know when reports.sendData() is actually done before exiting. We would have to know what that code is and/or see the code to know how to know when it is done. Just because it's an async function doesn't mean it's written properly to return a promise that resolves when it's done. If you want further help, show us the code for sendData() and any code that it calls too.
I've been working with node for the first time in a while again and stumbled upon node-schedule, which for the most part has been a breeze, however, I've found resuming a scheduled task after canceling it via job.cancel() pretty difficult.
For the record, I'm using schedule to perform specific actions at a specific date (non-recurring) and under some circumstances cancel the task at a specific date but would later like to resume it.
I tried using job.cancel(true) after cancelling it via plain job.cancel() first as the documentation states that that would reschedule the task, but this has not worked for me. Using job.reschedule() after having cancelled job first yields the same result.
I could probably come up with an unelegant solution, but I thought I'd ask if anyone knows of an elegant one first.
It took me a while to understand node-schedule documentation ^^
To un-cancel a job, You have to give to reschedule some options.
If you don't pass anything to reschedule, this function returns false (Error occured)
For exemple, you can declare options, and pass this variable like this :
const schedule = require('node-schedule');
let options = {rule: '*/1 * * * * *'}; // Declare schedule rules
let job = schedule.scheduleJob(options, () => {
console.log('Job processing !');
});
job.cancel(); // Cancel Job
job.reschedule(options); // Reschedule Job
Hope it helps.
This question already has an answer here:
How to emit a Qt signal daily at a given time?
(1 answer)
Closed 7 years ago.
I am using Qt5 under Windows7.
I know how to create a task using QThread, but my problem is:
How do I run it every day at 03:00AM?
I was thinking about QTimer, but it doesn't seem to be ok... it can't be linked somehow to 03:00am.
Just to make it clear: I can't use some Windows application(s). It must be coded inside my Qt app as it does some cleaning job too: cleanup history list, trim it down to 1000 lines (or whatever), etc. So, you see I can't do that using TaskScheduler or similar Windows tools...
you can use windows task scheduler to do this for you
Whats wrong with using a QTimer? I agree that a task scheduler is the better option. Here, only about 0,03% of the time code is executed it is really supposed to do something. If the exact moment is not as important you can increase the timer interval and the check-boundaries and reduce the unncessary calls. But if you prefer such a solution this should work:
someclass::someclass(){
member_timer = new QTimer(this);
QObject::connect(member_timer, SIGNAL(timeout()), this, SLOT(check_time()));
member_timer->start(30000);
member_cleanup_performed = false;
}
void someclass::check_time(){
QTimer ctime = QTime::currentTime();
if(ctime.hour() == 3 && ctime.minute() == 0){
if(member_cleanup_performed == false){
this->cleanup();
member_cleanup_performed = true;
}
}else{
member_cleanup_performed = false;
}
}
If you can use C++11, have a look at std::this_thread::sleep_until.
Run it in a separate thread and let the thread emit a signal connected to a slot in the main thread, which then performs the action. That of course requires that your application is actually running at 3 am.
I have an IRC bot written in Perl, using the deprecated, undocumented and unloved Net::IRC library. Still, it runs just fine... unless the connection goes down. It appears that the library ceased to be updated before they've implemented support for reconnecting. The obvious solution would be to rewrite the whole bot to make use of the library's successors, but that would unfortunately require rewriting the whole bot.
So I'm interested in workarounds.
Current setup I have is supervisord configured to restart the bot whenever the process exits unexpectedly, and a cron job to kill the process whenever internet connectivity is lost.
This does not work as I would like it to, because the bot seems incapable of detecting that it has lost connectivity due to internet outage. It will happily continue running, doing nothing, pretending to still be connected to the IRC server.
I have the following code as the main program loop:
while (1) {
$irc->do_one_loop;
# can add stuff here
}
What I would like it to do is:
a) detect that the internet has gone down,
b) wait until the internet has gone up,
c) exit the script, so that supervisord can resurrect it.
Are there any other, better ways of doing this?
EDIT: The in-script method did not work, for unknown reasons. I'm trying to make a separate script to solve it.
#!/usr/bin/perl
use Net::Ping::External;
while (1) {
while (Net::Ping::External::ping(host => "8.8.8.8")) { sleep 5; }
sleep 5 until Net::Ping::External::ping(host => "8.8.8.8");
system("sudo kill `pgrep -f 'perl painbot.pl'`");
}
Assuming that do_one_loop will not hang (may need to add some alarm if it does), you'll need to actively poll something to tell whether or not the network is up. Something like this should work to ping every 5 seconds after a failure until you get a response, then exit.
use Net::Ping::External;
sub connectionCheck {
return if Net::Ping::External::ping(host => "8.8.8.8");
sleep 5 until Net::Ping::External::ping(host => "8.8.8.8");
exit;
}
Edit:
Since do_one_loop does seem to hang, you'll need some way to wrap a timeout around it. The amount of time depends on how long you expect it to run for, and how long you are willing to wait if it becomes unresponsive. A simple way to do this is using alarm (assuming you are not on windows):
local $SIG{'ALRM'} = sub { die "Timeout" };
alarm 30; # 30 seconds
eval {
$irc->do_one_loop;
alarm 0;
};
The Net::IRC main loop has support for timeouts and scheduled events.
Try something like this (I haven't tested it, and it's been 7 years since I last used the module...):
# connect to IRC, add event handlers, etc.
$time_of_last_ping = $time_of_last_pong = time;
$irc->timeout(30);
# Can't handle PONG in Net::IRC (!), so handle "No origin specified" error
# (this may not work for you; you may rather do this some other way)
$conn->add_handler(409, sub { $time_of_last_pong = time });
while (1) {
$irc->do_one_loop;
# check internet connection: send PING to server
if ( time-$time_of_last_ping > 30 ) {
$conn->sl("PING"); # Should be "PING anything"
$time_of_last_ping = time;
}
break if time-$time_of_last_pong > 90;
}
I have a list on which I have an ItemUpdated handler.
When I edit using the datasheet view and modify every item, the ItemUpdated event will obviously run for every single item.
In my ItemUpdated event, I want it to check if there is a Timer Job scheduled to run. If there is, then extend the SPOneTimeSchedule schedule of this job to delay it by 5 seconds. If there isn't, then create the Timer Job and schedule it for 5 seconds from now.
I've tried looking to see if the job definition exists in the handler and if it does exist, then extend the schedule by 5 seconds. If it doesn't exist, then create the job definition to run in a minutes time.
MyTimerJob rollupJob = null;
foreach (SPJobDefinition job in web.Site.WebApplication.JobDefinitions)
{
if (job.Name == Constants.JOB_ROLLUP_NAME)
{
rollupJob = (MyTimerJob)job;
}
}
if (rollupJob == null)
{
rollupJob = new MyTimerJob(Constants.JOB_ROLLUP_NAME, web.Site.WebApplication);
}
SPOneTimeSchedule schedule = new SPOneTimeSchedule(DateTime.Now.AddSeconds(5));
rollupJob.Schedule = schedule;
rollupJob.Update();
When I try this out on the server, I get a lot of errors
"An update conflict has occurred, and you must re-try this action. The object MyTimerJob Name=MyTimerJobName Parent=SPWebApplication Name=SharePoint -80 is being updated by NT AUTHORITY\NETWORK SERVICE in the w3wp process
I think the job is probably running for the first time and once running, the other ItemUpdated events are coming in and finding the existing Job definition. It then tries to Update this definition even though it is currently being used. Should I make a new Job Definition name so that it doesn't step on top of the first? Or raise the time to a minute?
I solved this myself by just setting the delay to a minutes time from now regardless of whether a definition is found. This way, while it is busy, it will keep pushing back the scheduling of the job until it is done processing
This is because the event is asynchronous. You'll need to rethink exactly what you're trying to solve with this code and potentially re-factor it.
Maybe you should try using "lock" on the timer job object?