I have several maintenance job stuck on my sql server , stuck from the 10th of may , what should I do first? I try to stop them from sql agent but nothing happens
I did a check with sp_whoisactive
have this kind of statments
(327853387ms)PREEMPTIVE_OS_AUTHZINITIALIZECON
update: I restarted the agent but the jobs after a while re appear with the ne execution (they are scheduled jobs)
At the end the solution was to restart sql service, the restart of the sql agent was not sufficient
Related
currently experiencing an issue with azure pipelines where a job seems to be stuck running stopping other jobs from being processed. The running job has been cancelled yet the agent says it is running, are there any solutions to this? We've tried deleting the 'azure pipelines', turning the agent off and back on again but no luck, is this likely to be an azure bug? We have not hit any caps or limits
Below you can see there is one running job.
When I click into azure pipelines no processes are running
But the agent thinks it is running Job 938 but as can be seen it is not running
Any help appreciated, thanks
I have a cron scheduled to run on Thor cluster. Is there a way to monitor a cron running on HPCC Cluster and send a notification if the cron is not running due to a failure or system shutdown?
Akhilesh,
The only way I can think of to do that would be to make the CRON job periodically send a "ping" of some sort (an email, or update a semaphore file, or ... ) then have a separate process running on another box to alert someone if that "ping" doesn't arrive as scheduled (indicating the CRON job is no longer working).
HTH,
Richard
I need to know , how to stop a azure databricks cluster by doing configuration when it is running infinitely for executing a job.(without manual stopping)and as well as create an email alert for it, as the job running time exceeds its usual running time.
You can do this in the Jobs UI, Select your job, under Advanced, edit the Alerts and Timeout values.
This Databricks docs page may help you: https://docs.databricks.com/jobs.html
Sorry but can't find he configuration point a need. I schedule spark application, sometimes they may not succeed after 1 hour, in this case I want to automatically kill this task (because I am sure it will never succeed, and another scheduling may start).
I found a timeout configuration, but as I understand it, this is used to delay the start of a workflow.
So is there a kind of living' timeout ?
Oozie cannot kill a workflow that it triggered. However you can ensure that a single workflow is running at same time by setting Concurrency = 1 in the Coordinator.
Also you can have a second Oozie workflow monitoring the status of the Spark job.
Anyawy, you should investigate the root cause of Spark job not successful or being blocked.
I have been using Google Dataproc for a few weeks now and since I started I had a problem with canceling and stopping jobs.
It seems like there must be some server other than those created on cluster setup, that keeps track of and supervises jobs.
I have never had a process that does its job without error actually stop when I hit stop in the dev console. The spinner just keeps spinning and spinning.
Cluster restart or stop does nothing, even if stopped for hours.
Only when the cluster is entirely deleted will the jobs disappear... (But wait there's more!) If you create a new cluster with the same settings, before the previous cluster's jobs have been deleted, the old jobs will start on the new cluster!!!
I have seen jobs that terminate on their own due to OOM errors restart themselves after cluster restart! (with no coding for this sort of fault tolerance on my side)
How can I forcefully stop Dataproc jobs? (gcloud beta dataproc jobs kill does not work)
Does anyone know what is going on with these seemingly related issues?
Is there a special way to shutdown a Spark job to avoid these issues?
Jobs keep running
In some cases, errors have not been successfully reported to the Cloud Dataproc service. Thus, if a job fails, it appears to run forever even though it (has probably) failed on the back end. This should be fixed by a soon-to-be released version of Dataproc in the next 1-2 weeks.
Job starts after restart
This would be unintended and undesirable. We have tried to replicate this issue and cannot. If anyone can replicate this reliably, we'd like to know so we can fix it! This may (is provably) be related to the issue above where the job has failed but appears to be running, even after a cluster restarts.
Best way to shutdown
Ideally, the best way to shutdown a Cloud Dataproc cluster is to terminate the cluster and start a new one. If that will be problematic, you can try a bulk restart of the Compute Engine VMs; it will be much easier to create a new cluster, however.