Python 3 BlockingScheduler killed without apparent reason - python-3.5

I am running a basic blocking scheduler and it is being killed without apparent reason. In my console, a "Killed" message appears but that's all. Any idea how I could obtain a reason as to why it was killed? My function is as simple as the one below.
from apscheduler.schedulers.blocking import BlockingScheduler
import pandas as pd
import time
sched = BlockingScheduler()
#sched.scheduled_job('cron', day_of_week='mon,tue', hour=17, minute=45)
def scheduled_job():
print("Start time: ", pd.datetime.now(), "\n")
fct.start()
time.sleep(100)
fct.stop()
print("End time: ", pd.datetime.now(), "\n\n")
return
sched.start()

Related

Passing run_date in seconds to apscheduler's add_job

I want to schedule a function in APScheduler to happen only once in the specified time from now. The run_date date trigger accepts datetime objects and/or strings there of:
from datetime import datetime
import os
from apscheduler.schedulers.asyncio import AsyncIOScheduler
try:
import asyncio
except ImportError:
import trollius as asyncio
def tick():
print('Tick! The time is: %s' % datetime.now())
if __name__ == '__main__':
scheduler = AsyncIOScheduler()
scheduler.add_job(tick, 'date', run_date=datetime.fromordinal(datetime.toordinal(datetime.now())+1))
scheduler.start()
print('Press Ctrl+{0} to exit'.format('Break' if os.name == 'nt' else 'C'))
# Execution will block here until Ctrl+C (Ctrl+Break on Windows) is pressed.
try:
asyncio.get_event_loop().run_forever()
except (KeyboardInterrupt, SystemExit):
pass
Is there a built-in functionality in APScheduler to specify the delay time directly in seconds, e.g.something like application_start_time + specified_delay? I tried datetime.fromordinal(datetime.toordinal(datetime.now())+1) as argument to run_date for 1 second after the starting point of my application, but this only hangs for ever without calling the tick function showing the following deprecation message:
Press Ctrl+C to exit
/tmp/a.py:30: DeprecationWarning: There is no current event loop
asyncio.get_event_loop().run_forever()
What you probably wanted was:
from datetime import datetime, timedelta, timezone
scheduler.add_job(tick, "date", run_date=datetime.now(timezone.utc) + timedelta(seconds=1))

How to exit ThreadPoolExecutor with statement immediately when a future is running

Coming from a .Net background I am trying to understand python multithreading using concurrent.futures.ThreadPoolExecutor and submit. I was trying to add a timeout to some code for a test but have realised I don't exactly understand some elements of what I'm trying to do. I have put some simplified code below. I would expect the method to return after around 5 seconds, when the call to concurrent.futures.wait(futures, return_when=FIRST_COMPLETED) completes. In fact it takes the full 10 seconds. I suspect it has to do with my understanding of the with statement as changing the code to thread_pool = concurrent.futures.ThreadPoolExecutor(max_workers=2) results in the behvaiour I would expect. Adding a call to the shutdown method doesn't do anything as all the futures are already running. Is there a way to exit out of the with statement immediately following the call to wait? I have tried using break and return but they have no effect. I am using python 3.10.8
from concurrent.futures import FIRST_COMPLETED
import threading
import concurrent
import time
def test_multiple_threads():
set_timeout_on_method()
print("Current Time =", datetime.now()) # Prints time N + 10
def set_timeout_on_method():
futures = []
with concurrent.futures.ThreadPoolExecutor(max_workers=2) as thread_pool:
print("Current Time =", datetime.now()) # Prints time N
futures.append(thread_pool.submit(time.sleep, 5))
futures.append(thread_pool.submit(time.sleep, 10))
concurrent.futures.wait(futures, return_when=FIRST_COMPLETED)
print("Current Time =", datetime.now()) # Prints time N + 5
print("Current Time =", datetime.now()) # Prints time N + 10
AFAIK, there is no native way to terminate threads from ThreadPoolExecutor and it's supposedly not even a good idea, as described in existing answers (exhibit A, exhibit B).
It is possible to do this with processes in ProcessPoolExecutor, but even then the main process would apparently wait for all the processes that already started:
If wait is False then this method will return immediately and the
resources associated with the executor will be freed when all pending
futures are done executing. Regardless of the value of wait, the
entire Python program will not exit until all pending futures are done
executing.
This means that even though the "End #" would be printed after cca 5 seconds, the script would terminate after cca 20 seconds.
from concurrent.futures import FIRST_COMPLETED, ProcessPoolExecutor, wait
from datetime import datetime
from time import sleep
def multiple_processes():
print("Start #", datetime.now())
set_timeout_on_method()
print("End #", datetime.now())
def set_timeout_on_method():
futures = []
with ProcessPoolExecutor() as executor:
futures.append(executor.submit(sleep, 5))
futures.append(executor.submit(sleep, 10))
futures.append(executor.submit(sleep, 20))
print("Futures created #", datetime.now())
if wait(futures, return_when=FIRST_COMPLETED):
print("Shortest future completed #", datetime.now())
executor.shutdown(wait=False, cancel_futures=True)
if __name__ == "__main__":
multiple_processes()
With max_workers set to 1, the entire script would take cca 35 seconds because (to my surprise) the last future doesn't get cancelled, despite cancel_futures=True.
You could kill the workers, though. This would make the main process finish without delay:
...
with ProcessPoolExecutor(max_workers=1) as executor:
futures.append(executor.submit(sleep, 5))
futures.append(executor.submit(sleep, 10))
futures.append(executor.submit(sleep, 20))
print("Futures created #", datetime.now())
if wait(futures, return_when=FIRST_COMPLETED):
print("Shortest future completed #", datetime.now())
subprocesses = [p.pid for p in executor._processes.values()]
executor.shutdown(wait=False, cancel_futures=True)
for pid in subprocesses:
os.kill(pid, signal.SIGTERM)
...
Disclaimer: Please don't take this answer as an advice to whatever you are trying achieve. It's just a brainstorming based on your code.
The problem is that you can not cancel Future if it was already started:
Attempt to cancel the call. If the call is currently being executed or finished running and cannot be cancelled then the method will return False, otherwise the call will be cancelled and the method will return True.
To prove it I made the following changes:
from concurrent.futures import (
FIRST_COMPLETED,
ThreadPoolExecutor,
wait as futures_wait,
)
from time import sleep
from datetime import datetime
def test_multiple_threads():
set_timeout_on_method()
print("Current Time =", datetime.now()) # Prints time N + 10
def set_timeout_on_method():
with ThreadPoolExecutor(max_workers=2) as thread_pool:
print("Current Time =", datetime.now()) # Prints time N
futures = [thread_pool.submit(sleep, t) for t in (2, 10, 2, 100, 100, 100, 100, 100)]
futures_wait(futures, return_when=FIRST_COMPLETED)
print("Current Time =", datetime.now()) # Prints time N + 5
print([i.cancel() if not i.done() else "DONE" for i in futures])
print("Current Time =", datetime.now()) # Prints time N + 10
if __name__ == '__main__':
test_multiple_threads()
As you can see only three of tasks are done. ThreadPoolExecutor actually based on threading module and Thread in Python can't be stopped in some conventional way. Check this answer

subprocess.popen-process stops running while using it with SMACH

I'm simply trying to start a rosbag-command from python in a SMACH. I figured out that one way to do so is to use subprocesses. My goal is that as soon as the rosbag starts, the state machine transitions to state T2 (and stays there).
However, when starting a rosbag using subprocess.popen inside a SMACH-state and then using rostopic echo 'topic' , the rosbag appears to first properly publishing data, then suddenly stops publishing data and only as soon as I end the SMACH using Ctrl+C, the rosbag continues publishing some more data and before it stops as well.
Is there any reasonable explanation for that (did I maybe miss a parameter or is it just not possible to keep the node running that way)? Or is there maybe a better way to start the rosbag and let in run in the background?
(Btw also some other commands like some roslaunch-commands appear to stop working after they're started via subprocess.popen!)
My code looks as follows:
#!/usr/bin/env python3
import os
import signal
import subprocess
import smach
import smach_ros
import rospy
import time
from gnss_navigation.srv import *
class t1(smach.State):
def __init__(self, outcomes=['successful', 'failed', 'preempted']):
smach.State.__init__(self, outcomes)
def execute(self, userdata):
if self.preempt_requested():
self.service_preempt()
return 'preempted'
try:
process1 = subprocess.Popen('rosbag play /home/faps/bags/2020-05-07-11-18-18.bag', stdout=subprocess.PIPE,
shell=True, preexec_fn=os.setsid)
except Exception:
return 'failed'
return 'successful'
class t2(smach.State):
def __init__(self, outcomes=['successful', 'failed', 'preempted']):
smach.State.__init__(self, outcomes)
def execute(self, userdata):
#time.sleep(2)
if self.preempt_requested():
self.service_preempt()
return 'preempted'
return 'successful'
if __name__=="__main__":
rospy.init_node('test_state_machine')
sm_1 = smach.StateMachine(outcomes=['success', 'error', 'preempted'])
with sm_1:
smach.StateMachine.add('T1', t1(), transitions={'successful': 'T2', 'failed': 'error'})
smach.StateMachine.add('T2', t2(), transitions={'successful': 'T2', 'failed': 'error', 'preempted':'preempted'})
# Execute SMACH plan
outcome = sm_1.execute()
print('exit-outcome:' + outcome)
# Wait for ctrl-c to stop the application
rospy.spin()
As explained in the answer's comment section of this thread the problem appears when using subprocess.PIPE as stdout.
Therefore, the two possible solutions I used to solve the problem are:
If you don't care about print-outs and stuff -> use devnull as output:
FNULL = open(os.devnull, 'w')
process = subprocess.Popen('your command', stdout=FNULL, stderr=subprocess.STDOUT,
shell=True, preexec_fn=os.setsid)
If you do need print-outs and stuff -> create a log-file and use it as output:
log_file = open('path_to_log/log.txt', 'w')
process = subprocess.Popen('your command', stdout=log_file, stderr=subprocess.STDOUT,
shell=True, preexec_fn=os.setsid)

apscheduler calling function params only once

I have tweaked this basic example to illustrate the point in the subject:
https://github.com/agronholm/apscheduler/blob/master/examples/executors/processpool.py
Here is the tweaked code(see args=[datetime.now()])
#!/usr/bin/env python
from datetime import datetime
import os
from apscheduler.schedulers.blocking import BlockingScheduler
def tick(param):
print('Tick! The time is: %s' % param)
if __name__ == '__main__':
scheduler = BlockingScheduler()
scheduler.add_executor('processpool')
scheduler.add_job(tick, 'interval', seconds=3, args=[datetime.now()])
print('Press Ctrl+{0} to exit'.format('Break' if os.name == 'nt' else 'C'))
try:
scheduler.start()
except (KeyboardInterrupt, SystemExit):
pass
When I run it, the output timestamp does not update:
$ ./test.py
Press Ctrl+C to exit
Tick! The time is: 2019-01-28 19:41:53.131599
Tick! The time is: 2019-01-28 19:41:53.131599
Tick! The time is: 2019-01-28 19:41:53.131599
Is this the expected behavior? I'm using Python 3.6.7 and apscheduler 3.5.3, thanks.
This has nothing to do with APScheduler. What you're doing could be rewritten like this:
args = [datetime.now()]
scheduler.add_job(tick, 'interval', seconds=3, args=args)
You're calling datetime.now() and then passing its return value in a list to scheduler.add_job(). Since you're passing a datetime, how do you expect APScheduler to call datetime.now() every time the target function is executed?

Scrapy - run at time interval

i have a spider for crawling a site and i want to run it every 10 minutes. put it in python schedule and run it. after first run i got
ReactorNotRestartable
i try this sulotion and got
AttributeError: Can't pickle local object 'run_spider..f'
error.
edit:
try how-to-schedule-scrapy-crawl-execution-programmatically python program run without error and crawl function run every 30 seconds but spider doesn't run and i don't get data.
def run_spider():
def f(q):
try:
runner = crawler.CrawlerRunner()
deferred = runner.crawl(DivarSpider)
#deferred.addBoth(lambda _: reactor.stop())
#reactor.run()
q.put(None)
except Exception as e:
q.put(e)
runner = crawler.CrawlerRunner()
deferred = runner.crawl(DivarSpider)
q = Queue()
p = Process(target=f, args=(q,))
p.start()
result = q.get()
p.join()
if result is not None:
raise result
The multiprocessing solution is a gross hack to work-around lack of understanding of how Scrapy and reactor management work. You can get rid of it and everything is much simpler.
from twisted.internet.task import LoopingCall
from twisted.internet import reactor
from scrapy.crawler import CrawlRunner
from scrapy.utils.log import configure_logging
from yourlib import YourSpider
configure_logging()
runner = CrawlRunner()
task = LoopingCall(lambda: runner.crawl(YourSpider()))
task.start(60 * 10)
reactor.run()
Easiest way I know to do it is using a separate script to call the script containing your twisted reactor, like this:
cmd = ['python3', 'auto_crawl.py']
subprocess.Popen(cmd).wait()
To run your CrawlerRunner every 10 minutes, you could use a loop or crontab on this script.

Resources