Cron vs APscheduler vs something else for 2 second interval

Cron vs APscheduler vs something else for 2 second interval - cron

I need to pull data from a serial connection at a fixed interval of 2 second with a piece of python code. The software is running on a Raspberry Pi 24/7.
As far as i see it, i have three options:
Start my python script as a service (with systemd) and use an APscheduler
Use a cron-job (possible?)
Use another solution
What is the recommended way of doing it?

Here's how you can do this job in apscheduler
from apscheduler.schedulers.background import BackgroundScheduler
def pull_data():
print("code comes here")
scheduler = BackgroundScheduler()
scheduler.add_job(pull_data, "interval", seconds=2)
scheduler.start()
apscheduler also supports async code
from apscheduler.schedulers.asyncio import AsyncIOScheduler
async def pull_data():
await print("code comes here")
scheduler = AsyncIOScheduler()
scheduler.add_job(pull_data, "interval", seconds=2)
scheduler.start()
You can also do this job with lightweight python library schedule.
import time
import schedule
def pull_data():
print("code comes here")
schedule.every(2).seconds.do(pull_data)
while True:
schedule.run_pending()
time.sleep(1)

Related

Schedule a task apscheduler for every hour and wait if task is not complete

I'm trying to schedule a task to run every hour. The task is a function that will take longer and longer execute. The minimum break the function needs is an hour from its start.
Once the function starts the one hour clock should be started for it to start again. With that comes a problem. Being that the function gets longer and longer to complete it could take over an hour to complete that function. This means the scheduler would have called the function before it completes. The goal is to start that function every hour from when it began but to wait for it to complete once its starts to take over an hour to run.
This is what iv'e been trying:
from threading import Thread
from apscheduler.schedulers.blocking import BlockingScheduler
import threading
import time
def job():
# This is a function that takes longer
# and longer
# and longer
# to complete
thread = Thread(target = job)
scheduler = BlockingScheduler()
scheduler.add_job(job, 'interval', hours=1)
if thread.is_alive():
print("waiting")
else:
scheduler.start()

Using module schedule run schedule immediately then again every hour

Im trying to schedule a task with the module "schedule" for every hour. My problem is i need the task to first run then run again every hour.
This code works fine but it waits an hour before initial run
import schedule
import time
def job():
print("This happens every hour")
schedule.every().hour.do(job)
while True:
schedule.run_pending()
I would like to avoid doing this:
import schedule
import time
def job():
print("This happens immediately then every hour")
schedule.every().hour.do(job)
while i == 0:
job()
i = i+1
while i == 1:
schedule.run_pending()
Ideally it would be nice to have a option like this:
schedule.run_pending_now()

Probably the easiest solution is to just run it immediately as well as scheduling it, such as with:
import schedule
import time
def job():
print("This happens every hour")
schedule.every().hour.do(job)
job() # Runs now.
while True:
schedule.run_pending() # Runs every hour, starting one hour from now.

To run all jobs regardless if they are scheduled to run or not, use schedule.run_all(). Jobs are re-scheduled after finishing, just like they would if they were executed using run_pending().
def job_1():
print('Foo')
def job_2():
print('Bar')
schedule.every().monday.at("12:40").do(job_1)
schedule.every().tuesday.at("16:40").do(job_2)
schedule.run_all()
# Add the delay_seconds argument to run the jobs with a number
# of seconds delay in between.
schedule.run_all(delay_seconds=10)```

If you have many tasks that takes some time to execute and you want to run them independently during start you can use threading
import schedule
import time
def job():
print("This happens every hour")
def run_threaded(task):
job_thread = threading.Thread(target=task)
job_thread.start()
run_threaded(job) #runs job once during start
schedule.every().hour.do(run_threaded, job)
while True:
schedule.run_pending() # Runs every hour, starting one hour from now.

Actually I don't think that calling the function directly would be so wise, since it will block the thread without the scheduler, right?
I think there is nothing wrong about setting the job to be executed once, and every 30 sec for example like that:
scheduler.add_job(MPOStarter.run, args=ppi_args) # run once, then every 30 sec
scheduler.add_job(MPOStarter.run, "interval", seconds=30, args=ppi_args)

Restrict number of subprocess.Popen

Following from run multiple instances of python script simultaneously I can now write a python program to run multiple instances.
import sys
import subprocess
for i in range(1000):
subprocess.Popen([sys.executable, 'task.py', '{}in.csv'.format(i), '{}out.csv'.format(i)])
This starts 1000 subprocess simultaneously. If the command that each subprocess is runs is computationally resource intensive, this can result in load on the machine (may even crash).
Is there a way where I can restrict the number of subprocess to be run at a time? For example something like this:
if (#subprocess_currently_running = 10) {
wait(); // Or sleep
}
That is just allow 10 subprocess to run at a time. In case one out of ten finishes start a new one.

Counting Semaphore is an old-good mechanism which can be used to control/manage the maximum number of concurrently running threads/processes.
But as each subprocess.Popen object (implying process) needs to be waited for termination, the official doc tells us about the important downside of subprocess.Popen.wait()(for this case of multiple concurrent sub-processes):
Note: The function is implemented using a busy loop (non-blocking call and short sleeps). Use the asyncio module for an asynchronous
wait: see asyncio.create_subprocess_exec.
Thus, it's preferable for us to switch to:
asyncio.create_subprocess_exec
asyncio.Semaphore
How it can be implemented:
import asyncio
import sys
MAX_PROCESSES = 10
async def process_csv(i, sem):
async with sem: # controls/allows running 10 concurrent subprocesses at a time
proc = await asyncio.create_subprocess_exec(sys.executable, 'task.py',
f'{i}in.csv', f'{i}out.csv')
await proc.wait()
async def main():
sem = asyncio.Semaphore(MAX_PROCESSES)
await asyncio.gather(*[process_csv(i, sem) for i in range(1000)])
asyncio.run(main())

Need help for automatization of a Python function

I have here a python program written for an Enigma 2 Linux Set top box:
VirtualZap Python program for Enigma 2 based set top boxes
I want to automatize the execution of the following function every minute:
def aktualisieren(self):
self.updateInfos()
You can find the defined function within line 436 and 437.
My problem is that class VirtualZap contains only one constructor but no main method with the actual program run, therefore it is difficult to implement threads or coroutines. Is there any possibility to automatize the execution of the aktualisieren function?

There is an Advanced Python Scheduler
from apscheduler.schedulers.blocking import BlockingScheduler
def aktualisieren(self):
self.updateInfos()
scheduler = BlockingScheduler()
scheduler.add_job(aktualisieren, 'interval', hours=1)
scheduler.start()

Python Notify when all files have been transferred

I am using "watchdog" api to keep checking changes in a folder in my filesystem. Whatever files changes in that folder, I pass them to a particular function which starts threads for each file I pass them.
But watchdog, or any other filesystem watcher api (in my knowledge), notifies users file by file i.e. as the files come by, they notify the user. But I would like it to notify me a whole bunch of files at a time so that I can pass that list to my function and take use of multi-threading. Currently, when I use "watchdog", it notifies me one file at a time and I am only able to pass that one file to my function. I want to pass it many files at a time to be able to have multithreading.
One solution that comes to my mind is: you see when you copy a bunch of files in a folder, OS shows you a progress bar. If it would be possible for me to be notified when that progress bar is done, then it would be a perfect solution for my question. But I don't know if that is possible.
Also I know that watchdog is a polling API, and an ideal API for watching filesystem would be interrupt driven api like pyinotify. But I didn't find any API which was interrupt driven and also cross platform. iWatch is good, but only for linux, and I want something for all OS. So, if you have suggestions on any other API, please do let me know.
Thanks.

Instead of accumulating filesystem events, you could spawn a pool of worker
threads which get tasks from a common queue. The watchdog thread could then put
tasks in the queue as filesystem events occur. Done this way, a worker thread
can start working as soon as an event occurs.
For example,
import logging
import Queue
import threading
import time
import watchdog.observers as observers
import watchdog.events as events
logger = logging.getLogger(__name__)
SENTINEL = None
class MyEventHandler(events.FileSystemEventHandler):
def on_any_event(self, event):
super(MyEventHandler, self).on_any_event(event)
queue.put(event)
def __init__(self, queue):
self.queue = queue
def process(queue):
while True:
event = queue.get()
logger.info(event)
if __name__ == '__main__':
logging.basicConfig(level=logging.DEBUG,
format='[%(asctime)s %(threadName)s] %(message)s',
datefmt='%H:%M:%S')
queue = Queue.Queue()
num_workers = 4
pool = [threading.Thread(target=process, args=(queue,)) for i in range(num_workers)]
for t in pool:
t.daemon = True
t.start()
event_handler = MyEventHandler(queue)
observer = observers.Observer()
observer.schedule(
event_handler,
path='/tmp/testdir',
recursive=True)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
Running
% mkdir /tmp/testdir
% script.py
yields output like
[14:48:31 Thread-1] <FileCreatedEvent: src_path=/tmp/testdir/.#foo>
[14:48:32 Thread-2] <FileModifiedEvent: src_path=/tmp/testdir/foo>
[14:48:32 Thread-3] <FileModifiedEvent: src_path=/tmp/testdir/foo>
[14:48:32 Thread-4] <FileDeletedEvent: src_path=/tmp/testdir/.#foo>
[14:48:42 Thread-1] <FileDeletedEvent: src_path=/tmp/testdir/foo>
[14:48:47 Thread-2] <FileCreatedEvent: src_path=/tmp/testdir/.#bar>
[14:48:49 Thread-4] <FileCreatedEvent: src_path=/tmp/testdir/bar>
[14:48:49 Thread-4] <FileModifiedEvent: src_path=/tmp/testdir/bar>
[14:48:49 Thread-1] <FileDeletedEvent: src_path=/tmp/testdir/.#bar>
[14:48:54 Thread-2] <FileDeletedEvent: src_path=/tmp/testdir/bar>
Doug Hellman has written an excellent set of tutorials (which has now been edited into a book) which should help you get started:
on using Queue
the threading module
how to setup and use a pool of worker processes
how to setup a pool of worker threads
I didn't actually end up using a multiprocessing Pool or ThreadPool as discussed
in the last two links, but you may find them useful anyway.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Cron vs APscheduler vs something else for 2 second interval - cron

Related

Schedule a task apscheduler for every hour and wait if task is not complete

Using module schedule run schedule immediately then again every hour

Restrict number of subprocess.Popen

Need help for automatization of a Python function

Python Notify when all files have been transferred

Categories

Resources