What is best practice to interact with subprocesses in python - python-3.x

I'm building an apllication which is intended to do a bulk-job processing data within another software. To control the other software automatically I'm using pyautoit, and everything works fine, except for application errors, caused from the external software, which occur from time to time.
To handle those cases, I built a watchdog:
It starts the script with the bulk job within a subprocess
process = subprocess.Popen(['python', job_script, src_path], stdout=subprocess.PIPE,
stderr=subprocess.PIPE, shell=True)
It listens to the system event using winevt.EventLog module
EventLog.Subscribe('System', 'Event/System[Level<=2]', handle_event)
In case of an error occurs, it shuts down everything and re-starts the script again.
Ok, if an system error event occurs, this event should get handled in a way, that the supprocess gets notified. This notification should then lead to the following action within the subprocess:
Within the subprocess there's an object controlling everything and continuously collecting
generated data. In order to not having to start the whole job from the beginnig, after re-starting the script, this object has to be dumped using pickle (which isn't the problem here!)
Listening to the system event from inside the subprocess didn't work. It results in a continuous loop, when calling subprocess.Popen().
So, my question is how I can either subscribe for system events from inside a childproces, or communicate between the parent and childprocess - means, sending a message like "hey, an errorocurred", listening within the subprocess and then creating the dump?
I'm really sorry not being allowed to post any code in this case. But I hope (and actually think), that my description should be understandable. My question is just about what module to use to accomplish this in the best way?
Would be really happy, if somebody could point me into the right direction...
Br,
Mic

I believe the best answer may lie here: https://docs.python.org/3/library/subprocess.html#subprocess.Popen.stdin
These attributes should allow for proper communication between the different processes fairly easily, and without any other dependancies.
Note that Popen.communicate() may suit better if other processes may cause issues.
EDIT to add example scripts:
main.py
from subprocess import *
import sys
def check_output(p):
out = p.stdout.readline()
return out
def send_data(p, data):
p.stdin.write(bytes(f'{data}\r\n', 'utf8')) # auto newline
p.stdin.flush()
def initiate(p):
#p.stdin.write(bytes('init\r\n', 'utf8')) # function to send first communication
#p.stdin.flush()
send_data(p, 'init')
return check_output(p)
def test(p, data):
send_data(p, data)
return check_output(p)
def main()
exe_name = 'Doc2.py'
p = Popen([sys.executable, exe_name], stdout=PIPE, stderr=STDOUT, stdin=PIPE)
print(initiate(p))
print(test(p, 'test'))
print(test(p, 'test2')) # testing responses
print(test(p, 'test3'))
if __name__ == '__main__':
main()
Doc2.py
import sys, time, random
def recv_data():
return sys.stdin.readline()
def send_data(data):
print(data)
while 1:
d = recv_data()
#print(f'd: {d}')
if d.strip() == 'test':
send_data('return')
elif d.strip() == 'init':
send_data('Acknowledge')
else:
send_data('Failed')
This is the best method I could come up with for cross-process communication. Also make sure all requests and responses don't contain newlines, or the code will break.

Related

What happens when using python input() if no TTY?

I am trying to write an API client for Telegram using Telethon.
https://github.com/LonamiWebs/Telethon
If you create a TelegramClient(session) it prompts for input upon initialization if your session isn’t authorized.
This is great when manually running the program from the terminal, but what if I want to run it inside a daemon or cron job?
They are using the default Input method from python3 to gather the input. I don’t see any way in the library to specify a session file and check if it’s logged in that can be run before initializing a TelegramClient, and it’s the initializer that will prompt for input if not logged in.
This feels like a catch 22! Does anybody know if this might produce an error that could be caught? Or what happens when input() is run with no tty? Would it just hang? Could I add a timeout in that case?
Thanks in advance for helping me understand better!
You are affirming that the initialization of TelegramClient invokes the input function as default, but this is done inside the TelegramClient.start method (docs).
Taking the solution that you give at the end of your question is a fair aproach, so let's use a timeout when invoking input.
from asyncio import get_event_loop, wait_for, TimeoutError
from functools import partial
from telethon import TelegramClient
async def ainput(prompt):
"""Reads input from stdin in an async way"""
loop = get_event_loop()
await loop.run_in_executor(None, print, prompt)
return await loop.run_in_executor(None, input)
async def get_code(timeout):
"""Waits for the code from stdin with a timeout"""
try:
return await wait_for(
ainput("Please, type the code you received: "),
timeout=timeout
)
except TimeoutError:
pass
client = TelegramClient(session, api_id, api_hash).start(
phone=phone,
code_callback=partial(get_code, 30)
)
You should keep in mind that when you call start the arguments phone, and password also reads from stdin if it isn't provided a callable or default value, so you can handle them like in this example with code_callback.
In your case you can get the code from a POST to your API or in other way, just get creative and write the callable that fits your needs.

Python aiogram bot: send message from another thread

The telegram bot I'm making can execute a function that takes a few minutes to process and I'd like to be able to continue to use the bot while it's processing the function.
I'm using aiogram, asyncio and I tried using Python threading to make this possible.
The code I currently have is:
import asyncio
from queue import Queue
from threading import Thread
import time
import logging
from aiogram import Bot, types
from aiogram.types.message import ContentType
from aiogram.contrib.middlewares.logging import LoggingMiddleware
from aiogram.contrib.fsm_storage.memory import MemoryStorage
from aiogram.dispatcher import Dispatcher, FSMContext
from aiogram.utils.executor import start_webhook
from aiogram.types import InputFile
...
loop = asyncio.get_event_loop()
bot = Bot(token=BOT_TOKEN, loop=loop)
dp = Dispatcher(bot, storage=MemoryStorage())
dp.middleware.setup(LoggingMiddleware())
task_queue = Queue()
...
async def send_result(id):
logging.warning("entered send_result function")
image_res = InputFile(path_or_bytesio="images/result/res.jpg")
await bot.send_photo(id, image_res, FINISHED_MESSAGE)
def queue_processing():
while True:
if not task_queue.empty():
task = task_queue.get()
if task["type"] == "nst":
nst.run(task["style"], task["content"])
send_fut = asyncio.run_coroutine_threadsafe(send_result(task['id']), loop)
send_fut.result()
task_queue.task_done()
time.sleep(2)
if __name__ == "__main__":
executor_images = Thread(target=queue_processing, args=())
executor_images.start()
start_webhook(
dispatcher=dp,
webhook_path=WEBHOOK_PATH,
skip_updates=False,
on_startup=on_startup,
host=WEBAPP_HOST,
port=WEBAPP_PORT,
)
So I'm trying to setup a separate thread that's running a loop that is processing a queue of slow tasks thus allowing to continue chatting with the bot in the meantime and which would send the result message (image) to the chat after it's finished with a task.
However, this doesn't work. My friend came up with this solution while doing a similar task about a year ago, and it does work in his bot, but it doesn't seem to work in mine.
Judging by logs, it never even enters the send_result function, because the warning never comes through. The second thread does work properly though and the result image is saved and is located in its assigned path by the time nst.run finishes working.
I tried A LOT of different things and I'm very puzzled why this solution doesn't work for me because it does work with another bot. For example, I tried using asyncio.create_task instead of asyncio.run_coroutine_threadsafe, but to no avail.
To my understanding, you don't need to pass a loop to aiogram's Bot or Dispatcher anymore, but in that case I don't know how to send a task to the main thread from the second one.
Versions I'm using: aiogram 2.18, asyncio 3.4.3, Python 3.9.10.
Solved, the issue was that you can't access the bot's loop directly (with bot.loop or dp.loop) even if you pass your own asyncio loop to the bot or the dispatcher.
So the solution was to access the main thread's loop by using asyncio.get_event_loop() (which returns currently running loop, if there's one) from within one of the message handlers, because the loop is running at this point, and pass it to asyncio.run_coroutine_threadsafe (I used the "task" dictionary for that) like this: asyncio.run_coroutine_threadsafe(send_result(task['id']), task['loop']).

python 3 - jack-client - using write_midi_event - jack keeps sending the midi event forever

I try to use python jack-client module to send a program change midi when I click on a button
here is a simplified version of the code :
def process_callback(frames: int):
global midiUi
if(midiUi is not None):
midiUi.process_callback(frames)
class MidiUi:
def __init__(self):
self.client = jack.Client('MidiUi')
self.client.set_process_callback(process_callback)
self.client.activate()
def sendProgramChange(self):
self.midiQueue.append([0xC0,0])
def process_callback(self,frames: int):
while(len(self.midiQueue)>0):
data = self.midiQueue.pop()
self.outMidiPort.clear_buffer()
buffer = self.outMidiPort.reserve_midi_event(0,len(data))
buffer[:] = bytearray(data)
self.outMidiPort.write_midi_event(0,buffer) # this only happens once yet midi input receives tons of program changes events
#raise jack.CallbackExit
midiUi = MidiUi()
while True:
....
#some button calls midiUi.sendProgramChange()
write_midi_event is called only once when pressing the button,
but apparently the destination midi port receives a continuous flow of midi C0 program changes (unless I call jack.CallbackExit, but then the call back never triggers again)
(I monitor my python script output using jack_midi_dump and midisnoop)
anyone know how to solve this ?
thanks for your help
I now user python-rtmidi for this matter
midiout = rtmidi.MidiOut(rtapi=rtmidi.API_UNIX_JACK)
rtMidiOutputPorts=midiout.get_ports()
then write data to the port
This post may be kind of old and it seems like you've got it figured out, but I did find a solution:
The midi client is sending whatever is in the buffer, which means that things like write_midi_event need to be cleared to stop sending it. Therefore, what helped me was near the beginning of my process, I had;
outport.clear_buffer()
Hope that helps ;)

How to shut down CherryPy in no incoming connections for specified time?

I am using CherryPy to speak to an authentication server. The script runs fine if all the inputted information is fine. But if they make an mistake typing their ID the internal HTTP error screen fires ok, but the server keeps running and nothing else in the script will run until the CherryPy engine is closed so I have to manually kill the script. Is there some code I can put in the index along the lines of
if timer >10 and connections == 0:
close cherrypy (< I have a method for this already)
Im mostly a data mangler, so not used to web servers. Googling shows lost of hits for closing CherryPy when there are too many connections but not when there have been no connections for a specified (short) time. I realise the point of a web server is usually to hang around waiting for connections, so this may be an odd case. All the same, any help welcome.
Interesting use case, you can use the CherryPy plugins infrastrcuture to do something like that, take a look at this ActivityMonitor plugin implementation, it shutdowns the server if is not handling anything and haven't seen any request in a specified amount of time (in this case 10 seconds).
Maybe you have to adjust the logic on how to shut it down or do anything else in the _verify method.
If you want to read a bit more about the publish/subscribe architecture take a look at the CherryPy Docs.
import time
import threading
import cherrypy
from cherrypy.process.plugins import Monitor
class ActivityMonitor(Monitor):
def __init__(self, bus, wait_time, monitor_time=None):
"""
bus: cherrypy.engine
wait_time: Seconds since last request that we consider to be active.
monitor_time: Seconds that we'll wait before verifying the activity.
If is not defined, wait half the `wait_time`.
"""
if monitor_time is None:
# if monitor time is not defined, then verify half
# the wait time since the last request
monitor_time = wait_time / 2
super().__init__(
bus, self._verify, monitor_time, self.__class__.__name__
)
# use a lock to make sure the thread that triggers the before_request
# and after_request does not collide with the monitor method (_verify)
self._active_request_lock = threading.Lock()
self._active_requests = 0
self._wait_time = wait_time
self._last_request_ts = time.time()
def _verify(self):
# verify that we don't have any active requests and
# shutdown the server in case we haven't seen any activity
# since self._last_request_ts + self._wait_time
with self._active_request_lock:
if (not self._active_requests and
self._last_request_ts + self._wait_time < time.time()):
self.bus.exit() # shutdown the engine
def before_request(self):
with self._active_request_lock:
self._active_requests += 1
def after_request(self):
with self._active_request_lock:
self._active_requests -= 1
# update the last time a request was served
self._last_request_ts = time.time()
class Root:
#cherrypy.expose
def index(self):
return "Hello user: current time {:.0f}".format(time.time())
def main():
# here is how to use the plugin:
ActivityMonitor(cherrypy.engine, wait_time=10, monitor_time=5).subscribe()
cherrypy.quickstart(Root())
if __name__ == '__main__':
main()

Creating detached processes from celery worker/alternative solution?

I'm developing a web service that will be used as a "database as a service" provider. The goal is to have a flask based small web service, running on some host and "worker" processes running on different hosts owned by different teams. Whenever a team member comes and requests a new database I should create one on their host. Now the problem... The service I start must be running. The worker however might be restarted. Could happen 5 minutes could happen 5 days. A simple Popen won't do the trick because it'd create a child process and if the worker stops later on the Popen process is destroyed (I tried this).
I have an implementation that's using multiprocessing which works like a champ, sadly I cannot use this with celery. so out of luck there. I tried to get away from the multiprocessing library with double forking and named pipes. The most minimal sample I could produce:
def launcher2(working_directory, cmd, *args):
command = [cmd]
command.extend(list(args))
process = subprocess.Popen(command, cwd=working_directory, shell=False, start_new_session=True,
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
with open(f'{working_directory}/ipc.fifo', 'wb') as wpid:
wpid.write(process.pid)
#shared_task(bind=True, name="Test")
def run(self, cmd, *args):
working_directory = '/var/tmp/workdir'
if not os.path.exists(working_directory):
os.makedirs(working_directory, mode=0o700)
ipc = f'{working_directory}/ipc.fifo'
if os.path.exists(ipc):
os.remove(ipc)
os.mkfifo(ipc)
pid1 = os.fork()
if pid1 == 0:
os.setsid()
os.umask(0)
pid2 = os.fork()
if pid2 > 0:
sys.exit(0)
os.setsid()
os.umask(0)
launcher2(working_directory, cmd, *args)
else:
with os.fdopen(os.open(ipc, flags=os.O_NONBLOCK | os.O_RDONLY), 'rb') as ripc:
readers, _, _ = select.select([ripc], [], [], 15)
if not readers:
raise TimeoutError(60, 'Timed out', ipc)
reader = readers.pop()
pid = struct.unpack('I', reader.read())[0]
pid, status = os.waitpid(pid, 0)
print(status)
if __name__ == '__main__':
async_result = run.apply_async(('/usr/bin/sleep', '15'), queue='q2')
print(async_result.get())
My usecase is more complex but I don't think anyone would want to read 200+ lines of bootstrapping, but this fails exactly on the same way. On the other hand I don't wait for the pid unless that's required so it's like start the process on request and let it do it's job. Bootstrapping a database takes roughly a minute with the full setup, and I don't want the clients standing by for a minute. Request comes in, I spawn the process and send back an id for the database instance, and the client can query the status based on the received instance id. However with the above forking solution I get:
[2020-01-20 18:03:17,760: INFO/MainProcess] Received task: Test[dbebc31c-7929-4b75-ae28-62d3f9810fd9]
[2020-01-20 18:03:20,859: ERROR/MainProcess] Process 'ForkPoolWorker-2' pid:16634 exited with 'signal 15 (SIGTERM)'
[2020-01-20 18:03:20,877: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 15 (SIGTERM).')
Traceback (most recent call last):
File "/home/pupsz/PycharmProjects/provider/venv37/lib/python3.7/site-packages/billiard/pool.py", line 1267, in mark_as_worker_lost
human_status(exitcode)),
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 15 (SIGTERM).
Which leaves me wondering, what might be going on. I tried an even more simple task:
#shared_task(bind=True, name="Test")
def run(self, cmd, *args):
working_directory = '/var/tmp/workdir'
if not os.path.exists(working_directory):
os.makedirs(working_directory, mode=0o700)
command = [cmd]
command.extend(list(args))
process = subprocess.Popen(command, cwd=working_directory, shell=False, start_new_session=True,
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
return process.wait()
if __name__ == '__main__':
async_result = run.apply_async(('/usr/bin/sleep', '15'), queue='q2')
print(async_result.get())
Which again fails with the very same error. Now I like Celery but from this it feels like it's not suited for my needs. Did I mess something up? Can it be achieved, what I need to do from a worker? Do I have any alternatives, or should I just write my own task queue?
Celery is not multiprocessing-friendly, so try to use billiard instead of multiprocessing (from billiard import Process etc...) I hope one day Celery guys do a heavy refactoring of that code, remove billiard, and start using multiprocessing instead...
So, until they move to multiprocessing we are stuck with billiard. My advice is to remove any usage of multiprocessing in your Celery tasks, and start using billiard.context.Process and similar, depending on your use-case.

Resources