Creating detached processes from celery worker/alternative solution? - python-3.x

I'm developing a web service that will be used as a "database as a service" provider. The goal is to have a flask based small web service, running on some host and "worker" processes running on different hosts owned by different teams. Whenever a team member comes and requests a new database I should create one on their host. Now the problem... The service I start must be running. The worker however might be restarted. Could happen 5 minutes could happen 5 days. A simple Popen won't do the trick because it'd create a child process and if the worker stops later on the Popen process is destroyed (I tried this).
I have an implementation that's using multiprocessing which works like a champ, sadly I cannot use this with celery. so out of luck there. I tried to get away from the multiprocessing library with double forking and named pipes. The most minimal sample I could produce:
def launcher2(working_directory, cmd, *args):
command = [cmd]
command.extend(list(args))
process = subprocess.Popen(command, cwd=working_directory, shell=False, start_new_session=True,
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
with open(f'{working_directory}/ipc.fifo', 'wb') as wpid:
wpid.write(process.pid)
#shared_task(bind=True, name="Test")
def run(self, cmd, *args):
working_directory = '/var/tmp/workdir'
if not os.path.exists(working_directory):
os.makedirs(working_directory, mode=0o700)
ipc = f'{working_directory}/ipc.fifo'
if os.path.exists(ipc):
os.remove(ipc)
os.mkfifo(ipc)
pid1 = os.fork()
if pid1 == 0:
os.setsid()
os.umask(0)
pid2 = os.fork()
if pid2 > 0:
sys.exit(0)
os.setsid()
os.umask(0)
launcher2(working_directory, cmd, *args)
else:
with os.fdopen(os.open(ipc, flags=os.O_NONBLOCK | os.O_RDONLY), 'rb') as ripc:
readers, _, _ = select.select([ripc], [], [], 15)
if not readers:
raise TimeoutError(60, 'Timed out', ipc)
reader = readers.pop()
pid = struct.unpack('I', reader.read())[0]
pid, status = os.waitpid(pid, 0)
print(status)
if __name__ == '__main__':
async_result = run.apply_async(('/usr/bin/sleep', '15'), queue='q2')
print(async_result.get())
My usecase is more complex but I don't think anyone would want to read 200+ lines of bootstrapping, but this fails exactly on the same way. On the other hand I don't wait for the pid unless that's required so it's like start the process on request and let it do it's job. Bootstrapping a database takes roughly a minute with the full setup, and I don't want the clients standing by for a minute. Request comes in, I spawn the process and send back an id for the database instance, and the client can query the status based on the received instance id. However with the above forking solution I get:
[2020-01-20 18:03:17,760: INFO/MainProcess] Received task: Test[dbebc31c-7929-4b75-ae28-62d3f9810fd9]
[2020-01-20 18:03:20,859: ERROR/MainProcess] Process 'ForkPoolWorker-2' pid:16634 exited with 'signal 15 (SIGTERM)'
[2020-01-20 18:03:20,877: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 15 (SIGTERM).')
Traceback (most recent call last):
File "/home/pupsz/PycharmProjects/provider/venv37/lib/python3.7/site-packages/billiard/pool.py", line 1267, in mark_as_worker_lost
human_status(exitcode)),
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 15 (SIGTERM).
Which leaves me wondering, what might be going on. I tried an even more simple task:
#shared_task(bind=True, name="Test")
def run(self, cmd, *args):
working_directory = '/var/tmp/workdir'
if not os.path.exists(working_directory):
os.makedirs(working_directory, mode=0o700)
command = [cmd]
command.extend(list(args))
process = subprocess.Popen(command, cwd=working_directory, shell=False, start_new_session=True,
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
return process.wait()
if __name__ == '__main__':
async_result = run.apply_async(('/usr/bin/sleep', '15'), queue='q2')
print(async_result.get())
Which again fails with the very same error. Now I like Celery but from this it feels like it's not suited for my needs. Did I mess something up? Can it be achieved, what I need to do from a worker? Do I have any alternatives, or should I just write my own task queue?

Celery is not multiprocessing-friendly, so try to use billiard instead of multiprocessing (from billiard import Process etc...) I hope one day Celery guys do a heavy refactoring of that code, remove billiard, and start using multiprocessing instead...
So, until they move to multiprocessing we are stuck with billiard. My advice is to remove any usage of multiprocessing in your Celery tasks, and start using billiard.context.Process and similar, depending on your use-case.

Related

How to implement custom timeout for function that connects to server

I want to establish a connection with a b0 client for Coppelia Sim using the Python API. Unfortunately, this connection function does not have a timeout and will run indefinitely if it fails to connect.
To counter that, I tried moving the connection to a separate process (multiprocessing) and check after a couple of seconds, whether the process is still alive. If it still is, I kill the process and continue with the program.
This sort of works, as it does not block my program anymore, however the process does not stop when the connection is successfully made, so it kills the process, even when the connection succeeds.
How can I fix this and also write the b0client to the global variable?
def connection_function():
global b0client
b0client = b0RemoteApi.RemoteApiClient('b0RemoteApi_pythonClient', 'b0RemoteApi', 60)
print('Success!')
return 0
def establish_b0_connection(timeout):
connection_process = multiprocessing.Process(target=connection_function)
connection_process.start()
# Wait for [timeout] seconds or until process finishes
connection_process.join(timeout=timeout)
# If thread is still active
if connection_process.is_alive():
print('[INITIALIZATION OF B0 API CLIENT FAILED]')
# Terminate - may not work if process is stuck for good
connection_process.terminate()
# OR Kill - will work for sure, no chance for process to finish nicely however
# connection_process.kill()
connection_process.join()
print('[CONTINUING WITHOUT B0 API CLIENT]')
return False
else:
return True
if __name__ == '__main__':
b0client = None
establish_b0_connection(timeout=5)
# Continue with the code, with or without connection.
# ...

What is best practice to interact with subprocesses in python

I'm building an apllication which is intended to do a bulk-job processing data within another software. To control the other software automatically I'm using pyautoit, and everything works fine, except for application errors, caused from the external software, which occur from time to time.
To handle those cases, I built a watchdog:
It starts the script with the bulk job within a subprocess
process = subprocess.Popen(['python', job_script, src_path], stdout=subprocess.PIPE,
stderr=subprocess.PIPE, shell=True)
It listens to the system event using winevt.EventLog module
EventLog.Subscribe('System', 'Event/System[Level<=2]', handle_event)
In case of an error occurs, it shuts down everything and re-starts the script again.
Ok, if an system error event occurs, this event should get handled in a way, that the supprocess gets notified. This notification should then lead to the following action within the subprocess:
Within the subprocess there's an object controlling everything and continuously collecting
generated data. In order to not having to start the whole job from the beginnig, after re-starting the script, this object has to be dumped using pickle (which isn't the problem here!)
Listening to the system event from inside the subprocess didn't work. It results in a continuous loop, when calling subprocess.Popen().
So, my question is how I can either subscribe for system events from inside a childproces, or communicate between the parent and childprocess - means, sending a message like "hey, an errorocurred", listening within the subprocess and then creating the dump?
I'm really sorry not being allowed to post any code in this case. But I hope (and actually think), that my description should be understandable. My question is just about what module to use to accomplish this in the best way?
Would be really happy, if somebody could point me into the right direction...
Br,
Mic
I believe the best answer may lie here: https://docs.python.org/3/library/subprocess.html#subprocess.Popen.stdin
These attributes should allow for proper communication between the different processes fairly easily, and without any other dependancies.
Note that Popen.communicate() may suit better if other processes may cause issues.
EDIT to add example scripts:
main.py
from subprocess import *
import sys
def check_output(p):
out = p.stdout.readline()
return out
def send_data(p, data):
p.stdin.write(bytes(f'{data}\r\n', 'utf8')) # auto newline
p.stdin.flush()
def initiate(p):
#p.stdin.write(bytes('init\r\n', 'utf8')) # function to send first communication
#p.stdin.flush()
send_data(p, 'init')
return check_output(p)
def test(p, data):
send_data(p, data)
return check_output(p)
def main()
exe_name = 'Doc2.py'
p = Popen([sys.executable, exe_name], stdout=PIPE, stderr=STDOUT, stdin=PIPE)
print(initiate(p))
print(test(p, 'test'))
print(test(p, 'test2')) # testing responses
print(test(p, 'test3'))
if __name__ == '__main__':
main()
Doc2.py
import sys, time, random
def recv_data():
return sys.stdin.readline()
def send_data(data):
print(data)
while 1:
d = recv_data()
#print(f'd: {d}')
if d.strip() == 'test':
send_data('return')
elif d.strip() == 'init':
send_data('Acknowledge')
else:
send_data('Failed')
This is the best method I could come up with for cross-process communication. Also make sure all requests and responses don't contain newlines, or the code will break.

PyZMQ segmentation fault, messages not arriving after restart via bash script

im currently facing an issue in which a proxy throws a segmentation error after a random period of time. Restarting the proxy with a bash script leads to messages not arriving.
I was sadly not able to recreate the issue. I am aware that this most likely is related to a partly wrong utilization of zmq and that the error gets thrown by c python.
The System runs 3 diffrent processes.
process, which task is to handle send data, which is a subscriber
context = zmq.Context()
socket = context.socket(zmq.SUB)
socket.connect("tcp://127.0.0.1:8100")
socket.setsockopt(zmq.SUBSCRIBE, b'')
poller = zmq.Poller()
poller.register(socket, zmq.POLLIN)
while True:
try:
socks = dict(poller.poll())
if socket in socks and socks[socket] == zmq.POLLIN:
action, values = socket.recv_pyobj()
##### handling of data #######
except Exception as e:
print(e)
process, being a proxy
def main():
context = zmq.Context()
# Socket facing clients
frontend = context.socket(zmq.XSUB)
frontend.bind("tcp://127.0.0.1:5557")
# Socket facing services
backend = context.socket(zmq.XPUB)
backend.bind("tcp://127.0.0.1:8100")
print("starting broker...")
while True:
try:
zmq.proxy(frontend, backend)
except KeyboardInterrupt:
print("stopping broker...")
frontend.close()
backend.close()
context.term()
quit()
except Exception as e:
print(f"failed with {e}")
if __name__ == "__main__":
main()
process running multiple threads which publish their data.
socket_pub = context.socket(zmq.PUB)
socket_pub.connect("tcp://127.0.0.1:5557")
while True:
# pre processing f.e. outgoing requests
######
# sending results
message = ["Action", {"foo": "bar"}]
socket_pub.send_pyobj(message)
As i was not able to recreate the error and therefor was not able to fix it i am trying to bypass it using a bash script.
The segementation error gets thrown in process nr. 2 (the proxy).
Thus the bash script simply restarts it.
#!/bin/bash
until python3 process2.py; do
echo "bridge broker crashed with exit code $?. Respawning.." >&2
sleep 1
done
The bash script correctly respawns the process if it died due to a segmentation fault.
But notifications from process 3 are not arriving in process 1 anymore.
I was not able to track down why this is happening. I rebuild it localy and if i manually restarted the proxy (without a segmentation fault) the messages directly arrived again.
Does anybody have a clue why this is happening or do i have to find the initial reason for the segmentation fault?

How to shut down CherryPy in no incoming connections for specified time?

I am using CherryPy to speak to an authentication server. The script runs fine if all the inputted information is fine. But if they make an mistake typing their ID the internal HTTP error screen fires ok, but the server keeps running and nothing else in the script will run until the CherryPy engine is closed so I have to manually kill the script. Is there some code I can put in the index along the lines of
if timer >10 and connections == 0:
close cherrypy (< I have a method for this already)
Im mostly a data mangler, so not used to web servers. Googling shows lost of hits for closing CherryPy when there are too many connections but not when there have been no connections for a specified (short) time. I realise the point of a web server is usually to hang around waiting for connections, so this may be an odd case. All the same, any help welcome.
Interesting use case, you can use the CherryPy plugins infrastrcuture to do something like that, take a look at this ActivityMonitor plugin implementation, it shutdowns the server if is not handling anything and haven't seen any request in a specified amount of time (in this case 10 seconds).
Maybe you have to adjust the logic on how to shut it down or do anything else in the _verify method.
If you want to read a bit more about the publish/subscribe architecture take a look at the CherryPy Docs.
import time
import threading
import cherrypy
from cherrypy.process.plugins import Monitor
class ActivityMonitor(Monitor):
def __init__(self, bus, wait_time, monitor_time=None):
"""
bus: cherrypy.engine
wait_time: Seconds since last request that we consider to be active.
monitor_time: Seconds that we'll wait before verifying the activity.
If is not defined, wait half the `wait_time`.
"""
if monitor_time is None:
# if monitor time is not defined, then verify half
# the wait time since the last request
monitor_time = wait_time / 2
super().__init__(
bus, self._verify, monitor_time, self.__class__.__name__
)
# use a lock to make sure the thread that triggers the before_request
# and after_request does not collide with the monitor method (_verify)
self._active_request_lock = threading.Lock()
self._active_requests = 0
self._wait_time = wait_time
self._last_request_ts = time.time()
def _verify(self):
# verify that we don't have any active requests and
# shutdown the server in case we haven't seen any activity
# since self._last_request_ts + self._wait_time
with self._active_request_lock:
if (not self._active_requests and
self._last_request_ts + self._wait_time < time.time()):
self.bus.exit() # shutdown the engine
def before_request(self):
with self._active_request_lock:
self._active_requests += 1
def after_request(self):
with self._active_request_lock:
self._active_requests -= 1
# update the last time a request was served
self._last_request_ts = time.time()
class Root:
#cherrypy.expose
def index(self):
return "Hello user: current time {:.0f}".format(time.time())
def main():
# here is how to use the plugin:
ActivityMonitor(cherrypy.engine, wait_time=10, monitor_time=5).subscribe()
cherrypy.quickstart(Root())
if __name__ == '__main__':
main()

Is there a way to run python flask function, every specific interval of time and display on the local server the output?

I am working python program using flask, where i want to extract keys from dictionary. this keys is in text format. But I want to repeat this above whole process after every specific interval of time. And display this output on local browser each time.
I have tried this using flask_apscheduler. The program run and shows output but only once, but dose not repeat itself after interval of time.
This is python program which i tried.
#app.route('/trend', methods=['POST', 'GET'])
def run_tasks():
for i in range(0, 1):
app.apscheduler.add_job(func=getTrendingEntities, trigger='cron', args=[i], id='j'+str(i), second = 5)
return "Code run perfect"
#app.route('/loc', methods=['POST', 'GET'])
def getIntentAndSummary(self, request):
if request.method == "POST":
reqStr = request.data.decode("utf-8", "strict")
reqStrArr = reqStr.split()
reqStr = ' '.join(reqStrArr)
text_1 = []
requestBody = json.loads(reqStr)
if requestBody.get('m') is not None:
text_1.append(requestBody.get('m'))
return jsonify(text_1)
if (__name__ == "__main__"):
app.run(port = 8000)
The problem is that you're calling add_job every time the /trend page is requested. The job should only be added once, as part of the initialization, before starting the scheduler (see below).
It would also make more sense to use the 'interval' trigger instead of 'cron', since you want your job to run every 5 seconds. Here's a simple working example:
from flask import Flask
from flask_apscheduler import APScheduler
import datetime
app = Flask(__name__)
#function executed by scheduled job
def my_job(text):
print(text, str(datetime.datetime.now()))
if (__name__ == "__main__"):
scheduler = APScheduler()
scheduler.add_job(func=my_job, args=['job run'], trigger='interval', id='job', seconds=5)
scheduler.start()
app.run(port = 8000)
Sample console output:
job run 2019-03-30 12:49:55.339020
job run 2019-03-30 12:50:00.339467
job run 2019-03-30 12:50:05.343154
job run 2019-03-30 12:50:10.343579
You can then modify the job attributes by calling scheduler.modify_job().
As for the second problem which is refreshing the client view every time the job runs, you can't do that directly from Flask. An ugly but simple way would be to add <meta http-equiv="refresh" content="1" > to the HTML page to instruct the browser to refresh it every second. A much better implementation would be to use SocketIO to send new data in real-time to the web client.
I would recommend that you start a demonized thread, import your application variable, then you can use with app.app_context() in order to log into to your console.
It's a little bit more fiddly but allows the application to run separated by different threads.
I use this method to fire off a bunch of http requests concurrently. The alternative is wait for each response before making a new one.
I'm sure you've realised that the thread will become occupied of you run an infinitely running command.
Make sure to demonize the thread so that when you stop your web app it will kill the thread at the same time gracefully.

Resources