Python RQ-Scheduler not giving any output - python-3.x

I am unable to get rq_scheduler working. Here is a simple example:
app.py
from flask import Flask
import datetime
from redis import Redis
from rq import Queue
from rq_scheduler import Scheduler
from tasks import example
app=Flask(__name__)
app.secret_key='abc'
app.redis = Redis.from_url('redis://')
app.task_queue = Queue('test', connection=app.redis)
scheduler = Scheduler(queue=app.task_queue,connection=app.redis)
#app.task_queue.enqueue('tasks.example',2)
#scheduler.enqueue_at(datetime.datetime(2020,4,16,10,46), example, 2)
scheduler.enqueue_in(datetime.timedelta(seconds=1), example, 2)
if __name__=='__main__':
app.run(host='0.0.0.0', port=5000, debug=True)
tasks.py
import time
def example(seconds):
print('Starting task')
for i in range(seconds):
print(i)
time.sleep(1)
print('Task completed')
In the app directory in terminal, I start the following in separate tabs:
$redis-server
$rq worker test
$rqscheduler
$python app.py
The first queue.enqueue works fine. Both scheduler tasks do nothing. What is wrong?

I suspect that you may be getting confused because rqscheduler by default checks for new jobs every one minute. You can tweak this with the -i flag to set the interval in seconds, and also add the -v flag for more verbose output:
rqscheduler -i 1 -v
However I also noticed another issue with the above Flask code...
Probably due to the dev server spawning a separate process I was finding that the scheduler.enqueue_in function was enqueuing the job twice. This probably wouldn't be an issue if the enqueue_in function was called inside a view function. However where you have it placed this actually runs when the application is started.
So when launching with the dev server this gets executed twice. This will then run once every time the autoreloader senses a code change: So after starting the dev server, then saving a change to the code, 3 jobs total have been enqueued.
For the purpose of testing this, it may be advisable just to have a simple python script which doesn't actually run the Flask app:
# enqueue_test.py
from redis import Redis
from rq import Queue
from rq_scheduler import Scheduler
from tasks import example
r = Redis.from_url('redis://localhost:6379')
q = Queue('test', connection=r)
scheduler = Scheduler(queue=q, connection=r)
scheduler.enqueue_in(datetime.timedelta(seconds=1), example, 2)

Related

What's the proper way to test a MongoDB connection with motor io?

I've got a simple FastAPI webapp going and I'd like to be able to check the database connection on startup (and retry connection if it fails)
I've got the following code, but it doesn't feel right
# main.py
import uvicorn
from backend.app import app
if __name__ == "__main__":
uvicorn.run(app, port=8001)
# app.py
# ... omitted for brevity
from backend.database import notes, tags
# ... omitted for brevity
# database.py
from motor.motor_asyncio import AsyncIOMotorClient
from asyncio import get_event_loop
client = AsyncIOMotorClient("localhost", 27027)
loop = get_event_loop()
data = loop.run_until_complete(client.server_info())
db = client.notes_db
notes = db.notes
tags = db.tags
Without get_event_loop() and the subsequent loop.run_until_complete() call it won't test the database connection until you actually try to access / write to it.
My goal is to be able to halt the startup process until it can successfully connect to a database, is there any clean way to do this with Python and motor.io (https://motor.readthedocs.io/, sorry there's no tag for it) ?
the startup event in FastAPI is the deal here I guess. I addition this repository is a nice example and this thread could even provide you with more information. You could execute your tests within the startup event. This means the application won't start until the startup event has been successfully executed.

Is there a way to run python flask function, every specific interval of time and display on the local server the output?

I am working python program using flask, where i want to extract keys from dictionary. this keys is in text format. But I want to repeat this above whole process after every specific interval of time. And display this output on local browser each time.
I have tried this using flask_apscheduler. The program run and shows output but only once, but dose not repeat itself after interval of time.
This is python program which i tried.
#app.route('/trend', methods=['POST', 'GET'])
def run_tasks():
for i in range(0, 1):
app.apscheduler.add_job(func=getTrendingEntities, trigger='cron', args=[i], id='j'+str(i), second = 5)
return "Code run perfect"
#app.route('/loc', methods=['POST', 'GET'])
def getIntentAndSummary(self, request):
if request.method == "POST":
reqStr = request.data.decode("utf-8", "strict")
reqStrArr = reqStr.split()
reqStr = ' '.join(reqStrArr)
text_1 = []
requestBody = json.loads(reqStr)
if requestBody.get('m') is not None:
text_1.append(requestBody.get('m'))
return jsonify(text_1)
if (__name__ == "__main__"):
app.run(port = 8000)
The problem is that you're calling add_job every time the /trend page is requested. The job should only be added once, as part of the initialization, before starting the scheduler (see below).
It would also make more sense to use the 'interval' trigger instead of 'cron', since you want your job to run every 5 seconds. Here's a simple working example:
from flask import Flask
from flask_apscheduler import APScheduler
import datetime
app = Flask(__name__)
#function executed by scheduled job
def my_job(text):
print(text, str(datetime.datetime.now()))
if (__name__ == "__main__"):
scheduler = APScheduler()
scheduler.add_job(func=my_job, args=['job run'], trigger='interval', id='job', seconds=5)
scheduler.start()
app.run(port = 8000)
Sample console output:
job run 2019-03-30 12:49:55.339020
job run 2019-03-30 12:50:00.339467
job run 2019-03-30 12:50:05.343154
job run 2019-03-30 12:50:10.343579
You can then modify the job attributes by calling scheduler.modify_job().
As for the second problem which is refreshing the client view every time the job runs, you can't do that directly from Flask. An ugly but simple way would be to add <meta http-equiv="refresh" content="1" > to the HTML page to instruct the browser to refresh it every second. A much better implementation would be to use SocketIO to send new data in real-time to the web client.
I would recommend that you start a demonized thread, import your application variable, then you can use with app.app_context() in order to log into to your console.
It's a little bit more fiddly but allows the application to run separated by different threads.
I use this method to fire off a bunch of http requests concurrently. The alternative is wait for each response before making a new one.
I'm sure you've realised that the thread will become occupied of you run an infinitely running command.
Make sure to demonize the thread so that when you stop your web app it will kill the thread at the same time gracefully.

APScheduler resets after every deploy

I have a script which which when run adds rss feed parsing tasks to some celery queues. Now I have implemented apscheduler to run the script every 2 hours to get new data from the feeds.
My implementation looks like this:
#!/usr/bin/env python
import atexit
import logging
import os
from logging import getLogger
from apscheduler.schedulers.blocking import BlockingScheduler
logger = getLogger('scheduled_parser')
PARSER_SCHEDULER = 'parser_scheduler'
def main():
scheduler = BlockingScheduler(job_defaults={'coalesce': True})
scheduler.add_jobstore('sqlalchemy',alias='scheduler_config', url=os.environ.get("DATABASE_URL"))
scheduler.add_job(run_parser, 'interval', seconds=int(os.environ.get("SCHEDULER_RUN_FREQUENCY")),
id=PARSER_SCHEDULER, replace_existing=True)
scheduler.start()
atexit.register(lambda: scheduler.shutdown())
def run_parser():
< code to add items to queues>
if __name__ == "__main__":
logging.basicConfig()
logger.setLevel(logging.INFO)
main()
My code is deployed on heroku and I have following in my procfile
clock: python scheduled_parser
<celery worker processes>
I am having following issues:
I am storing the scheduler job in persistant storage and I can even see it in my db, but when I do scheduler.get_job(PARSER_SCHEDULER,'scheduler_config') I get None
Whenever I deploy on heroku, I think the next run is being updated. For example if parser is set to run every 2 hours and next run going to be at 4:00pm and if I deploy on Heroku at 3:00pm then my next run happens at 5:00pm instead of 4:00pm.
Not sure about your issue #1, but I think issue #2 is that on every deploy, this line is going to replace the job, thus resetting the schedule:
scheduler.add_job(run_parser, 'interval', seconds=int(os.environ.get("SCHEDULER_RUN_FREQUENCY")),
id=PARSER_SCHEDULER, replace_existing=True)

Dispy, initiating a SharedJobCluster on a compute node

I am creating a compute cluster in python using dispy. One of my use cases would be very nicely solved by starting a process on a compute node that itself starts a distributed process. As such, I have implemented the SharedJobCluster on the primary scheduler, and also in the function that will be sent to the cluster (which should in turn, start a series of distributed processes). However, when the second SharedJobCluster is initiated, the code hangs and does not move past this line (nor show any errors).
Minimum working example:
def clusterfun():
import dispy
import test2
import logging
log_filename = 'worker.log'
logging.basicConfig(filename=log_filename,
level=logging.DEBUG,
format='%(asctime)s %(name)-12s %(levelname)-8s %(message)s',
datefmt='[%m-%d-%Y %H:%M:%S]')
logging.info("Starting cluster...")
# THE FOLLOWING LINE HANGS
cluster = dispy.SharedJobCluster(test2.clusterfun2, port=0, scheduler_node='127.0.0.1')
logging.info("Started cluster...")
job = cluster.submit()
logging.info("Submitted job...")
return job()
if __name__ == '__main__':
import dispy
#
# Start the Compute cluster
#
cluster = dispy.SharedJobCluster(clusterfun, port=0, depends=['test2.py'], scheduler_node='127.0.0.1')
job = cluster.submit()
print(job())
test2.py contains:
def clusterfun2():
return "Foo"
For reference, I am currently running the dispyscheduler.py, dispynode, and this python code all on the same machine. This setup works, except when trying to initiate embedded distribution task.
The worker.log output contains "Starting cluster..." but nothing else.
If I check the status of the node it says that it is running 1 job, but it never completes.

Handling atexit for multiple app objects with Flask dev server reloader

This is yet another flask dev server reloader question. There are a million questions asking why it loads everything twice, and this is not one of them. I understand that it loads everything twice, my question involves dealing with this reality and I haven't found an answer that I think addresses what I'm trying to do.
My question is, how can I cleanup all app objects at exit?
My current approach is shown below. In this example I run my cleanup code using an atexit function.
from flask import Flask
app = Flask(__name__)
print("start_app_id: ", '{}'.format(id(app)))
import atexit
#atexit.register
def shutdown():
print("AtExit_app_id: ", '{}'.format(id(app)))
#do some cleanup on the app object here
if __name__ == "__main__":
import os
if os.environ.get('WERKZEUG_RUN_MAIN') == "true":
print("reloaded_main_app_id: ", '{}'.format(id(app)))
else:
print("first_main_app_id: ", '{}'.format(id(app)))
app.run(host='0.0.0.0', debug=True)
The output of this code is as follows:
start_app_id: 140521561348864
first_main_app_id: 140521561348864
* Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
* Restarting with stat
start_app_id: 140105598483312
reloaded_main_app_id: 140105598483312
* Debugger is active!
* Debugger pin code: xxx-xxx-xxx
^CAtExit_app_id: 140521561348864
Note that when first loaded, an app object with ID '864 is created. During the automatic reloading, a new app object with ID '312 is created. Then when I hit Ctrl-C (last line), the atexit routine is called and the original '864 app object is the one that is accessible using the app variable -- not the newer '312 app object.
I want to be able to do cleanup on all app objects floating around when the server closes or is Ctrl-C'd (in this case both '864 and '312). Any recs on how to do this?
Or alternately, if I could just run the cleanup on the newer '312 object created after reloading I could also make that work -- however my current approach only lets me cleanup the original app object.
Thanks.
UPDATE1: I found a link that suggested using try/finally instead of the atexit hook to accomplish what I set out to do above. Switching to this results in exactly the same behavior as atexit and therefore doesn't help with my issue:
from flask import Flask
app = Flask(__name__)
print("start_app_id: ", '{}'.format(id(app)))
if __name__ == "__main__":
import os
if os.environ.get('WERKZEUG_RUN_MAIN') == "true":
print("reloaded_main_app_id: ", '{}'.format(id(app)))
else:
print("first_main_app_id: ", '{}'.format(id(app)))
try:
app.run(host='0.0.0.0', debug=True)
finally:
print("Finally_app_id: ", '{}'.format(id(app)))
#do app cleanup code here
After some digging through the werkzeug source I found the answer. The answer is that it isn't possible to do what I wanted -- and this is by design.
When using the flask dev server (werkzeug) it isn't possible to cleanup all existing app objects upon termination (e.g. ctrl-C) because the werkzeug server catches the keyboardinterrupt exception and "passes" on it. You can see this in the last lines of werkzeug's _reloader.py in the run_with_reloader function:
def run_with_reloader(main_func, extra_files=None, interval=1,
reloader_type='auto'):
"""Run the given function in an independent python interpreter."""
import signal
reloader = reloader_loops[reloader_type](extra_files, interval)
signal.signal(signal.SIGTERM, lambda *args: sys.exit(0))
try:
if os.environ.get('WERKZEUG_RUN_MAIN') == 'true':
t = threading.Thread(target=main_func, args=())
t.setDaemon(True)
t.start()
reloader.run()
else:
sys.exit(reloader.restart_with_reloader())
except KeyboardInterrupt:
pass
If you replace the above "except KeyboardInterrupt:" with "finally:", and then run the second code snippet in the original question, you observe that both of the created app objects are cleaned up as desired. Interestingly, the first code snippet (that uses #atexit) still doesn't work as desired after making these changes.
So in conclusion, you can cleanup all existing app objects when using the flask dev server, but you need to modify the werkzeug source to do so.

Resources