I have two tasks A and B in my workflow and I want to skip B if task failed,any idea please?
#task
A():
Pass
B():
Pass
Flow("") as flow:
Prefect's default behavior is not to run task B if upstream task A fails. But those tasks must depend on each other either explicitly via the upstream_tasks keyword or implicitly by passing data between each other. You can also use triggers to control this behavior.
Example of how you can set dependencies:
from prefect import task, Flow
#task
def task_1():
pass
#task
def task_2():
pass
#task
def task_3():
pass
with Flow("flow_with_dependencies") as flow:
t1 = task_1()
t2 = task_2(upstream_tasks=[t1])
t3 = task_3(upstream_tasks=[t2])
Related
This is the problem I am having. I wont share code because of condfidentiality but instead I will provide some dummy example.
Assume that we have a class as follows:
class SayHello:
def __init__(self, name, id):
self.name=name
self.id=id
#public func
def doSomething(self, arg1, arg2 ):
DoAHugeTaskWithArgument
Lets say now that in an other modules we have this:
class CallOperations:
def __init__(self):
self.dummydict={1: {"james":20, "peter":30, "victor":40, "john":45, "ali":21, "tom":41, "hector":37}, 2:{"james":23, "peter":31, "victor":44, "john":46, "ali":23, "tom":44, "hector":35} }
def runProcessors(self):
#runprocess
for _, v in self.dummydict.items():
Instances = [SayHello(g,b) for g ,b in v.items()]
with ProcessPoolExecutor(max_workers=2) as executor:
future = [executor.submit(ins.doSomething, 2, 1235) for ins in Instances]
So the problem starts here. I want to know what instances are running doSomething() funtion in their respective process. I want to set a variable = 1 when the function of that instance is running in the process and set it to zero when it is completed.
Each instance has its own name and id. Is there way to find out the name of the running instance in the process?
This problem is making me very confused and can not find a proper solution.
Thank you alot.
If I understand your question correctly, you want to know when an instance of SayHello is executing and when it is not. You can set a variable (1 or 0) by using a Manager - but the usefulness of this can be debated. You might want to use a lock instead.
I had to tweak your code a bit but this is a running example. It picks one of your tasks as the one to monitor in the while loop. It is a dummy loop that never exits but you'll get the idea. It will keep polling the variable of one of your instances and you can see it change when that task is running, and then revert back to zero.
from time import sleep
from concurrent.futures import ProcessPoolExecutor
from multiprocessing import Manager
class SayHello:
def __init__(self, name, id):
self.name=name
self.id=id
self.status = Manager().Value("i",0)
#public func
def doSomething(self, arg1, arg2 ):
self.status.value = 1
sleep(5)
self.status.value = 0
class CallOperations:
def __init__(self):
self.dummydict={1: {"james":20, "peter":30, "victor":40, "john":45, "ali":21, "tom":41, "hector":37}, 2:{"james":23, "peter":31, "victor":44, "john":46, "ali":23, "tom":44, "hector":35} }
def runProcessors(self):
#runprocess
for _, v in self.dummydict.items():
Instances = [SayHello(g,b) for g ,b in v.items()]
f = Instances[3]
executor = ProcessPoolExecutor(max_workers=2)
future = [executor.submit(ins.doSomething, 2, 1235) for ins in Instances]
while True:
print(f.status.value)
# Insert break condition here
sleep(0.5)
executor.shutdown()
foo = CallOperations()
foo.runProcessors()
The problem with this is that it can lead to a race condition depending on what you do in your main program. If you want to do any operations on the instance when it is in passive state, it might progress to active just after you check the variable but before you have completed your actions in the main program.
Locks come to rescue here, as you can also create a shared lock Manager().Lock(). If your DoSomething() tries to acquire the lock and your main process does the same when operating on a passive instance, you avoid this problem. Of course your main program could then block the executor from processing if it reserves locks for lengthy operations, as then your two workers would be stuck waiting on locks if the execution processed to those instances where locks are being held by the main program. This case would not be suitable for parallel processing implemented using executors.
EDIT: if you are only interested in the running status, you can check the Future.running() status of your future objects, in this case items in your future array.
I have a question for pytest
I would like to run same pytest script with multiple threads.
But,i am not sure how to create and run thread which is passing more than one param. (And running thread with pytest..)
for example I have
test_web.py
from selenium import webdriver
import pytest
class SAMPLETEST:
self.browser = webdriver.Chrome()
self.browser.get(URL)
self.browser.maximize_window()
def test_title(self):
assert "Project WEB" in self.browser.title
def test_login(self):
print('Testing Login')
ID_BOX = self.broswer.find_element_by_id("ProjectemployeeId")
PW_BOX = self.broswer.find_element_by_id("projectpassword")
ID_BOX.send_keys(self.ID) # this place for ID. This param come from thread_run.py
PW_BOX.send_keys(self.PW) # this place for PW. It is not working. I am not sure how to get this data from threa_run.py
PW_BOX.submit()
IN thread_run.py
import threading
import time
from test_web import SAMPLETEST
ID_List = ["0","1","2","3","4","5","6","7"]
PW_LIST = ["0","1","2","3","4","5","6","7"]
threads = []
print("1: Create thread")
for I in range(8):
print("Append thread" + str(I))
t = threading.Thread(target=SAMPLETEST, args=(ID_List[I], PW_LIST[I]))
threads.append(t)
for I in range(8):
print("Start thread:" + str(I))
threads[I].start()
i was able to run thread to run many SAMPLETEST class without pytest.
However, it is not working with pytest.
My question is.
First, how to initialize self.brower in insde of SAMPLETEST? I am sure below codes will not be working
self.browser = webdriver.Chrome()
self.browser.get(URL)
self.browser.maximize_window()
Second, in thread_run.py, how can i pass the two arguments(ID and Password) when I run thread to call SAMPLTEST on test_web.py?
ID_BOX.send_keys(self.ID) # this place for ID. This param come from thread_run.py
ID_BOX.send_keys(self.ID)
PW_BOX.send_keys(self.PW)
I was trying to build constructor (init) in SAMPLETEST class but it wasn't working...
I am not really sure how to run threads (which passing arguments or parameter ) with pytest.
There are 2 scenarios which i could read from this:
prepare the test data and pass in parameters to your test method which could be achieved by pytest-generate-tests and parameterise concept. You can refer to the documentation here
In case of running pytest in multi threading - Pytest-xdist or pytest-parallel
I had a similar issue and got it resolved by passing an argument in form of a list.
Like:,
I replaced below line
thread_1 = Thread(target=fun1, args=10)
with
thread_1 = Thread(target=fun1, args=[10])
I am trying to set up a delayed task task whose timing will depend on several parameters (either passed to it or obtained from a redis database).
the pseudocode would look like this:
def main():
scheduler = BackgroundScheduler()
scheduler.add_job(delayed_task,
id=task_id,
next_run_time=somedate,
args=(task_id, some_data))
scheduler.start()
do_something_else()
def delayed_task(id, passed_data):
rd = connect_to_redis()
redis_data = rd.fetch_data(id)
publish_data(passed_data, redis_data)
updated_run_time = parse(redis_data)
#obtain a scheduler object here
scheduler.modify_job(id, next_run_time=updated_run_time)
The question is the following: is there a way to access the scheduler from a task?
The scheduler cannot be passed as parameter to the task, as this will raise
TypeError: can't pickle _thread.lock objects
For the same reason, I can't put all this in a class and have it called a method since the arguments of the method include self, which is the class containing the scheduler, and will thus result in the same issue.
Is it possible to regain an instance of the scheduler from outside, like I can generate a new connexion to redis?
I have a custom contextmanager I use (not a fixture) for setup and cleanup of a test:
#contextmanager
def db_content(*args, **kwargs):
instance = db_insert( ... )
yield instance
db_delete(instance)
def test_my_test():
with db_content( ... ) as instance:
# ...
assert result
The problem is that when the assertion fails, the db_delete() code - meaning the post yield statements, are not being executed.
I can see that if I use a fixture this does work.
#pytest.fixture
def db_instance():
instance = db_insert( ... )
yield instance
db_delete(instance)
def test_my_test(db_instance):
# ...
assert result
However, fixtures are very inflexible. I would like to pass different arguments to my context each test, while using fixtures would force me to define a different fixture for each case.
contextlib does not execute the post-yield statements if an exception was thrown. This is by design. To make it work you would have to write:
#contextmanager
def db_content(*args, **kwargs):
instance = db_insert( ... )
try:
yield instance
finally:
db_delete(instance)
In my opinion this is counter-intuitive as the try is not on the yield itself.
I took the implementation of contextmanager and made a safe version one that works as I expected, however its an entire code duplication, if anyone has a better workaround I'd love to see it.
I have an implementation of a BackgroundTask object that looks like the following:
class BackgroundTask(QObject):
'''
A utility class that makes running long-running tasks in a separate thread easier
:type task: callable
:param task: The task to run
:param args: positional arguments to pass to task
:param kawrgs: keyword arguments to pass to task
.. warning :: There is one **MAJOR** restriction in task: It **cannot** interact with any Qt GUI objects.
doing so will cause the GUI to crash. This is a limitation in Qt's threading model, not with
this class's design
'''
finished = pyqtSignal() #: Signal that is emitted when the task has finished running
def __init__(self, task, *args, **kwargs):
super(BackgroundTask, self).__init__()
self.task = task #: The callable that does the actual task work
self.args = args #: positional arguments passed to task when it is called
self.kwargs = kwargs #: keyword arguments pass to task when it is called
self.results= None #: After :attr:`finished` is emitted, this will contain whatever value :attr:`task` returned
def runTask(self):
'''
Does the actual calling of :attr:`task`, in the form ``task(*args, **kwargs)``, and stores the returned value
in :attr:`results`
'''
flushed_print('Running Task')
self.results = self.task(*self.args, **self.kwargs)
flushed_print('Got My Results!')
flushed_print('Emitting Finished!')
self.finished.emit()
def __repr__(self):
return '<BackgroundTask(task = {}, {}, {})>'.format(self.task, *self.args, **self.kwargs)
#staticmethod
def build_and_run_background_task(thread, finished_notifier, task, *args, **kwargs):
'''
Factory method that builds a :class:`BackgroundTask` and runs it on a thread in one call
:type finished_notifier: callable
:param finished_notifier: Callback that will be called when the task has completed its execution. Signature: ``func()``
:rtype: :class:`BackgroundTask`
:return: The created :class:`BackgroundTask` object, which will be running in its thread.
Once finished_notifier has been called, the :attr:`results` attribute of the returned :class:`BackgroundTask` should contain
the return value of the input task callable.
'''
flushed_print('Setting Up Background Task To Run In Thread')
bg_task = BackgroundTask(task, *args, **kwargs)
bg_task.moveToThread(thread)
bg_task.finished.connect(thread.quit)
thread.started.connect(bg_task.runTask)
thread.finished.connect(finished_notifier)
thread.start()
flushed_print('Thread Started!')
return bg_task
As my docstrings indicate, this should allow me to pass an arbitrary callable and its arguments to build_and_run_background_task, and upon completion of the task it should call the callable passed as finished_notifier and kill the thread. However, when I run it with the following as finished_notifier
def done():
flushed_print('Done!')
I get the following output:
Setting Up Background Task To Run In Thread
Thread Started!
Running Task
Got My Results!
Emitting Finished!
And that's it. The finished_notifier callback is never executed, and the thread's quit method is never called, suggesting the finshed signal in BackgroundTask isn't actually being emitted. If, however, I bind to finshed directly and call runTask directly (not in a thread), everything works as expected. I'm sure I just missed something stupid, any suggestions?
Figured out the problem myself, I needed to call qApp.processEvents() where another point in the application was waiting for this operation to finish. I had been testing on the command-line as well and that had masked the same problem.