Need to add the dynamic soft_time_limit for my task. The task should be executed based on dynamic soft_time_limit. I should not mention a limit for this execution. Is there any way to define this?
#APP.task(acks_late=True, soft_time_limit=10000, trail=True, bind=True)
def execute_fun(self, data):
try:
do_work()
except Exception as error:
print('---error---', error)
Here in the above function, I don't want to define the soft_time_limit, It should take it as a dynamic time limit. How should I achieve this?
Related
Hope you are doing well!
I'm using a function that is utilizing the lru_cache of functools library. for example:
#functools.lru_cache(maxsize=pow(2,13))
def get_id(word):
# retrieving id, using cache if possible
I would like on some occasions to bypass the cache, thus getting the ID straight from the source,
but I wouldn't like to create 2 separate functions (one with cache, and the other without) that are running exactly the same code.
Reading the documentation of functools on docs.python I understand that the cache can be bypassed:
The original underlying function is accessible through the wrapped
attribute. This is useful for introspection, for bypassing the cache,
or for rewrapping the function with a different cache.
I've tried doing so with a wrapper function, but since the inner function exists only while the outer function was running, the cache was reset on every call.
I would appreciate any help on the matter.
what the documentation is telling you is that you can access the wrapped function directly this way (bypassing the caching):
get_id.__wrapped__(word="hello")
you could add one additional layer with a flag:
from functools import lru_cache
#lru_cache(maxsize=pow(2, 13))
def get_cached(word):
return get_raw(word)
def get_raw(word):
# your logic goes here...
pass
def get_id(word, cached=True):
if cached:
return get_cached(word)
else:
return get_raw(word)
Please consider the follow code:
class Task1(TaskSet):
#task
def task1_method(self):
pass
class Task2(TaskSet):
#task
def task2_method(self):
pass
class UserBehaviour(TaskSet):
tasks = [Task1, Task2]
class LoggedInUser(HttpUser):
host = "http://localhost"
wait_time = between(1, 5)
tasks = [UserBehaviour]
When I execute the code above with just one user, the method Task2.Method never gets executed, only the method from Task1.
What can I do to make sure the code from both tasks gets executed for the same user?
I would like to do it this way because I want to separate the tasks into different files for better organizing the project. If that is not possible, how can I have tasks defined into different files in an way that I can have tasks defined for each od my application modules?
I think I got it. To solve the problem I had to add a method at the end of each taskset to stop the execution of the task set:
def stop(self):
self.interrupt()
In addition to that, I had to change the inherited class to SequentialTaskSet so all tasks get executed in order.
This is the full code:
class Task1(SequentialTaskSet):
#task
def task1_method(self):
pass
#task
def stop(self):
self.interrupt()
class Task2(SequentialTaskSet):
#task
def task2_method(self):
pass
#task
def stop(self):
self.interrupt()
class UserBehaviour(SequentialTaskSet):
tasks = [Task1, Task2]
class LoggedInUser(HttpUser):
host = "http://localhost"
wait_time = between(1, 5)
tasks = [UserBehaviour]
Everything seems to be working fine now.
At first I thought this was a bug, but it is actually just as intended (although I dont really understand WHY it was implemented that way)
One important thing to know about TaskSets is that they will never
stop executing their tasks, and hand over execution back to their
parent User/TaskSet, by themselves. This has to be done by the
developer by calling the TaskSet.interrupt() method.
https://docs.locust.io/en/stable/writing-a-locustfile.html#interrupting-a-taskset
I would solve this issue with inheritance: Define a base TaskSet or User class that has the common tasks, and then subclass it, adding the user-type-specific tasks/code.
If you define a base User class, remember to set abstract = True if you dont want Locust to run that user as well.
Unable to understand the concept of Payload in airflow with TriggerDagRunOperator.Please help me to understand this term in a very easy way.
the TriggerDagRunOperator triggers a DAG run for a specified dag_id. This needs a trigger_dag_id with type string and a python_callable param which is a reference to a python function that will be called while passing it the context object and a placeholder object obj for your callable to fill and return if you want a DagRun created. This obj object contains a run_id and payload attribute that you can modify in your function.
The run_id should be a unique identifier for that DAG run, and the payload has to be a picklable object that will be made available to your tasks while executing that DAG run. Your function header should look like def foo(context, dag_run_obj):
picklable simply means it can be serialized by the pickle module. For a basic understanding of this, see what can be pickled and unpickled?. The pickle protocol provides more details, and shows how classes can customize the process.
Reference: https://github.com/apache/airflow/blob/d313d8d24b1969be9154b555dd91466a2489e1c7/airflow/operators/dagrun_operator.py#L37
I am trying to implement a class which uses python context manager ..
Though i understand the general concept of enter and exit i dont see how to use the same context manager across multiple code blocks.
for example take the below case
#contextmanager
def backupContext(input)
try:
return xyz
finally
revert (xyz)
class do_something:
def __init__(self):
self.context = contextVal
def doResourceOperation_1():
with backupContext(self.context) as context:
do_what_you_want_1(context)
def doResourceOperation_2():
with backupContext(self.context) as context:
do_what_you_want_2(context)
I am invoking the context managers twice..Suppose i want to do only once.. during the init and use the same context manager object to do all my operations and then finally when the object is deleted i want to do the revert operation how should i go about it?
Should i call enter and exit manually instead of using the with statement?
What is the order in which Luigi executes the methods (run, output, requires). I understand requires is run as a first check for checking the validity of the task DAG, but shouldn't output be run after run()?
I'm actually trying to wait for a kafka message in run and based on that trigger a bunch of other tasks and return a LocalTarget. Like this:
def run(self):
for message in self.consumer:
self.metadata_key = str(message.value, 'utf-8')
self.path = os.path.join(settings.LUIGI_OUTPUT_PATH, self.metadata_key, self.batch_id)
if not os.path.exists(self.path):
os.mkdir(self.path)
with self.conn.cursor() as cursor:
all_accounts = cursor.execute('select domainname from tblaccountinfo;')
for each in all_accounts:
open(os.path.join(self.path,each)).close()
def output(self):
return LocalTarget(self.path)
However, I get an error saying:
Exception: path or is_tmp must be set
At the return LocalTarget(self.path) line. Why does luigi try to execute the def output() method till def run() is done?
When you run a pipeline (ie one or more tasks), Luigi first checks whether its output targets already exist, and if not, schedules the task to run.
How does Luigi know what targets it must check? It just gets them calling your task's output() method.
It is not the execution order. Luigi will check for the file that we want to create using output() method is existing or not before making the task to pending status. So, it expects the variables to be resolved if you are using any. Here, you are using self.path, which is getting created in the run method. That's why the error.
Either you have to create the path in the class itself and consume in output method or create them in the output method itself and consume them in the run method as below
self.output().open('w').close()