ThreadPoolExecutor with Context Manager "cannot schedule new futures after shutdown" - python-3.x

I'm creating a thread manager class that handles executing tasks as threads and passing the results to the next process step. The flow works properly upon the first execution of receiving a task, but the second execution fails with the following error :
...python3.8/concurrent/futures/thread.py", line 179, in submit
raise RuntimeError('cannot schedule new futures after shutdown')
RuntimeError: cannot schedule new futures after shutdown
The tasks come from Cmd.cmdloop user input - so, the script is persistent and meant to not shutdown. Instead, run will be called multiple times, as input is received from the user.
I've implemented a ThreadPoolExecutor to handle the work load and trying to gather the results chronologically with concurrent.futures.as_completed so each item is processed to the next step in order of completion.
The run method below works perfect for the first execution, but returns the error upon second execution of the same task (that succeeded during the first execution).
def run ( self, _executor=None, _futures={}, ) -> bool :
task = self.pipeline.get( )
with _executor or self.__default_executor as executor :
_futures = { executor.submit ( task.target.execute, ), }
for future in concurrent.futures.as_completed ( _futures, ) :
print( future.result ( ) )
return True
So, the idea is that each call to run will create and teardown the executor with the context. But the error suggests the context shutdown properly after the first execution, and cannot be reopened/recreated when run is called during the second iteration... what is this error pointing to? .. what am I missing?
Any help would be great - thanks in advance.

Your easiest solution will be to use the multiprocessing library instead sending futures and ThreadPoolExecutore with Context Manager:
pool = ThreadPool(50)
pool.starmap(test_function, zip(array1,array2...))
pool.close()
pool.join()
While (array1[0] , array2[0]) will be the values sent to function "test_function" at the first thread, (array1[1] , array2[1]) at the second thread, and so on.

Related

Python thread breaks down silently when run in the background

In django I am using thread to read xlsx file in the background but after sometimes it breaks down silently without giving any errors. Is there any way to start an independent thread not causing random failure?
thread_obj = threading.Thread(
target=bulk_xlsx_obj.copy_bulk_xlsx
)
thread_obj.start()
You could set the thread as Daemon to avoid silent failure as follows:
thread_obj = threading.Thread(
target=bulk_xlsx_obj.copy_bulk_xlsx
)
thread_obj.setDaemon(True)
thread_obj.start()

Want to deploy to servers in parallel using Groovy

I am trying to Deploy to a list of servers in parallel to save some time. The names of servers are listed in a collection: serverNames
The original code was:
serverNames.each({
def server = new Server([steps: steps, hostname: it, domain: "test"])
server.stopTomcat()
server.ssh("rm -rf ${WEB_APPS_DIR}/pc*")
PLMScriptUtils.secureCopy(steps, warFileLocation, it, WEB_APPS_DIR)
})
Basically i want to stop the tomcat, rename a file and then copy a war file to a location using the following lines:
server.stopTomcat()
server.ssh("rm -rf ${WEB_APPS_DIR}/pc*")
PLMScriptUtils.secureCopy(steps, warFileLocation, it, WEB_APPS_DIR)
The original code was working properly and it took 1 server from the collection serverNames and performed the 3 line to do the deploy.
But now i have requirement to run the deployment to the servers listed in serverNames parallely
Below is my new modified code:
def threads = []
def th
serverNames.each({
def server = new Server([steps: steps, hostname: it, domain: "test"])
th = new Thread({
steps.echo "doing deployment"
server.stopTomcat()
server.ssh("rm -rf ${WEB_APPS_DIR}/pc*")
PLMScriptUtils.secureCopy(steps, warFileLocation, it, WEB_APPS_DIR)
})
threads << th
})
threads.each {
steps.echo "joining thread"
it.join()
}
threads.each {
steps.echo "starting thread"
it.start()
}
The echo statements were added to visualize the flow.
With this the output is coming as:
joining thread
joining thread
joining thread
joining thread
starting thread
starting thread
starting thread
starting thread
The number of servers in the collection was 4 hence 4 times the thread is being added and started. but it is not executing the 3 lines i want to run in parallel, which means "doing deployment" is not being printed at all and later the build is failing with an exception.
Note that i am running this Groovy code as a pipeline through Jenkins this whole piece of code is actually a function called deploy of the class deployment and my pipeline in jenkins is creating an object of the class deployment and then calling the deploy function
Can anyone help me with this ? I am stuck like hell with this one. :-(
Have a look at the parallel step. In scripted pipelines (which you seem to be using), you can pass it a map of thread name to action (as a Groovy closure) which is then run in parallel.
deployActions = [
Server1: {
// stop tomcat etc.
},
Server2: {
...
}
]
parallel deployActions
It is much simpler and the recommended way of doing it.

Twilio Taskrouter: How to prevent last worker in queue from being reassigned rejected task?

I'm using NodeJS to manage a Twilio Taskrouter workflow. My goal is to have a task assigned to an Idle worker in the main queue identified with queueSid, unless one of the following is true:
No workers in the queue are set to Idle
Reservations for the task have already been rejected by every worker in the queue
In these cases, the task should fall through to the next queue identified with automaticQueueSid. Here is how I construct the JSON for the workflow (it includes a filter such that an inbound call from an agent should not generate an outbound call to that same agent):
configurationJSON(){
var config={
"task_routing":{
"filters":[
{
"filter_friendly_name":"don't call self",
"expression":"1==1",
"targets":[
{
"queue":queueSid,
"expression":"(task.caller!=worker.contact_uri) and (worker.sid NOT IN task.rejectedWorkers)",
"skip_if": "workers.available == 0"
},
{
"queue":automaticQueueSid
}
]
}
],
"default_filter":{
"queue":queueSid
}
}
}
return config;
}
This results in no reservation being created after the task reaches the queue. My event logger shows that the following events have occurred:
workflow.target-matched
workflow.entered
task.created
That's as far as it gets and just hangs there. When I replace the line
"expression":"(task.caller!=worker.contact_uri) and (worker.sid NOT IN task.rejectedWorkers)"
with
"expression":"(task.caller!=worker.contact_uri)
Then the reservation is correctly created for the next available worker, or sent to automaticQueueSid if no workers are available when the call comes in, so I guess the skip_if is working correctly. So maybe there is something wrong with how I wrote the target expression?
I tried working around this by setting a worker to unavailable once they reject a reservation, as follows:
clientWorkspace
.workers(parameters.workerSid)
.reservations(parameters.reservationSid)
.update({
reservationStatus:'rejected'
})
.then(reservation=>{
//this function sets the worker's Activity to Offline
var updateResult=worker.updateWorkerFromSid(parameters.workerSid,process.env.TWILIO_OFFLINE_SID);
})
.catch(err=>console.log("/agent_rejects: error rejecting reservation: "+err));
But what seems to be happening is that as soon as the reservation is rejected, before worker.updateWorkerFromSid() is called, Taskrouter has already generated a new reservation and assigned it to that same worker, and my Activity update fails with the following error:
Error: Worker [workerSid] cannot have its activity updated while it has 1 pending reservations.
Eventually, it seems that the worker is naturally set to Offline and the task does time out and get moved into the next queue, as shown by the following events/descriptions:
worker.activity.update
Worker [friendly name] updated to Offline Activity
reservation.timeout
Reservation [sid] timed out
task-queue.moved
Task [sid] moved out of TaskQueue [friendly name]
task-queue.timeout
Task [sid] timed out of TaskQueue [friendly name]
After this point the task is moved into the next queue automaticQueueSid to be handled by available workers registered with that queue. I'm not sure why a timeout is being used, as I haven't included one in my workflow configuration.
I'm stumped--how can I get the task to successfully move to the next queue upon the last worker's reservation rejection?
UPDATE: although #philnash's answer helped me correctly handle the worker.sid NOT IN task.rejectedWorkers issue, I ultimately ended up implementing this feature using the RejectPendingReservations parameter when updating the worker's availability.
Twilio developer evangelist here.
rejectedWorkers is not an attribute that is automatically handled by TaskRouter. You reference this answer by my colleague Megan in which she says:
For example, you could update TaskAttributes to have a rejected worker SID list, and then in the workflow say that worker.sid NOT IN task.rejectedWorkerSids.
So, in order to filter by a rejectedWorkers attribute you need to maintain one yourself, by updating the task before you reject the reservation.
Let me know if that helps at all.

How can I check if an exception was raised in a Gherkin test?

I'm writing a scenario that checks for a timeout exception. This is my Ruby code:
# Execute the constructed command, logging out each line
log.info "Executing '#{command.join(' ')}'"
begin
timeout(config['deploy-timeout'].to_i) do
execute_and_log command
end
rescue Exception => e
log.info 'Err, timeout, ouch'
raise Timeout::Error
end
I'd like to check either the output for Err, timeout or if an Exception was raised in Gherkin/Cucumber.
Then(/^the output should be 'Err, timeout, ouch'$/) do
puts 'Err, timeout, ouch'
end
How can I do that?
Gherkin isn't the place where the magic happens. The magic happens in the step definition files where you use Ruby, selenium, RSpec and other technologies (e.g.Capybara) to create the desired behaviors. So to rephrase your question, "How can I test a timeout exception given a cucumber, ruby, RSpec and selenium implementation?"
Selenium has the concept of implicit waits. That is, the duration over which selenium retries an operation before declaring failure. You can control implicit waits by setting the following in your env.rb:
# Set the amount of time the driver should wait when searching for elements
driver.manage.timeouts.implicit_wait = 20
# Sets the amount of time to wait for an asynchronous script to finish
# execution before throwing an error. If the timeout is negative, then the
# script will be allowed to run indefinitely.
driver.manage.timeouts.script_timeout = 20
# Sets the amount of time to wait for a page load to complete before throwing an error.
# If the timeout is negative, page loads can be indefinite.
driver.manage.timeouts.page_load = 20
The units are seconds. You will need to set implicit_wait higher than your 'Err, timeout, ouch' timeout. Think.
I believe that WebDriver throws Error::TimeOutError when a wait time is exceed. Your code throws, what? Timeout::Error? So in the rescue sections:
Given(/^ .... $/ do
...
rescue Error::TimeOutError => e
#timeout_exception = "Error::TimeOutError"
end
rescue Timeout::Error => f
#timeout_exception = "Err, timeout, ouch"
end
end
Then(/^the output should be 'Err, timeout, ouch'$/) do |expectedException|
expect(#timeout_exception).to eq(expectedException), "Err, timeout, ouch"
end
The above assumes that you are using rspec/exceptions, i.e RSpec 3.
First, you need to be able to execute the code that needs testing in such a way that the timeout be guaranteed. It's one option to call the cli and set up config files and whatnot but you would be much better off if you can call just the relevant part of the code from your step definition directly.
Next, you want to be able to test the outcome of your 'When' steps in your 'Then' steps. You enclose the actual call in the 'When' step in a 'try' block and store the result as well as any exception somewhere where the 'Then' step will be able to do assertions on them.
Something along the line (please excuse the syntax):
Given(/a configuration that will cause a timeout/) do
sharedContext.config = createConfigThatWillCauseTimeout()
When(/a command is executed/) do
begin:
sharedContext.result = executeCommand(sharedContext.config)
rescue Exception => ex
sharedContext.exception = ex
Then(/there should have been a Timeout/) do
logContent = loadLogContent(sharedContext.config)
assert sharedContext.exception.is_a?(Timeout::Error)
assert logContent.contains("Err, timeout, ouch")

Tkinter traffic-jamming

I have an expensive function that is called via a Tkinter callback:
def func: # called whenever there is a mouse press in the screen.
print("Busy? " + str(X.busy)) # X.busy is my own varaible and is initialized False.
X.busy = True
do_calculations() # do_calculations contains several tk.Canvas().update() calls
X.busy = False
When I click too quickly, the func()'s appear to pile up because the print gives "Busy? True", indicating that the function hasen't finished yet and we are starting it on another thread.
However, print(threading.current_thread()) always gives <_MainThread(MainThread, started 123...)>, the 123... is always the same each print for a given program run. How can the same thread be multiple threads?
It looks to me like you're running into recursive message processing. In particular, tk.Canvas().update() will process any pending messages, including extra button clicks. Further, it will do this on the same thread (at least on Windows).
So your thread ID is constant, but your stack trace will have multiple nested calls to func.

Resources