In django I am using thread to read xlsx file in the background but after sometimes it breaks down silently without giving any errors. Is there any way to start an independent thread not causing random failure?
thread_obj = threading.Thread(
target=bulk_xlsx_obj.copy_bulk_xlsx
)
thread_obj.start()
You could set the thread as Daemon to avoid silent failure as follows:
thread_obj = threading.Thread(
target=bulk_xlsx_obj.copy_bulk_xlsx
)
thread_obj.setDaemon(True)
thread_obj.start()
Related
I'm creating a thread manager class that handles executing tasks as threads and passing the results to the next process step. The flow works properly upon the first execution of receiving a task, but the second execution fails with the following error :
...python3.8/concurrent/futures/thread.py", line 179, in submit
raise RuntimeError('cannot schedule new futures after shutdown')
RuntimeError: cannot schedule new futures after shutdown
The tasks come from Cmd.cmdloop user input - so, the script is persistent and meant to not shutdown. Instead, run will be called multiple times, as input is received from the user.
I've implemented a ThreadPoolExecutor to handle the work load and trying to gather the results chronologically with concurrent.futures.as_completed so each item is processed to the next step in order of completion.
The run method below works perfect for the first execution, but returns the error upon second execution of the same task (that succeeded during the first execution).
def run ( self, _executor=None, _futures={}, ) -> bool :
task = self.pipeline.get( )
with _executor or self.__default_executor as executor :
_futures = { executor.submit ( task.target.execute, ), }
for future in concurrent.futures.as_completed ( _futures, ) :
print( future.result ( ) )
return True
So, the idea is that each call to run will create and teardown the executor with the context. But the error suggests the context shutdown properly after the first execution, and cannot be reopened/recreated when run is called during the second iteration... what is this error pointing to? .. what am I missing?
Any help would be great - thanks in advance.
Your easiest solution will be to use the multiprocessing library instead sending futures and ThreadPoolExecutore with Context Manager:
pool = ThreadPool(50)
pool.starmap(test_function, zip(array1,array2...))
pool.close()
pool.join()
While (array1[0] , array2[0]) will be the values sent to function "test_function" at the first thread, (array1[1] , array2[1]) at the second thread, and so on.
I'm building an apllication which is intended to do a bulk-job processing data within another software. To control the other software automatically I'm using pyautoit, and everything works fine, except for application errors, caused from the external software, which occur from time to time.
To handle those cases, I built a watchdog:
It starts the script with the bulk job within a subprocess
process = subprocess.Popen(['python', job_script, src_path], stdout=subprocess.PIPE,
stderr=subprocess.PIPE, shell=True)
It listens to the system event using winevt.EventLog module
EventLog.Subscribe('System', 'Event/System[Level<=2]', handle_event)
In case of an error occurs, it shuts down everything and re-starts the script again.
Ok, if an system error event occurs, this event should get handled in a way, that the supprocess gets notified. This notification should then lead to the following action within the subprocess:
Within the subprocess there's an object controlling everything and continuously collecting
generated data. In order to not having to start the whole job from the beginnig, after re-starting the script, this object has to be dumped using pickle (which isn't the problem here!)
Listening to the system event from inside the subprocess didn't work. It results in a continuous loop, when calling subprocess.Popen().
So, my question is how I can either subscribe for system events from inside a childproces, or communicate between the parent and childprocess - means, sending a message like "hey, an errorocurred", listening within the subprocess and then creating the dump?
I'm really sorry not being allowed to post any code in this case. But I hope (and actually think), that my description should be understandable. My question is just about what module to use to accomplish this in the best way?
Would be really happy, if somebody could point me into the right direction...
Br,
Mic
I believe the best answer may lie here: https://docs.python.org/3/library/subprocess.html#subprocess.Popen.stdin
These attributes should allow for proper communication between the different processes fairly easily, and without any other dependancies.
Note that Popen.communicate() may suit better if other processes may cause issues.
EDIT to add example scripts:
main.py
from subprocess import *
import sys
def check_output(p):
out = p.stdout.readline()
return out
def send_data(p, data):
p.stdin.write(bytes(f'{data}\r\n', 'utf8')) # auto newline
p.stdin.flush()
def initiate(p):
#p.stdin.write(bytes('init\r\n', 'utf8')) # function to send first communication
#p.stdin.flush()
send_data(p, 'init')
return check_output(p)
def test(p, data):
send_data(p, data)
return check_output(p)
def main()
exe_name = 'Doc2.py'
p = Popen([sys.executable, exe_name], stdout=PIPE, stderr=STDOUT, stdin=PIPE)
print(initiate(p))
print(test(p, 'test'))
print(test(p, 'test2')) # testing responses
print(test(p, 'test3'))
if __name__ == '__main__':
main()
Doc2.py
import sys, time, random
def recv_data():
return sys.stdin.readline()
def send_data(data):
print(data)
while 1:
d = recv_data()
#print(f'd: {d}')
if d.strip() == 'test':
send_data('return')
elif d.strip() == 'init':
send_data('Acknowledge')
else:
send_data('Failed')
This is the best method I could come up with for cross-process communication. Also make sure all requests and responses don't contain newlines, or the code will break.
I'm developing a web service that will be used as a "database as a service" provider. The goal is to have a flask based small web service, running on some host and "worker" processes running on different hosts owned by different teams. Whenever a team member comes and requests a new database I should create one on their host. Now the problem... The service I start must be running. The worker however might be restarted. Could happen 5 minutes could happen 5 days. A simple Popen won't do the trick because it'd create a child process and if the worker stops later on the Popen process is destroyed (I tried this).
I have an implementation that's using multiprocessing which works like a champ, sadly I cannot use this with celery. so out of luck there. I tried to get away from the multiprocessing library with double forking and named pipes. The most minimal sample I could produce:
def launcher2(working_directory, cmd, *args):
command = [cmd]
command.extend(list(args))
process = subprocess.Popen(command, cwd=working_directory, shell=False, start_new_session=True,
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
with open(f'{working_directory}/ipc.fifo', 'wb') as wpid:
wpid.write(process.pid)
#shared_task(bind=True, name="Test")
def run(self, cmd, *args):
working_directory = '/var/tmp/workdir'
if not os.path.exists(working_directory):
os.makedirs(working_directory, mode=0o700)
ipc = f'{working_directory}/ipc.fifo'
if os.path.exists(ipc):
os.remove(ipc)
os.mkfifo(ipc)
pid1 = os.fork()
if pid1 == 0:
os.setsid()
os.umask(0)
pid2 = os.fork()
if pid2 > 0:
sys.exit(0)
os.setsid()
os.umask(0)
launcher2(working_directory, cmd, *args)
else:
with os.fdopen(os.open(ipc, flags=os.O_NONBLOCK | os.O_RDONLY), 'rb') as ripc:
readers, _, _ = select.select([ripc], [], [], 15)
if not readers:
raise TimeoutError(60, 'Timed out', ipc)
reader = readers.pop()
pid = struct.unpack('I', reader.read())[0]
pid, status = os.waitpid(pid, 0)
print(status)
if __name__ == '__main__':
async_result = run.apply_async(('/usr/bin/sleep', '15'), queue='q2')
print(async_result.get())
My usecase is more complex but I don't think anyone would want to read 200+ lines of bootstrapping, but this fails exactly on the same way. On the other hand I don't wait for the pid unless that's required so it's like start the process on request and let it do it's job. Bootstrapping a database takes roughly a minute with the full setup, and I don't want the clients standing by for a minute. Request comes in, I spawn the process and send back an id for the database instance, and the client can query the status based on the received instance id. However with the above forking solution I get:
[2020-01-20 18:03:17,760: INFO/MainProcess] Received task: Test[dbebc31c-7929-4b75-ae28-62d3f9810fd9]
[2020-01-20 18:03:20,859: ERROR/MainProcess] Process 'ForkPoolWorker-2' pid:16634 exited with 'signal 15 (SIGTERM)'
[2020-01-20 18:03:20,877: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 15 (SIGTERM).')
Traceback (most recent call last):
File "/home/pupsz/PycharmProjects/provider/venv37/lib/python3.7/site-packages/billiard/pool.py", line 1267, in mark_as_worker_lost
human_status(exitcode)),
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 15 (SIGTERM).
Which leaves me wondering, what might be going on. I tried an even more simple task:
#shared_task(bind=True, name="Test")
def run(self, cmd, *args):
working_directory = '/var/tmp/workdir'
if not os.path.exists(working_directory):
os.makedirs(working_directory, mode=0o700)
command = [cmd]
command.extend(list(args))
process = subprocess.Popen(command, cwd=working_directory, shell=False, start_new_session=True,
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
return process.wait()
if __name__ == '__main__':
async_result = run.apply_async(('/usr/bin/sleep', '15'), queue='q2')
print(async_result.get())
Which again fails with the very same error. Now I like Celery but from this it feels like it's not suited for my needs. Did I mess something up? Can it be achieved, what I need to do from a worker? Do I have any alternatives, or should I just write my own task queue?
Celery is not multiprocessing-friendly, so try to use billiard instead of multiprocessing (from billiard import Process etc...) I hope one day Celery guys do a heavy refactoring of that code, remove billiard, and start using multiprocessing instead...
So, until they move to multiprocessing we are stuck with billiard. My advice is to remove any usage of multiprocessing in your Celery tasks, and start using billiard.context.Process and similar, depending on your use-case.
I am working on a script which needs to spawn an Expect process periodically (every 5 mins) to do some work. Below is the code that I have that spawns an Expect process and does some work. The main process of the script is doing some other work at all times, for example it may wait for user input, because of that I am calling this function 'spawn_expect' in a thread that keeps calling it every 5 minutes, but the issue is that the Expect is not working as expected.
If however I replace the thread with another process, that is if I fork and let one process take care of spawning Expect and the other process does the main work of the script (for example waiting at a prompt) then Expect works fine.
My question is that is it possible to have a thread spawn Expect process ? do I have to resort to using a process to do this work ? Thanks !
sub spawn_expect {
my $expect = Expect->spawn($release_config{kinit_exec});
my $position = $expect->expect(10,
[qr/Password.*: /, sub {my $fh = shift; print $fh "password\n";}],
[timeout => sub {print "Timed out";}]);
# if this function is run via a process, $position is defined, if it is run via a thread, it is not defined
...
}
Create the Expect object beforehand (not inside a thread) and pass it to a thread
my $exp = Expect->spawn( ... );
$exp->raw_pty(1);
$exp->log_stdout(0);
my ($thr) = threads->create(\&login, $exp);
my #res = $thr->join();
# ...
sub login {
my $exp = shift;
my $position = $exp->expect( ... );
# ...
}
I tested with multiple threads, where one uses Expect with a custom test script and returns the script's output to the main thread. Let me know if I should post these (short) programs.
When the Expect object is created inside a thread it fails for me, too. My guess is that in that case it can't set up its pty the way it does that normally.
Given the clarification in a comment I'd use fork for the job though.
Need solution for parallel processing in tcl (windows).
I tried with thread, still not able to achieve desired output.
To simplify My requirement I am giving a simple example as following.
Requirement:
I want to run notepad.exe without effecting my current execution of flow. From main thread control should go to called thread, start notepad.exe and come back to main thread with out closing the notepad .
Tried:(Tcl script)
package require Thread
set a 10
proc test_thread {b} {
puts "in procedure $b"
set tid [thread::create] ;# Create a thread
return $tid
}
puts "main thread"
puts [thread::id]
set ttid [test_thread $a]
thread::send $ttid {exec c:/windows/system32/notepad.exe &}
puts "end"
Getting Output:
running notepad without showing any log.
when closing notepad application I am getting following output.
main thread
tid0000000000001214
in procedure 10
end
Desired output:
main thread
tid0000000000001214
in procedure 10
---->> control should go to thread and run notepad.exe with out effecting main thread flow.
<<-------
end
So kindly help to solve this issue and if appart from thread concept any other is there let me know.
You're using a synchronous thread::send. It's the version that is most convenient for when you want to get a value back, but it does wait. You probably should be using the asynchronous version:
thread::send -async $ttid {exec c:/windows/system32/notepad.exe &}
# ^^^^^^ This flag here is what you need to add
However it is curious that the exec call is behaving as you describe at all; the & at the end should make it effectively asynchronous anyway. Unless there's some sort of nasty interaction with how Windows is interpreting asynchronous subprocess creation in this case.