using multiprocessing in windows with multithreading python - python-3.x

I have a script which works like that, a list of element and a function,:
def fct(elm):
do work
after that I start the threads (3) wherein the end of every thread I print the name of the element like this:
jobs = Queue()
def do_stuff(q):
while not q.empty():
value = q.get()
fct(item=value)
q.task_done()
for i in lines:
jobs.put(i)
for i in range(3):
worker = threading.Thread(target=do_stuff, args=(jobs,))
worker.start()
jobs.join()
what I wanna do is whenever a three is done (a file is saved) starting another process which has to read the file and apply another fct2
note: I'm using windows

Related

Queue and thread from file customize working threads

I am planing to write a python script that reads urls from a file and checks the status code from these urls using requests. To speed up the process my intention is to use multiple threads at the same time.
import threading
import queue
q = queue.Queue()
def CheckUrl():
while True:
project = q.get()
#Do the URL checking here
q.task_done()
threading.Thread(target=CheckUrl, daemon=True).start()
file = open("TextFile.txt", "r")
while True:
next_line = file.readline()
q.put(next_line)
if not next_line:
break;
file.close()
print('project requests sent\n', end='')
q.join()
print('projects completed')
My problem. Now the code is reading all the text at once making as many threads as there are lines in the text file if I understand correctly. I i would like to do something like read 20 lines at the same time, check status code from the 20 urls, if one or more checks are done go to the next.
is there something like
threading.Thread(target=CheckUrl, daemon=True, THREADSATSAMETIME=20).start()
Seems i have to stick with this one
def threads_run():
for i in range(20): #create 20 threads
(i) = threading.Thread(target=CheckUrl, daemon=True).start()
threads_run()

How to return data from a separate process (like queue) without closing it so you can send it more data later

The reason I am trying to do this is because Windows is terribly slow at opening and closing processes all the time. Its terribly inefficient which is the anti-purpose of multiprocessing. If one could start say 10 processes that do a specific operation on some data that can return the data (with queue or something) without closing it so you can send it more data to operate on and return. Basically a hive of processes holding functions that are always open ready to process and return data without death on the return so you don't have to keep opening new ones. I guess I could just run Linux Debian Crunchbang but I want to make it Windows efficient as well.
import multiprocessing
def operator_dies(data, q):
for i in range(len(data)):
if data % i == 0:
return q.put(0) # Function and process die here
return q.put(1) # And here
def operator_lives(data, q):
for i in range(len(data)):
if data % i == 0:
q.put(0) # I want it to stay open, but send data back with q.put()
q.put(1) # Same
def initialize(data):
q = multiprocessing.Queue()
p = [multiprocessing.Process(target=operator_dies, args=data, q]
p.daemon = True
p.start()
# Multiprocessing.Process will open more processes.
# Is there another multiprocessing function that can send new data as
# arguments in the function of an already open process ID and reset its loop?
# I cannot simply do: q.put().processid from here to refresh
# a variable for arguments in an open process can I?
if __name__ == '__main__':
initialize(data)

How to change global variables when using parallel programing

I am using multiprocessing in my code to do somethings parallel. Actually in a simple version of my goal, I want to change some global variables by two different processes in parallel.
But in the end of the code running, the result which is getting from mp.Queue is true but the variables are not changed.
here is a simple version of code:
import multiprocessing as mp
a = 3
b = 5
# define a example function
def f(length, output):
global a
global b
if length==5:
a = length + a
output.put(a)
if length==3:
b = length + b
output.put(b)
if __name__ == '__main__':
# Define an output queue
output = mp.Queue()
# Setup a list of processes that we want to run
processes = []
processes.append(mp.Process(target=f, args=(5, output)))
processes.append(mp.Process(target=f, args=(3, output)))
# Run processes
for p in processes:
p.start()
# Exit the completed processes
for p in processes:
p.join()
# Get process results from the output queue
results = [output.get() for p in processes]
print(results)
print ("a:",a)
print ("b:",b)
And the blow is the answers:
[8, 8]
a: 3
b: 5
How can I apply the results of processes to the global variables? or how can I run this code with multiprocessing and get answer like running a simple threat code ?
When you use Threading, the two (or more) threads are created within the same process and share their memory (globals).
When you use MultiProcessing, a whole new process is created and each one gets its own copy of the memory (globals).
You could look at mutiprocessing Value/Array or Manager to allow pseudo-globals, i.e. shared objects.

Multiprocessing - tkinter pipeline communication

I have a question on multiprocessing and tkinter. I am having some problems getting my process to function parallel with the tkinter GUI. I have created a simple example to practice and have been reading up to understand the basics of multiprocessing. However when applying them to tkinter, only one process runs at the time. (Using Multiprocessing module for updating Tkinter GUI) Additionally, when I added the queue to communicate between processes, (How to use multiprocessing queue in Python?), the process won't even start.
Goal:
I would like to have one process that counts down and puts the values in the queue and one to update tkinter after 1 second and show me the values.
All advice is kindly appreciated
Kind regards,
S
EDIT: I want the data to be available when the after method is being called. So the problem is not with the after function, but with the method being called by the after function. It will take 0.5 second to complete the calculation each time. Consequently the GUI is unresponsive for half a second, each second.
EDIT2: Corrections were made to the code based on the feedback but this code is not running yet.
class Countdown():
"""Countdown prior to changing the settings of the flows"""
def __init__(self,q):
self.master = Tk()
self.label = Label(self.master, text="", width=10)
self.label.pack()
self.counting(q)
# Countdown()
def counting(self, q):
try:
self.i = q.get()
except:
self.label.after(1000, self.counting, q)
if int(self.i) <= 0:
print("Go")
self.master.destroy()
else:
self.label.configure(text="%d" % self.i)
print(i)
self.label.after(1000, self.counting, q)
def printX(q):
for i in range(10):
print("test")
q.put(9-i)
time.sleep(1)
return
if __name__ == '__main__':
q = multiprocessing.Queue()
n = multiprocessing.Process(name='Process2', target=printX, args = (q,))
n.start()
GUI = Countdown(q)
GUI.master.mainloop()
Multiprocessing does not function inside of the interactive Ipython notebook.
Multiprocessing working in Python but not in iPython As an alternative you can use spyder.
No code will run after you call mainloop until the window has been destroyed. You need to start your other process before you call mainloop.
You are calling wrong the after function. The second argument must be the name of the function to call, not a call to the function.
If you call it like
self.label.after(1000, self.counting(q))
It will call counting(q) and wait for a return value to assign as a function to call.
To assign a function with arguments the syntax is
self.label.after(1000, self.counting, q)
Also, start your second process before you create the window and call counting.
n = multiprocessing.Process(name='Process2', target=printX, args = (q,))
n.start()
GUI = Countdown(q)
GUI.master.mainloop()
Also you only need to call mainloop once. Either position you have works, but you just need one
Edit: Also you need to put (9-i) in the queue to make it count down.
q.put(9-i)
Inside the printX function

Is this a Python 3 regression in IPython Notebook?

I am attempting to create some simple asynchronously-executing animations based on ipythonblocks and I am trying to update the cell output area using clear_output() followed by a grid.show().
For text output the basis of the technique is discussed in Per-cell output for threaded IPython Notebooks so my simplistic assumption was to use the same method to isolate HTML output. Since I want to repeatedly replace a grid with its updated HTML version I try to use clear_output() to ensure that only one copy of the grid is displayed.
I verified that this proposed technique works for textual output with the following cells. First the context manager.
import sys
from contextlib import contextmanager
import threading
stdout_lock = threading.Lock()
n = 0
#contextmanager
def set_stdout_parent(parent):
"""a context manager for setting a particular parent for sys.stdout
(i.e. redirecting output to a specific cell). The parent determines
the destination cell of output
"""
global n
save_parent = sys.stdout.parent_header
# we need a lock, so that other threads don't snatch control
# while we have set a temporary parent
with stdout_lock:
sys.stdout.parent_header = parent
try:
yield
finally:
# the flush is important, because that's when the parent_header actually has its effect
n += 1; print("Flushing", n)
sys.stdout.flush()
sys.stdout.parent_header = save_parent
Then the test code
import threading
import time
class timedThread(threading.Thread):
def run(self):
# record the parent (uncluding the stdout cell) when the thread starts
thread_parent = sys.stdout.parent_header
for i in range(3):
time.sleep(2)
# then ensure that the parent is the same as when the thread started
# every time we print
with set_stdout_parent(thread_parent):
print(i)
timedThread().start()
This provided the output
0
Flushing 1
1
Flushing 2
2
Flushing 3
So I modified the code to clear the cell between cycles.
import IPython.core.display
class clearingTimedThread(threading.Thread):
def run(self):
# record the parent (uncluding the stdout cell) when the thread starts
thread_parent = sys.stdout.parent_header
for i in range(3):
time.sleep(2)
# then ensure that the parent is the same as when the thread started
# every time we print
with set_stdout_parent(thread_parent):
IPython.core.display.clear_output()
print(i)
clearingTimedThread().start()
As expected the output area of the cell was repeatedly cleared, and ended up reading
2
Flushing 6
I therefore thought I was on safe ground in using the same technique to clear a cell's output area when using ipythonblocks. Alas no. This code
from ipythonblocks import BlockGrid
w = 10
h = 10
class clearingBlockThread(threading.Thread):
def run(self):
grid = BlockGrid(w, h)
# record the parent (uncluding the stdout cell) when the thread starts
thread_parent = sys.stdout.parent_header
for i in range(10):
# then ensure that the parent is the same as when the thread started
# every time we print
with set_stdout_parent(thread_parent):
block = grid[i, i]
block.green = 255
IPython.core.display.clear_output(other=True)
grid.show()
time.sleep(0.2)
clearingBlockThread().start()
does indeed produce the desired end state (a black matrix with a green diagonal) but the intermediate steps don't appear in the cell's output area. To complicate things slightly (?) this example is running on Python 3. In checking before posting here I discover that the expected behavior (a simple animation) does in fact occur under Python 2.7. Hence I though to ask whether this is an issue I need to report.

Resources