Related
Context: I want to create attributes of an object class in parallel by distributing them in the available cores. This question was answered in this post here by using the python Multiprocessing Pool.
The MRE for my task is the following using Pyomo 6.4.1v:
from pyomo.environ import *
import os
import multiprocessing
from multiprocessing import Pool
from multiprocessing.managers import BaseManager, NamespaceProxy
import types
class ObjProxy(NamespaceProxy):
"""Returns a proxy instance for any user defined data-type. The proxy instance will have the namespace and
functions of the data-type (except private/protected callables/attributes). Furthermore, the proxy will be
pickable and can its state can be shared among different processes. """
def __getattr__(self, name):
result = super().__getattr__(name)
if isinstance(result, types.MethodType):
def wrapper(*args, **kwargs):
return self._callmethod(name, args, kwargs)
return wrapper
return result
#classmethod
def create(cls, *args, **kwargs):
# Register class
class_str = cls.__name__
BaseManager.register(class_str, cls, ObjProxy, exposed=tuple(dir(cls)))
# Start a manager process
manager = BaseManager()
manager.start()
# Create and return this proxy instance. Using this proxy allows sharing of state between processes.
inst = eval("manager.{}(*args, **kwargs)".format(class_str))
return inst
ConcreteModel.create = create
class A:
def __init__(self):
self.model = ConcreteModel.create()
def do_something(self, var):
if var == 'var1':
self.model.var1 = var
elif var == 'var2':
self.model.var2 = var
else:
print('other var.')
def do_something2(self, model, var_name, var_init):
model.add_component(var_name,var_init)
def init_var(self):
print('Sequentially')
self.do_something('var1')
self.do_something('test')
print(self.model.var1)
print(vars(self.model).keys())
# Trying to create the attributes in parallel
print('\nParallel')
self.__sets_list = [(self.model,'time',Set(initialize = [x for x in range(1,13)])),
(self.model,'customers',Set(initialize = ['c1','c2','c3'])),
(self.model,'finish_bulks',Set(initialize = ['b1','b2','b3','b4'])),
(self.model,'fermentation_types',Set(initialize = ['ft1','ft2','ft3','ft4'])),
(self.model,'fermenters',Set(initialize = ['f1','f2','f3'])),
(self.model,'ferm_plants',Set(initialize = ['fp1','fp2','fp3','fp4'])),
(self.model,'plants',Set(initialize = ['p1','p2','p3','p4','p5'])),
(self.model,'gran_plants',Set(initialize = ['gp1','gp2','gp3','gp4','gp4']))]
with Pool(7) as pool:
pool.starmap(self.do_something2,self.__sets_list)
self.model.time.pprint()
self.model.customers.pprint()
def main(): # The main part run from another file
obj = A()
obj.init_var()
# Call other methods to create other attributes and the solver step.
# The other methods are similar to do_something2() just changing the var_init to Var() and Constraint().
if __name__ == '__main__':
multiprocessing.set_start_method("spawn")
main = main()
Ouput
Sequentially
other var.
var1
dict_keys(['_tls', '_idset', '_token', '_id', '_manager', '_serializer', '_Client', '_owned_by_manager', '_authkey', '_close'])
Parallel
WARNING: Element gp4 already exists in Set gran_plants; no action taken
time : Size=1, Index=None, Ordered=Insertion
Key : Dimen : Domain : Size : Members
None : 1 : Any : 12 : {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
customers : Size=1, Index=None, Ordered=Insertion
Key : Dimen : Domain : Size : Members
None : 1 : Any : 3 : {'c1', 'c2', 'c3'}
I change the number of parallel processes for testing, but it raises different errors, and other times it runs without errors. This is confusing for me, and I did not figure out what is the problem behind it. I did not find another post that had a similar problem, but I saw some posts discussing that pickle does not handle large data. So, the errors that sometimes I gotcha are the following:
Error 1
Unserializable message: Traceback (most recent call last):
File "/home/.../anaconda3/envs/.../lib/python3.9/multiprocessing/managers.py", line 300, in serve_client
send(msg)
File "/home/.../anaconda3/envs/.../lib/python3.9/multiprocessing/connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/.../anaconda3/envs/.../lib/python3.9/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
SystemError: <method 'dump' of '_pickle.Pickler' objects> returned NULL without setting an error
Error 2
Unserializable message: Traceback (most recent call last):
File "/home/.../anaconda3/envs/.../lib/python3.9/multiprocessing/managers.py", line 300, in serve_client
send(msg)
File "/home/.../anaconda3/envs/.../lib/python3.9/multiprocessing/connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/.../anaconda3/envs/.../lib/python3.9/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
RuntimeError: dictionary changed size during iteration
Error 3
*** Reference count error detected: an attempt was made to deallocate the type 32727 (? ***
*** Reference count error detected: an attempt was made to deallocate the type 32727 (? ***
*** Reference count error detected: an attempt was made to deallocate the type 32727 (? ***
Unserializable message: Traceback (most recent call last):
File "/home/.../anaconda3/envs/.../lib/python3.9/multiprocessing/managers.py", line 300, in serve_client
send(msg)
File "/home/.../anaconda3/envs/.../lib/python3.9/multiprocessing/connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/.../anaconda3/envs/.../lib/python3.9/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
numpy.core._exceptions._ArrayMemoryError: <unprintble MemoryError object>
Error 4
Unserializable message: Traceback (most recent call last):
File "/home/.../anaconda3/envs/.../lib/python3.9/multiprocessing/managers.py", line 300, in serve_client
send(msg)
File "/home/.../anaconda3/envs/.../lib/python3.9/multiprocessing/connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/.../anaconda3/envs/.../lib/python3.9/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'WeakSet.__init__.<locals>._remove'
So, there are different errors, and it looks like it is not stable. I hope that someone has had and solved this problem. Furthermore, if someone has implemented other strategies for this task, please, feel free to post your answer in this issue here
Tkx.
I am learning multithreading in python. can anyone pleae tell me why the thread does not start?
code:
import threading
import time
import logging
class Threads_2:
def __new__(cls):
"""
this will be invoked once the creation procedure of the object begins
"""
instance = super(Threads_2,cls).__new__(cls)
return instance
def __init__(self):
"""
this will be invoked once the initialisation procedure of the object begins
"""
#self.configLogging()
#threadinf.Thread.__init__(self)
#self.spawnThreads()
def spawnThreads(self):
if __name__ == "__main__":
thread1 = threading.Thread(target=self.backgroundTask, args=(10,))
thread1.start()
def backgroundTask(numOfLoops):
for i in numOfLoops:
print(2)
obj = Threads_2()
obj.spawnThreads()
error:
Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\Users\xxx\AppData\Local\Programs\Python\Python39\lib\threading.py", line 954, in _bootstrap_inner
self.run()
File "C:\Users\xxx\AppData\Local\Programs\Python\Python39\lib\threading.py", line 892, in run
self._target(*self._args, **self._kwargs)
TypeError: backgroundTask() takes 1 positional argument but 2 were given
PS D:\python workspace>
The function backgroundTask is in the class Threads_2. So, it should be backgroundTask(self, numOfLoops) instead of backgroundTask(numOfLoops).
I am attempting to jit compile a python function, and use a optional argument to change the arguments of another function call.
I think where jit might be tripping up is that the default value of the optional argument is None, and jit doesn't know how to handle that, or at least doesn't know how to handle it when it changes to a numpy array. See below for a rough overview:
#jit(nopython=True)
def foo(otherFunc,arg1, optionalArg=None):
if optionalArg is not None:
out=otherFunc(arg1,optionalArg)
else:
out=otherFunc(arg1)
return out
Where optionalArg is either None, or a numpy array
One solution would be to turn this into three functions as shown below, but this feels kinda janky and I don't like it, especially because speed is very important for this task.
def foo(otherFunc,arg1,optionalArg=None):
if optionalArg is not None:
out=func1(otherFunc,arg1,optionalArg)
else:
out=func2(otherFunc,arg1)
return out
#jit(nopython=True)
def func1(otherFunc,arg1,optionalArg):
out=otherFunc(arg1,optionalArg)
return out
#jit(nopython=True)
def func2(otherFunc,arg1):
out=otherFunc(arg1)
return out
Note that other stuff is happening besides just calling otherFunc that makes using jit worth it, but I'm almost certain that is not where the problem is since this was working before without the optionalArg portion, so I have decided not to include it.
For those of you that are curious its runge-kutta order 4 implementation with optional extra parameters to pass to the differential equation. If you want to see the whole thing just ask.
The traceback is rather long but here is some of it:
inte.rk4(de2,y0,0.001,200,vals=np.ones(4))
Traceback (most recent call last):
File "<ipython-input-38-478197aa6a1a>", line 1, in <module>
inte.rk4(de2,y0,0.001,200,vals=np.ones(4))
File "C:\Users\Alex\Anaconda3\lib\site-packages\numba\dispatcher.py", line 350, in _compile_for_args
error_rewrite(e, 'typing')
File "C:\Users\Alex\Anaconda3\lib\site-packages\numba\dispatcher.py", line 317, in error_rewrite
reraise(type(e), e, None)
File "C:\Users\Alex\Anaconda3\lib\site-packages\numba\six.py", line 658, in reraise
raise value.with_traceback(tb)
TypingError: Internal error at <numba.typeinfer.CallConstraint object at 0x00000258E168C358>:
This continues...
inte.rk4 is the equiavlent of foo, de2 is otherFunc, y0, 0.001 and 200 are just values, that I swaped out for arg1 in my problem description above, and vals is optionalArg.
A similar thing happens when I try to run this with the vals parameter omitted:
ysExp=inte.rk4(deExp,y0,0.001,200)
Traceback (most recent call last):
File "<ipython-input-39-7dde4bcbdc2f>", line 1, in <module>
ysExp=inte.rk4(deExp,y0,0.001,200)
File "C:\Users\Alex\Anaconda3\lib\site-packages\numba\dispatcher.py", line 350, in _compile_for_args
error_rewrite(e, 'typing')
File "C:\Users\Alex\Anaconda3\lib\site-packages\numba\dispatcher.py", line 317, in error_rewrite
reraise(type(e), e, None)
File "C:\Users\Alex\Anaconda3\lib\site-packages\numba\six.py", line 658, in reraise
raise value.with_traceback(tb)
TypingError: Internal error at <numba.typeinfer.CallConstraint object at 0x00000258E048EA90>:
This continues...
If you see the documentation here, you can specify the optional type arguments explicitly in Numba. For example (this is the same example from documentation):
>>> #jit((optional(intp),))
... def f(x):
... return x is not None
...
>>> f(0)
True
>>> f(None)
False
Additionally, based on the conversation going on this Github issue you can use the following workaround to implement optional keyword. I have modified the code from the solution provided in the github issue to suit your example:
from numba import jitclass, int32, njit
from collections import OrderedDict
import numpy as np
np_arr = np.asarray([1,2])
spec = OrderedDict()
spec['x'] = int32
#jitclass(spec)
class Foo(object):
def __init__(self, x):
self.x = x
def otherFunc(self, optionalArg):
if optionalArg is None:
return self.x + 10
else:
return len(optionalArg)
#njit
def useOtherFunc(arg1, optArg):
foo = Foo(arg1)
print(foo.otherFunc(optArg))
arg1 = 5
useOtherFunc(arg1, np_arr) # Output: 2
useOtherFunc(arg1, None) # Output : 15
See this colab notebook for the example shown above.
I am trying to have a simple subclass of OrderedDict that gets created by a Pool then returned.
It seems that the pickling process when returning the created object to the pool tries to re-instantiate the object and fails due to the required additional argument in the __init__ function.
This is a minimal (non) working example:
from collections import OrderedDict
from multiprocessing import Pool
class Obj1(OrderedDict):
def __init__(self, x, *args, **kwargs):
super().__init__(*args, **kwargs)
self.x = x
def task(x):
obj1 = Obj1(x)
return obj1
if __name__ == '__main__':
with Pool(1) as pool:
for x in pool.imap_unordered(task, (1,2,3)):
print(x.x)
If I do this I get the following error.
Exception in thread Thread-3:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/multiprocessing/pool.py", line 463, in _handle_results
task = get()
File "/usr/lib/python3.6/multiprocessing/connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
TypeError: init() missing 1 required positional argument: 'x'
Again this fails when the task functions returns to the pool and I guess the object gets pickled?
If I changed OrderedDict by a simple dict it works flawlessly....
I have a workaround to use kwargs and retrieve the attribute of interest but I am stumped about the error to start with. Any ideas?
You can define __getstate__() and __setstate__() methods for your class.
In those functions you can make sure that x is handled as well. For example:
def __getstate__(self):
return self.x, self.items()
def __setstate__(self, state):
self.x = state[0]
self.update(state[1])
BTW, from CPython 3.6 there is no reason to use OrderedDict, since dictionary order is insertion order. This was originally an implementation detail in CPython. In Python 3.7 it was made part of the language.
Here is my code, I'm having issues trying to get this to run. I keep getting a failure when trying to execute:
I am referencing this function the same way I am trying to reference this one? I'm not sure what's going on?
#### PARSING RESULTS
Traceback (most recent call last):
File "masscanner.py", line 49, in <module>
main()
File "masscanner.py", line 43, in main
file = write_file(savefile)
NameError: name 'savefile' is not defined
def write_file(savefile):
print('\n\n########## WRITING FILE ##########\n')
fh = open("endpointslist", "w")
for i in savefile:
fh.write(i[0])
fh.write('\n')
def main():
""" Main program """
results = find_endpoints()
ipportset = parse_results(results)
fh = write_file(savefile)
pprint(ipportset)
return 0
if __name__ == "__main__":
main()
Possible that its just a typo. Shouldn't it be write_file(ipportset). In the context, the variable savefile comes from nowhere and is therefore giving you the error NameError: name 'savefile' is not defined