Pickle can't pickle _thread.lock objects - python-3.x

I'm trying to use pickle to save one of my objects but I face this error when trying to dump it:
TypeError: can't pickle _thread.lock objects
It is not clear to me, because I'm not using any locks inside my code. I tried to reproduce this error:
import threading
from time import sleep
import pickle
class some_class:
def __init__(self):
self.a = 1
thr = threading.Thread(target=self.incr)
self.lock = threading.Lock()
thr.start()
def incr(self):
while True:
# with self.lock:
self.a += 1
print(self.a)
sleep(0.5)
if __name__ == "__main__":
a = some_class()
val = pickle.dumps(a, pickle.HIGHEST_PROTOCOL)
print("pickle done!")
pickle_thread.py", line 22, in
val = pickle.dumps(a, pickle.HIGHEST_PROTOCOL) TypeError: can't pickle _thread.lock objects
If I define a thread lock inside my object I can't pickle it, right?
I think the problem here is using threading.lock but is there any workaround for this?
Actually, in my main project, I can't find any locks but I've used lots of modules that I can't trace them. What should I look for?
Thanks.

You can try to customize the pickling method for this class by excluding unpicklable objects from the dictionary:
def __getstate__(self):
state = self.__dict__.copy()
del state['lock']
return state
When unpickling, you can recreate missing objects manually, e.g.:
def __setstate__(self, state):
self.__dict__.update(state)
self.lock = threading.Lock() # ???
I don't know enough about the threading module to predict if this is gonna be sufficient.

Related

Is there a way to pickle a PriorityQueue?

This is Python 3.10. I use a PriorityQueue as a way to track Actors' turn order in my game. It's just a simple roguelike. I don't use the synchronization features of the PriorityQueue. My code:
import pickle
from queue import PriorityQueue
class GameEngine():
def __init__(self):
self.pqueue = PriorityQueue()
def save_to_file(self):
with open('save.pkl', 'wb') as file:
pickle.dump(self, file, pickle.HIGHEST_PROTOCOL)
class Monster():
pass
engine = GameEngine()
orc1 = Monster()
orc2 = Monster()
engine.pqueue.put((20,orc1))
engine.pqueue.put((10,orc2))
engine.save_to_file()
It returns TypeError: cannot pickle '_thread.lock' object. From what I understand PriorityQueue is not pickle-able. I've read here that Queue.Queue has a pickle-able alternative of collections.deque if the synchronization stuff is not necessary. Is there such an alternative to PriorityQueue, or is there a way to pickle it anyway? Other than implementing my own simplified version of PriorityQueue?
As you don't need the synchronisation features of PriorityQueue, just use the light-weight heapq module. It provides functions (not methods) to work on a plain list:
import pickle
from heapq import heappush, heappop
class GameEngine():
def __init__(self):
self.pqueue = []
def save_to_file(self):
with open('save.pkl', 'wb') as file:
pickle.dump(self, file, pickle.HIGHEST_PROTOCOL)
class Monster():
pass
engine = GameEngine()
orc1 = Monster()
orc2 = Monster()
heappush(engine.pqueue, (20,orc1))
heappush(engine.pqueue, (10,orc2))
engine.save_to_file()

Runtime error using concurrent.futures.ProcessPoolExecutor

I have seen many YouTube videos for basic tutorials for concurrent.futures.ProcessPoolExecutor. I have also seen posts in SO here and here, GitHub and GitHubMemory, yet no luck.
Problem:
I'm getting the following runtime error:
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
I admit it, I do not fully understand this error since this is my very first attempt at multiprocessing in my python code.
Here's my pseudocode:
module.py
import xyz
from multiprocessing import freeze_support
def abc():
return x
def main():
xyz
qwerty
if __name__ == "__main__":
freeze_support()
obj = Object()
main()
classObject.py
import abcd
class Object(object):
def __init__(self):
asdf
cvbn
with concurrent.futures.ProcessPoolExecutor(max_workers=2) as executor:
executor.map(self.function_for_multiprocess, var1, var2)
# ****The error points at the code above.👆*👆*👆
def function_for_multiprocess(var1, var2):
doSomething1
doSomething2
self.variable = something
My class file (classObject.py) does not have the "main" guard.
Things I have tried:
Tried adding if __name__ == "__main__": and freeze_support in the classObject.py along with renaming __init__() to main()`
While doing the above, removed the freeze_support from the module.py
I haven't found a different solution from the link provided above. Any insights would be greatly appreciated!
I'm using a MacBook Pro (16-inch, 2019), Processor 2.3 GHz 8-Core Intel Core i9, OS:Big Sur. I don't think that matters but just declaring it if it does.
you need to pass arguments as picklable object, so as list or a tuple.
and you don't need freeze_support()
just change executor.map(self.function_for_multiprocess, var1, var2)
to executor.map(self.function_for_multiprocess, (var1, var2))
from multiprocessing import freeze_support
import concurrent.futures
class Object(object):
def __init__(self, var1=1, var2=2):
with concurrent.futures.ProcessPoolExecutor(max_workers=2) as executor:
executor.map(self.function_for_multiprocess, (var1, var2))
def function_for_multiprocess(var1, var2):
print('var1:', var1)
print('var2:', var2)
def abc(x):
return x
def main():
print('abc:', abc(200))
if __name__ == "__main__":
#freeze_support()
obj = Object()
main()

Unable to access class attribute in another function

import rospy
from sensor_msgs.msg import Imu
class ImuData:
def __init__(self):
#self.data = None
pass
def get_observation(self):
rospy.Subscriber('/imu', Imu, self.imu_callback)
imuData = self.data
print(imuData)
def imu_callback(self, msg):
self.data = msg.orientation
print(self.data)
if __name__ == '__main__':
rospy.init_node('gett_imu', anonymous= True)
idd = ImuData()
idd.get_observation()
In the above code, I would like to access self.data defined in imu_callback from get_observation function. The problem is I get error saying that ImuData has no attribute data.
How do I solve this issue?
Note: I feel that the question has to do with the python classes and not with Ros and rospy.
A couple of things are going on here. One, that was mentioned in the comment, is that you should be initializing your attributes inside __init__. The error your seeing is partially because of Python and the fact that self.data has not actually been initialized yet.
The second issue you have is where you setup the subscriber. This should also be done in __init__ and only once. Sensors will be publishing at a fairly constant rate, thus it takes time to actually receive any data on the topic. Also if you plan to call get_observation more than once you would create a new subscription, which you do not want.
Take the following code as a fixed example:
def __init__(self):
rospy.Subscriber('/imu', Imu, self.imu_callback)
self.data = None
def get_observation(self):
imuData = self.data
print(imuData)

Using multiprocessing.Pool in Python with a function returning custom object

I am using multiprocessing.Pool to speed up computation, as I call one function multiple times, and then collate the result. Here is a snippet of my code:
import multiprocessing
from functools import partial
def Foo(id:int,constant_arg1:str, constant_arg2:str):
custom_class_obj = CustomClass(constant_arg1, constant_arg2)
custom_class_obj.run() # this changes some attributes of the custom_class_obj
if(something):
return None
else:
return [custom_class_obj]
def parallel_run(iters:int, a:str, b:str):
pool = multiprocessing.Pool(processes=k)
## create the partial function obj before passing it to pool
partial_func = partial(Foo, constant_arg1=a, constant_arg2=b)
## create the variable id list
iter_list = list(range(iters))
all_runs = pool.map(partial_func, iter_list)
return all_runs
This throws the following error in the multiprocessing module:
multiprocessing.pool.MaybeEncodingError: Error sending result: '[[<CustomClass object at 0x1693c7070>], [<CustomClass object at 0x1693b88e0>], ....]'
Reason: 'TypeError("cannot pickle 'module' object")'
How can I resolve this?
I was able to replicate the error message with a minimal example of an un-picklable class. The error basically states the instance of your class can't be pickled because it contains a reference to a module, and modules are not picklable. You need to comb through CustomClass to make sure instances don't hold things like open file handles, module references, etc.. If you need to have those things, you should use __getstate__ and __setstate__ to customize the pickle and unpickle process.
distilled example of your error:
from multiprocessing import Pool
from functools import partial
class klass:
def __init__(self, a):
self.value = a
import os
self.module = os #this fails: can't pickle a module and send it back to main process
def foo(a, b, c):
return klass(a+b+c)
if __name__ == "__main__":
with Pool() as p:
a = 1
b = 2
bar = partial(foo, a, b)
res = p.map(bar, range(10))
print([r.value for r in res])

How to initialize python watchdog pattern matching event handler

I'm using the Python Watchdog to monitor a directory for new files being created. Several different types of files are created in said directory but I only need to monitor a single file type, hence I use the Watchdog PatternMatchingEventHandler, where I specify the pattern to monitor using the patterns keyword.
To correctly execute the code under the hood (not displayed here) I need to initialize an empty dataframe in my event-handler, and I am having trouble getting this to work. If I remove the __init__ in the code below, everything works just fine btw.
I used the code in this answer as inspiration for my own.
The code I have set up looks as follows:
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
import time
import pandas as pd
import numpy as np
from multiprocessing import Pool
class HandlerEQ54(PatternMatchingEventHandler):
def __init__(self):
#Initializing an empty dataframe for storage purposes.
data_54 = pd.DataFrame(columns = ['Barcode','DUT','Step12','Step11','Np1','Np2','TimestampEQ54'])
#Converting to INT for later purposes
data_54[['Barcode','DUT']]=data_54[['Barcode','DUT']].astype(np.int64)
self.data = data_54
def on_created(self,event):
if event.is_directory:
return True
elif event.event_type == 'created':
#Take action here when a file is created.
print('Found new files:')
print(event.src_path)
time.sleep(0.1)
#Creating process pool to return data
pool1 = Pool(processes=4)
#Pass file to parsing function and return parsed result.
result_54 = pool1.starmap(parse_eq54,[(event.src_path,self.data)])
#returns the dataframe rather than the list of dataframes returned by starmap
self.data = result_54[0]
print('Data read: ')
print(self.data)
def monitorEquipment(equipment):
'''Uses the Watchdog package to monitor the data directory for new files.
See the HandlerEQ54 and HandlerEQ51 classes in multiprocessing_handlers for actual monitoring code. Monitors each equipment.'''
print('equipment')
if equipment.upper() == 'EQ54':
event_handler = HandlerEQ54(patterns=["*.log"])
filepath = '/path/to/first/file/source/'
# set up observer
observer = Observer()
observer.schedule(event_handler, path=filepath, recursive=True)
observer.daemon=True
observer.start()
print('Observer started')
# monitor
try:
while True:
time.sleep(5)
except KeyboardInterrupt:
observer.unschedule_all()
observer.stop()
observer.join()
However, when I execute monitorEquipment I receive the following error message:
TypeError: __init__() got an unexpected keyword argument 'patterns'
Evidently I'm doing something wrong when I'm initializing my handler class, but I'm drawing a blank as to what that is (which probably reflects my less-than-optimal understanding of classes). Can someone advice me on how to correctly initialize the empty dataframe in my HandlerEQ54 class, to not get the error I do?
Looks like you are missing the patterns argument from your __init__ method, you'll also need a super() call to the __init__ method of the parent class (PatternMatchingEventHandler), so you can pass the patterns argument upwards.
it should look something like this:
class HandlerEQ54(PatternMatchingEventHandler):
def __init__(self, patterns=None):
super(HandlerEQ54, self).__init__(patterns=patterns)
...
event_handler = HandlerEQ54(patterns=["*.log"])
or, for a more generic case and to support all of PatternMatchingEventHandler's arguments:
class HandlerEQ54(PatternMatchingEventHandler):
def __init__(self, *args, **kwargs):
super(HandlerEQ54, self).__init__(*args, **kwargs)
...
event_handler = HandlerEQ54(patterns=["*.log"])

Resources