Cupy get error in multithread.pool if GPU already used

Cupy get error in multithread.pool if GPU already used - python-3.x

I tried to use cupy in two parts of my program, one of them being parallelized with a pool.
I managed to reproduce it with a simple example:
import cupy
import numpy as np
from multiprocessing import pool
def f(x):
return cupy.asnumpy(2*cupy.array(x))
input = np.array([1,2,3,4])
print(cupy.asnumpy(cupy.array(input)))
print(np.array(list(map(f, input))))
p = pool.Pool(4)
output = p.map(f, input)
p.close()
p.join()
print(output)
The output is the following:
[1 2 3 4]
[2 4 6 8]
Exception in thread Thread-3:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/multiprocessing/pool.py", line 489, in _handle_results
task = get()
File "/usr/lib/python3.6/multiprocessing/connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
File "cupy/cuda/runtime.pyx", line 126, in cupy.cuda.runtime.CUDARuntimeError.__init__
TypeError: an integer is required
also, the code freezes and doesn't exit but I think it's not related to cupy.
And my config is this one:
CuPy Version : 5.2.0
CUDA Root : /usr/local/cuda-10.0
CUDA Build Version : 10000
CUDA Driver Version : 10000
CUDA Runtime Version : 10000
cuDNN Build Version : 7301
cuDNN Version : 7301
NCCL Build Version : 2307

This issue is not specific to CuPy. Due to the limitation of CUDA, processes cannot be forked after CUDA initialization.
You need to use multiprocessing.set_start_method('spawn') (or forkserver), or avoid initializing CUDA (i.e., do not use CuPy API except import cupy) until you fork child processes.

When I tried multiprocessing with cupy before, I needed to use spawn context.
ctx = multiprocessing.get_context('spawn')
pool = ctx.Pool(4)
I don't know this resolves your problem but can you try it?

Related

CUDA error on WSL2 using pytorch with multiprocessing

I have a Python script as shown below:
import torch
from torch.multiprocessing import set_start_method, Pipe, Process
def func(conn):
data = conn.recv()
print(data)
if __name__ == "__main__":
set_start_method('spawn')
a, b = Pipe()
data = torch.tensor([1, 2, 3], device='cuda')
proc = Process(target=func, args=(data,))
proc.start()
b.send(data)
proc.join()
I run this script on WSL2, but it shows
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 121, in rebuild_cuda_tensor
storage = storage_cls._new_shared_cuda(
File "/home/zxc/anaconda3/envs/airctrl/lib/python3.8/site-packages/torch/storage.py", line 807, in _new_shared_cuda
return torch.UntypedStorage._new_shared_cuda(*args, **kwargs)
RuntimeError: CUDA error: unknown error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: invalid device ordinal (function uncheckedSetDevice)
My environment is:
OS：WSL2 Ubuntu 22.04
CUDA: 11.7
Python: 3.8
Pytorch: 1.13.0+cu117
Any idea on how to solve this issue?
Thanks.
I've run this script on Ubuntu 22.04 without WSL2, it's OK.

Try moving the data to GPU inside your function instead of creating it in GPU directly. That worked for me.

Getting this ONNXRuntimeError again and again. Please suggest

I am trying to run this code to remove background from my image but I keep getting an error. Please tell me what I am doing wrong
from rembg.bg import remove
import numpy as np
import io
from PIL import Image
input_path = 'crop.jpeg'
output_path = 'out.png'
f = np.fromfile(input_path)
result = remove(f)
img = Image.open(io.BytesIO(result)).convert("RGBA")
img.save(output_path)
*C:\Sauhard\Internships\TEST IMAGES>python -u "c:\Sauhard\Internships\TEST IMAGES\a.py"
Traceback (most recent call last):
File "c:\Sauhard\Internships\TEST IMAGES\a.py", line 10, in <module>
result = remove(f)
File "C:\Users\Sauhard Saini\AppData\Local\Programs\Python\Python39\lib\site-packages\rembg\bg.py", line 133, in remove
session = new_session("u2net")
File "C:\Users\Sauhard Saini\AppData\Local\Programs\Python\Python39\lib\site-packages\rembg\session_factory.py", line 60, in new_session
ort.InferenceSession(
File "C:\Users\Sauhard Saini\AppData\Local\Programs\Python\Python39\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 347, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "C:\Users\Sauhard Saini\AppData\Local\Programs\Python\Python39\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 395, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
RuntimeError: D:\a\_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1029 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "C:\Users\Sauhard Saini\AppData\Local\Programs\Python\Python39\lib\site-packages\onnxruntime\capi\onnxruntime_providers_tensorrt.dll"*

This error is related to the generic graphics card driver provided with the Windows installation and happens mainly with AMD/ATI based solutions.
download and install the latest available version of the graphics card driver for your hardware at www.amd.com
Another option can be the installation of the "Visual C++ Redistributable Packages for Visual Studio".

changing rounding mode via importing libm in python 3

My environment: Ubuntu 18.04, Anaconda, Python 3.6
I am using following code to import libm in python via ctypes in order to change floating type environment such as rounding.
import numpy as np
import ctypes
FE_TONEAREST = 0x0000
FE_DOWNWARD = 0x0400
FE_UPWARD = 0x0800
FE_TOWARDZERO = 0x0c00
#libm = ctypes.CDLL("libm.so", ctypes.RTLD_GLOBAL)
libm = ctypes.cdll.LoadLibrary(r'/usr/lib/x86_64-linux-gnu/libm.so')
v = 1. / (1<<23)
print( repr(np.float32(1+v) - np.float32(v/2))) # prints 1.0
#change mode
libm.fesetround(FE_UPWARD)
print( repr(np.float32(1+v) - np.float32(v/2))) # prints 1.0000002
However I get following error:
Traceback (most recent call last):
File "mode.py", line 10, in <module>
libm = ctypes.cdll.LoadLibrary(r'/usr/lib/x86_64-linux-gnu/libm.so')
File "/anaconda/envs/phat/lib/python3.6/ctypes/__init__.py", line 426, in LoadLibrary
return self._dlltype(name)
File "/anaconda/envs/phat/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /usr/lib/x86_64-linux-gnu/libm.so: invalid ELF header
libm is the default library comes with ubuntu 18.04.
Could you please advise what is the best way to import library?
Thank you.

I have changed the library path
from
/usr/lib/x86_64-linux-gnu/libm.so
to
/lib/x86_64-linux-gnu/libm.so.6
it worked.

Unable to run pathos program from spyder IDE

I have the following simple program:
from pathos.core import connect
tunnel = connect('192.168.1.5', port=50004)
print(tunnel)
print(type(tunnel._lport))
print(tunnel._rport)
def sleepy_squared(x):
from time import sleep
sleep(1.0)
return x**2
from pathos.pp import ParallelPythonPool as Pool
p = Pool(8, servers=('192.168.1.5:6260',))
print(p.servers)
x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y = p.map(sleepy_squared, x)
print(y)
When I try running this program from the Spyder 4 IDE I get the following error:
Tunnel('-q -N -L 4761:192.168.1.5:50004 192.168.1.5')
<class 'int'>
50004
('192.168.1.5:6260',)
Traceback (most recent call last):
File "/home/mahmoud/anaconda3/envs/trade_fxcm/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3319, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-1-e89974d31563>", line 20, in <module>
y = p.map(sleepy_squared, x)
File "/home/mahmoud/anaconda3/envs/trade_fxcm/lib/python3.6/site-packages/pathos/parallel.py", line 234, in map
return list(self.imap(f, *args))
File "/home/mahmoud/anaconda3/envs/trade_fxcm/lib/python3.6/site-packages/pathos/parallel.py", line 247, in imap
return (subproc() for subproc in list(builtins.map(submit, *args)))
File "/home/mahmoud/anaconda3/envs/trade_fxcm/lib/python3.6/site-packages/pathos/parallel.py", line 243, in submit
return _pool.submit(f, argz, globals=globals())
File "/home/mahmoud/anaconda3/envs/trade_fxcm/lib/python3.6/site-packages/pp/_pp.py", line 499, in submit
sfunc = self.__dumpsfunc((func, ) + depfuncs, modules)
File "/home/mahmoud/anaconda3/envs/trade_fxcm/lib/python3.6/site-packages/pp/_pp.py", line 683, in __dumpsfunc
sources = [self.__get_source(func) for func in funcs]
File "/home/mahmoud/anaconda3/envs/trade_fxcm/lib/python3.6/site-packages/pp/_pp.py", line 683, in <listcomp>
sources = [self.__get_source(func) for func in funcs]
File "/home/mahmoud/anaconda3/envs/trade_fxcm/lib/python3.6/site-packages/pp/_pp.py", line 750, in __get_source
self.__sourcesHM[hashf] = importable(func)
File "/home/mahmoud/anaconda3/envs/trade_fxcm/lib/python3.6/site-packages/dill/source.py", line 957, in importable
src = _closuredimport(obj, alias=alias, builtin=builtin)
File "/home/mahmoud/anaconda3/envs/trade_fxcm/lib/python3.6/site-packages/dill/source.py", line 876, in _closuredimport
src = getimport(func, alias=alias, builtin=builtin)
File "/home/mahmoud/anaconda3/envs/trade_fxcm/lib/python3.6/site-packages/dill/source.py", line 764, in getimport
return _getimport(head, tail, alias, verify, builtin)
File "/home/mahmoud/anaconda3/envs/trade_fxcm/lib/python3.6/site-packages/dill/source.py", line 713, in _getimport
try: exec(_str) #XXX: check if == obj? (name collision)
File "<string>", line 1
from __main__'> import sleepy_squared
^
SyntaxError: EOL while scanning string literal
When I run this program from the terminal using the following command python test_connect.py the program works fine. My question is why isn't the program running on the Spyder IDE 4 and how can I make the program run on Spyder IDE 4?

I'm the pathos author. Spyder, Jupyter, and other IDEs add an additional execution layer on top of the interpreter, and in some cases, even wrap the execution in a closure to add additional hooks into the rest of the IDE. You are using a ParallelPool, which uses ppft, which uses dill.source to "serialize" by extracting the source code of an object and it's dependencies. Since the IDE is adding a closure layer, dill.source has to try to serialize that as well, and it's not successful -- so in short it's a compatibility issue between dill.source and Spyder. If you pick one of the other pathos pools, it may succeed. The ProcessPool is essentially the same as the ParallelPool, but serializes by object instead of by source code -- it uses multiprocess, which uses dill. Then there's ThreadPool, which is probably the most likely to succeed, unless Spyder also messes with the main thread -- which most IDEs do. So, what can you do about it? Easy thing is to not run parallel code from the IDE. Essentially, write your code in the IDE, and then swap out the Pool and it should run in parallel. IDEs don't generally play well with parallel computing.

Error using Scapy

I am using Python2.5 and Scapy2.2.0 . When I execute the following code:
from scapy.all import *
a = IP(dst='10.100.95.184')
a.src = "10.100.95.22"
ab = a/ICMP()
sendp(ab)
I get the following error:
WARNING: No route found for IPv6 destination :: (no default route?)
Traceback (most recent call last):
File "C:\Python25\att.py", line 6, in <module>
sendp(ab)
File "C:\Python25\Lib\site-packages\scapy\sendrecv.py", line 259, in sendp
__gen_send(conf.L2socket(iface=iface, *args, **kargs), x, inter=inter, loop=loop, count=count, verbose=verbose, realtime=realtime)
File "C:\Python25\Lib\site-packages\scapy\sendrecv.py", line 237, in __gen_send
os.write(1,".")
OSError: [Errno 9] Bad file descriptor
Any idea how I can correct this?

i had a similar problem (not this exactly error message) and it looks like not a problem in your code. I fixed my scenario reinstalling the scapy package. Have you tried it? Try to upgrade your Python version to the next one too.
Good luck !

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Cupy get error in multithread.pool if GPU already used - python-3.x

When I tried multiprocessing with cupy before, I needed to use spawn context. ctx = multiprocessing.get_context('spawn') pool = ctx.Pool(4) I don't know this resolves your problem but can you try it?

Related

CUDA error on WSL2 using pytorch with multiprocessing

Getting this ONNXRuntimeError again and again. Please suggest

changing rounding mode via importing libm in python 3

Unable to run pathos program from spyder IDE

Error using Scapy

Categories

Resources