conda create an environment without yaml or text file [duplicate]

conda create an environment without yaml or text file [duplicate] - python-3.x

This question already has answers here:
Install everything from a `conda list` output
(2 answers)
Closed 1 year ago.
I deleted a conda environment by mistake, and I do not have a yaml/text file containing the list of libraries. However, I have the following data:
Name
Version
Build
Channel
_tflow_select
2.1.0
gpu
anaconda
absl-py
0.13.0
pypi_0
pypi
aiohttp
3.6.3
py37he774522_0
anaconda
alabaster
0.7.12
pypi_0
pypi
anyio
2.0.2
pypi_0
pypi
appdirs
1.4.4
pypi_0
pypi
argon2-cffi
20.1.0
py37he774522_1
anaconda
ase
3.21.0
pypi_0
pypi
asgiref
3.3.1
pypi_0
pypi
astor
0.8.1
py37_0
anaconda
astroid
2.5.2
pypi_0
pypi
async-timeout
3.0.1
py37_0
anaconda
async_generator
1.10
py37h28b3542_0
anaconda
attrs
20.2.0
py_0
anaconda
azure-core
1.10.0
pypi_0
pypi
azure-eventhub
5.1.0
pypi_0
pypi
azure-storage-blob
12.6.0
pypi_0
pypi
Please note that above table is a just the first few libraries present in the environment.
Can anyone please suggest me a way to create an environment using the following information?
Please do not ask me to install the libraries one by one as there are many libraries to be installed. Also, do not suggest me to create a .yml/.txt file and then use conda/pip to install all of them at a go, as putting everything in the correct format would take a lot of time.
Please let me know if those two are the only solutions to this problem.

Merv's comment to my question:
This looks like the output from the conda list. Is that what you have?
I made a script for converting to YAML - see
stackoverflow.com/a/65912328/570918
I referred to that link, and it helped me resolve my problem.
Thanks, Merv.
There's another workaround for this problem which is quite tedious, so I recommend people facing similar issues to follow Merv's answer.

Related

is python package grpcio from conda-forge or pypi build using boringssl?

I would like to be sure that the python package grpcio (version >= 1.38.1) that I need to install is using BoringSSL and not OpenSSL. I was looking a conda-forge conda-forge or pypi pypi or at the grpc site grpc but could find this info. I found some blog that mention that boringssl is used but without any reference blog. Any suggestion how to find this info ? or it will use the available SSL lib from the system ?

Yes, BoringSSL is used for all grpcio pre-compiled wheels. However, not all of them enabled assembly optimization for encrption: https://github.com/grpc/grpc/blob/master/doc/ssl-performance.md
Building with OpenSSL is an option that people need to explicitly opt-in: https://github.com/grpc/grpc/blob/master/setup.py#L138

Impyla is returning values in bytes format

I'm trying to receive data in JH from Impyla, everything works fine except tables in one DB are returning data in b'' format.
Code:
from impala.dbapi import connect
conn = connect(host=host, port=21050, user={userName}, use_ssl=True, auth_mechanism='GSSAPI', kerberos_service_name='impala', database=db)
cursor = conn.cursor()
cursor.execute(sql)
data = cursor.fetchall()
example output:
b'', b'UK', b'X', b'Hlavn\xc3\xad 51',
It is happening only on 1 DB, other DBs and tables that I have tested are ok in utf-8 (tested on 4 DBs). + Not every column is in b''.
Packages:
impyla 0.17.0 pypi_0 pypi
bitarray 2.1.0 pypi_0 pypi
six 1.14.0 py_1 conda-forge
thrift 0.11.0 pypi_0 pypi
thrift-cpp 0.13.0 h62aa4f2_2 conda-forge
thrift-sasl 0.4.3 pypi_0 pypi
thriftpy 0.3.9 py37h516909a_1001 conda-forge
thriftpy2 0.4.14 py37h5e8e339_0 conda-forge
krb5 1.17.2 h926e7f8_0 conda-forge
However, if I run same query not from JH, but directly from server the output is in correct encoding - no bytes.
Packages on server:
impyla 0.16.3 py37hc8dfbb8_0 conda-forge
bitarray 2.0.1 py37h5e8e339_0 conda-forge
thrift 0.13.0 py37hcd2ae1e_2 conda-forge
thrift_sasl 0.4.2 py37h8f50634_0 conda-forge
thriftpy 0.3.9 py37h516909a_1001 conda-forge
thriftpy2 0.4.14 py37h5e8e339_0 conda-forge
six 1.15.0 pyh9f0ad1d_0 conda-forge
krb5 1.19.1 hcc1bbae_0 conda-forge
Any clues? :)
Thank you.
EDIT: 07. 06.
Format is in bytes because columns are varchar. String columns format = utf-8 encoded string. But varchars and chars are in bytes format. It appears that they changed it with version upgrade, as I have described behaviour server/JH (different versions). So I would have solved this by downgrading version, but the lower version is returning "invalid query handle" when trying to select a large number of rows :(
Im adding this link, which describes the issue, workaround and future progress: https://github.com/cloudera/impyla/issues/455

How to solve the famous `unhandled cuda error, NCCL version 2.7.8` error?

I've seen multiple issue about the:
RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1614378083779/work/torch/lib/c10d/ProcessGroupNCCL.cpp:825, unhandled cuda error, NCCL version 2.7.8
ncclUnhandledCudaError: Call to CUDA function failed.
but none seem to fix it for me:
https://github.com/pytorch/pytorch/issues/54550
https://github.com/pytorch/pytorch/issues/47885
https://github.com/pytorch/pytorch/issues/50921
https://github.com/pytorch/pytorch/issues/54823
I've tried to do torch.cuda.set_device(device) manually at the beginning of every script. That didn't seem to work for me. I've tried different GPUS. I've tried downgrading pytorch version and cuda version. Different combinations of 1.6.0, 1.7.1, 1.8.0 and cuda 10.2, 11.0, 11.1. I am unsure what else to do. What did people do to solve this issue?
very related perhaps?
Pytorch "NCCL error": unhandled system error, NCCL version 2.4.8"
More complete error message:
('jobid', 4852)
('slurm_jobid', -1)
('slurm_array_task_id', -1)
('condor_jobid', 4852)
('current_time', 'Mar25_16-27-35')
('tb_dir', PosixPath('/home/miranda9/data/logs/logs_Mar25_16-27-35_jobid_4852/tb'))
('gpu_name', 'GeForce GTX TITAN X')
('PID', '30688')
torch.cuda.device_count()=2
opts.world_size=2
ABOUT TO SPAWN WORKERS
done setting sharing strategy...next mp.spawn
INFO:root:Added key: store_based_barrier_key:1 to store for rank: 1
INFO:root:Added key: store_based_barrier_key:1 to store for rank: 0
rank=0
mp.current_process()=<SpawnProcess name='SpawnProcess-1' parent=30688 started>
os.getpid()=30704
setting up rank=0 (with world_size=2)
MASTER_ADDR='127.0.0.1'
59264
backend='nccl'
--> done setting up rank=0
setup process done for rank=0
Traceback (most recent call last):
File "/home/miranda9/ML4Coq/ml4coq-proj/embeddings_zoo/tree_nns/main_brando.py", line 279, in <module>
main_distributed()
File "/home/miranda9/ML4Coq/ml4coq-proj/embeddings_zoo/tree_nns/main_brando.py", line 188, in main_distributed
spawn_return = mp.spawn(fn=train, args=(opts,), nprocs=opts.world_size)
File "/home/miranda9/miniconda3/envs/metalearning11.1/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/miranda9/miniconda3/envs/metalearning11.1/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/home/miranda9/miniconda3/envs/metalearning11.1/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/miranda9/miniconda3/envs/metalearning11.1/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/home/miranda9/ML4Coq/ml4coq-proj/embeddings_zoo/tree_nns/main_brando.py", line 212, in train
tactic_predictor = move_to_ddp(rank, opts, tactic_predictor)
File "/home/miranda9/ultimate-utils/ultimate-utils-project/uutils/torch/distributed.py", line 162, in move_to_ddp
model = DistributedDataParallel(model, find_unused_parameters=True, device_ids=[opts.gpu])
File "/home/miranda9/miniconda3/envs/metalearning11.1/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 446, in __init__
self._sync_params_and_buffers(authoritative_rank=0)
File "/home/miranda9/miniconda3/envs/metalearning11.1/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 457, in _sync_params_and_buffers
self._distributed_broadcast_coalesced(
File "/home/miranda9/miniconda3/envs/metalearning11.1/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1155, in _distributed_broadcast_coalesced
dist._broadcast_coalesced(
RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1616554793803/work/torch/lib/c10d/ProcessGroupNCCL.cpp:825, unhandled cuda error, NCCL version 2.7.8
ncclUnhandledCudaError: Call to CUDA function failed.
Bonus 1:
I still have errors:
ncclSystemError: System call (socket, malloc, munmap, etc) failed.
Traceback (most recent call last):
File "/home/miranda9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_dist_maml_l2l.py", line 1423, in <module>
main()
File "/home/miranda9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_dist_maml_l2l.py", line 1365, in main
train(args=args)
File "/home/miranda9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_dist_maml_l2l.py", line 1385, in train
args.opt = move_opt_to_cherry_opt_and_sync_params(args) if is_running_parallel(args.rank) else args.opt
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/distributed.py", line 456, in move_opt_to_cherry_opt_and_sync_params
args.opt = cherry.optim.Distributed(args.model.parameters(), opt=args.opt, sync=syn)
File "/home/miranda9/miniconda3/envs/meta_learning_a100/lib/python3.9/site-packages/cherry/optim.py", line 62, in __init__
self.sync_parameters()
File "/home/miranda9/miniconda3/envs/meta_learning_a100/lib/python3.9/site-packages/cherry/optim.py", line 78, in sync_parameters
dist.broadcast(p.data, src=root)
File "/home/miranda9/miniconda3/envs/meta_learning_a100/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py", line 1090, in broadcast
work = default_pg.broadcast([tensor], opts)
RuntimeError: NCCL error in: ../torch/lib/c10d/ProcessGroupNCCL.cpp:911, unhandled system error, NCCL version 2.7.8
one of the answers suggested to have nvcca & pytorch.version.cuda to match but they do not:
(meta_learning_a100) [miranda9#hal-dgx ~]$ python -c "import torch;print(torch.version.cuda)"
11.1
(meta_learning_a100) [miranda9#hal-dgx ~]$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0
How do I match them?

I had the right cuda installed meaning:
python -c "import torch;print(torch.version.cuda)"
#was equal to
nvcc -V
and
ldconfig -v | grep "libnccl.so" | tail -n1 | sed -r 's/^.*\.so\.//'
was giving out some version of nccl (e.g., 2.10.3 )
The fix was to remove nccl:
sudo apt remove libnccl2 libnccl-dev
then the libnccl version check was not giving any version, but ddp training was working fine!

This is not a very satisfactory answer but this seems to be what ended up working for me. I simply used pytorch 1.7.1 and it's cuda version 10.2. As long as cuda 11.0 is loaded it seems to be working. To install that version do:
conda install -y pytorch==1.7.1 torchvision torchaudio cudatoolkit=10.2 -c pytorch -c conda-forge
if your are in an HPC do module avail to make sure the right cuda version is loaded. Perhaps you need to source bash and other things for the submission job to work. My setup looks as follows:
#!/bin/bash
echo JOB STARTED
# a submission job is usually empty and has the root of the submission so you probably need your HOME env var
export HOME=/home/miranda9
# to have modules work and the conda command work
source /etc/bashrc
source /etc/profile
source /etc/profile.d/modules.sh
source ~/.bashrc
source ~/.bash_profile
conda activate metalearningpy1.7.1c10.2
#conda activate metalearning1.7.1c11.1
#conda activate metalearning11.1
#module load cuda-toolkit/10.2
module load cuda-toolkit/11.1
#nvidia-smi
nvcc --version
#conda list
hostname
echo $PATH
which python
# - run script
python -u ~/ML4Coq/ml4coq-proj/embeddings_zoo/tree_nns/main_brando.py
I also echo other useful things like the nvcc version to make sure load worked (note the top of nvidia-smi doesn't show the right cuda version).
Note I think this is probably just a bug since cuda 11.1 + pytorch 1.8.1 are new as of this writing. I did try
torch.cuda.set_device(opts.gpu) # https://github.com/pytorch/pytorch/issues/54550
but I can't say that it always works or why it doesn't. I do have it in my current code but I think I still get error with pytorch 1.8.x + cuda 11.x.
see my conda list in case it helps:
$ conda list
# packages in environment at /home/miranda9/miniconda3/envs/metalearningpy1.7.1c10.2:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
absl-py 0.12.0 py38h06a4308_0
aioconsole 0.3.1 pypi_0 pypi
aiohttp 3.7.4 py38h27cfd23_1
anatome 0.0.1 pypi_0 pypi
argcomplete 1.12.2 pypi_0 pypi
astunparse 1.6.3 pypi_0 pypi
async-timeout 3.0.1 py38h06a4308_0
attrs 20.3.0 pyhd3eb1b0_0
beautifulsoup4 4.9.3 pyha847dfd_0
blas 1.0 mkl
blinker 1.4 py38h06a4308_0
boto 2.49.0 pypi_0 pypi
brotlipy 0.7.0 py38h27cfd23_1003
bzip2 1.0.8 h7b6447c_0
c-ares 1.17.1 h27cfd23_0
ca-certificates 2021.1.19 h06a4308_1
cachetools 4.2.1 pyhd3eb1b0_0
cairo 1.14.12 h8948797_3
certifi 2020.12.5 py38h06a4308_0
cffi 1.14.0 py38h2e261b9_0
chardet 3.0.4 py38h06a4308_1003
click 7.1.2 pyhd3eb1b0_0
cloudpickle 1.6.0 pypi_0 pypi
conda 4.9.2 py38h06a4308_0
conda-build 3.21.4 py38h06a4308_0
conda-package-handling 1.7.2 py38h03888b9_0
coverage 5.5 py38h27cfd23_2
crcmod 1.7 pypi_0 pypi
cryptography 3.4.7 py38hd23ed53_0
cudatoolkit 10.2.89 hfd86e86_1
cycler 0.10.0 py38_0
cython 0.29.22 py38h2531618_0
dbus 1.13.18 hb2f20db_0
decorator 5.0.3 pyhd3eb1b0_0
dgl-cuda10.2 0.6.0post1 py38_0 dglteam
dill 0.3.3 pyhd3eb1b0_0
expat 2.3.0 h2531618_2
fasteners 0.16 pypi_0 pypi
filelock 3.0.12 pyhd3eb1b0_1
flatbuffers 1.12 pypi_0 pypi
fontconfig 2.13.1 h6c09931_0
freetype 2.10.4 h7ca028e_0 conda-forge
fribidi 1.0.10 h7b6447c_0
future 0.18.2 pypi_0 pypi
gast 0.3.3 pypi_0 pypi
gcs-oauth2-boto-plugin 2.7 pypi_0 pypi
glib 2.63.1 h5a9c865_0
glob2 0.7 pyhd3eb1b0_0
google-apitools 0.5.31 pypi_0 pypi
google-auth 1.28.0 pyhd3eb1b0_0
google-auth-oauthlib 0.4.3 pyhd3eb1b0_0
google-pasta 0.2.0 pypi_0 pypi
google-reauth 0.1.1 pypi_0 pypi
graphite2 1.3.14 h23475e2_0
graphviz 2.40.1 h21bd128_2
grpcio 1.32.0 pypi_0 pypi
gst-plugins-base 1.14.0 hbbd80ab_1
gstreamer 1.14.0 hb453b48_1
gsutil 4.60 pypi_0 pypi
gym 0.18.0 pypi_0 pypi
h5py 2.10.0 pypi_0 pypi
harfbuzz 1.8.8 hffaf4a1_0
higher 0.2.1 pypi_0 pypi
httplib2 0.19.0 pypi_0 pypi
icu 58.2 he6710b0_3
idna 2.10 pyhd3eb1b0_0
importlib-metadata 3.7.3 py38h06a4308_1
intel-openmp 2020.2 254
jinja2 2.11.3 pyhd3eb1b0_0
joblib 1.0.1 pyhd3eb1b0_0
jpeg 9b h024ee3a_2
keras-preprocessing 1.1.2 pypi_0 pypi
kiwisolver 1.3.1 py38h2531618_0
lark-parser 0.6.5 pypi_0 pypi
lcms2 2.11 h396b838_0
ld_impl_linux-64 2.33.1 h53a641e_7
learn2learn 0.1.5 pypi_0 pypi
libarchive 3.4.2 h62408e4_0
libffi 3.2.1 hf484d3e_1007
libgcc-ng 9.1.0 hdf63c60_0
libgfortran-ng 7.3.0 hdf63c60_0
liblief 0.10.1 he6710b0_0
libpng 1.6.37 h21135ba_2 conda-forge
libprotobuf 3.14.0 h8c45485_0
libstdcxx-ng 9.1.0 hdf63c60_0
libtiff 4.1.0 h2733197_1
libuuid 1.0.3 h1bed415_2
libuv 1.40.0 h7b6447c_0
libxcb 1.14 h7b6447c_0
libxml2 2.9.10 hb55368b_3
lmdb 0.94 pypi_0 pypi
lz4-c 1.9.2 he1b5a44_3 conda-forge
markdown 3.3.4 py38h06a4308_0
markupsafe 1.1.1 py38h7b6447c_0
matplotlib 3.3.4 py38h06a4308_0
matplotlib-base 3.3.4 py38h62a2d02_0
memory-profiler 0.58.0 pypi_0 pypi
mkl 2020.2 256
mkl-service 2.3.0 py38h1e0a361_2 conda-forge
mkl_fft 1.3.0 py38h54f3939_0
mkl_random 1.2.0 py38hc5bc63f_1 conda-forge
mock 2.0.0 pypi_0 pypi
monotonic 1.5 pypi_0 pypi
multidict 5.1.0 py38h27cfd23_2
ncurses 6.2 he6710b0_1
networkx 2.5 py_0
ninja 1.10.2 py38hff7bd54_0
numpy 1.19.2 py38h54aff64_0
numpy-base 1.19.2 py38hfa32c7d_0
oauth2client 4.1.3 pypi_0 pypi
oauthlib 3.1.0 py_0
olefile 0.46 pyh9f0ad1d_1 conda-forge
openssl 1.1.1k h27cfd23_0
opt-einsum 3.3.0 pypi_0 pypi
ordered-set 4.0.2 pypi_0 pypi
pandas 1.2.3 py38ha9443f7_0
pango 1.42.4 h049681c_0
patchelf 0.12 h2531618_1
pbr 5.5.1 pypi_0 pypi
pcre 8.44 he6710b0_0
pexpect 4.6.0 pypi_0 pypi
pillow 7.2.0 pypi_0 pypi
pip 21.0.1 py38h06a4308_0
pixman 0.40.0 h7b6447c_0
pkginfo 1.7.0 py38h06a4308_0
progressbar2 3.39.3 pypi_0 pypi
protobuf 3.14.0 py38h2531618_1
psutil 5.8.0 py38h27cfd23_1
ptyprocess 0.7.0 pypi_0 pypi
py-lief 0.10.1 py38h403a769_0
pyasn1 0.4.8 py_0
pyasn1-modules 0.2.8 py_0
pycapnp 1.0.0 pypi_0 pypi
pycosat 0.6.3 py38h7b6447c_1
pycparser 2.20 py_2
pyglet 1.5.0 pypi_0 pypi
pyjwt 1.7.1 py38_0
pyopenssl 20.0.1 pyhd3eb1b0_1
pyparsing 2.4.7 pyhd3eb1b0_0
pyqt 5.9.2 py38h05f1152_4
pysocks 1.7.1 py38h06a4308_0
python 3.8.2 hcf32534_0
python-dateutil 2.8.1 pyhd3eb1b0_0
python-libarchive-c 2.9 pyhd3eb1b0_0
python-utils 2.5.6 pypi_0 pypi
python_abi 3.8 1_cp38 conda-forge
pytorch 1.7.1 py3.8_cuda10.2.89_cudnn7.6.5_0 pytorch
pytz 2021.1 pyhd3eb1b0_0
pyu2f 0.1.5 pypi_0 pypi
pyyaml 5.4.1 py38h27cfd23_1
qt 5.9.7 h5867ecd_1
readline 8.1 h27cfd23_0
requests 2.25.1 pyhd3eb1b0_0
requests-oauthlib 1.3.0 py_0
retry-decorator 1.1.1 pypi_0 pypi
ripgrep 12.1.1 0
rsa 4.7.2 pyhd3eb1b0_1
ruamel_yaml 0.15.100 py38h27cfd23_0
scikit-learn 0.24.1 py38ha9443f7_0
scipy 1.6.2 py38h91f5cce_0
setuptools 52.0.0 py38h06a4308_0
sexpdata 0.0.3 pypi_0 pypi
sip 4.19.13 py38he6710b0_0
six 1.15.0 pyh9f0ad1d_0 conda-forge
soupsieve 2.2.1 pyhd3eb1b0_0
sqlite 3.35.2 hdfb4753_0
tensorboard 2.4.0 pyhc547734_0
tensorboard-plugin-wit 1.6.0 py_0
tensorflow 2.4.1 pypi_0 pypi
tensorflow-estimator 2.4.0 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
threadpoolctl 2.1.0 pyh5ca1d4c_0
tk 8.6.10 hbc83047_0
torchaudio 0.7.2 py38 pytorch
torchmeta 1.7.0 pypi_0 pypi
torchtext 0.8.1 py38 pytorch
torchvision 0.8.2 py38_cu102 pytorch
tornado 6.1 py38h27cfd23_0
tqdm 4.56.0 pypi_0 pypi
typing-extensions 3.7.4.3 0
typing_extensions 3.7.4.3 py_0 conda-forge
urllib3 1.26.4 pyhd3eb1b0_0
werkzeug 1.0.1 pyhd3eb1b0_0
wheel 0.36.2 pyhd3eb1b0_0
wrapt 1.12.1 pypi_0 pypi
xz 5.2.5 h7b6447c_0
yaml 0.2.5 h7b6447c_0
yarl 1.6.3 py38h27cfd23_0
zipp 3.4.1 pyhd3eb1b0_0
zlib 1.2.11 h7b6447c_3
zstd 1.4.5 h9ceee32_0
For a100s this seemed to work at some point:
pip3 install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html

You should get the answer at https://pytorch.org/get-started/locally/
For me it worked by setting this up:
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116

As discussed in the related question Pytorch "NCCL error": unhandled system error, NCCL version 2.4.8", unhandled cuda error, NCCL version ... means something is wrong on the NCCL side. You need to set an environment variable NCCL_DEBUG=INFO to ask NCCL to print out its log so you can figure out what is exactly the problem. (Tip: look for the first WARN line in NCCL log).
As for OP's problem, it's likely caused by some mismatch between driver version / cuda version / cuda version pytorch is compiled with. In that case, if you check the NCCL log, it's going to show something like:
[5] transport/p2p.cc:238 NCCL WARN failed to open CUDA IPC handle : 36 API call is not supported in the installed CUDA driver
which clearly tells the problem. That's why we need to use NCCL_DEBUG=INFO when debugging unhandled cuda error.
Update:
Q: How to set NCCL_DEBUG=INFO?
A: Option 1: prepend NCCL_DEBUG=INFO to the commandline. For example NCCL_DEBUG=INFO python yourscript.py.
Option 2: Set it in Python script. For example,
import os
os.environ["NCCL_DEBUG"] = "INFO"
Option 3: Set it in your shell. For example, export NCCL_DEBUG=INFO
Q: How to match the version of CUDA and Pytorch?
A: OP seems to be using CUDA 11.0. That's a bit tricky because Pytorch no longer offers prebuilt package with CUDA 11.0. So you need to either use an old Pytorch prebuilt package (I think the last version with CUDA 11.0 is Pytorch 1.7.1) or update your system CUDA version. Or you can try to build Pytorch from the source.
If you are OK with an old Pytorch.
conda create --name=tmp pytorch=1.7.1 cudatoolkit=11.0 -c pytorch -c nvidia

"Regionalization.ipynb is not trusted" issue with bw2regional

I managed to install bw2regional thanks to this post here
but now when I want to run/import it in my notebook, I still get the error message:
ModuleNotFoundError: No module named 'bw2regional'
in my powershell miniconda prompt, the following line appears in parallel:
1 - Regionalization.ipynb is not trusted
Did I not do the installation correct?
Thanks in advance for any help
Edit:
Here are the packages I installed in my environment:
(condashell) PS C:\Users\reim> conda list
# packages in environment at C:\Users\reim\Miniconda3\envs\condashell:
#
# Name Version Build Channel
[I ommited some for claritys sake here]
brightway2 2.3 py_2 cmutel
brotlipy 0.7.0 py38h294d835_1001 conda-forge
bw2analyzer 0.9.4 py_1 cmutel
bw2calc 1.8.0 py_0 cmutel
bw2data 3.6.2 py_0 cmutel
bw2io 0.7.12 py_0 cmutel
bw2parameters 0.6.6 py_0 cmutel
bw2regional 0.5.2 py_0 cmutel
bw_migrations 0.1 py_0 cmutel
[I ommited some for claritys sake here]

It seems like there are at least two things happening here.
You have a Jupyter notebook installed in a different environment than bw2regional
The reason I think this is that you can import bw2regional. I guess you should be able to run conda install jupyter in the same environment that you ran conda install bw2regional. If this sounds confusing, please read a bit on managing conda environments.
Your notebook is not trusted.
Not sure why this would be happening - hard to tell without knowing your OS, configuration, where you got the notebook, etc. But you can read about the Jupyter security policy, maybe that helps.

Conda install of pytorch fails

I created an environment with conda and I want to install pytorch in it, but it doesn't work. After I get inside my environment with source activate env_name I tried this: conda install pytorch torchvision -c pytorch (I also tried it like this: conda install -c pytorch pytorch torchvision) but I am getting this error:
Using Anaconda Cloud api site https://api.anaconda.org
Fetching package metadata: ......
Solving package specifications: ......
Error: Could not find some dependencies for pytorch: mkl >=2018, cudatoolkit >=9.0,<9.1, blas * mkl, cudatoolkit >=10.0,<10.1, cudatoolkit >=9.2,<9.3, blas * openblas, cudnn 7.0.*, cudatoolkit 9.*
Did you mean one of these?
pytorch, pytorch-gpu, pytorch-cpu
Did you mean one of these?
cudatoolkit
You can search for this package on anaconda.org with
anaconda search -t conda cudatoolkit 9.*
(and similarly for the other packages)
Here are my installed packages:
backports 1.0 py34_0
backports.shutil-get-terminal-size 1.0.0 <pip>
decorator 4.0.11 py34_0
get_terminal_size 1.0.0 py34_0
ipython 4.2.0 py34_0
ipython-genutils 0.1.0 <pip>
ipython_genutils 0.1.0 py34_0
libgfortran 1.0 0
numpy 1.9.2 py34_0
openssl 1.0.2l 0
path.py 10.0 py34_0
pexpect 4.2.1 py34_0
pickleshare 0.7.4 py34_0
pip 9.0.1 py34_1
ptyprocess 0.5.1 py34_0
python 3.4.5 0
readline 6.2 2
scipy 0.16.0 np19py34_0
setuptools 27.2.0 py34_0
simplegeneric 0.8.1 py34_1
six 1.10.0 py34_0
sqlite 3.13.0 0
tk 8.5.18 0
traitlets 4.3.1 py34_0
wheel 0.29.0 py34_0
xz 5.2.3 0
zlib 1.2.11 0
What should I do? Thank you!

Pytorch's vision package (aka torchvision) was developed post-Python 3.4, and so only has versions supporting Python 2.7, 3.5-7. Please create a new environment with a later Python version. Note it is always better to include the packages you care about in the creation of the environment, e.g.,
conda create -n env_name -c pytorch torchvision
and Conda will figure the rest out. If you need to have a specific version of Python, you can include that as well (e.g., python=3.6).

Please try the following steps.It worked fine for me.
source activate env_name
conda install -c pytorch pytorch
open python shell
import torch

I can't give you a definite answer cause you didn't provided the info about the Python version, platform you're using.
Go to the official website for Pytorch, choose a installation method according to your platform, Python version and whether you need CUDA.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

conda create an environment without yaml or text file [duplicate] - python-3.x

Related

is python package grpcio from conda-forge or pypi build using boringssl?

Impyla is returning values in bytes format

How to solve the famous `unhandled cuda error, NCCL version 2.7.8` error?

"Regionalization.ipynb is not trusted" issue with bw2regional

Conda install of pytorch fails

Categories

Resources