I am trying to run some code on Google Colab TPU.
I am installing pytorch-xla using the following lines of code:
!pip install cloud-tpu-client==0.10 https://storage.googleapis.com/tpu-pytorch/wheels/torch_xla-1.9-cp37-cp37m-linux_x86_64.whl
When I am trying to import torch_xla, I am getting the error
ImportError: /usr/local/lib/python3.7/dist-packages/_XLAC.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZNK2at10TensorBase8data_ptrIN3c107complexIfEEEEPT_v
What is the reason for this error and what is the solution?
This issue is occurring even when using the example notebooks given in pytorch-xla github repo readme. But I received no error yesterday running the same notebooks!
EDIT:
When I run the following code:
!curl https://raw.githubusercontent.com/pytorch/xla/master/contrib/scripts/env-setup.py -o pytorch-xla-env-setup.py
!python pytorch-xla-env-setup.py --version 1.9
import os
os.environ['LD_LIBRARY_PATH']='/usr/local/lib'
!echo $LD_LIBRARY_PATH
!sudo ln -s /usr/local/lib/libmkl_intel_lp64.so /usr/local/lib/libmkl_intel_lp64.so.1
!sudo ln -s /usr/local/lib/libmkl_intel_thread.so /usr/local/lib/libmkl_intel_thread.so.1
!sudo ln -s /usr/local/lib/libmkl_core.so /usr/local/lib/libmkl_core.so.1
!ldconfig
!ldd /usr/local/lib/python3.7/dist-packages/torch/lib/libtorch.so
It uninstalls torch-1.9.0+cu102 and install torch-1.10.0a0+git88c0ea9
When importing torch I get this output
/usr/local/lib/python3.7/dist-packages/torch/package/_directory_reader.py:17: UserWarning: Failed to initialize NumPy: module compiled against API version 0xe but this version of numpy is 0xd (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:68.)
_dtype_to_storage = {data_type(0).dtype: data_type for data_type in _storages}
What is the reason behind this behavior?
%pip install cloud-tpu-client==0.10 https://storage.googleapis.com/tpu-pytorch/wheels/torch_xla-1.9-cp37-cp37m-linux_x86_64.whl
%pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchtext==0.10.0 -f https://download.pytorch.org/whl/cu111/torch_stable.html
Related
While trying to run "import torchvision", the following error displayed-
ImportError: /home/jupyterlab/conda/envs/python/lib/python3.7/site-packages/torchvision/_C.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail36_typeMetaDataInstance_preallocated_7E
This did not help either-
! conda install pytorch=1.11.0 torchvision=0.12.0 -c pytorch -y
pip install torchvision==0.4.1 solves the problem
I am using Python 3.9.13. I installed scikit-learn from the terminal:
pip install scikit-learn
Then I tried to download the mnist dataset using fetch_openml:
from sklearn.datasets import fetch_openml
raw_data = fetch_openml('mnist_784')
That gave me a long error message ending with:
fetch_openml with as_frame=True requires pandas.
However, I had pandas installed. So I looked more deeply inside the error message and I found that the exception causing that error was this:
ModuleNotFoundError: No module named '_bz2'
I looked around and found a solution in this thread.
I only had to add another step to that solution.
After installing libbz2-dev I only had _bz2.cpython-38-x86_64-linux-gnu.so on my computer which is used for python 3.8.x so it did not work with my version of python.
I changed the file's name to _bz2.cpython-39-x86_64-linux-gnu.so and it worked after that.
sudo apt-get install libbz2-dev
sudo cp /usr/lib/python3.8/lib-dynload/_bz2.cpython-38-x86_64-linux-gnu.so /usr/local/lib/python3.9/
sudo mv /usr/local/lib/python3.9/_bz2.cpython-38-x86_64-linux-gnu.so /usr/local/lib/python3.9/_bz2.cpython-39-x86_64-linux-gnu.so
I had a similar issue with _lzma library when I wanted to import torchvision.
The issue was solved with running below lines in the terminal:
sudo apt install liblzma-dev
sudo cp /usr/lib/python3.8/lib-dynload/_lzma.cpython-38-x86_64-linux-gnu.so /usr/local/lib/python3.9/
sudo mv /usr/local/lib/python3.9/_lzma.cpython-38-x86_64-linux-gnu.so /usr/local/lib/python3.9/_lzma.cpython-39-x86_64-linux-gnu.so
i am training stylegan2-ada-pytorch from google colab with my custom images however on trying to perform the initial training i get the above error from tensorboard
cmd = f"/usr/bin/python3 /content/stylegan2-ada-pytorch/train.py --snap {SNAP} --outdir {EXPERIMENTS} --data {DATA}"
!{cmd}
solved it by uninstalling jax and reinstalling the cuda version of iy
!pip uninstall jax jaxlib -y
!pip install "jax[cuda11_cudnn805]==0.3.10" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
And then changed the torch version to 1.8.1
I am trying to run a model on TPU as given in colab notebook. The model was working fine, but today I could not run the model.
I used the following code to install pytorch-xla.
VERSION = "nightly" ##param ["1.5" , "20200325", "nightly"]
!curl https://raw.githubusercontent.com/pytorch/xla/master/contrib/scripts/env-setup.py -o pytorch-xla-env-setup.py
!python pytorch-xla-env-setup.py --version $VERSION
I try to install required libraries as below:
!pip install -U nlp
!pip install sentencepiece
!pip install numpy --upgrade
However, when I try the following
import nlp
It gives the following error:
OSError: libmkl_intel_lp64.so.1: cannot open shared object file: No such file or directory
I searched the error and I tried the followings, but still does not work. Any ideas how to fix it? Note: It was working a few days ago, however, today it is not.
!pip install mkl
#!export PATH="$PATH:/opt/intel/bin"
#!export LD_LIBRARY_PATH="$PATH:opt/intel/mkl/lib/intel64_lin/"
!export LID_LIBRAEY_PATH="$LID_LIBRARY_PATH:/opt/intel/mkl/lib/intel64_lin/"
import os
os.environ['LD_LIBRARY_PATH']='/usr/local/lib'
!echo $LD_LIBRARY_PATH
!sudo ln -s /usr/local/lib/libmkl_intel_lp64.so /usr/local/lib/libmkl_intel_lp64.so.1
!sudo ln -s /usr/local/lib/libmkl_intel_thread.so /usr/local/lib/libmkl_intel_thread.so.1
!sudo ln -s /usr/local/lib/libmkl_core.so /usr/local/lib/libmkl_core.so.1
!ldconfig
!ldd /usr/local/lib/python3.7/dist-packages/torch/lib/libtorch.so
worked for me. We will also try to fix the problem internally.
from albumentations.pytorch.transforms import ToTensorV2
I used the above code, and it doesn't work.
Just add a code block with the line
! pip install albumentations==0.4.6
above the block where you do the import. I tried installing it without the specific version and it failed.
When i did not specify the version number in pip install, version 0.1.12 was installed which does not contain ToTensorV2.
Ensure that you have the latest version
!pip install --upgrade --force-reinstall --no-deps albumentations
Get albumentations from Github.
Usage on Colab Example
!pip install -U git+https://github.com/albu/albumentations > /dev/null