Running python code on CUDA - python-3.x

While trying to run this code https://wltrimbl.github.io/2014-06-10-spelman/intermediate/python/04-multiprocessing.html on my GPU system which has 300 cores, i used the comment with tf.device('/GPU:0') on the beginning of code. But found that it does not run on GPU. Then i tried
import tensorflow as tf
tf.device('/GPU:0'): # for run in GPU
init = tf.initialize_all_variables()
# initializing all variables
sess = tf.Session(
config=tf.ConfigProto(
intra_op_parallelism_threads=1))
Does this code run in GPU? or is there any method for run a python code on GPU.

No, it won't run without a GPU optimized version of TensorFlow.
Python multiprocesing is for CPU only.
TensorFlow GPU is available (see here https://www.nvidia.com/en-us/data-center/gpu-accelerated-applications/tensorflow/). The Tensorflow GPU implementation is using CUDA with cuDNN under the hood.
To run your own Python script on GPU, you need to use a library like PyCUDA or Cupy which use the CUDA API under the hood as well.

Related

How to run Caffe2 on Macbook Pro M1 GPU

I was able to run PyTorch with Macbook Pro M1 Max GPU. However Caffe2 does not use the GPUs.
import torch
torch.device("mps")
from caffe2.python import core
WARNING:root:This caffe2 python run failed to load cuda module:No module named 'caffe2.python.caffe2_pybind11_state_gpu',and AMD hip module:No module named 'caffe2.python.caffe2_pybind11_state_hip'.Will run in CPU only mode.
I created the PyTorch and Caffe2 from the nightly code using
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
BUILD_CAFFE2=1 MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py install
Any suggestions on how to solve this?

Problems getting GPU to work with older version of Tensorflow and Keras

I have a project I'm trying to work on but it's based on code that's a few years old and for whatever reason this code tends to fail if Tensorflow or NumPy aren't the correct versions (which means everything I'm using has to be old). This has meant that I've needed to dual-install an older version of Python to then be able to install the correct versions of the dependencies.
I'm running:
Python 3.7.5
NumPy 1.17.4
Pandas 0.25.3
pyyaml 5.1.2
more_itertools 7.2.0
keras 2.3.1
tensorflow 2.0.1
CUDA 10.0
CuDNN 7.4.1
I'm particularly interested in the keras and tensorflow versions. From my research, it seems they should work with GPU (as is?) according to this:
https://www.tensorflow.org/install/source (towards the bottom under tested build configurations for GPU).
However, when I try to detect GPU devices on my build with
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
I get
2022-03-22 19:34:53.410102: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 1722158755601489749
]
Which seems like it isn't recognising my GPU.
Is there something I'm missing in the setup process that I need to enable GPU support? As far as I can tell my versions of Tensorflow and Keras are compatible with GPU processing and have compatible versions of CUDA and CuDNN installed.

Segmentation fault error in importing sentence_transformers in Azure Machine Learning Service Nvidia Compute

I would like to use sentence_transformers in AML to run XLM-Roberta model for sentence embedding. I have a script in which I import sentence_transformers:
from sentence_transformers import SentenceTransformer
Once I run my AML pipeline, the run fails on this script with the following error:
AzureMLCompute job failed.
UserProcessKilledBySystemSignal: Job failed since the user script received system termination signal usually due to out-of-memory or segfault.
Cause: segmentation fault
TaskIndex:
NodeIp: #####
NodeId: #####
I'm pretty sure that this import is causing this error, because if I comment out this import, the rest of the script will run.
This is weird because the installation of the sentence_transformers succeed.
This is the details of my compute:
Virtual machine size
STANDARD_NV24 (24 Cores, 224 GB RAM, 1440 GB Disk)
Processing Unit
GPU - 4 x NVIDIA Tesla M60
Agent Pool:
Azure Pipelines
Agent Specification:
ubuntu-16.04
requirements.txt file:
torch==1.4.0
sentence-transformers
Does anyone have a solution for this error?
I fixed the issue by changing the pytorch version from 1.4.0 to 1.6.0.
So the requirements.txt looks like this:
torch==1.6.0
sentence-transformers
At first I tried one of the older versions of sentence-transformers which was compatible with pytorch 1.4.0. But the older version doesn't support "xml-roberta-base" model, so I tried to upgrade the pytorch version.

cuda runtime error (3) : initialization error at /pytorch/aten/src/THC/THCGeneral.cpp:51

I start to run TensorFlow application on terminal. At the same time, when I start Pytorch application on another terminal I get error
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=51 error=3 : initialization error
terminate called after throwing an instance of 'std::runtime_error'
what(): cuda runtime error (3) : initialization error at /pytorch /aten/src/THC/THCGeneral.cpp:51
Both Pytorch TensorFlow and running on different virtualenvs
My Environment
-Ubuntu 18.04
-GPU GeForce GTX 1060
-Pytorch env (torch==1.1.0, torchvision==0.2.0)
-Tensorflow env (tensorflow-gpu==1.15.0)
Pytorch application was running smoothly before starting TensorFlow application
I stop the TensorFlow application and check
>>> torch.cuda.device_count()
0
>>> torch.cuda.is_available()
False
But error not goes back
I had the same problem with pytorch training script crashing, after waking up the Ubuntu from sleep mode. Torch was not able to detect the GPU. It seems that the Cuda driver has a problem restoring the active context after the wakeup.
Rebooting the system fixed the issue.
I was running pytorch in "conda env".

Keras tensorflow backend does not detect GPU

I am running keras with tensorflow backend on linux.
First, I installed tensorflow GPU version by itself, and run the following code to check and found out that it's running on GPU and shows the GPU it's running on, device mapping, etc. The tensorflow I use was from https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-
0.11.0-cp27-none-linux_x86_64.whl
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))
Then, I installed keras using conda install keras. I checked conda list and now I have 2 versions of tensorflow (1.1.0 and 0.11.0). I tried import tensorflow as tf which results in:
2017-07-18 16:35:59.569535: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 16:35:59.569629: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 16:35:59.569690: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 16:35:59.569707: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-18 16:35:59.569731: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Device mapping: no known devices.
2017-07-18 16:35:59.579959: I tensorflow/core/common_runtime/direct_session.cc:257] Device mapping:
MatMul: (MatMul): /job:localhost/replica:0/task:0/cpu:0
2017-07-18 16:36:14.369948: I tensorflow/core/common_runtime/simple_placer.cc:841] MatMul: (MatMul)/job:localhost/replica:0/task:0/cpu:0
b: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-07-18 16:36:14.370051: I tensorflow/core/common_runtime/simple_placer.cc:841] b: (Const)/job:localhost/replica:0/task:0/cpu:0
a: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-07-18 16:36:14.370109: I tensorflow/core/common_runtime/simple_placer.cc:841] a: (Const)/job:localhost/replica:0/task:0/cpu:0
I already set CUDA_VISIBLE_DEVICES, which works before keras was installed.
Is this because of the tensorflow version? Can I choose to install 0.11.0 instead of 1.1.0 when installing keras?
If the problem is due to tensorflow not detecting a GPU, how can I solve this issue? I read in this link and it says that tensorflow will automatically run on GPU is it detects one.
Chances are that Keras, depending on a newer version of TensorFlow, has caused the installation of a CPU-only TensorFlow package (tensorflow) that is hiding the older, GPU-enabled version (tensorflow-gpu).
I would upgrade the GPU-enabled version first. Usually you can just do pip install --upgrade tensorflow-gpu, but you have Anaconda-specific instructions in the TensorFlow installation page. Then you can uninstall the CPU-only TensorFlow package with pip uninstall tensorflow. Now import tensorflow as tf should actually import the GPU-enabled package which, as you suggest, should in turn detect your GPU automatically.

Resources