I'm trying to run my deep learning code in Google Colab, I have installed cuda10.0.130 and cudnn7.6.4 for tensorflow 1.14.0, but the result of tf.test.is_gpu_available() is still false, I don't know what can I do now, can somebody give me some instructions? Here is the output of !sudo lsb_release -a and !nvidia-smi
Supported and Tested configurations for GPU versions are given here in this link
Supported Version for Cuda 10.1 and Cudnn 7.6 will be tensorflow_gpu-2.3.0
Also for TF 1.X versions CPU and GPU support are different
So you should do
!pip install tensorflow_gpu==1.14.0
for using GPU version of Tensorflow
Ref- https://www.tensorflow.org/install/gpu#older_versions_of_tensorflow
I'm setting up my Conda environment with a remote GPU to use Pytorch.
The GPU I use is only NVIDIA-SMI 396.54, so I can only use cuda version 9.2
However, I need to use a higher version torch to be able to use some attributes.
I tried
conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=9.2
But this results in
print(torch.version.cuda)>> None
torch.cuda.is_available() >> False
There are two things I would check.
You may have unintentionally installed the pytorch cpu version or had it in your environment first, before running the above command. Even if you install the gpu version of Pytorch, if you already have the cpu version of pytorch then torch.cuda.is_available() will return False. Therefore I suggest checking out this link:
Forum on why Pytorch is CPU version even after installing cudatoolkit version
Although, I am pretty sure the above thing is your problem, I suggest looking at this second thing.
For understanding how to download previous version of Pytorch refer to this link. https://pytorch.org/get-started/previous-versions/
After looking at this, I suggest starting a new conda env and running your conda install command first.
Sarthak Jain
Platform: Precision 5820, 32G, rtx4000; Win 10 Pro, Arcgis Pro 2.6 concurrent license;
Issue:
I installed the deep learning tools following the guidelines provided here:
deeplearninginstallation
tersorflow was not found after installation so I manually installed the 2.1.0 version. I now have arcgis 1.8.2, pro 2.6, fastai 1.0.60, python 3.6.12, pytorch 1.4.0, tensorflow-gpu 2.1.0; environment check in arcgis pro python seemed fine.
However, after I select toolbox-image analyst-deeplearning-traindeeplearningmodel, the program seems to go into a hang, with most buttons disabled/unresponsive, this would continue until I force terminate the program. I also ran into "tool not licensed" twice, which was gone after I restarted the program; and a "name 'CallBackHandler' is not defined" once, which was also gone after I restarted.
I tried runing the command from the arcgis pro python prompt:
TrainDeepLearningModel(r"**", r"**", 40, "RETINANET", 16, "# #", None, "RESNET50", None, 10, "STOP_TRAINING", "FREEZE_MODEL")
executing the command would also send the program into a hang similar to the previous one. Monitor shows that ram and GPU usage haven't changed much, so I left the program running for an hour before forcibly terminating it.
I'd greatly appreciate it if anyone can tell me what the issues are here. I'll post any other env parameters if anyone requires. Cheers.
I got the tool up and running now by running conda install -c pytorch -c fastai fastai=1.0.54 pytorch=1.1.0 torchvision scikit-image and removing all the conflicting specifications in the cloned arcgispro-py3 env that I had. Now I still don't understand what went wrong. Presumably one or more packages in the env was conflicting. But seeing as I'm not a python expert, I couldn't identify the exact issue.
Before this I tried the versions stated here deeplearning install guide, but wasn't able to get pass tensorflow-gpu because python kept checking conflications. Now I actually don't have tensorflow-gpu in the env. I have tensorflow 2.1.0, keras-applications 1.0.8/base 2.3.1/preprocesing 1.1.0 (no keras-gpu), scikit-image 0.17.2, pillow 6.2.1, fastai 1.0.54, pytorch 1.1.0, libtiff 4.0.10. Some are different from what the guideline provided.
Thing is when I ran the process, CPU usage was up and GPU wasn't despite the fact that I specified GPU as the processing core. But I have much more pressing things to do right now like getting the analysis finished. So I'll probably tweek the env around a little after I'm done with this bit and see what happens. Meanwhile, anyone's input is still welcome.
I got tired of all the reminders that TensorFlow is not optimized for my CPU and so finally compiled it from source. In fact I did it twice and made two .whl files, once using
bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
and once using
bazel build --config=mkl --config=opt //tensorflow/tools/pip_package:build_pip_package
since I was not sure how much difference the Intel MKL would make. Now I have setup two identical anaconda environments, one using each whl file.
What is the quickest way to determine which of the .whl packages performs better on my system? If someone can point me to a standard Benchmark package/command in tensorflow that would be great (please note that I do not have GPU support).
You can run the below benchmark and check the performance.
https://github.com/tensorflow/benchmarks.git
git clone the above code to your terminal and then run the tf_cnn_benchmark.py benchmark code.
Thanks
I have following environment in my windows 10 machine
Python : 3.6.0
Anaconda:4.3.1
Tensorflow:1.1.0
Screen Shots
OS:Windows 10-64bit
Now when I am trying to install keras into my system I am getting a huge list of errros.
Detailed Error Log
Now I have two questions here.
Can I install keras into my system when I already have tensorflow GPU version which was really hard to install?
If keras can be installed into my this system configuration then will my tensorflow GPU version work properly afterwards?