I have successfully installed pytorch from source using command git clone --recursive https://github.com/pytorch/pytorch.git on my Windows 11 with CPU. But I cannot run the pretrained DL model. It gives error on line: from caffe2.python import workspace. Even though I have workspace on pytorch/caffe2/python/workspace. Please guide if there is anything else I need to do?
Please enable BUILD_CAFFE2 while building PyTorch from source if not already done.
Related
While trying to learn fairseq, I was following the tutorials on the website and implementing:
https://fairseq.readthedocs.io/en/latest/tutorial_simple_lstm.html#training-the-model
However, after following all the steps, when I try to train the model using the following:
! fairseq-train data-bin/iwslt14.tokenized.de-en \ --arch tutorial_simple_lstm \ --encoder-dropout 0.2 --decoder-dropout 0.2 \ --optimizer adam --lr 0.005 --lr-shrink 0.5 \ --max-tokens 12000
I receive an error:
`fairseq-train: error: argument --arch/-a: invalid choice: 'tutorial_simple_lstm' (choose from 'fconv', 'fconv_iwslt_de_en', 'fconv_wmt_en_ro', 'fconv_wmt_en_de', 'fconv_wmt_en_fr', 'fconv_lm', 'fconv_lm_dauphin_wikitext103', 'fconv_lm_dauphin_gbw', 'transformer', 'transformer_iwslt_de_en', 'transformer_wmt_en_de', 'transformer_vaswani_wmt_en_de_big', 'transformer_vaswani_wmt_en_fr_big', 'transformer_wmt_en_de_big', 'transformer_wmt_en_de_big_t2t', 'bart_large', 'bart_base', 'mbart_large', 'mbart_base', 'mbart_base_wmt20', 'nonautoregressive_transformer', 'nonautoregressive_transformer_wmt_en_de', 'nacrf_transformer', 'iterative_nonautoregressive_transformer', 'iterative_nonautoregressive_transformer_wmt_en_de', 'cmlm_transformer', 'cmlm_transformer_wmt_en_de', 'levenshtein_transformer', 'levenshtein_transformer_wmt_en_de', 'levenshtein_transformer_vaswani_wmt_en_de_big',....
Some additional info: I am using google colab. And I am writing the entire code until train step into .py file and uploading it to fairseq/models/... path as per my interpretation of the instructions. I am following the exact tutorial in the link.
And, before running it on colab, I am installing fairseq using:
!git clone https://github.com/pytorch/fairseq %cd fairseq !pip install --editable ./
I think this error happens because the command line argument created as per the tutorial has not been set properly.
Can anyone please explain if on any step I would need to do something else.
I would be grateful for your inputs as for a beginner learner such help from the community goes a long way.
Seems you didn't register the SimpleLSTMModel architecture as follow. Once the model is registered you can use it with the existing Command-line Tools.
#register_model('simple_lstm')
class SimpleLSTMModel(FairseqEncoderDecoderModel):
...
.
.
...
Please note that copying .py files doesn't mean you have registered the model. To do so, you need to execute the .py file that includes abovementioned lines of code. Then, you'll be able to run the training process using existing command-line tools.
You should put your .py into:
fairseq/fairseq/models
not to fairseq/models
I have been trying to run the Mask R-CNN demo from matter plot (https://github.com/matterport/Mask_RCNN). It works with tensorflow version 1.13.1 and keras 2.1.0 as suggested by someone here (https://github.com/matterport/Mask_RCNN/issues/1797). I get the error
ModuleNotFoundError: No module named 'astor'
The thing is astor 0.8.0 is installed in my virtual env but when trying to import it, it say it does not exist. I have made sure to install it normally as well as sudo. If you think it does exist on $PYTHONPATH, how can I do it. I am out of my depth here, so please be considerate.
EDIT: I am using virtualenv in pyCharm. If I look through my interpreter paths, I get
file:///home/$SUER/anaconda3/lib/python3.6
file:///home/$USER/anaconda3/lib/python3.6/lib-dynload
file:///home/$USER/project/Mask_RCNN/venv/lib/python3.6/site-packages
file:///home/$USER/anaconda3/lib/python3.6/site-packages
I have replace my actual user name with $USER in the above output.
Thanks to #bigbounty, but his questions made me understand a bit more about PYTHONPATH. Maybe something that I already should have known but you make mistakes and learn.
SOLUTION: in the venv interpreter paths I added
/home/$USER/anaconda3/pkgs/astor-0.8.0-py36_0/lib/python3.6/site-packages
and it worked. I do not know why the package was installed in this folder where apparently it was not included.
i solve copyng the astor folder from site-package (in env) to my project folder and set is as source( in pycharm)
I am trying to go through the following tutorial published here but get the error below when I run these lines fo code:
run = exp.submit(est)
run.wait_for_completion(show_output=True)
ERROR:
"message": "Could not import package \"azureml-dataprep\". Please ensure it is installed by running: pip install \"azureml-dataprep[fuse,pandas]\""
However, I have already installed the required packages:
I am running this through Jupyter Notebooks in an Anacoda Python 3.7 environment.
UPDATE
Tried creating a new conda environment as specified here but still get the same error.
conda create -n aml python=3.7.3
After installing all the required packages, I am able to reproduce the exeception by executing the following:
Sorry for this. Take a look at the Jupyter Notebook version of the same tutorial:
https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/ml-frameworks/tensorflow/deployment/train-hyperparameter-tune-deploy-with-tensorflow/train-hyperparameter-tune-deploy-with-tensorflow.ipynb
When configuring estimator, you need to specify the pip packages u wanna install on the remote compute. In this case, azureml-dataprep[fuse, blob]. Installing the package to your local computer is not useful since the training script is executed on the remote compute target, which doesn't have the required package installed yet.
est = TensorFlow(source_directory=script_folder,
script_params=script_params,
compute_target=compute_target,
entry_script='tf_mnist.py',
use_gpu=True,
pip_packages=['azureml-dataprep[pandas,fuse]'])
Can you pls try the fix and let us know whether it solves your issue :) In the mean time, I will update the public documentation to include pip_packages in estimator config.
Have you gone through the known issues and Troubleshooting page?. It is mentioned as one of the known issue.
Error message: ERROR: No matching distribution found for azureml-dataprep-native
Anaconda's Python 3.7.4 distribution has a bug that breaks azureml-sdk
install. This issue is discussed in this GitHub Issue This can be
worked around by creating a new Conda Environment using this command:
I am new to python and pydev. I have tensorflow source and am able to run the example files using python3 /pathtoexamplefile.py. I want to try to step thru the word2vec_basic.py code inside pydev. The debuger keep throwing
File "/Users/me/workspace/tensorflow/tensorflow/python/init.py", line 45, in
from tensorflow.python import pywrap_tensorflow
ImportError: cannot import name 'pywrap_tensorflow'
I think it has something to do with the working directory. I am able to run python3 -c 'import tensorflow' from my home directory. But, once I enter /Users/me/workspace/tensorflow, the command throws the same error, referencing the same line 45.
Can someone help me thru this part? Thank you.
Try to do 2 things:
Update to PyDev 5.4.0 and enable the support for running with the '-m' flag (in Preferences > PyDev > Run).
Go to your launch in Run > Run Configurations > Select the launch and change the working directory to be the project location.
Then, try to run it again. If it still fails, post your full stack trace... also, the screenshot for the tree shouldn't have all the source for tensorflow expanded (i.e.: I'm interested in the icons related to the project and source folders to know about how you made your PYTHONPATH configuration inside PyDev, not the internal contents of the tensorflow module).
Ok, the problem is the entire tensorflow source tree is inside the eclipse project. Its confused whether to go to the other branches of the source tree or to the installed tensorflow modules. I created a separate pydev project with only the word2vec directory, and it now runs inside eclipse.
I have run the tutorials and created my own neural network implementation in tensorflow successfully. I then decided to go one bit further an add my own op because I needed to do some of my own preprocessing on the data. I followed the tutorial on the tensorflow site to add an op. I successfully built tensorflow after writing my own c++ file. Then, when I try to use it from my code, I get
'module' object has no attribute 'sec_since_midnight'
My code does get reflected in bazel-genfiles/tensorflow/python/ops/gen_user_ops.py so the wrapper does get generated for it correctly. It just looks like I can't see the tensorflow/python/user_ops/user_ops.py which is what imports that file.
Now when I when I go through the testing of this module, I get the following odd behavior. It should not pass because the expected vector I give it does not match what the result should be. But maybe the test never gets executed despite saying passed?
INFO: Found 1 test target...
Target //tensorflow/python:sec_since_midnight_op_test up-to-date:
bazel-bin/tensorflow/python/sec_since_midnight_op_test
INFO: Elapsed time: 6.131s, Critical Path: 5.36s
//tensorflow/python:sec_since_midnight_op_test (1/0 cached) PASSED
Executed 0 out of 1 tests: 1 test passes.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.
Hmmm. Well, I uninstalled tensorflow and then I reinstalled from what I just built and what I wrote was suddenly recognized. I have seen this behavior twice in a row now where an uninstall is necessary. So to sum, the steps after adding my own op are:
$ pip uninstall tensorflow
$ bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
# To build with GPU support:
$ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
# The name of the .whl file will depend on your platform.
$ pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-cp27-none-linux_x86_64.whl