Loading local data google colab - python-3.x

I have a npy file, (largeFIle.npy) saved in the same "colab notebooks" folder on my google drive that I have my google colab notebook saved in. I'm trying to load the data into my notebook with the code below but I'm getting the error below. This code works fine when I run it locally on my laptop with the notebook in the same folder as the file. Is there something different I need to do when loading data with notebooks in google colab? I'm very new to colab.
code:
dataset_name = 'largeFIle.npy'
dataset = np.load(dataset_name, encoding='bytes')
Error:
FileNotFoundError Traceback (most recent call last)
<ipython-input-6-db02a0bfcf1d> in <module>()
----> 1 dataset = np.load(dataset_name, encoding='bytes')
/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py in load(file, mmap_mode, allow_pickle, fix_imports, encoding)
370 own_fid = False
371 if isinstance(file, basestring):
--> 372 fid = open(file, "rb")
373 own_fid = True
374 elif is_pathlib_path(file):
FileNotFoundError: [Errno 2] No such file or directory: 'largeFIle.npy'

When you launch a new notebook on colab, it connects you with a remote machine for 12 hours and all you have there is the notebook and preloaded functions. To access your folders on drive, you need to connect the remote instance to your drive and authenticate it.
This thing bugged me for sometime when I was beginning too, so I'm creating a gist and I'll update it as I learn more. For your case, check out section 2 (Connecting with Drive). You don't have to edit or understand anything, just copy the cell and run it. It will run a bunch of functions and then give you an authentication link. You need to go to that link and sign-in with Google, you'll get an access token there. Put it back in the input box and press Enter. If it doesn't work or if there's some error, run the cell again.
In the next part I mount my drive to the folder '/drive'. So now, everything that's on your drive exists in this folder, including your notebook. Next, you can change your working directory. For me, I'm keeping all my notebooks in '/Colab' folder, edit it accordingly.
Hope it helps you. Feel free to suggest me edits to the gist as you learn more. :)

Have you set up your google drive with google colab with this method. After mounting Google drive use below command for your problem (Assuming you have stored largeFIle.npy in Colab Notebook folder.)
dataset = np.load('drive/Colab Notebooks/largeFIle.npy, encoding='bytes')

Related

How to use yolov5 api with flask offline?

I was able to run the Flask app with yolov5 on a PC with an internet connection. I followed the steps mentioned in yolov5 docs and used this file: yolov5/utils/flask_rest_api/restapi.py,
But I need to achieve the same offline(On a particular PC). Now the issue is, when I am using the following:
model = torch.hub.load("ultralytics/yolov5", "yolov5", force_reload=True)
It tries to download model from internet. And throws an error.
Urllib.error.URLError: <urlopen error [Errno - 2] name or service not known>
How to get the same results offline.
Thanks in advance.
If you want to run detection offline, you need to have the model already downloaded.
So, download the model (for example yolov5s.pt) from https://github.com/ultralytics/yolov5/releases and store it for example to the yolov5/models.
After that, replace
# model = torch.hub.load("ultralytics/yolov5", "yolov5s", force_reload=True) # force_reload to recache
with
model = torch.hub.load(r'C:\Users\Milan\Projects\yolov5', 'custom', path=r'C:\Users\Milan\Projects\yolov5\models\yolov5s.pt', source='local')
With this line, you can run detection also offline.
Note: When you start the app for the first time with the updated torch.hub.load, it will download the model if not present (so you do not need to download it from https://github.com/ultralytics/yolov5/releases).
There is one more issue involved here. When this code is run on a machine that has no internet connection at all. Then you may face the following error.
Downloading https://ultralytics.com/assets/Arial.ttf to /home/<local_user>/.config/Ultralytics/Arial.ttf...
Traceback (most recent call last):
File "/home/<local_user>/Py_Prac_WSL/yolov5-flask-master/yolov5/utils/plots.py", line 56, in check_pil_font
return ImageFont.truetype(str(font) if font.exists() else font.name, size)
File "/home/<local_user>/.local/share/virtualenvs/23_Jun-82xb8nrB/lib/python3.8/site-packages/PIL/ImageFont.py", line 836, in truetype
return freetype(font)
File "/home/<local_user>/.local/share/virtualenvs/23_Jun-82xb8nrB/lib/python3.8/site-packages/PIL/ImageFont.py", line 833, in freetype
return FreeTypeFont(font, size, index, encoding, layout_engine)
File "/home/<local_user>/.local/share/virtualenvs/23_Jun-82xb8nrB/lib/python3.8/site-packages/PIL/ImageFont.py", line 193, in __init__
self.font = core.getfont(
OSError: cannot open resource
To overcome this error, you need to download manually, the Arial.ttf file from https://ultralytics.com/assets/Arial.ttf and paste it to the following location, on Linux:
/home/<your_pc_user>/.config/Ultralytics
On windows, paste Arial.ttf here:
C:\Windows\Fonts
The first line of the error message mentions the same thing. After this, the code runs smoothly in offline mode.
Further as mentioned at https://docs.ultralytics.com/tutorials/pytorch-hub/, any custom-trained-model other than the one uploaded at PyTorch-model-hub can be accessed by this code.
path_hubconfig = 'absolute/path/to/yolov5'
path_trained_model = 'absolute/path/to/best.pt'
model = torch.hub.load(path_hubconfig, 'custom', path=path_trained_model, source='local') # local repo
With this code, object detection is carried out by the locally saved custom-trained model. Once, the custom trained model is saved locally this piece of code access it directly avoiding any necessity of the internet.

How to save python code (part of the notebook) to file in GDrive from code

I am using Google Colabs for my research in machine learning.
I do many variations on a network and run them saving the results.
I have a part of my notebook that used to be separate file (network.py) At the start of a training session I used to save this file in a directory that has the results and logs etc. Now that this part of the code is in the notebook it is easier to edit etc, BUT I do not have a file to copy to the output directory that describes the model. how to i take a section of a google colab notebook and save the raw code as a python file?
Things I have tried:
%%writefile "my_file.py" - is able to write the file however the classes are not available to the runtime.

Google Colab - downloads some files, TypeError: Failed to fetch on others

I have a Google Colab notebook with PyTorch code running in it.
At the beginning of the train function, I create, save and download word_to_ix and tag_to_ix dictionaries without a problem, using the following code:
from google.colab import files
torch.save(tag_to_ix, pos_dict_path)
files.download(pos_dict_path)
torch.save(word_to_ix, word_dict_path)
files.download(word_dict_path)
I train the model, and then try to download it with the code:
torch.save(model.state_dict(), model_path)
files.download(model_path)
Then I get a MessageError: TypeError: Failed to fetch.
Obviously, the problem is not with the third party cookies (as suggested here), because the first files are downloaded without a problem. (I actually also tried adding the link in my Allow section, but, surprise surprise, it made no difference.)
I was originally trying to save the model as is (which, to my understanding, saves it as a Pickle), and I thought maybe Colab files doesn't handle downloading Pickles well, but as you can see above, I'm now trying to save a dict object (which is also what word_to_ix and tag_to_ix) are, and it's still not working.
Downloading the file manually with right-click isn't a solution, because sometimes I leave the code running while I do other things, and by the time I get back to it, the runtime has disconnected, and the files are gone.
Any suggestions?

Google Colab: "Unable to connect to the runtime" after uploading Pytorch model from local

I am using a simple (not necessarily efficient) method for Pytorch model saving.
import torch
from google.colab import files
torch.save(model, filename) # save a trained model on the VM
files.download(filename) # download the model to local
best_model = files.upload() # select the model just downloaded
best_model[filename] # access the model
Colab disconnects during execution of the last line, and hitting RECONNECT tab always shows ALLOCATING -> CONNECTING (fails, with "unable to connect to the runtime" message in the left bottom corner) -> RECONNECT. At the same time, executing any one of the cells gives Error message "Failed to execute cell, Could not send execute message to runtime: [object CloseEvent]"
I know it is related to the last line, because I can successfully connect with my other google accounts which doesn't execute that.
Why does it happen? It seems the google accounts which have executed the last line can no longer connect to the runtime.
Edit:
One night later, I can reconnect with the google account after session expiration. I just attempted the approach in the comment, and found that just files.upload() the Pytorch model would lead to the problem. Once the upload completes, Colab disconnects.
Try disabling your ad-blocker. Worked for me
(I wrote this answer before reading your update. Think it may help.)
files.upload() is just for uploading files. We have no reason to expect it to return some pytorch type/model.
When you call a = files.upload(), a is a dictionary of filename - a big bytes array.
{'my_image.png': b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR....' }
type(a['my_image.png'])
Just like when you do open('my_image', 'b').read()
So, I think the next line best_model[filename] try to print the whole huge bytes array, which bugs the colab.

Read NetCDF file from Azure file storage

I have uploaded a file to my Azure file storage account and created a SAS (shared access signature). Let's pretend the file in question is called fileA.nc
Now, with Python3, I am attempting to read fileA.nc:
from netCDF4 import Dataset
url ='https://<my-azure-resource-group>.file.core.windows.net/<some-file-share>/fileA.nc<SAS-token>';
dataset = Dataset(url)
print(dataset.variables.keys())
The above code does not work, instead giving me the following error:
Traceback (most recent call last): File "yadaYadaYada/test.py", line
8, in
dataset = Dataset(url) File "netCDF4/_netCDF4.pyx", line 1848, in netCDF4._netCDF4.Dataset.init (netCDF4/_netCDF4.c:13983)
OSError: NetCDF: Malformed or unexpected Constraint
This is line 8:
dataset = Dataset(url)
I know the URL provided works. If I paste it into the browser, the file downloads...
I have checked the netCDF4 documentation, which says this:
Remote OPeNDAP-hosted datasets can be accessed for reading over
http
if a URL is provided to the Dataset constructor instead of a filename.
However, this requires that the netCDF library be built with OPenDAP
support, via the --enable-dap configure option (added in version
4.0.1).
However, I have no idea how to tell if when Pycharms installed netcdf4, it used the --enable-dap argument, but I cannot imagine why it would not. Besides, if I stick in a url which points to some HTML, I get the HTML in the error dump and so from that I would think netcdf4 is actually trying to load a remote dataset and so the problem is somewhere else.
I'd really appreciate some help here. Maybe someone knows of another Python 3 netCDF library that will allow me to load my datasets from Azure?
UPDATE
Okay, I can now confirm that the python netcdf4 library does come with --OPenDAP enabled:
Hello again, netCDF4 1.0.4 with OpenDAP support is now available in
the conda respoitory on Unix. To install: $ conda install netcdf4
Ilan
I have found a solution. It turns out that you cannot read directly from an Azure File share, even though when you paste the link to a file in the browser, the file begins to download.
What I needed to do was to mount the File Share on my OS. In my case, I was using Windows but this can be done with Linux, too. The following code should be modified accordingly and then put into Command Prompt:
net use <drive-letter>: \\<storage-account-name>.file.core.windows.net\<share-name>
example :
net use z: \\samples.file.core.windows.net\logs
Once the File Share is mounted, you can read from it as if it were an external HDD. You may need to add permission, but I didn't.
Here is the link to the documentation for mounting the File Share: Documentation

Resources