I have uploaded a file to my Azure file storage account and created a SAS (shared access signature). Let's pretend the file in question is called fileA.nc
Now, with Python3, I am attempting to read fileA.nc:
from netCDF4 import Dataset
url ='https://<my-azure-resource-group>.file.core.windows.net/<some-file-share>/fileA.nc<SAS-token>';
dataset = Dataset(url)
print(dataset.variables.keys())
The above code does not work, instead giving me the following error:
Traceback (most recent call last): File "yadaYadaYada/test.py", line
8, in
dataset = Dataset(url) File "netCDF4/_netCDF4.pyx", line 1848, in netCDF4._netCDF4.Dataset.init (netCDF4/_netCDF4.c:13983)
OSError: NetCDF: Malformed or unexpected Constraint
This is line 8:
dataset = Dataset(url)
I know the URL provided works. If I paste it into the browser, the file downloads...
I have checked the netCDF4 documentation, which says this:
Remote OPeNDAP-hosted datasets can be accessed for reading over
http
if a URL is provided to the Dataset constructor instead of a filename.
However, this requires that the netCDF library be built with OPenDAP
support, via the --enable-dap configure option (added in version
4.0.1).
However, I have no idea how to tell if when Pycharms installed netcdf4, it used the --enable-dap argument, but I cannot imagine why it would not. Besides, if I stick in a url which points to some HTML, I get the HTML in the error dump and so from that I would think netcdf4 is actually trying to load a remote dataset and so the problem is somewhere else.
I'd really appreciate some help here. Maybe someone knows of another Python 3 netCDF library that will allow me to load my datasets from Azure?
UPDATE
Okay, I can now confirm that the python netcdf4 library does come with --OPenDAP enabled:
Hello again, netCDF4 1.0.4 with OpenDAP support is now available in
the conda respoitory on Unix. To install: $ conda install netcdf4
Ilan
I have found a solution. It turns out that you cannot read directly from an Azure File share, even though when you paste the link to a file in the browser, the file begins to download.
What I needed to do was to mount the File Share on my OS. In my case, I was using Windows but this can be done with Linux, too. The following code should be modified accordingly and then put into Command Prompt:
net use <drive-letter>: \\<storage-account-name>.file.core.windows.net\<share-name>
example :
net use z: \\samples.file.core.windows.net\logs
Once the File Share is mounted, you can read from it as if it were an external HDD. You may need to add permission, but I didn't.
Here is the link to the documentation for mounting the File Share: Documentation
Related
I was able to run the Flask app with yolov5 on a PC with an internet connection. I followed the steps mentioned in yolov5 docs and used this file: yolov5/utils/flask_rest_api/restapi.py,
But I need to achieve the same offline(On a particular PC). Now the issue is, when I am using the following:
model = torch.hub.load("ultralytics/yolov5", "yolov5", force_reload=True)
It tries to download model from internet. And throws an error.
Urllib.error.URLError: <urlopen error [Errno - 2] name or service not known>
How to get the same results offline.
Thanks in advance.
If you want to run detection offline, you need to have the model already downloaded.
So, download the model (for example yolov5s.pt) from https://github.com/ultralytics/yolov5/releases and store it for example to the yolov5/models.
After that, replace
# model = torch.hub.load("ultralytics/yolov5", "yolov5s", force_reload=True) # force_reload to recache
with
model = torch.hub.load(r'C:\Users\Milan\Projects\yolov5', 'custom', path=r'C:\Users\Milan\Projects\yolov5\models\yolov5s.pt', source='local')
With this line, you can run detection also offline.
Note: When you start the app for the first time with the updated torch.hub.load, it will download the model if not present (so you do not need to download it from https://github.com/ultralytics/yolov5/releases).
There is one more issue involved here. When this code is run on a machine that has no internet connection at all. Then you may face the following error.
Downloading https://ultralytics.com/assets/Arial.ttf to /home/<local_user>/.config/Ultralytics/Arial.ttf...
Traceback (most recent call last):
File "/home/<local_user>/Py_Prac_WSL/yolov5-flask-master/yolov5/utils/plots.py", line 56, in check_pil_font
return ImageFont.truetype(str(font) if font.exists() else font.name, size)
File "/home/<local_user>/.local/share/virtualenvs/23_Jun-82xb8nrB/lib/python3.8/site-packages/PIL/ImageFont.py", line 836, in truetype
return freetype(font)
File "/home/<local_user>/.local/share/virtualenvs/23_Jun-82xb8nrB/lib/python3.8/site-packages/PIL/ImageFont.py", line 833, in freetype
return FreeTypeFont(font, size, index, encoding, layout_engine)
File "/home/<local_user>/.local/share/virtualenvs/23_Jun-82xb8nrB/lib/python3.8/site-packages/PIL/ImageFont.py", line 193, in __init__
self.font = core.getfont(
OSError: cannot open resource
To overcome this error, you need to download manually, the Arial.ttf file from https://ultralytics.com/assets/Arial.ttf and paste it to the following location, on Linux:
/home/<your_pc_user>/.config/Ultralytics
On windows, paste Arial.ttf here:
C:\Windows\Fonts
The first line of the error message mentions the same thing. After this, the code runs smoothly in offline mode.
Further as mentioned at https://docs.ultralytics.com/tutorials/pytorch-hub/, any custom-trained-model other than the one uploaded at PyTorch-model-hub can be accessed by this code.
path_hubconfig = 'absolute/path/to/yolov5'
path_trained_model = 'absolute/path/to/best.pt'
model = torch.hub.load(path_hubconfig, 'custom', path=path_trained_model, source='local') # local repo
With this code, object detection is carried out by the locally saved custom-trained model. Once, the custom trained model is saved locally this piece of code access it directly avoiding any necessity of the internet.
The model was originally created in Keras using a remote Linux environment that I no longer have access to. I now need to load the model on my Windows machine (a couple of years later). The program is written in Python if that's relevant.
The error I get:
Exception has occurred: OSError
Unable to open file (bad superblock version number)
File "Directory/file” line 13, in model = load_model('model1.h5')
I've tried installing h5py-2.10.0 so that I could install h5repack to convert the file to a readable format but when I do that, it says that h5repack is unable to open the file.
h5repack error: < 'model1.h5' >: unable to open file
I’m reluctant to try switching my machine over to Linux unless there’s no other solution.
Thanks in advance for your help!
I am doing PCA on CIFAR 10 image on IBM WATSON Studio Free version so I uploaded the python file for downloading the CIFAR10 on the studio
pic below.
But when I trying to import cache the following error is showing.
pic below-
After spending some time on google I find a solution but I can't understand it.
link
https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/add-script-to-notebook.html
the solution is as follows:-
Click the Add Data icon (Shows the Add Data icon), and then browse the script file or drag it into your notebook sidebar.
Click in an empty code cell in your notebook and then click the Insert to code link below the file. Take the returned string, and write to a file in the file system that comes with the runtime session.
To import the classes to access the methods in a script in your notebook, use the following command:
For Python:
from <python file name> import <class name>
I can't understand this line
` and write to a file in the file system that comes with the runtime session.``
Where can I find the file that comes with runtime session? Where is the file system located?
Can anyone plz help me in this with the details where to find that file
You have the import error because the script that you are trying to import is not available in your Python runtime's local filesystem. The files (cache.py, cifar10.py, etc.) that you uploaded are uploaded to the object storage bucket associated with the Watson Studio project. To use those files you need to make them available to the Python runtime for example by downloading the script to the runtimes local filesystem.
UPDATE: In the meanwhile there is an option to directly insert the StreamingBody objects. This will also have all the required credentials included. You can skip to writing it to a file in the local runtime filesystem section of this answer if you are using insert StreamingBody object option.
Or,
You can use the code snippet below to read the script in a StreamingBody object:
import types
import pandas as pd
from botocore.client import Config
import ibm_boto3
def __iter__(self): return 0
os_client= ibm_boto3.client(service_name='s3',
ibm_api_key_id='<IBM_API_KEY_ID>',
ibm_auth_endpoint="<IBM_AUTH_ENDPOINT>",
config=Config(signature_version='oauth'),
endpoint_url='<ENDPOINT>')
# Your data file was loaded into a botocore.response.StreamingBody object.
# Please read the documentation of ibm_boto3 and pandas to learn more about the possibilities to load the data.
# ibm_boto3 documentation: https://ibm.github.io/ibm-cos-sdk-python/
# pandas documentation: http://pandas.pydata.org/
streaming_body_1 = os_client.get_object(Bucket='<BUCKET>', Key='cifar.py')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(streaming_body_1, "__iter__"): streaming_body_1.__iter__ = types.MethodType( __iter__, streaming_body_1 )
And then write it to a file in the local runtime filesystem.
f = open('cifar.py', 'wb')
f.write(streaming_body_1.read())
This opens a file with write access and calls the write method to write to the file. You should then be able to simply import the script.
import cifar
Note: You can get the credentials like IBM_API_KEY_ID for the file by clicking on the Insert credentials option on the drop-down menu for your file.
The instructions that op found miss one crucial line of code. I followed them and was able to import modules but wasn't able to use any functions or classes in those modules. This was fixed by closing the files after writing. This part in the instrucitons:
f = open('<myScript>.py', 'wb')
f.write(streaming_body_1.read())
should instead be (at least this works in my case):
f = open('<myScript>.py', 'wb')
f.write(streaming_body_1.read())
f.close()
Hopefully this helps someone.
I'm new to Deep Learning and PyTorch, so please do bear with me if some questions seem silly or I'm not asking in the correct format.
I was watching this video as part of a PyTorch series on Deep Learning: https://www.youtube.com/watch?v=8n-TGaBZnk4 . This video specifically is about ETL (using Fashion-MNIST dataset).
I have a few questions on the video at 7:05.
Question 1: In the Fashion-MNIST subclass constructor we passed it the argument:
‘root’, where the instructor mentioned: this is the location in disk where data is located. Sorry maybe this is a silly question, but is this where the data is located on the source server (from the URL) disk, or is this the path location where you want to save the data on your computer locally?
Question 2: Also for the Fashion-MNIST is the 'root' always the same location path: i.e. './data/FashionMNIST'?
Question 3: If the 'root' defines the location path where the data is located on the source server, then where would it be downloaded on locally? I checked my 'download' folder (I'm using Windows 7 laptop), and couldn't find the files there?
Question 4: The video mentioned that we should check if the data, in subsequent calls, are downloaded already or not (i.e. in the argument we pass download=true).
4(a): What's a good approach to do this? Do we put an if statement in place to check for this? Or is there a smarter way of checking for downloaded data?
4(b): Also what does it mean by "subsequent calls"? Does it mean when we need to call the 'FashionMNIST' constructor again for the test_data download?
Question 5: Finally, I tried running the code below (which is the one in the video) on Spyder IDE (Python 3.5):
import torch
import torchvision
import torchvision.transforms as transforms
train_set = torchvision.datasets.FashionMNIST(
root='./data/FashionMNIST'
,train=True
,download=True
,transform=transforms.Compose([
transforms.ToTensor()
])
)
I got the output:
Traceback (most recent call last):
File "<ipython-input-3-3ac000b9e90a>", line 10, in <module>
transforms.ToTensor()
File "C:\Program Files\Anaconda3\lib\site-packages\torchvision\datasets\mnist.py", line 68, in __init__
self.download()
File "C:\Program Files\Anaconda3\lib\site-packages\torchvision\datasets\mnist.py", line 136, in download
makedir_exist_ok(self.raw_folder)
File "C:\Program Files\Anaconda3\lib\site-packages\torchvision\datasets\utils.py", line 41, in makedir_exist_ok
os.makedirs(dirpath)
File "C:\Program Files\Anaconda3\lib\os.py", line 241, in makedirs
mkdir(name, mode)
FileNotFoundError: [WinError 206] The filename or extension is too long: './data/FashionMNIST\\FashionMNIST\\raw'
Not sure why I got that error at the end. In addition I ran the code on Jupyter Notebook, as per the video, and it worked fine. But I'm wondering why it throws that error in Spyder IDE.
Many thanks in advance.
No genuine question is a silly question, Answering questions one bye one:
Ans 1 & 2 :
root is the path on your local disk where the data will be saved, you can give ny path according to your liking it will not cause an issue.
Ans 3:
The urls etc are defined within the files and the path of the data is all you need to do: in order to look at the urls from where the data is downloaded here is a link.
Ans 4. : download = True merely gives it permission to download if the data doesn't exists the downloader will automatically check if the data already exists, if it exists it will still not download, even if download is set to be true, again it happens in the background you don't have to worry about it.
Ans5 : The issue isn't a torch issue exactly it has more to do with how it is being compiled on in windows, the issue is discussed at length here & here
I have saved some objects with shelve. In an other file I was able to restore these objects. But when I copy the archive to an other computer shelve gives me a _gdbm.error: File read error. The packages which hold the classes of the stored objects are directly accessible on both computers (but they are stored in different locations and added with PYTHONPATH). Both machines run on Ubuntu 13.10, one is 32bit and the other 64bit.
Shouldn't these archives be machine independent?
On the 64bit machine I get
>>> import shelve
>>> shelve.open('arch.db')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.3/shelve.py", line 232, in open
return DbfilenameShelf(filename, flag, protocol, writeback)
File "/usr/lib/python3.3/shelve.py", line 216, in __init__
Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
File "/usr/lib/python3.3/dbm/__init__.py", line 94, in open
return mod.open(file, flag, mode)
_gdbm.error: File read error
and on the 32bit machine it works.
When I create an archive on the 64bit machine, it is openable on the 32bit machine, but the interactive python prompt crashes:
>>> import shelve
>>> s = shelve.open('arch.db')
>>> for i in s.items(): print(i)
...
gdbm fatal: lseek error
I don't even get a traceback.
It is really annoying, I intended to work on both computers, but at the moment I am bound to my slow 32bit eeepc, because I've already saved a lot into the archive.
The problem is that shelve uses gdbm (provided as dbm.gnu as default backend to store the serialized objects. The files created with gdbm depend on the architecture of the system and are therefore only usable on 32bit or 64bit systems.
There are some tools (gdbmexport or gdbm_dump) which allow you to convert the gdbm files, however, the workflow becomes error prone if you want to access the files from both systems on a regular basis.
Fortunately, python provides different backends: dbm.gnu, dbm.ndbm and dbm.dumb. The latter two are platform independent.
import shelve
import dbm
dbm._defaultmod = dbm.ndbm
db = shelve.open('somename')
The database with the above code can be used on 64bit and 32bit systems.
The default backend has to be set only when you create the file. dbm checks the file type of your database before opening and uses the correct backend.
Please note that the above code changes the default dbm for the whole python process. Another component might break if it relies on gdbm being the default.