How to load another file in libROSA? - python-3.x

I am new to LibROSA and wanted to know how to load a file which is not the default?

You're looking for load. Just provide the path to the file you want to load.

Related

Confusion regarding joblib.dump()

One way to save sklearn models is to use joblib.dump(model,filename). I have a confusion regarding the filename argument. One way to run this function is through :
joblib.dump(model,"model.joblib")
This saves the model successfully and also the model is loaded correctly using the:
model=joblib.load("model.joblib")
Another way is to use :
joblib.dump(model,"model")
With no ".joblib" extension this time. This also runs successfully and the model is loaded correctly using the:
model=joblib.load("model")
What confuses me is the file extension in the filename, Is there a certain file extension that I should use for saving the model? Or it is not necessary to use a file extension as I did above? If it is not necessary, then why?
There is no file extension that "must" be used to serialize a model. You can specify the compression method by using one of the supported filename extensions (.z, .gz, .bz2, .xz or .lzma). By default joblib will use zlib to serialize objects.
Therefore you can use any file extension. However it is a good practice to use the library as the extension in order to know how to load it.
I name my serialized model model.pickle when I am using pickle library and model.joblib when I am using joblib.

Load data file pyomo

How can I load a data file of a model to run in pyomo , I want to run a specific model but I do not know in which directory I should save the file.dat so that it can be run by the prompt.
The easiest way to do this is to have your data file in the same directory as the .py file containing your model. Please see the Pyomo online documentation for more details (http://pyomo.readthedocs.io/en/latest/data/index.html)

NetSuite SuiteScript How to bridge the 10MG limit?

HI and thanks for any help. Is there a way to work with files larger than 10mg? I have to check for updates on items in a file that would be uploaded, but the file contains all items in the system and is approximately 20MG. This 10MG limit is killing me. I see streaming for file save and appending but not for file reading. So I am open to any suggestions. The provider in this instance doesn't offer the facility to chunk the files. thanks in advance for your help.
If you are using SS2 to process a file from the file cabinet then if you use file.lines.iterator() to process a file the size limit is 10MB per line.
I believe returning a file object from a map reduce script's getInputStage automatically parses the file into lines.
The 10MB file size limit comes into play if you try to create a file larger than 10MB.
If you are trying to read in a an external file via script then one approach that I've used is to proxy the call via an external service. e.g. query an AWS lambda function that checks for and saves the file to S3. Return the file path and size to your SuiteScript. The SuiteScript then asks for "pages" of the file that are less than 10MB and saves those. If you are uploading something like a .csv then the lambda function can send the header with each paged request.

Read NetCDF file from Azure file storage

I have uploaded a file to my Azure file storage account and created a SAS (shared access signature). Let's pretend the file in question is called fileA.nc
Now, with Python3, I am attempting to read fileA.nc:
from netCDF4 import Dataset
url ='https://<my-azure-resource-group>.file.core.windows.net/<some-file-share>/fileA.nc<SAS-token>';
dataset = Dataset(url)
print(dataset.variables.keys())
The above code does not work, instead giving me the following error:
Traceback (most recent call last): File "yadaYadaYada/test.py", line
8, in
dataset = Dataset(url) File "netCDF4/_netCDF4.pyx", line 1848, in netCDF4._netCDF4.Dataset.init (netCDF4/_netCDF4.c:13983)
OSError: NetCDF: Malformed or unexpected Constraint
This is line 8:
dataset = Dataset(url)
I know the URL provided works. If I paste it into the browser, the file downloads...
I have checked the netCDF4 documentation, which says this:
Remote OPeNDAP-hosted datasets can be accessed for reading over
http
if a URL is provided to the Dataset constructor instead of a filename.
However, this requires that the netCDF library be built with OPenDAP
support, via the --enable-dap configure option (added in version
4.0.1).
However, I have no idea how to tell if when Pycharms installed netcdf4, it used the --enable-dap argument, but I cannot imagine why it would not. Besides, if I stick in a url which points to some HTML, I get the HTML in the error dump and so from that I would think netcdf4 is actually trying to load a remote dataset and so the problem is somewhere else.
I'd really appreciate some help here. Maybe someone knows of another Python 3 netCDF library that will allow me to load my datasets from Azure?
UPDATE
Okay, I can now confirm that the python netcdf4 library does come with --OPenDAP enabled:
Hello again, netCDF4 1.0.4 with OpenDAP support is now available in
the conda respoitory on Unix. To install: $ conda install netcdf4
Ilan
I have found a solution. It turns out that you cannot read directly from an Azure File share, even though when you paste the link to a file in the browser, the file begins to download.
What I needed to do was to mount the File Share on my OS. In my case, I was using Windows but this can be done with Linux, too. The following code should be modified accordingly and then put into Command Prompt:
net use <drive-letter>: \\<storage-account-name>.file.core.windows.net\<share-name>
example :
net use z: \\samples.file.core.windows.net\logs
Once the File Share is mounted, you can read from it as if it were an external HDD. You may need to add permission, but I didn't.
Here is the link to the documentation for mounting the File Share: Documentation

Unable to Load Resource Using getResource

I am trying to simply load a file from a packages resource folder. I have the following project structure:
And have tried the following in an attempt to load each of the txt files to the Populator.groovy script:
File file = new File(Populator.class.getResource("/names/first-names.txt").getFile())
The above results in a FileNotFoundException if any methods are called from the file instance. The path returned is correct, and the file is indeed where the path specifies. I am also using very similar methods of extracting resources in above modules and no errors are occurring. Whats going on here?
Why not
File file = new File(Populator.class.getResource("/names/first-names.txt").toURI())
Not sure why you want it as a file though? Wouldn't an input stream do?

Resources