Importing scripts into a notebook in IBM WATSON STUDIO

Importing scripts into a notebook in IBM WATSON STUDIO - python-3.x

I am doing PCA on CIFAR 10 image on IBM WATSON Studio Free version so I uploaded the python file for downloading the CIFAR10 on the studio
pic below.
But when I trying to import cache the following error is showing.
pic below-
After spending some time on google I find a solution but I can't understand it.
link
https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/add-script-to-notebook.html
the solution is as follows:-
Click the Add Data icon (Shows the Add Data icon), and then browse the script file or drag it into your notebook sidebar.
Click in an empty code cell in your notebook and then click the Insert to code link below the file. Take the returned string, and write to a file in the file system that comes with the runtime session.
To import the classes to access the methods in a script in your notebook, use the following command:
For Python:
from <python file name> import <class name>
I can't understand this line
` and write to a file in the file system that comes with the runtime session.``
Where can I find the file that comes with runtime session? Where is the file system located?
Can anyone plz help me in this with the details where to find that file

You have the import error because the script that you are trying to import is not available in your Python runtime's local filesystem. The files (cache.py, cifar10.py, etc.) that you uploaded are uploaded to the object storage bucket associated with the Watson Studio project. To use those files you need to make them available to the Python runtime for example by downloading the script to the runtimes local filesystem.
UPDATE: In the meanwhile there is an option to directly insert the StreamingBody objects. This will also have all the required credentials included. You can skip to writing it to a file in the local runtime filesystem section of this answer if you are using insert StreamingBody object option.
Or,
You can use the code snippet below to read the script in a StreamingBody object:
import types
import pandas as pd
from botocore.client import Config
import ibm_boto3
def __iter__(self): return 0
os_client= ibm_boto3.client(service_name='s3',
ibm_api_key_id='<IBM_API_KEY_ID>',
ibm_auth_endpoint="<IBM_AUTH_ENDPOINT>",
config=Config(signature_version='oauth'),
endpoint_url='<ENDPOINT>')
# Your data file was loaded into a botocore.response.StreamingBody object.
# Please read the documentation of ibm_boto3 and pandas to learn more about the possibilities to load the data.
# ibm_boto3 documentation: https://ibm.github.io/ibm-cos-sdk-python/
# pandas documentation: http://pandas.pydata.org/
streaming_body_1 = os_client.get_object(Bucket='<BUCKET>', Key='cifar.py')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(streaming_body_1, "__iter__"): streaming_body_1.__iter__ = types.MethodType( __iter__, streaming_body_1 )
And then write it to a file in the local runtime filesystem.
f = open('cifar.py', 'wb')
f.write(streaming_body_1.read())
This opens a file with write access and calls the write method to write to the file. You should then be able to simply import the script.
import cifar
Note: You can get the credentials like IBM_API_KEY_ID for the file by clicking on the Insert credentials option on the drop-down menu for your file.

The instructions that op found miss one crucial line of code. I followed them and was able to import modules but wasn't able to use any functions or classes in those modules. This was fixed by closing the files after writing. This part in the instrucitons:
f = open('<myScript>.py', 'wb')
f.write(streaming_body_1.read())
should instead be (at least this works in my case):
f = open('<myScript>.py', 'wb')
f.write(streaming_body_1.read())
f.close()
Hopefully this helps someone.

Related

Import Python custom package, with Args, into another py File from a network location directory

I need to build a solution for a Use Case and I am still a bit of a novice on Python capabilities for version 3.9.9.
Use Case:
User Billy wants to run a script against a Snowflake server Azure database, call it sandbox, using a his own python script on his local machine.
Billy's python script, to keep connection settings secure, needs to call a snowflake_conn.py script, which is located in another network folder location (\abs\here\is\snowflake_conn.py), and pass arguments for DB & Schema.
The call will return a connection to Snowflake Billy can use to run his SQL script.
I am envisioning something like:
import pandas as pd
import snowflake_conn # I need to know how to find this in a network folder, not local.
# and then call the custom conn function
snowflake_connect('database','schema')
# where it returns the snowflake.connector.connect.cursor() as sfconn
conn1 = sfconn.conn()
qry = r'select * from tablename where 1=1'
conn1.execute(qry)
df = conn1.fetch_pandas_all()
I saw something like this..but that was from back in 2016 and likely prior to 3.9.9.
import sys
sys.path.insert(0, "/network/modules/location") # OR "\\abs\here\is\" ??
import snowflake_conn
That snowflake_conn.py file uses a configparser.ConfigParser() .read() command to open a config.ini file in the same folder as the snowflake_conn.py script.
I am following the instructions in another stackoverflow question, link below that is 4 years old, to help get the config.ini setup completed.
import my database connection with python
I also found this link, which seems to also point to only a local folder structure, not network folder.
https://blog.finxter.com/python-how-to-import-modules-from-another-folder/
Eventually I want to try to encrypt the .ini file to protect the contents of that .ini file for increased security, but not sure where to start on that yet.

Pandas: how to make openpyxl the default engine for all read_excel operations?

Since read_exceldefault engine xlrd has been deprecated in newer pandas releases, how do I make openpyxl the default engine of all my pd.read_excel calls?
Now, if I update pandas, I must put the parameter engine="openpyxl" in all my pd.read_excel calls. It looks unnecessary.

It's easy! You can do it by changing the default values of the method by going to the _base.py inside the environment's pandas folder. You can find it as follows:
import pandas as pd
print(pd.__file__)
Once in the pandas folder, dive into the folder io > excel > _base.py
Open the file and find
def read_excel(...)
You will find the default value for engine. Change it to 'openpyxl'
If you're using vscode, simply right click on the instance of the method .read_excel and press F12, or go to the definition and change it right away.

If you're using pandas version 1.1.5 or other new version this might help:
Run print(pd.__file__) to see where your pandas library is stored. Usually, the file path would end in "Lib\site-packages\pandas". Then, open folder "io" and folder "excel" right after. You will find there the "_base.py" file.
Look for the def __init__. You will find there the default engine to read Excel files. It should be on line 849. I should read:
if engine is None:
engine = "openpyxl"

How to write solution to a .sol file while using cbc in pyomo? How can I get the name of the .sol file after model is solved?

I am using pyomo on Jupyter Notebook. I have kept keepfiles = true in solve.I am able to get the location of .sol file where it is stored. How can I get the filename of the .sol file created for the current instance?
I have used following:
from pyomo.opt import SolverFactory
SolverFactory("cbc").options['solu']="solution_file.sol"
But this does not work in creating the desired solution file.

If you add the keepfiles=True option to your call to solve the temporary files that are used to pass the model to the solver and read in the results will not be deleted and the path to them will be printed on the screen. So I would create and call your solver using something like:
from pyomo.opt import SolverFactory
solver = SolverFactory("cbc")
solver.solve(model, keepfiles=True)

Images not getting saved in open cv write function

I'm trying to process images and try them to save them.
I am able to process it but image are not getting saved in folder
code - Taken from github
import cv2, glob, numpy
def scaleRadius(img,scale):
x=img[int(img.shape[0]/2),:,:].sum(1)
r=(x>x.mean()/10).sum()/2
s=scale*1.0/r
return cv2.resize(img,(0,0),fx=s,fy=s)
scale =512
for f in (glob.glob("pdr/*.jpeg")):
a=cv2.imread(f)
a=scaleRadius(a,scale)
b=numpy.zeros(a.shape)
cv2.circle(b,(int(a.shape[1]/2),int(a.shape[0]/2)),int(scale*0.9),(1,1,1),-1,8,0)
aa=cv2.addWeighted(a,4,cv2.GaussianBlur(a,(0,0),scale/30),-4,128)*b+128*(1-b)
cv2.imwrite(str(scale)+"_"+f,aa)
Code executes well but it output does not get saved

cv2.imwrite() doesn't create the directory for you, make sure you've created directory 512_pdr before running the script.

Read NetCDF file from Azure file storage

I have uploaded a file to my Azure file storage account and created a SAS (shared access signature). Let's pretend the file in question is called fileA.nc
Now, with Python3, I am attempting to read fileA.nc:
from netCDF4 import Dataset
url ='https://<my-azure-resource-group>.file.core.windows.net/<some-file-share>/fileA.nc<SAS-token>';
dataset = Dataset(url)
print(dataset.variables.keys())
The above code does not work, instead giving me the following error:
Traceback (most recent call last): File "yadaYadaYada/test.py", line
8, in
dataset = Dataset(url) File "netCDF4/_netCDF4.pyx", line 1848, in netCDF4._netCDF4.Dataset.init (netCDF4/_netCDF4.c:13983)
OSError: NetCDF: Malformed or unexpected Constraint
This is line 8:
dataset = Dataset(url)
I know the URL provided works. If I paste it into the browser, the file downloads...
I have checked the netCDF4 documentation, which says this:
Remote OPeNDAP-hosted datasets can be accessed for reading over
http
if a URL is provided to the Dataset constructor instead of a filename.
However, this requires that the netCDF library be built with OPenDAP
support, via the --enable-dap configure option (added in version
4.0.1).
However, I have no idea how to tell if when Pycharms installed netcdf4, it used the --enable-dap argument, but I cannot imagine why it would not. Besides, if I stick in a url which points to some HTML, I get the HTML in the error dump and so from that I would think netcdf4 is actually trying to load a remote dataset and so the problem is somewhere else.
I'd really appreciate some help here. Maybe someone knows of another Python 3 netCDF library that will allow me to load my datasets from Azure?
UPDATE
Okay, I can now confirm that the python netcdf4 library does come with --OPenDAP enabled:
Hello again, netCDF4 1.0.4 with OpenDAP support is now available in
the conda respoitory on Unix. To install: $ conda install netcdf4
Ilan

I have found a solution. It turns out that you cannot read directly from an Azure File share, even though when you paste the link to a file in the browser, the file begins to download.
What I needed to do was to mount the File Share on my OS. In my case, I was using Windows but this can be done with Linux, too. The following code should be modified accordingly and then put into Command Prompt:
net use <drive-letter>: \\<storage-account-name>.file.core.windows.net\<share-name>
example :
net use z: \\samples.file.core.windows.net\logs
Once the File Share is mounted, you can read from it as if it were an external HDD. You may need to add permission, but I didn't.
Here is the link to the documentation for mounting the File Share: Documentation

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Importing scripts into a notebook in IBM WATSON STUDIO - python-3.x

Related

Import Python custom package, with Args, into another py File from a network location directory

Pandas: how to make openpyxl the default engine for all read_excel operations?

How to write solution to a .sol file while using cbc in pyomo? How can I get the name of the .sol file after model is solved?

Images not getting saved in open cv write function

Read NetCDF file from Azure file storage

Categories

Resources