Sharing Python Notebooks in Watson Studio - python-3.x

I have a few notebooks in my Watson studio project. They use a common set of function definitions. I'm constantly refining these common function definitions so I want them to be in a notebook. How can I share this notebook with the other notebooks?

You cannot include notebooks from other notebooks in Watson Studio on Cloud.
What you can do is let a notebook write a file with your shared function to the project storage. Other notebooks can fetch that file from project storage, save it to the transient disk storage, and import it from there. The project-lib utility will help you with accessing the project storage.
You may use the %save magic to write functions from different cells to a .py file on disk. Then the final cell of the notebook can load that file and write it to the project storage.

Related

using dbutils (dbutils.fs.rm in a databricks Job) azure databricks

https://docs.databricks.com/dev-tools/databricks-utils.html
I am trying to use dbutils.fs.rm in a job for Azure on a dbfs folder. It's actually a big pain and the dbutils.fs.rm resolves all the issues but seems to only work in a notebook.
The issues I am having are dealing with sub folders with files. I want an easy way within python to delete all a folder, and all sub content.

Azure Synapse: Upload directory of py files in Spark job reference files

I am trying to pass a whole directory of python files that are referenced in the main python file in Azure Synapse Spark Job Definition but the files are not appearing in the location and I get Module Not Found Error. Trying to upload like this:
abfss://[directory path in data lake]/*
You have to trick the Spark job definition by exporting it, editing it as a JSON, and importing it back.
After the export, open in a text editor and add the following:
"conf": {
"spark.submit.pyFiles":
"path-to-abfss/module1.zip, path-to-abfss/module2.zip"
},
Now, import the JSON back.
The way to achieve this on Synapse is to package your python files into a wheel package and upload the wheel package to a specific location the Azure Data Lake Storage where your spark pool will load them from every time it starts. This will make the custom python packages available to all jobs and notebooks using that spark pool.
You can find more details on the official documentation: https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-manage-python-packages#install-wheel-files

Databricks Access local notebook

I have created some notebook on Databricks and I wanted to access them. One notebook has the local path
/Users/test#gmx.de/sel2
If I now try to access the directory via
%fs /Users/test#gmx.de
I am getting an error message saying that the local directory is not found.
What do I make wrong?
Many thanks!
The notebooks aren't a real objects located on the file system. Notebook is in-memory representation and are stored in the database in Databricks-managed control plane. Here is the architecture diagram from documentation:
If you want to export notebook to local file system you can do it via databricks cli or via UI. Or you can include it into another notebook via %run, or execute it from another notebook with notebook workflow (dbutils.notebook.run). And you can run tests inside it with some tools like Nutter.

SAP Commerce Cloud Hot Folder local setup

We are trying to use cloud hot folder functionality and in order to do so we are modifying our existing hot-folder implementation that was not implemented originally for usage within cloud.
Following the steps on this help page:
https://help.sap.com/viewer/0fa6bcf4736c46f78c248512391eb467/SHIP/en-US/4abf9290a64f43b59fbf35a3d8e5ba4d.html
We are trying to test the cloud functionality locally. I have on my machine azurite docker container running and I have modified the mentioned properties in local.properties file but it seems that the files are not being picked up by hybris in any of the cases that we are trying.
First we have in our local azurite storage a blob storage called hybris. Within this blob storage we have folders master>hotfolder, and according to docs uploading a sample.csv file into this should trigger a hot folder upload.
Also we have a mapping for our hot-folder import that scans the files within this folder: #{baseDirectory}/${tenantId}/sample/classifications. {baseDirectory} is configured using a property like so: ${HYBRIS_DATA_DIR}/sample/import
Can we keep these mappings within our hot folder xml definitions, or do we need to change them?
How should the blob container be named in order for it to be accessible to hybris?
Thank you very much,
I would be very happy to provide any further information.
In the end I did manage to run cloud hot folder imports on local machine.
It was a matter of correctly configuring a number of properties that are used by cloudhotfolder and azurecloudhotfolder extensions.
Simply use the following properties to set the desired behaviour of the system:
cluster.node.groups=integration,yHotfolderCandidate
azure.hotfolder.storage.account.connection-string=DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:32770/devstoreaccount1;
azure.hotfolder.storage.container.hotfolder=${tenantId}/your/path/here
cloud.hotfolder.default.mapping.file.name.pattern=^(customer|product|url_media|sampleFilePattern|anotherFileNamePattern)-\\d+.*
cloud.hotfolder.default.images.root.url=http://127.0.0.1:32785/devstoreaccount1/${azure.hotfolder.storage.container.name}/master/path/to/media/folder
cloud.hotfolder.default.mapping.header.catalog=YourProductCatalog
And that is it, if there are existing routings for traditional hot folder import, these can also be used but their mappings should be in the value of
cloud.hotfolder.default.mapping.file.name.pattern
property.
I am trying the same - to set up a local dev env to test out the cloud hotfolder. It seems that you have had some success. Can you provide where you located the azurecloudhotfolder - which is called out here https://help.sap.com/viewer/0fa6bcf4736c46f78c248512391eb467/SHIP/en-US/4abf9290a64f43b59fbf35a3d8e5ba4d.html
Thanks

How can we save or upload .py file on dbfs/filestore

We have few .py files on my local needs to stored/saved on fileStore path on dbfs. How can I achieve this?
Tried with dbUtils.fs module copy actions.
I tried the below code but did not work, I know something is not right with my source path. Or is there any better way of doing this? please advise
'''
dbUtils.fs.cp ("c:\\file.py", "dbfs/filestore/file.py")
'''
It sounds like you want to copy a file on local to the dbfs path of servers of Azure Databricks. However, due to the interactive interface of Notebook of Azure Databricks based on browser, it could not directly operate the files on local by programming on cloud.
So the solutions as below that you can try.
As #Jon said in the comment, you can follow the offical document Databricks CLI to install the databricks CLI via Python tool command pip install databricks-cli on local and then copy a file to dbfs.
Follow the offical document Accessing Data to import data via Drop files into or browse to files in the Import & Explore Data box on the landing page, but also recommended to use CLI, as the figure below.
Upload your specified files to Azure Blob Storage, then follow the offical document Data sources / Azure Blob Storage to do the operations include dbutils.fs.cp.
Hope it helps.

Resources