Can I use papermill for aws sagemaker notebooks to create excel reports? - excel

So I know I can use papermill to run jupyter notebooks in an automated way, but if I use an AWS Sagemaker notebook and I create excel reports with jupyter notebook (exporting to excel).
How do I find my excel files? Because I can only give a notebook as an output file or not?
I'm planning to use the solution below:
https://github.com/aws-samples/sagemaker-run-notebook/blob/master/QuickStart.md#using-existing-aws-primitives

Yes, you can use the specific location to save the exported notebook like S3 bucket.
You can try papermill rclone which runs as subprocess and you can hardcode the location or pass dynamically the locations.
Please refer the link for more details about rclone
https://pbpython.com/papermil-rclone-report-2.html

Related

Databricks Access local notebook

I have created some notebook on Databricks and I wanted to access them. One notebook has the local path
/Users/test#gmx.de/sel2
If I now try to access the directory via
%fs /Users/test#gmx.de
I am getting an error message saying that the local directory is not found.
What do I make wrong?
Many thanks!
The notebooks aren't a real objects located on the file system. Notebook is in-memory representation and are stored in the database in Databricks-managed control plane. Here is the architecture diagram from documentation:
If you want to export notebook to local file system you can do it via databricks cli or via UI. Or you can include it into another notebook via %run, or execute it from another notebook with notebook workflow (dbutils.notebook.run). And you can run tests inside it with some tools like Nutter.

External Properties File in Azure Databricks

We have a full fledge Spark Application that is taking a lot off parameter from properties file. Now we want move the application to Azure notebook format .Entire code is working fine and giving expected result with hard coded parameter. But is it possible to use external properties file in Azure Databricks Notebook also ??If we can, then where we need to place properties file??
You may utilize the Databricks DBFS Filestore, Azure Databricks note books can access user's files from here.
To Upload the properties file you have, you can use 2 options
Using wget,
import sys
"wget -P /tmp/ http://<your-repo>/<path>/app1.properties"
dbutils.fs.cp("file:/tmp/app1.properties", "dbfs:/FileStore/configs/app1/")
Using dbfs.fs.put, (may be an one-time activity to create this file)
dbutils.fs.put("FileStore/configs/app1/app1.properties", "prop1=val1\nprop2=val2")
To import the properties file values,
properties = dict(line.strip().split('=') for line in open('/dbfs/FileStore/configs/app1/app1.properties'))
Hope this helps!!
There's a possibility of providing/returning arguments with use of Databricks Jobs REST API, more information can be found e.g. here: https://docs.databricks.com/dev-tools/api/latest/examples.html#jobs-api-example

Sharing Python Notebooks in Watson Studio

I have a few notebooks in my Watson studio project. They use a common set of function definitions. I'm constantly refining these common function definitions so I want them to be in a notebook. How can I share this notebook with the other notebooks?
You cannot include notebooks from other notebooks in Watson Studio on Cloud.
What you can do is let a notebook write a file with your shared function to the project storage. Other notebooks can fetch that file from project storage, save it to the transient disk storage, and import it from there. The project-lib utility will help you with accessing the project storage.
You may use the %save magic to write functions from different cells to a .py file on disk. Then the final cell of the notebook can load that file and write it to the project storage.

How can we save or upload .py file on dbfs/filestore

We have few .py files on my local needs to stored/saved on fileStore path on dbfs. How can I achieve this?
Tried with dbUtils.fs module copy actions.
I tried the below code but did not work, I know something is not right with my source path. Or is there any better way of doing this? please advise
'''
dbUtils.fs.cp ("c:\\file.py", "dbfs/filestore/file.py")
'''
It sounds like you want to copy a file on local to the dbfs path of servers of Azure Databricks. However, due to the interactive interface of Notebook of Azure Databricks based on browser, it could not directly operate the files on local by programming on cloud.
So the solutions as below that you can try.
As #Jon said in the comment, you can follow the offical document Databricks CLI to install the databricks CLI via Python tool command pip install databricks-cli on local and then copy a file to dbfs.
Follow the offical document Accessing Data to import data via Drop files into or browse to files in the Import & Explore Data box on the landing page, but also recommended to use CLI, as the figure below.
Upload your specified files to Azure Blob Storage, then follow the offical document Data sources / Azure Blob Storage to do the operations include dbutils.fs.cp.
Hope it helps.

Not able to view files created in Azure Notebooks

I am trying to create a new file using Azure notebooks (notebooks.azure.com) and executing the Jupyter notebooks itself doesn't have any challenges or errors, but the actual file is missing in the path
The files list after executing the script is below (I should have seen test.txt file which is missing now)
Does anyone has inputs?

Resources