Import `.csv` files from Google drive into Jupyter notebook - python-3.x

I am doing some work on the Covid-19 and I had to access .csv files on Github. (to be honest, the URL is https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series).
So, I went to this page and downloaded the .csv files that interested me directly on my hard drive: C: \ Users \ ... .csv
Then, what I do is that I import these files as pandas dataframes into a Jupyter notebook to work with Python, by coding for example: dataD = pd.read_csv ('C: / Users / path_of_my_file_on_my_computer ...').
It all works very well.
To make it easier to chat with other people, I was told that I should import the .csv files not on my C but on Google drive (https://drive.google.com/drive/my-drive), and then put there also the .ipynb files that I created in Jupyter notebook and then allow access to the people concerned.
So I created a folder on my drive (say, Covid-19) to put these .csv files there, but I don't understand what kind of Python code I am supposed to write at the beginning of my Python file to replace the simple previous instruction dataD = pd .read_csv ('C: / Users / path_of_my_file_on_my_computer ...'), so that the program reads the data directly from my Google drive and no longer from my C?
I have looked at various posts that seem to speak more or less about this issue, but I don't really understand what to do.
I hope my question is clear enough (I am attaching a picture of the situation in my Google drive, assuming that it provides interesting information ... It's in French)

Given that your files are already hosted in the cloud and you are planning a collaborative scenario I think the idea proposed by #Eric is actually smarter.
Approach 1:
Otherwise, if you can't rely on that data source, you will have to build an authorization flow for your script to access Google Drive resources. You can see here a complete documentation on how to build your Python script and interact with the Google Drive API.
Approach 2:
Although, the Google Drive API requires authorization to access files URLs, you can build a workaround. Google Drive will generate some export links that, if your file is publicly available, will be accessible without authorization. In this Stack Overflow answer you can find more details about it.
In your Python script you will be able to parse the URL request directly without accessing the file system nor google drive authorization flow.

Related

Output file in Azure-automation script

I'm adapting a powershell script I have at work for use in Azure-automation, which outputs 3 different CSV files. I'm trying to avoid having to create a DB and send the information there since it would require a changing the script too much, and its quite complex.
Does anyone know if there's a way to just send the 3 files to some kind of folder in Azure? Or maybe another solution that wouldn't require messing too much with the script?
Sorry if it is a dumb question, I'm not very familiar with Azure yet.
Probably the easiest option is to continue writing the file as you are now, then after the file is written have your Powershell code upload it to Blob storage using Set-AzureStorageBlobContent. See https://savilltech.com/2018/03/25/writing-to-files-with-azure-automation/ for an example.
You can read more about using Powershell to upload to blob storage, including all the steps you need to create the storage account and container, at https://learn.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-powershell.

how do i read/import images ,which is stored in my laptop, in google colab?

Whenever i specify them path like "C:\Users\Admin\Desktop\tumor", I get the "file not found" error,using cv2.imread(). Can anyone explain the correct way to read them?
You'll need to transfer files to the backend VM. Recipes are in the I/O example notebook:
https://colab.research.google.com/notebooks/io.ipynb
Or, you can use a local runtime as described here:
http://research.google.com/colaboratory/local-runtimes.html

Share functions across colaboratory files

I'm sharing a colaboratory file with my colleagues and we are having fun with it. But it's getting bigger and bigger, so we want to offload some of the functions to another colaboratory file. How can we load one colaboratory file into another?
There's no way to do this right now, unfortunately: you'll need to move the code into a .py file that you load (say by cloning from github).

Unable to access a csv file in Google Colaboratory

I am using Google Colaboratory to run my machine learning project. I wanted to import .csv into pandas and use it for further processing, but I am facing an error stating that the file is not found. Do I need to provide any authorization to access that file or is it mandatory to upload file into google colab? That file already exists in same folder on Google Drive as that of .ipynb notebook.
Code: pandas read_csv function to read file
Error: Unable to locate
Do I need to provide any authentication or something like that?
See the I/O sample notebook for examples showing how to work with local files and those stored on Drive.

Zip Files In Browser Cache

Hey Guys
At the moment I have a NodeJS webapp in the making which scrapes a website for data. Specifically, this webapp scrapes images for the purpose of downloading them. For example, all the image permalinks are scraped from the reddit front page. They are then sent to the client to download individually. My issue is with the website I am scraping there can be thousands of images.
This provides a horrible user experience if 1000+ images are downloaded to the download folder.
As a result I have two options.
A) Download to a temporary folder on the server. Zip. Send to client for download. Delete from server
B) Download files to browser cache. zip. download to specified download directory.
My question to you is this; Is option B even possible?
I am relatively new to this entire process and I can't find anything to actually zip the files in the browser cache. I can implement option A relatively easily however this requires a large amount of bandwidth, something I can find for around $5/MO on DigitalOcean. However this entire project is a learning experience and as a result I would love to be able to manage files in the browser cache instead.
I am using the following NPM Modules:
NodeJS
Express
Cheerio
Request
Further Update
I had come across a plugin for NPM called jsZip: https://stuk.github.io/jszip/
However, I was unaware it could be implemented on the client side as well. This was purely an error on my part. This brings up an interesting issue of WebStorage: https://www.w3schools.com/html/html5_webstorage.asp
the maximum storage size for the session is 5MB
From here I will attempt to implement this answer here: How do you cache an image in Javascript to my current code and will update this answer with the result for anyone else facing this issue.

Resources