How to download zip file from a Hyperlink in python - python-3.x

there is website with url
https://www.fda.gov/drugs/drug-approvals-and-databases/drugsfda-data-files
and there is one downloadable file
Drugs#FDA Download File (ZIP - 3.2MB) as Hyperlink in the content of the site.
I have tried the code as below
import urllib.request
import gzip
url = 'https://www.fda.gov/media/89850/download'
with urllib.request.urlopen(url) as response:
with gzip.GzipFile(fileobj=response) as uncompressed:
file_header = uncompressed.read()
But i am getting error of : Not a Zipped file

you can use the python requests library to get the data from the url then write the contents to a file.
import requests
with open('my_zip_file.zip', 'wb') as my_zip_file:
data = requests.get('https://www.fda.gov/media/89850/download')
my_zip_file.write(data.content)
This will create a file in the same directory. you can of course name your file anything.

Related

Download file from website directly into Linux directory - Python

If I manually click on button, the browser starts downloading a CSV file (2GB) onto my computer. But I want to automate this.
This is the link to download:
https://data.cityofnewyork.us/api/views/bnx9-e6tj/rows.csv?accessType=DOWNLOAD
Issue; when I use either (requests or pandas) libraries it just hangs. I have no idea if it is being downloaded or not.
My goal is to:
Know if the file is being downloaded and
Have the CSV downloaded to a specified directory ie.
~/mydirectory
Can someone provide the code to do this?
Try this...
import requests
URL = "https://data.cityofnewyork.us/api/views/bnx9-e6tj/rows.csv?accessType=DOWNLOAD"
response = requests.get(URL)
print('Download Complete')
open("/mydirectory/downloaded_file.csv", "wb").write(response.content)
Or you could do it this way and have a progress bar ...
import wget
wget.download('https://data.cityofnewyork.us/api/views/bnx9-e6tj/rows.csv?accessType=DOWNLOAD')
The output will look like this:
11% [........ ] 73728 / 633847

accessing in the sub-folder of a folder in python

What I am trying to do is to access the images saved in my path that is E:/project/plane
but unable to get access.
I tried using glob.glob, all I'm getting is access to the subfolder not to the images inside the subfolder.
I also tried to take the name of the subfolder as an input combine it with the path but still can't get access to the folder.
Can anyone help me with how can I achieve this task?
here is my Python code:
import os
import glob
import cv2
path= "E:\project\%s"% input("filename")
print(path)
for folder in glob.glob(path + '*' , recursive=True):
print(folder)

Python ZipFile module problem when file is encrypted

I have the following short program
from zipfile import ZipFile
procFile1 ="C:\\Temp\\XLFile-Demo.zip"
procFile2 ="C:\\Temp2\\XLFile-Demo-PW123.zip"
# Unencrypted file
print ("Unencrypted file")
myzip1 = ZipFile(procFile1)
print (myzip1.infolist())
myzip1.extractall("C:\\Temp")
# Encrypted File
print ("Encrypted file")
myzip2 = ZipFile(procFile2)
print (myzip2.infolist())
myzip2.setpassword(bytes('123', 'utf-8'))
myzip2.extractall("C:\\Temp2")enter code here
At this Amazon Drive link are the two files. They are identical except that one zip is protected with the password 123.
Executing the above code successfully extracts the unencrypted one but raises the error NotImplementedError: That compression method is not supported for the other.
Unencrypted file
[<ZipInfo filename='XLFile-Demo.xlsx' compress_type=deflate external_attr=0x20 file_size=31964 compress_size=29252>]
Encrypted file
[<ZipInfo filename='XLFile-Demo.xlsx' compress_type=99 external_attr=0x20 file_size=31964 compress_size=29280>]
Am I doing anything wrong from my end?
The error came up when the file was zipped using WinRar's ZIP option. I installed 7Zip and it is working.
The .infolist for the 7Zip file is the following:
[<ZipInfo filename='XLFile-Demo.xlsx' compress_type=deflate external_attr=0x20 file_size=31964 compress_size=29340>]
Incidentally WinRar can handle this file and 7Zip can correctly process the encrypted Zip archive created by WinRar.

Download xml file from the server with Python3

am trying to download a xml file from public data bank
http://api.worldbank.org/v2/en/indicator/SP.POP.TOTL?downloadformat=xml
I tried to do it with requests:
import requests
response = requests.get(url)
response.encoding = 'utf-8' #or response.apparent_encoding
print(response.content)
and wget
import wget
wget.download(url, './my.xml')
But both of the ways provide mess instead of a correct file (it looks like a broken encoding, but I cannot fix it)
If I try to download the file via web browser I get correct a UTF-8 xml file.
What am I doing wrong in the code?

Download files with Python - "unknown url type"

I need to download a list of RTF files locally with Python3.
I tried with urllib
import urllib
url = "www.calhr.ca.gov/Documents/wfp-recruitment-flyer-bachelor-degree-jobs.rtf"
urllib.request.urlopen(url)
but I get a ValueError
ValueError: unknown url type: 'www.calhr.ca.gov/Documents/wfp-recruitment-flyer-bachelor-degree-jobs.rtf'
How to deal with this kind of file format?
Try adding http:// in front of the url,
import urllib
url = "http://www.calhr.ca.gov/Documents/wfp-recruitment-flyer-bachelor-degree-jobs.rtf"
urllib.request.urlopen(url)

Resources