Reading blob from database without saving to disk in Python - python-3.x

I am trying to read a set of records from db which has a blob field. I am able to read it but not without saving it to disk first.
cursor.execute("select image ,id,
department from dept_master")
depdata = cursor.fetchall()
for row in depdata:
file_like = io.BytesIO(row[0])
file = PIL.Image.open(file_like)
target = os.path.join("/path-to-save/", 'folder-save')
destination = "/".join([target, file.filename])
file.save(destination)
How can I read it and display it without first saving to disk ?

I am planning to use render_template to display the data
To serve the images from a blob field, you need to create a separate route which serves the actual image data specifically, then in a template include links which load the files from this route.
There's no need to use PIL here, as the blob field gives you bytes data. You run that through BytesIO to get a _io.BytesIO object, which can be passed straight to Flask's send_file function.
from io import BytesIO
from flask import send_file
#app.route('/image/<int:ident>')
def image_route(ident):
# This can only serve one image at a time, so match by id
cursor.execute("select image from dept_master WHERE id = ?", (ident,)
result = cursor.fetchone()
image_bytes = result[0]
bytes_io = BytesIO(image_bytes)
return send_file(bytes_io, mimetype='image/jpeg')
At this stage you should be able to hit /image/1 and see the image in your browser for the row with id 1.
Then, somewhere in a template, just include the link for this with:
<img src='{{ url_for("image_route", ident=image_ident) }}' />
This assumes that image_ident is available in the template. You might need to replace this variable name with something which exsists (could be a variable within a for loop which denotes the image id to pull).
Let me know if this needs further explaining.

Related

How do i get Django to store already uploaded cover image to a user without getting it deleted

How do i get Django to store already uploaded cover image to a user without getting it deleted when a new image is uploaded, but simply replace it? I'm having a challenge trying to figure out how to maintain old cover images while adding new once to a user. what happens is that when i upload a new cover image it simply deletes the previous one from the database. Here is my cover image models:
class AccountCover(models.Model):
account = models.ForeignKey(Account,on_delete=models.CASCADE)
cover_image = models.ImageField(max_length=255,upload_to=get_cover_cover_image_filepath,default=get_default_cover_image,)
My view.py
cover = AccountCover.objects.filter(account=account.id).first()
if request.user:
forms = CoverImageForm(request.POST, request.FILES,instance=cover,
initial = {'cover_image':cover.cover_image})
if request.method == 'POST':
f = CoverImageForm(request.POST, request.FILES,instance=cover)
if f.is_valid():
data = forms.save()
data.account = cover.account
data.save()
return redirect('account:edit', account.id)
else:
f = CoverImageForm()
context['f'] = f
You can keep old instances of AccountCover by forcing the form to save a new one each time. Simply remove the instance argument in the form, and remember save the form with commit=false before to set the account foreign key.
forms = CoverImageForm(request.POST, request.FILES, initial={'cover_image': cover.cover_image})
# ...
data = forms.save(commit=False)
data.account = cover.account
data.save()
I guess you need to preserve the old database record, not just the image file. An image file without any relation to the user's registry has no utility.

How to show a progess of GridFS downloading a file from the MongoDB?

I'm wondering if it's possible to somehow show the progress of reading a file with gridfs.GridFS() and pymongo. I could not file a callback I could pass to the .read() function.
my_db = MongoClient().test
fs = GridFSBucket(my_db)
# get _id of file to read.
file_id = fs.upload_from_stream("huge_test_file", "i carry lots of data!")
grid_out = fs.open_download_stream(file_id)
contents = grid_out.read()
Is there some way to actually retrieve the bytes that were already downloaded? Considering, the file may be being 5GB big, I want to give some download status feedback.
I know it's been 11 months.. but i think a way to do this is:
Make a new endpoint that returns the last chunks 'n' value, then compare this with the file 'chunkSize' value.
#app.route('/uploadProgress/<string:fileID>')
def uploadProgress(fileID):
doc = db.fs.files.find_one({"_id": fileID})
size = doc['chunkSize']
n = db.fs.chunks.find( { "files_id": fileID } ).sort([('n', -1)]).limit(1)[0]['n']
progress = str(n/size)
return progress
Now in javascript you can make a request to check the progress and update the progress bar

How do I open images stored in GCP in Google datalab?

I have been trying to open a image that I stored in the GCP bucket in my datalab notebook. When I use Image.open() it says like "No such file or directory: 'images/00001.jpeg'"
My code is:
nama_bucket = storage.Bucket("sample_bucket")
for obj in nama_bucket.objects():
Image.open(obj.key)
I just need to open the images stored in the bucket and view it. Thanks for the help!
I was able to reproduce the issue and get the same error as you (No such file or directory).
I will describe the workaround I used to solve it. However,there are few issues that I can see in the code snippet provided:
Class IPython.display.Image has no method 'open'.
You will need to wrap the Image constructor in a display() method.
With Storage APIs for Google Cloud Datalab, what resolved the issue for me was using the url parameter instead of the filename.
Here is the solution that worked for me:
import google.datalab.storage as storage
from IPython.display import Image
bucket_name = '<my-bucket-name>'
sample_bucket = storage.Bucket(bucket_name)
for obj in sample_bucket.objects():
display(Image(url='https://storage.googleapis.com/{}/{}'.format(bucket_name, obj.key)))
Let me know if it helps!
EDIT 1:
As you mentioned that you're using the PIL and would like your images to be handled by it, here's the way to achieve that (I have tested it and it worked well for me):
import google.datalab.storage as storage
from PIL import Image
import requests
from io import BytesIO
bucket_name = '<my-bucket-name>'
sample_bucket = storage.Bucket(bucket_name)
for obj in sample_bucket.objects():
url='https://storage.googleapis.com/{}/{}'.format(bucket_name, obj.key)
response = requests.get(url)
img = Image.open(BytesIO(response.content))
print("Filename: {}\nFormat: {}\nSize: {}\nMode: {}".format(obj.key, img.format, img.size, img.mode))
display(img)
Notice that this way you will not need to use IPython.display.Image at all.
EDIT 2:
Indeed, the error cannot identify image file <_io.BytesIO object at 0x7f8f33bdbdb0> is appearing because you have a directory in your bucket. In order to solve this issue it's important to understand how Google Cloud Storage sub-directories work.
Here's how I organized the files in my bucket to replicate your situation:
my-bucket/
img/
test-file-1.png
test-file-2.png
test-file-3.jpeg
test-file-4.png
Even though gsutil achieves the hierarchical file tree illusion by applying a variety of rules, to try to make naming work the way users would expect, in fact, the test-files 1-3 just happen to have '/'s in their names while there's no actual 'img' directory.
You can still still list all images from your bucket. With the structure I mentioned above it can be achieved, for example, by checking the file's extension:
import google.datalab.storage as storage
from PIL import Image
import requests
from io import BytesIO
bucket_name = '<my-bucket-name>'
sample_bucket = storage.Bucket(bucket_name)
for obj in sample_bucket.objects():
# Check that the object is an image
if obj.key[-3:].lower() in ('jpg','png') or obj.key[-4:].lower() in ('jpeg'):
url='https://storage.googleapis.com/{}/{}'.format(bucket_name, obj.key)
response = requests.get(url)
img = Image.open(BytesIO(response.content))
print("Filename: {}\nFormat: {}\nSize: {}\nMode: {}".format(obj.key, img.format, img.size, img.mode))
display(img)
If you need to get only the images "stored in a particular sub-directory" of your bucket, you will also need to check the files by name:
import google.datalab.storage as storage
from PIL import Image
import requests
from io import BytesIO
bucket_name = '<my-bucket-name>'
folder = '<name-of-the-directory>'
sample_bucket = storage.Bucket(bucket_name)
for obj in sample_bucket.objects():
# Check that the object is an image AND that it has the required sub-directory in its name
if (obj.key[-3:].lower() in ('jpg','png') or obj.key[-4:].lower() in ('jpeg')) and folder in obj.key:
url='https://storage.googleapis.com/{}/{}'.format(bucket_name, obj.key)
response = requests.get(url)
img = Image.open(BytesIO(response.content))
print("Filename: {}\nFormat: {}\nSize: {}\nMode: {}".format(obj.key, img.format, img.size, img.mode))
display(img)

Problem exporting Web Url results into CSV using beautifulsoup3

Problem: I tried to export results (Name, Address, Phone) into CSV but the CSV code not returning expected results.
#Import the installed modules
import requests
from bs4 import BeautifulSoup
import json
import re
import csv
#To get the data from the web page we will use requests get() method
url = "https://www.lookup.pk/dynamic/search.aspx?searchtype=kl&k=gym&l=lahore"
page = requests.get(url)
# To check the http response status code
print(page.status_code)
#Now I have collected the data from the web page, let's see what we got
print(page.text)
#The above data can be view in a pretty format by using beautifulsoup's prettify() method. For this we will create a bs4 object and use the prettify method
soup = BeautifulSoup(page.text, 'lxml')
print(soup.prettify())
#Find all DIVs that contain Companies information
product_name_list = soup.findAll("div",{"class":"CompanyInfo"})
#Find all Companies Name under h2tag
company_name_list_heading = soup.findAll("h2")
#Find all Address on page Name under a tag
company_name_list_items = soup.findAll("a",{"class":"address"})
#Find all Phone numbers on page Name under ul
company_name_list_numbers = soup.findAll("ul",{"class":"submenu"})
Created for loop to print out all company Data
for company_address in company_name_list_items:
print(company_address.prettify())
# Create for loop to print out all company Names
for company_name in company_name_list_heading:
print(company_name.prettify())
# Create for loop to print out all company Numbers
for company_numbers in company_name_list_numbers:
print(company_numbers.prettify())
Below is the code to export the results (name, address & phonenumber) into CSV
outfile = open('gymlookup.csv','w', newline='')
writer = csv.writer(outfile)
writer.writerow(["name", "Address", "Phone"])
product_name_list = soup.findAll("div",{"class":"CompanyInfo"})
company_name_list_heading = soup.findAll("h2")
company_name_list_items = soup.findAll("a",{"class":"address"})
company_name_list_numbers = soup.findAll("ul",{"class":"submenu"})
Here is the for loop to loop over data.
for company_name in company_name_list_heading:
names = company_name.contents[0]
for company_numbers in company_name_list_numbers:
names = company_numbers.contents[1]
for company_address in company_name_list_items:
address = company_address.contents[1]
writer.writerow([name, Address, Phone])
outfile.close()
You need to work on understanding how for loops work, and also the difference between strings, and variables and other datatypes. You also need to work on using what you have seen from other stackoverflow questions and learn to apply that. This is essentially the same as youre other 2 questions you already posted, but just a different site you're scraping from (but I didn't flag it as a duplicate, as you're new to stackoverflow and web scrpaing and I remember what it was like to try to learn). I'll still answer your questions, but eventually you need to be able to find the answers on your own and learn how to adapt it and apply (coding isn't a paint by colors. Which I do see you are adapting some of it. Good job in finding the "div",{"class":"CompanyInfo"} tag to get the company info)
That data you are pulling (name, address, phone) needs to be within a nested loop of the div class=CompanyInfo element/tag. You could theoretically have it the way you have it now, by putting those into a list, and then writing to the csv file from your lists, but theres a risk of data missing and then your data/info could be off or not with the correct corresponding company.
Here's what the full code looks like. notice that the variables are stored with in the loop, and then written. It then goes to the next block of CompanyInfo and continues.
#Import the installed modules
import requests
from bs4 import BeautifulSoup
import csv
#To get the data from the web page we will use requests get() method
url = "https://www.lookup.pk/dynamic/search.aspx?searchtype=kl&k=gym&l=lahore"
page = requests.get(url)
# To check the http response status code
print(page.status_code)
#Now I have collected the data from the web page, let's see what we got
print(page.text)
#The above data can be view in a pretty format by using beautifulsoup's prettify() method. For this we will create a bs4 object and use the prettify method
soup = BeautifulSoup(page.text, 'html.parser')
print(soup.prettify())
outfile = open('gymlookup.csv','w', newline='')
writer = csv.writer(outfile)
writer.writerow(["Name", "Address", "Phone"])
#Find all DIVs that contain Companies information
product_name_list = soup.findAll("div",{"class":"CompanyInfo"})
# Now loop through those elements
for element in product_name_list:
# Takes 1 block of the "div",{"class":"CompanyInfo"} tag and finds/stores name, address, phone
name = element.find('h2').text
address = element.find('address').text.strip()
phone = element.find("ul",{"class":"submenu"}).text.strip()
# writes the name, address, phone to csv
writer.writerow([name, address, phone])
# now will go to the next "div",{"class":"CompanyInfo"} tag and repeats
outfile.close()

How to upload an image with flask and store in couchdb?

A previous question asks how to retrieve at attachment from couchdb and display it in a flask application.
This question asks how to perform the opposite, i.e. how can an image be uploaded using flask and saved as a couchdb attachment.
Take a look at the example from WTF:
from werkzeug.utils import secure_filename
from flask_wtf.file import FileField
class PhotoForm(FlaskForm):
photo = FileField('Your photo')
#app.route('/upload/', methods=('GET', 'POST'))
def upload():
form = PhotoForm()
if form.validate_on_submit():
filename = secure_filename(form.photo.data.filename)
form.photo.data.save('uploads/' + filename)
else:
filename = None
return render_template('upload.html', form=form, filename=filename)
Take a look at the FileField api docs. There you have a stream method giving you access to the uploaded data. Instead of using the save method as in the example you can access the bytes from the stream, base64 encode it and save as an attachment in couchdb, e.g. Using put_attachment. Alternatively, the FileStorage api docs suggest you can use read() to retrieve the data.

Resources