I am trying to download embedded images in outlook using selenium python. I have tried to run the following code to download such images but it failed. The code can be used to download attachments but not embedded images. Anyone has any ideas how to download images to folder in python? Thanks so much.
The code i used was (it only works for attached images):
imgs = driver.find_elements_by_xpath(xpath)
for i, img in enumerate(imgs):
src = img.get_attribute('src')
urllib.request.urlretrieve(src, f"img {i}.png")
Related
Hello to the whole community, I wanted to know how to get an image through xpath. I have the following code to download an image using the link of the jpg file
import requests
url = 'https://www.elesquiu.com/u/portadas/tapas/7349.jpg'
myfile = requests.get(url)
open('ESQUIU.jpg', 'wb').write(myfile.content)
The problem that arises here, is that the file 7349.jpg is randomly renamed, and for that reason is that I need to go directly through xpath, can someone help me with this? Grateful
webpage info "https://www.elesquiu.com"
I am trying to render a image to a docx template which has a jinja hook in it.
I am using inlineImage method from docx template using this I was able to render image it is opening in libreoffice writer but not in msword where it should be really using.
In msword a image is rendered but not shown just a blank area is shown.
I do even tried using the subdoc method it is also giving me same results.
Please do help me understand what am doing wrong.
I am using python-docx 0.8.7 and docxtemplate 0.5.17 library to achieve the result
I am bound to use these versions since docxtemplate enforced to use this version of python-docx
I am posting this as an answer so that it may help others trying to achieve the same.
In my code i was using NamedTemporaryFile(delete=None) to create the image before I pass the value to inlineimage as InlineImage(template, tmpfile.name) Which was causing the Issue , I remodelled my logic to directly add the image after doing some resize based on image aspect ratio and send the path of the image instead of filename into the InlineImage method and it worked like a charm.
How can I convert images saved on cloudinary to pdf using their url with python3 + django?
I tried using pdfkit but it's not fetching the image from url and creates a blank pdf.
You can just set the format to pdf, like this:
https://res.cloudinary.com/demo/f_pdf/bike.jpg
Python:
CloudinaryImage("bike.jpg").image(fetch_format="pdf")
Thank you all for the helpful answers.
I used urllib.request.urlretrieve(url, filename) to download the images from the urls to local system and then with the help of PyFPDF converted them to the pdf file.
Link: https://pyfpdf.readthedocs.io/en/latest/Tutorial/index.html
for general html, when use pdfkit to convert html to pdf,images in the html can be saved in pdf.
while for wechat official account articles, I found images in urls was lost. the following code is an instance.
how to save wechat official account articles with images to pdf?
import pdfkit
url='https://mp.weixin.qq.com/s?__biz=MzA3NDMyOTcxMQ==&mid=2651249314&idx=1&sn=5338576a80a4145b9808ff06cc980c14'
path_wkthmltopdf = 'C:/Anaconda3/Lib/site-packages/wkhtmltopdf/bin/wkhtmltopdf.exe'
pdfkit.from_url(url=url,output_path='c:/test.pdf',configuration=pdfkit.configuration(wkhtmltopdf=path_wkthmltopdf))'
I think one of the solution is rolling down the url to load all images, then convert it to pdf. how to rolling down to load all images in pdfkit?
The following should work without modifying the windows environment variables:
import pdfkit
path_wkthmltopdf = r'C:\Python27\wkhtmltopdf\bin\wkhtmltopdf.exe'
config = pdfkit.configuration(wkhtmltopdf=path_wkthmltopdf)
url = 'https://mp.weixin.qq.com/s?timestamp=1515570589&src=3&ver=1&signature=xsZdozV1JPS2K8SuXJ8TKeqfuczP2z78*LCVu32ljt1NSa8oF41X88W0JYguTbLUwHHyt0ksUy8l9ljM5*uGOSH-GBjlVipz4a1aIeg9xNQgwlxuCV*9dURcg-U8UvR78C2RV6B5CIeA0n1jIaiFiqrQTIuel5IW-HYAcQsOT0g='
pdfkit.from_url(url, "out.pdf", configuration=config)
Assuming the path is correct (e.g. in my case it is r'C:\Program Files (x86)\wkhtmltopdf\bin\wkhtmltopdf.exe').
Result:
Loading page (1/2)
Printing pages (2/2)
Done
PDF Link
I am trying to download all the images from a webpage. The images are included as follows:
How should I parametrize my wget command to specify that I only want the images where the link starts with "https://alwaysSamePart.com/"? because what follows varies avery time so I cant just specify a hardcoded link.
Ideally I should scrape all the urls af all the images out of the html code. Surf to each link individually and save each image individually.