downloading images from webpage with different links - linux

I am trying to download all the images from a webpage. The images are included as follows:
How should I parametrize my wget command to specify that I only want the images where the link starts with "https://alwaysSamePart.com/"? because what follows varies avery time so I cant just specify a hardcoded link.
Ideally I should scrape all the urls af all the images out of the html code. Surf to each link individually and save each image individually.

Related

Nuxtjs including images from assets using the content module

I am trying to use the nuxt.js Nuxt.js content module. Is there a way to display images we refer to in our blogpost.md file? I know that we can put images in the front matter, but I want the create of the articles to put images inside their created .md file. Lets say we have a file:
-- start of .md file --
# some text
some description
![Image of test](../../assets/images/test.jpg)
![Image of Yaktocat](https://octodex.github.com/images/yaktocat.png)# some text
some description
-- end of .md file --
I end up seeing the image that is linked using https. But the other image is not displayed?? When checking the page I see an <img> tag is created, but no image to be seen...
When I check the structure using any other markdown editor, I see the image.
Including links to images does display images. But I need to include locally stored images.
any help greatly appreciated
Partially solving the issue, some images can be added using metadata (like cover images), those cannot be added within the content itself, just above or bellow actual content.
---
title: Sample
image: test.jpg
---
Here the markdown (without images)
In the view you can render that image after or before the content using something like:
<img :src="require(`~/assets/images/${ page.image }`)">
<nuxt-content :document="page"/>

Node.js how to download webp image

It´s very easy to download images via the request module. But this is only working for me when then end of the url contains .jpg or .png
But how can you download as example this image?
https://lh3.googleusercontent.com/VpoWDgQ2I_RlTNM1Srlo5Q0VQglr-gdbzJ48TwYRXM2U4iF75PMrv76rBiu5c3l1UJs=s180-rw
Does anybody know a method to download the image as .jpg?
I found a solution on howtogeek
"Click the URL bar, delete the last three characters in the address (the “-rw”), and then press “Enter.” The same image will be displayed again, but this time it’s rendered in its original format, usually JPEG or PNG."

how to convert wechat official account articles with images to pdf

for general html, when use pdfkit to convert html to pdf,images in the html can be saved in pdf.
while for wechat official account articles, I found images in urls was lost. the following code is an instance.
how to save wechat official account articles with images to pdf?
import pdfkit
url='https://mp.weixin.qq.com/s?__biz=MzA3NDMyOTcxMQ==&mid=2651249314&idx=1&sn=5338576a80a4145b9808ff06cc980c14'
path_wkthmltopdf = 'C:/Anaconda3/Lib/site-packages/wkhtmltopdf/bin/wkhtmltopdf.exe'
pdfkit.from_url(url=url,output_path='c:/test.pdf',configuration=pdfkit.configuration(wkhtmltopdf=path_wkthmltopdf))'
I think one of the solution is rolling down the url to load all images, then convert it to pdf. how to rolling down to load all images in pdfkit?
The following should work without modifying the windows environment variables:
import pdfkit
path_wkthmltopdf = r'C:\Python27\wkhtmltopdf\bin\wkhtmltopdf.exe'
config = pdfkit.configuration(wkhtmltopdf=path_wkthmltopdf)
url = 'https://mp.weixin.qq.com/s?timestamp=1515570589&src=3&ver=1&signature=xsZdozV1JPS2K8SuXJ8TKeqfuczP2z78*LCVu32ljt1NSa8oF41X88W0JYguTbLUwHHyt0ksUy8l9ljM5*uGOSH-GBjlVipz4a1aIeg9xNQgwlxuCV*9dURcg-U8UvR78C2RV6B5CIeA0n1jIaiFiqrQTIuel5IW-HYAcQsOT0g='
pdfkit.from_url(url, "out.pdf", configuration=config)
Assuming the path is correct (e.g. in my case it is r'C:\Program Files (x86)\wkhtmltopdf\bin\wkhtmltopdf.exe').
Result:
Loading page (1/2)
Printing pages (2/2)
Done
PDF Link

<a4j:mediaoutput , loads images slowly? what can be the reason?

I am inspecting a portal's page for loading of images ,its loading very slow.
We pick images from a filesystem , images name from database and read them, create a list and show results using a4j:mediaOutput tag. but the images are being loaded very slowly.
http://www.easyrenting.com/list-detail/3bhk-ardee-city-sector-52/6263
The first problem I see is that all your pictures are high-res (1800px x 2400px).
You really should create thumbnails server side to meet your view requirement and load images according of the size you want to show on the client size.
Have you only verified that your web page weight about 6.5MB including all images? (Check with Firebug).
I would recommand you a custom servlet like this one FileServlet supporting resume and caching with GZIP, and create a URL pattern according to load full res or thumbnail depending of the requirement.
There is no problem using the a4j:mediaOutput tag.
The images are getting loaded slowly because the size is too large, you need to find out a way to optimize the image size. Probably you can re-size the images before saving it to your file system.
Unless you are giving the zoom functionality, you do not need these big images.
That should help!

Hide previously viewed images using a Greasemonkey script

I'm trying to write a Greasemonkey script that will hide images in Google Image Search that I've already seen (so that I won't be shown the same image in search results twice, when making multiple searches.) Is there any way to obtain a list of images from search results using a Greasemonkey script so that I can prevent the images from loading after they are viewed? In order to do this, I'll need to find a way to prevent images from loading after they've previously been viewed.
var blockedImageList = new Array();
function blockImages(imageList){
//prevent images from loading if they are on the blocked image list,
//which is an array of image URLs
}
function getAllImagesFromPage(){
//add all of the images on the current page to the blocked image list.
}

Resources