Image url extraction using excel vba - excel

Iam working with excel 2016. I need to extract the link of an image from a website using VBA in excel.
Example, i have a website that shows a product with the link : https://www.hikvision.com/en/products/Turbo-HD-Products/Turbo-HD-Cameras/Value-Series/ds-2ce56d0t-vpir3f/
My image is into a div , like that :
<div class="slide-image" style="background-image: url('/content/dam/hikvision/products/HIKVISION/Turbo_HD_Products/Turbo_HD_Cameras/Value_Series/D0T_Series/DS-2CE56D0T-VPIR3F/images/2CE56D0T-半球11-正视图.png.thumb.1280.1280.png');"></div>
I tried this :
Private Sub btnExtractURL_Click()
Dim sourceString As String
Dim rowIdx As Integer, rowMax As Integer
Dim posFirst As Integer, posLast As Integer, chrLength As Integer
rowMax = Range("A3").End(xlDown).Row
' ---
For rowIdx = 3 To rowMax
If Cells(rowIdx, 1).Value <> "" Then
Cells(rowIdx, 2).Value = ""
sourceString = Cells(rowIdx, 1).Value
posFirst = InStr(sourceString, "http")
posLast = InStr(posFirst, sourceString, """")
chrLength = (posLast - 1) - (posFirst - 1)
Cells(rowIdx, 2).Value = Mid(sourceString, posFirst, chrLength)
End If
Next
' ---
MsgBox "finished"
End Sub
But i have an error with this solution... I tried to extract the text to see another method, and it work's but when i insert the class of that image, it doesn't work !
Sub Get_Web_Data()
Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim price As Variant
' Website to go to.
website = "https://www.hikvision.com/en/products/Turbo-HD-Products/Turbo-HD-Cameras/Value-Series/ds-2ce56d0t-vpir3f/"
' Create the object that will make the webpage request.
Set request = CreateObject("MSXML2.XMLHTTP")
' Where to go and how to go there - probably don't need to change this.
request.Open "GET", website, False
' Get fresh data.
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
' Send the request for the webpage.
request.send
' Get the webpage response data into a variable.
response = StrConv(request.responseBody, vbUnicode)
' Put the webpage into an html object to make data references easier.
html.body.innerHTML = response
' Get the price from the specified element on the page.
Name = html.getElementsByClassName("prod_name").Item(0).innerText
' Output the price into a message box.
MsgBox Name
End Sub
Can you give an idea to extract this image and copy the link into my excel ?

Related

Print each item in a JSON Object as separate row in excel file using VBA

I'm trying to read the json from an URL, and paste the json object items as a separate row in an excel sheet.
But till now, I'm able to fetch the entire json object into the excel in single column A1.
I tried to iterate through the json objects and print in separate rows.
But getting 424 Error : Object Qualifier
Need your help on the same.
Sample Data : {"data":{"id":3,"name":"true red","year":2002,"color":"#BF1932","pantone_value":"19-1664"},"support":{"url":"https://reqres.in/#support-heading","text":"To keep ReqRes free, contributions towards server costs are appreciated!"}}
Private Sub HTML_VBA_Extract_Data_From_Website_To_Excel()
Dim oXMLHTTP As Object
Dim sPageHTML As String
Dim sURL As String
'Change the URL before executing the code. URL to Extract data from.
sURL = "https://reqres.in/api/products/3"
'Extract data from website to Excel using VBA
Set oXMLHTTP = CreateObject("MSXML2.ServerXMLHTTP")
oXMLHTTP.Open "GET", sURL, False
oXMLHTTP.send
sPageHTML = oXMLHTTP.responseText
Dim jsonObject As Object
Set jsonObject = JsonConverter.ParseJson(sPageHTML)
Dim i As Integer
Dim Item As Variant
i = 2
For Each Item In jsonObject.Keys
ThisWorkbook.Sheets(1).Cells(i, 1).Value = Item
i = i + 1
Next
MsgBox "XMLHTML Fetch Completed"
End Sub
This outputs:
data
support
--------
id 3
name true red
year 2002
color #BF1932
pantone_value 19-1664
to the Immediate window, as expected.
Private Sub JsonTester()
Dim oXMLHTTP As Object
Dim sURL As String
Dim jsonObject As Object
Dim Item As Variant, data As Object
sURL = "https://reqres.in/api/products/3"
Set oXMLHTTP = CreateObject("MSXML2.ServerXMLHTTP")
oXMLHTTP.Open "GET", sURL, False
oXMLHTTP.send
Set jsonObject = JsonConverter.ParseJson(oXMLHTTP.responseText)
For Each Item In jsonObject.keys
Debug.Print Item
Next
Debug.Print "--------"
Set data = jsonObject("data")
For Each Item In data.keys
Debug.Print Item, data(Item)
Next
End Sub

Converting weird characters and symbols into normal language in excel

I am using the VBA code to extract information from a website into excel cells, and the numerical information is fine but I have a problem with text strings. I am mostly extracting information from Georgian websites, and the texts with the Georgian language are not properly displayed in excel, so I was wondering if there is any chance (code or something else) I could convert these symbols into proper language.
Sub GetData()
Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim price As Variant
Dim address As Variant
Dim x As Integer
Dim y As Range
x = 1
Do Until x = 9
Set y = Worksheets(1).Range("A21:A200"). _
Find(x, LookIn:=xlValues, lookat:=xlWhole)
website = "https://www.myhome.ge/ka/pr/11247371/iyideba-Zveli-ashenebuli-bina-veraze-T.-WoveliZis-qucha"
' Create the object that will make the webpage request.
Set request = CreateObject("MSXML2.XMLHTTP")
' Where to go and how to go there.
request.Open "GET", website, False
' Get fresh data.
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
' Send the request for the webpage.
request.send
' Get the webpage response data into a variable.
response = StrConv(request.responseBody, vbUnicode)
' Put the webpage into an html object.
html.body.innerHTML = response
' Get info from the specified element on the page.
address = html.getElementsByClassName("address").Item(0).innerText
price = html.getElementsByClassName("d-block convertable").Item(0).innerText
y.Offset(0, 1).Value = address
y.Offset(0, 5).Value = price
x = x + 1
Loop
End Sub
This is the code that I took from a youtube video (https://www.youtube.com/watch?v=IOzHacoP-u4) and slightly modified, and it works, I just have a problem with how excel displays the characters in text strings.
For your issue in the question
Remove this line response = StrConv(request.responseBody, vbUnicode) as it's not required.
Change html.body.innerHTML = response to html.body.innerHTML = request.responseText.
For your issue in comment
To retrieve the ID of the property, it can be retrieved from the class id-container, you will need to perform some string processing though to remove the extract :
propertyID = Trim$(Replace(html.getElementsByClassName("id-container")(0).innerText, ":", vbNullString))
Note: You should try to avoid declaring variable as Variant. innerText property returns a String datatype so you should declare address and price as String.

I have questions about how to click and search on web using vba

I have questions about how to click and search on web using vba.
I have wrote the code, but cannot find how to click the button in this web
Sub LEISearch()
'dimension (declare or set aside memory for) our variables
Dim objIE As InternetExplorer 'special object variable representing the IE browser
Dim LEI As HTMLLinkElement 'special object variable for an <a> (link) element
Dim y As Integer 'integer variable we'll use as a counter
Dim result As String 'string variable that will hold our result link'
Dim result2 As String
'initiating a new instance of Internet Explorer and asigning it to objIE
Set objIE = New InternetExplorer
'make IE browser visible (False would allow IE to run in the background)
objIE.Visible = True
'navigate IE to this web page (a pretty neat search engine really)
objIE.navigate "https://www.gmeiutility.org/search.jsp?keyWord"
'wait here a few seconds while the browser is busy
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
'in the search box put cell "A2" value, the word "in" and cell "C1" value
objIE.document.getElementById("searchInput").Value = _
Sheets("Macro1").Range("A1").Value
'click the 'go' button
Set LEIButton = objIE.document.getElementsByClassName("hiddenSubmitButton")
LEIButton.Focus
LEIButton.Click
'wait again for the browser
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
End Sub
This case is a really nice and clean example of web scraping so i will take this opportunity to present an educational post.
I highly recommend avoiding using IE to scrape websites whenever it's possible. It's highly inefficient. Especially in a case like this where there can be multiple pages of results. Instead, you can use HTTP requests.
An HTTP request is a structured way to request something from a server. In this case we want to send a keyword to the server and get the corresponding search results.
To find out how this request should look like, you have to inspect the network traffic when the button with the magnifying glass is clicked. You can do that through your browser's developer tools (Ctrl+Shift+E if you're using Firefox):
If you go through the Headers and the Params of the request you will see how the url, the body and the headers should look like. In this particular case, all the parameters are encoded into the url and the headers are not essential to the success of the request, so all you need is the url.
Some of the parameters of the url are the keyword, the number of results per page and the number of page.
The response's payload is in json format. You can inspect its structure using a tool like this. Here's how it looks like:
Basically the JSON response consists of as many results as you have specified that should be displayed per page (or less). To get the next page you need to send a new request with the same keyword but specifying a new page number and so on.
In fact, as you can see the website offers a lot more data than what's displayed on your browser, which could prove to be useful.
The code below searches for the keyword test, while requesting 25 results per page. One first request is sent to find out how many pages of results are there and then the code loops through all pages and prints the results in a worksheet.
TL;DR
Option Explicit
Sub main()
Dim sht As Worksheet
Dim totalNumberOfPages As Long
Dim searchResults As Object
Dim pageNumber As Long
Dim results() As String
Dim entity As Object
Dim i As Long, j As Long
Dim rng As Range
Set sht = ThisWorkbook.Worksheets("Name of your Worksheet")
''''''First request to find out the number of pages''''''
Set searchResults = XHRrequest("test", 25, 1) '
totalNumberOfPages = searchResults("totalPages") '
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'''''''''''''''''''Loop through all the pages''''''''''''''''''''''''''''''''
For pageNumber = 1 To totalNumberOfPages Step 1 '
Set searchResults = XHRrequest("test", 25, pageNumber) '
ReDim results(1 To searchResults("entitySearchResult").Count, 1 To 7) '
i = 0 '
'''''''''''write the results in an array'''''''''''''''''''''''''''' '
For Each entity In searchResults("entitySearchResult") ' '
i = i + 1 ' '
results(i, 1) = entity("LEINumber") ' '
results(i, 2) = entity("legalName") ' '
results(i, 3) = entity("city") ' '
results(i, 4) = entity("headquartersCountry") ' '
results(i, 5) = entity("recordStatus") ' '
results(i, 6) = entity("renewalStatus") ' '
results(i, 7) = entity("entityStatus") ' '
Next entity ' '
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' '
'''''''''''''''write all the results in the worksheet in one go''''' '
With sht ' '
Set rng = .Range("A" & .Rows.Count).End(xlUp).Offset(1, 0) ' '
End With ' '
rng.Resize(UBound(results, 1), UBound(results, 2)) = results ' '
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' '
Next pageNumber '
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
End Sub
Public Function XHRrequest(ByVal key As String, ByVal resultsPerPage As Long, ByVal pageNumber As Long) As Object
Dim req As New WinHttpRequest
Dim url As String
url = "https://www.gmeiutility.org/actions/Search/?isPendingValidationChecked=true&isSearchAllLOUChecked=true&keyWord=" & key & "&page=" & pageNumber & "&resultsPerPage=" & resultsPerPage & "&searchType=baseSearch" 'build the URL according to the parameters
'''''''''Send the HTTP request'''''''''''''''''''''''''''''''
With req '
.Open "POST", url, False '
.send '
Set XHRrequest = JsonConverter.ParseJson(.responseText) '
End With '
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
End Function
For demonstration purposes the code above prints all the data in worksheet named Name of your Worksheet.
If you need to perform multiple searches you can easily modify the code to best fit your needs. More specifically you can loop through multiple keywords and call the XHRrequest function using those keywords instead of "test".
Here's a sample of the output:
You will need to add the following references to your project (VBE>Tools>References):
Microsoft WinHTTP Services version 5.1
Microsoft HTML Objects Library
Microsoft Scripting Runtime
You will also need to add this JSON parser to your project. Follow the installation instructions in the link and you should be set to go.

Unable to "click" Span element in VBA/IE

I'm trying to automate a web scrape of the following URL: https://www.forebet.com/en/football-predictions-from-yesterday
I have code that pulls all of the data, however there is a More+ "button" which expands the list, and I can't seem to get a handle on how to send the click via VBA, as it's not a button and so I can't send use .Click
The HTML snippet for the section:
<tr id="mrows" style="height:57px;">
<td colspan="12"><span onclick="ltodrows("1x2","-1")">More [+]</span>
<div class="loader"></div>
</td>
</tr>
I've tried numerous ways of submitting the Click - but one have worked!
Start of my code:
Sub Button_More_Test()
Dim objIE As InternetExplorer
Dim itemEle As Object
'Dim e As Object
Set objIE = New InternetExplorer
objIE.Visible = True
objIE.Navigate "https://www.forebet.com/en/football-predictions-from-yesterday"
Do While objIE.Busy = True Or objIE.ReadyState <> 4: DoEvents: Loop
I see this has been answered but I also see that the total number of games is 586.
Now, don't get me wrong, it's perfectly fine to scrape the data using IE and the HTML document but I can only imagine what a nightmare it must be in terms of efficiency...
That's why I will post a method to circumvent IE and the More + button and all that.
More specifically the webpage provides a very convenient way to get the data in JSON format via an HTTP request.
To find out how this request should look like, you have to inspect the network traffic when the More+ button is clicked. You can do that through your browser's developer tools (Ctrl+Shift+E if you're using Firefox):
If you go through the Headers and the Params of the request you will see how the url, the body and the headers should look like. In this particular case, all the parameters are encoded into the url and the headers are not essential to the success of the request, so all you need is the url.
The response's payload is in json format. You can inspect its structure using a tool like this. Here's how it looks like:
Basically the JSON consists of the 586 games, each one of which consists of 41 parameters with their corresponding values.
TL;DR
Here's how the code should look like:
Option Explicit
Sub forebet()
'''''''Declarations''''''''''''''''''''''''''''''''''''''''''
Dim sht As Worksheet '
Dim req As New WinHttpRequest '
Dim jsonResponse As Object, game As Object '
Dim key As Variant '
Dim url As String '
Dim headers() As String, results() As String '
Dim i As Long, j As Long, lastRow As Long, lastCol As Long '
Dim rng As Range '
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Set sht = ThisWorkbook.Worksheets("Name of your Worksheet") 'store your worksheet in a variable
url = "https://www.forebet.com/scripts/getrs.php?ln=en&tp=1x2&in=-1"
'''''''''''HTTP request''''''''''''''''''''''''''''''''''''''''''
With req '
.Open "GET", url, False '
.send '
Set jsonResponse = JsonConverter.ParseJson(.responseText) '
End With '
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
ReDim headers(1 To 1, 1 To jsonResponse(1).Count)
ReDim results(1 To jsonResponse.Count, 1 To jsonResponse(1).Count)
''''''''Write the headers to an array''''''''
i = 1 '
For Each key In jsonResponse(1).Keys '
headers(1, i) = key '
i = i + 1 '
Next key '
'''''''''''''''''''''''''''''''''''''''''''''
''''''''''''Write the data to an array'''''''''''''''''''''''''''''''''''''''
j = 1 '
For Each game In jsonResponse '
i = 1 '
For Each key In game.Keys '
If game(key) <> "null" Then '
results(j, i) = game(key) '
i = i + 1 '
Else '
results(j, i) = "NULL" '
i = i + 1 '
End If '
Next key '
j = j + 1 '
Next game '
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
''''''''Write the headers and results arrays to the worksheet in one go''''''''''''''''''''''''''
sht.Range("A1").Resize(1, UBound(headers, 2)) = headers '
sht.Range("A2").Resize(UBound(results, 1), UBound(results, 2)) = results '
'sht.Cells.Value = sht.Cells.Value this line causes an out of memory error on 32 bit office ' '
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'''''''''''Prevent numbers form being stored as text''''''''''''
With sht '
lastRow = .Cells(.Rows.Count, 1).End(xlUp).Row '
lastCol = .Cells(1, .Columns.Count).End(xlToLeft).Column '
Set rng = Range(.Range("A1"), .Cells(lastRow, lastCol)) '
End With '
Debug.Print rng.Address '
rng.Value = rng.Value '
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
End Sub
In fact, as you can see the website offers a lot more data, which could prove to be useful. For demonstration purposes the code above prints all the data in worksheet named Name of your Worksheet.
Here's a sample of the output:
You will need to add the following references to your project (VBE>Tools>References):
Microsoft WinHTTP Services version 5.1
Microsoft HTML Objects Library
Microsoft Scripting Runtime
You will also need to add this JSON parser to your project. Follow the installation instructions in the link and you should be set to go.
You can loop through elements of the website like so:
For Each ele In objIE.Document.getElementsByTagName("span")
If ele.innerText = "More [+]" Then
ele.Click
Exit For
End If
Next
You'll have to wait until it loads, IE.state won't change but you get body before click, and loop until it changes, for example.

Error 91 Using For Loop with XMLHttpRequest VBA

I am trying to scrape a site for the contents of various HTML tags. I am using an array of html tags against an array of URLs.
Sub pArse(UserForm4_HiddenList_Text)
Dim http As New XMLHTTP60
'Dim html As New HTMLDocument
Dim hero_true_Val As Integer
Dim down_var As Integer
Dim bill_array_redim
Dim element_tag As String
Dim address_count As Integer
hero_true_Val = Application.WorksheetFunction.CountIf(Range("B:B"), True)
down_var = 1
bill_array_redim = Split(UserForm4_HiddenList_Text, Chr(10))
ReDim address_array(hero_true_Val)
For Z = 2 To 6
If Sheets("resource").Cells(Z, 2).Value <> "False" Then
address_count = address_count + 1
address_array(address_count) = Sheets("resource").Cells(Z, 1).Value
End If
Next Z
Sheets("Sheet1").Select
For url_stack = 0 To UBound(bill_array_redim)
Sheets("sheet1").Cells(down_var, 1) = bill_array_redim(url_stack)
For what_to_check = 1 To address_count
With http
.Open "GET", bill_array_redim(url_stack), False
.send
Do While http.readyState <> 4
DoEvents
Loop
End With
Dim html As New HTMLDocument
With html
.body.innerHTML = http.responseText
' the next line is where the error occurs
Sheets("Sheet1").Cells(down_var, 2).Value = .querySelector(address_array(what_to_check)).innerText
End With
down_var = down_var + 1
Next what_to_check
Next url_stack
UserForm4.Hide
End Sub
A user clicks a command button which then feeds in an array of URLs (UserForm4_HiddenList_Text). The code then checks for the existence of the word "true" in the sheet named "resource." This generates an array of terms to check for.
the problem I'm having is that on the second iteration of the what_to_check loop, I'm getting
error 91 : Object or with block variable not set
I'm unsure of what the issue is here and it only seems to occur if I pass in two html tags rather than one.
edit: thanks for replying Cindy. The error location is bolded. The message is "Object or with block variable not set."
Turns out I was being given an error I either didn't fully understand or wasn't very clear. The issue was that the response didn't have the tag inside it that I was looking for as it wasn't present on the site. By using:
With html
.body.innerHTML = http.responseText
If .body.contains(.querySelector(tag_array(tag_array_index_no)))
Then
Sheets(1).Range(Cells(down_var, 2).Address).Value =
tag_array(tag_array_index_no)
Sheets(1).Range(Cells(down_var, 3).Address).Value =
.querySelector(tag_array(tag_array_index_no)).innerText
Else
Sheets(1).Range(Cells(down_var, 2).Address).Value =
tag_array(tag_array_index_no)
Sheets(1).Range(Cells(down_var, 3).Address).Value =
tag_array(tag_array_index_no) + " not set."
End If
End With
I'm able to check if the tag exists in the response text.

Resources