Excel VBA: get error code for invalid URL in hyperlink with WinHttpRequest - excel

In Excel, I have a list with URLs. I need to check if IE (default browser) can open these. They don't have to open actually, it's to check the accessibility.
If they can't open, I need to isolate the error-code and place that in another column.
After searching around here, I started with following the hyperlinks, and used GET to get the data in a MsgBox. This seems to work partially, but of course now I get the MsgBox with every URL also without error. Also I'm looking for a way to extract the error and place that in the active sheet.
What I've got so far:
Sub Request_Data()
' declare
numRow = 2
Dim MyRequest As Object
' activate URLs without Follow
Do While ActiveSheet.Range("C" & numRow).Hyperlinks.Count > 0
numRow = numRow + 1
' create request
Set MyRequest = CreateObject("WinHttp.WinHttpRequest.5.1")
MyRequest.Open "GET", _
ActiveSheet.Range("C" & numRow)
' send request
MyRequest.Send
' outcome
MsgBox MyRequest.ResponseText
' isolate the error code (for example 404)
' place error code in excel sheet in column H next to row URL
Loop
End Sub
Does someone know how I should proceed?
I thought this might be useful but I don't know where to start.
Checking for broken hyperlinks in Excel
and
Bulk Url checker macro excel
Thanks in advance

See the code below - you will need to adapt the Test sub-routine to loop through your cells and call IsValidUrl for each value you want to test:
Option Explicit
Sub Test()
MsgBox IsValidUrl("http://www.thisdoesnotexistxxxxxxxxxxxxx.com/")
MsgBox IsValidUrl("http://www.google.com/")
MsgBox IsValidUrl("http://www.ppppppppppppqqqqqqqqqqqqqqrrrrrrrrrrrrr.com/")
End Sub
Function IsValidUrl(strUrl As String) As Long
Dim objRequest As Object
Dim lngCode As Long
On Error GoTo ErrHandler
Set objRequest = CreateObject("WinHttp.WinHttpRequest.5.1")
With objRequest
.Open "GET", strUrl
.Send
lngCode = 0
End With
GoTo ExitHandler
ErrHandler:
lngCode = Err.Number
ExitHandler:
Set objRequest = Nothing
IsValidUrl = lngCode
End Function
My output is:
-2147012889
0
-2147012889

Related

Open series of links in cells, search website content and return a value

I'm looking to carry out a task in VBA of searching a number of websites for a specified phrase.
I've got a spreadsheet with a dynamic data which changes accordingly creating a link in column F.
I am looking for a macro to open each individual link, search website content for a specifif phrase and then if the phrase is present on the website then to return a value in a cell to the right Offset by 0,1 otherwise to leave the cell blank and move to next row.
Is such task possible to carry out through vba?
I've tried researching similar requests but the result is not what I require.
Code I found online is kind of what I need but it produces the same result if the phrase is present or absent on the website.
Option Explicit
Sub SearchForString()
Dim rngURL As Range
Dim cll As Range
Dim stCheck As String
Dim xmlHttp As Object
On Error Resume Next
Set xmlHttp = CreateObject("Microsoft.XMLHTTP")
If xmlHttp Is Nothing Then
MsgBox "Unable to create XMLHTTP object, it's probably not installed on this machine", vbCritical
Exit Sub
End If
Set rngURL = Application.InputBox("Select the range of URLs to check", "Select Range", Selection, Type:=8)
On Error GoTo 0
If rngURL Is Nothing Then Exit Sub
stCheck = InputBox("Enter the text to search", "", "")
If Len(stCheck) = 0 Then Exit Sub
For Each cll In rngURL.Cells
If CheckURL(xmlHttp, cll.Value, stCheck) Then
cll.Offset(, 1).Value = 1
End If
Next cll
End Sub
Private Function CheckURL(ByRef xmlHttp As Object, ByVal URL As String, ByVal stCheck As String) As Boolean
Dim stResult As String
If Not LCase$(URL) Like "http://*" Then
URL = "http://" & URL
End If
xmlHttp.Open "GET", URL, False
xmlHttp.Send ""
If xmlHttp.readyState = 4 Then
If xmlHttp.Status = 200 Then
stResult = xmlHttp.responseText
If InStr(1, stResult, stCheck, vbBinaryCompare) > 0 Then
CheckURL = True
End If
End If
End If
End Function
Many thanks in advance.

VBA in Excel 365: XMLHTTP - Can't search for presence of multiple strings for URLs?

So I have a long list of URLs in an Excel sheet. Using a VBA macro I'm checking for the presence of a number of strings in each URL's HTML. I need to know which URL has which combination of strings in them.
I'm outputting the presence or absence of the strings in the main table as essentially: URL being looked at - col 1, presence of 1st string in that URL? - col 2, presence of 2nd string in that URL? - col 3
(And if I can work out how to get this code working, I'll be checking for a few more strings and outputting their presence or absence in further columns).
The strings I'm searching for are encoded into the VBA string.
My code works fine if I'm just searching for the 1st string.
When I adjust the code to search for a 2nd string as well, it doesn't output as it should if my 2nd string is present too.
I'm using a PC, MS Excel 2007 (it's the MS 365 suite on a Lenovo brought this year with Windows 10. I think I'm messing up something easy in the code here, but I can't quite work out how to search for multiple strings rather than just one, and output which ones are present in separate columns next to the URL in question.
Here's the whole code:
Option Explicit
Sub SearchForString()
Dim rngURL As Range
Dim cll As Range
Dim stCheck As String
Dim xmlHttp As Object
On Error Resume Next
Set xmlHttp = CreateObject("Microsoft.XMLHTTP")
If xmlHttp Is Nothing Then
MsgBox "Unable to create XMLHTTP object, it's probably not installed on this machine", vbCritical
Exit Sub
End If
Set rngURL = Application.InputBox("Select the range of URLs to check", "Select Range", Selection, Type:=8)
On Error GoTo 0
If rngURL Is Nothing Then Exit Sub
stCheck = "Stringtofind1"
If Len(stCheck) = 0 Then Exit Sub
For Each cll In rngURL.Cells
If CheckURL(xmlHttp, cll.Value, stCheck) Then
cll.Offset(, 1).Value = "1st string found" 'This all works fine
End If
' ** Start of Bit I can't get to work **
' ** I want to check for a second string, and see if that's here. But I can't seem to get this bit to work? **
If CheckURL(xmlHttp, cll.Value, "2ndStringtofind") Then
cll.Offset(, 2).Value = "2nd string also found"
End If
' ** End of Bit I can't get to work **
Next cll
End Sub
Private Function CheckURL(ByRef xmlHttp As Object, ByVal URL As String, ByVal stCheck As String) As Boolean
Dim stResult As String
If Not LCase$(URL) Like "https://*" Then
URL = "https://" & URL
End If
xmlHttp.Open "GET", URL, False
xmlHttp.Send ""
If xmlHttp.readyState = 4 Then
If xmlHttp.Status = 200 Then
stResult = xmlHttp.responseText
If InStr(1, stResult, stCheck, vbBinaryCompare) > 0 Then
CheckURL = True 'Returns position where string is found!
End If
End If
End If
End Function
To make it easier to read, here's the bit where I think that I'm struggling when I add it to the script:
If CheckURL(xmlHttp, cll.Value, "2ndStringtofind") Then
cll.Offset(, 2).Value = "2nd string also found"
End If
2ndStringtofind isn't a variable, but the actual text I'm looking for.
All help gratefully received in debugging this code. Thanks guys.

Excel macro to search a website with excel data and extract specific results and then loop for next value for another webiste

I have replicated the code in Excel macro to search a website with excel data and extract specific results and then loop for next value, although I get a error on the line URL_Get_SKU_Query1 = entityRange.Offset(0, 1).Value2 stating "object variable or with block variable not set"
So I am just trying to replicate the code for another website.
This code pulls in a certain text and spits out a value from the webiste.
So I would like to enter in MFR SKU in sheet 1 as such:
Name // SKU // Price
WaterSaverFaucet // SS902BC
After I have created a macro button on sheet 2 and clicking it
Then have it spit out the price.
So that it ends up like this below:
Name // SKU // Price
WaterSaverFaucet // SS902BC // 979.08
I would need this in order to look up multiple items on a website.
Sub LoopThroughBusinesses1()
Dim i As Integer
Dim SKU As String
For i = 2 To Sheet1.UsedRange.Rows.Count
SKU = Sheet1.Cells(i, 2)
Sheet1.Cells(i, 3) = URL_Get_SKU_Query1(SKU)
Next i
End Sub
Function URL_Get_SKU_Query1(strSearch As String) As String ' Change it from a Sub to a Function that returns the desired string
' strSearch = Range("a1") ' This is now passed as a parameter into the Function
Dim entityRange As Range
With Sheet2.QueryTables.Add( _
Connection:="URL;https://www.neobits.com/SearchBySKU.aspx?SearchText=" & strSearch & "&safe=active", _
Destination:=Sheet2.Range("A1")) ' Change this destination to Sheet2
.BackgroundQuery = True
.TablesOnlyFromHTML = True
.Refresh BackgroundQuery:=False
.SaveData = True
End With
' Find the Range that has "Price"
Set entityRange = Sheet2.UsedRange.Find("Price")
' Then return the value of the cell to its' right
URL_Get_SKU_Query1 = entityRange.Offset(0, 1).Value2
' Clear Sheet2 for the next run
Sheet2.UsedRange.Delete
End Function
Your logic is flawed unfortunately. You cannot simply take the mechanism from one webpage and assume it works for the next. In this case the solution you are trying will not work. When you enter a SKU into search what actually happens is a page re-direct (302). Not the construction of an url as you have tried. You are getting the error you see primarily due to hitting a page not found - though surfaces due to your element not being found on the 404 page.
Instead, you can use the construct the page in question actually uses for initial url and then you can use xmlhttp which will follow the re-direct as follows:
VBA:
Option Explicit
Public Sub GetPrices()
Dim xhr As XMLHTTP60, html As HTMLDocument, ws As Worksheet, i As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set xhr = New XMLHTTP60
Set html = New HTMLDocument
Dim allData()
allData = ws.UsedRange.Value
With xhr
For i = 2 To UBound(allData, 1)
.Open "GET", "https://www.neobits.com/search?keywords=" & allData(i, 2), False
.send
Dim price As Object
html.body.innerHTML = .responseText
Set price = html.querySelector("#main_price")
If Not price Is Nothing Then
allData(i, 3) = price.innerText
Else
allData(i, 3) = "No price found"
End If
Set price = Nothing
Next
End With
ws.Cells(1, 1).Resize(UBound(allData, 1), UBound(allData, 2)) = allData
End Sub
I assume your page set-up, in Sheet1, is as follows:
Required project references:
The two references bounded in red are required. Press Alt+F11 to open the VBE and then go Tools > References and add references. You may have a different version number for xml library - in which case reference will need changing as will code references of
Dim xhr As XMLHTTP60
and
New XMLHTTP60
To run this code:
Press Alt+F11 to open the VBE > Right click in project explorer > Add standard module. Paste code into that standard module > Select anywhere inside the code and press F5, or hit the green Run arrow in the ribbon.
You could further develop, for example, to handle non 200 status codes:
Option Explicit
Public Sub GetPrices()
Dim xhr As XMLHTTP60, html As HTMLDocument, ws As Worksheet, i As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set xhr = New XMLHTTP60
Set html = New HTMLDocument
Dim allData(), price As Object
allData = ws.UsedRange.Value
With xhr
For i = 2 To UBound(allData, 1)
.Open "GET", "https://www.neobits.com/search?keywords=" & allData(i, 2), False
.send
If .Status <> 200 Then
allData(i, 3) = "Status not succeeded" '<== Little bit loose but you get the idea.
Else
html.body.innerHTML = .responseText
Set price = html.querySelector("#main_price")
If Not price Is Nothing Then
allData(i, 3) = price.innerText
Else
allData(i, 3) = "No price found"
End If
Set price = Nothing
End If
Next
End With
ws.Cells(1, 1).Resize(UBound(allData, 1), UBound(allData, 2)) = allData
End Sub
' Find the Range that has "Entity Type:"
Set entityRange = Sheet2.UsedRange.Find("Lists At:")
' Then return the value of the cell to its' right
URL_Get_SKU_Query1 = entityRange.Offset(0, 1).Value2
The problem is that Range.Find may not find what you're looking for, for various reasons. Always specify the optional parameters to that function, since it otherwise "conveniently remembers" the values from the last time it was invoked - either from other VBA code, or through the Excel UI (IOW there's no way to be 100% sure of what values it's going to be running with if you don't specify them). But even then, if Range.Find doesn't find what it's looking for, it will return Nothing - and you can't just assume that will never happen!
But, reading closer...
' Find the Range that has "Entity Type:"
Set entityRange = Sheet2.UsedRange.Find("Lists At:")
Someone's lying. Read the comment. Now read the code. Who's telling the truth? Don't write comments that say "what" - have comments say "why", and let the code say "what". Otherwise you have situations like that, where it's impossible to tell whether the comment is outdated or the code isn't right, at least not without looking at the worksheet.
In any case, you need to make sure entityRange isn't Nothing before you try to make a member call against it:
If Not entityRange Is Nothing Then
URL_Get_SKU_Query1 = entityRange.Offset(0, 1).Value2
End If

Getting HTML Source with Excel-VBA

I would like to direct an excel VBA form to certain URLs, get the HTML source and store that resource in a string. Is this possible, and if so, how do I do it?
Yes. One way to do it is to use the MSXML DLL - and to do that you need to add a reference to the Microsoft XML library via Tools->References.
Here's some code that displays the content of a given URL:
Public Sub ShowHTML(ByVal strURL)
On Error GoTo ErrorHandler
Dim strError As String
strError = ""
Dim oXMLHTTP As MSXML2.XMLHTTP
Set oXMLHTTP = New MSXML2.XMLHTTP
Dim strResponse As String
strResponse = ""
With oXMLHTTP
.Open "GET", strURL, False
.send ""
If .Status <> 200 Then
strError = .statusText
GoTo CleanUpAndExit
Else
If .getResponseHeader("Content-type") <> "text/html" Then
strError = "Not an HTML file"
GoTo CleanUpAndExit
Else
strResponse = .responseText
End If
End If
End With
CleanUpAndExit:
On Error Resume Next ' Avoid recursive call to error handler
' Clean up code goes here
Set oXMLHTTP = Nothing
If Len(strError) > 0 Then ' Report any error
MsgBox strError
Else
MsgBox strResponse
End If
Exit Sub
ErrorHandler:
strError = Err.Description
Resume CleanUpAndExit
End Sub
Just an addition to the above response. The question was how to get the HTML source which the stated answer does not actually provide.
Compare the contents of oXMLHTTP.responseText with the source code in a browser for URL "http://finance.yahoo.com/q/op?s=T+Options". They do not match and even the returned values are different. (This should be executed after hours to avoid changes during the trading day.)
If I find a way to perform this task the basic code will be posted.
Compact getHTTP function
Below is a compact & generic function that will return HTTP response from a specified URL to, for example:
return the HTML Source of a web page,
JSON response from an API URL,
parse a text file at a URL, etc.
This does not require any VBA References since MSXML2 is used as a late-bound object.
Public Function getHTTP(ByVal url As String) As String
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", url, False: .Send
getHTTP = StrConv(.responseBody, vbUnicode)
End With
End Function
Note that this basic function has no validation or error handling, as those are the parts that can vary considerably depending on which URL you're hitting.
If desired, check the value of .Status after the .Send) to check for success codes like 0 or 200, and also you can setup an error trap with On Error Goto... (never Resume Next!)
Example Usage:
This procedure scrapes this Stack Overflow page for the current score of this question.
Sub demo_getVoteCount()
Const answerID$ = 2522760
Const url_SO = "https://stackoverflow.com/a/" & answerID
Dim html As String, startPos As Long, voteCount As Variant
html = getHTTP(url_SO) 'get html from url
startPos = InStr(html, "answerid=""" & answerID) 'locate this answer
startPos = InStr(startPos, html, "vote-count-post") 'locate vote count
startPos = InStr(startPos, html, ">") + 1 'locate value
voteCount=Mid(html,startPos,InStr(startPos,html,"<")-startPos) 'extract score
MsgBox "Answer #" & answerID & " has a score of " & voteCount & "."
End Sub
Of course in reality there are far better ways to get the score of an answer than the example above, such as this way.)

Sort dead hyperlinks in Excel with VBA?

The title says it:
I have an excel Sheet with an column full of hyperlinks. Now I want that an VBA Script checks which hyperlinks are dead or work and makes an entry into the next columns either with the text 404 Error or active.
Hopefully someone can help me because I am not really good at VB.
EDIT:
I found # http://www.utteraccess.com/forums/printthread.php?Cat=&Board=84&main=1037294&type=thread
A solution which is made for word but the Problem is that I need this solution for Excel. Can someone translate this to Excel solution?
Private Sub testHyperlinks()
Dim thisHyperlink As Hyperlink
For Each thisHyperlink In ActiveDocument.Hyperlinks
If thisHyperlink.Address <> "" And Left(thisHyperlink.Address, 6) <> "mailto" Then
If Not IsURLGood(thisHyperlink.Address) Then
Debug.Print thisHyperlink.Address
End If
End If
Next
End Sub
Private Function IsURLGood(url As String) As Boolean
' Test the URL to see if it is good
Dim request As New WinHttpRequest
On Error GoTo IsURLGoodError
request.Open "GET", url
request.Send
If request.Status = 200 Then
IsURLGood = True
Else
IsURLGood = False
End If
Exit Function
IsURLGoodError:
IsURLGood = False
End Function
First add a reference to Microsoft XML V3 (or above), using Tools->References. Then paste this code:
Option Explicit
Sub CheckHyperlinks()
Dim oColumn As Range
Set oColumn = GetColumn() ' replace this with code to get the relevant column
Dim oCell As Range
For Each oCell In oColumn.Cells
If oCell.Hyperlinks.Count > 0 Then
Dim oHyperlink As Hyperlink
Set oHyperlink = oCell.Hyperlinks(1) ' I assume only 1 hyperlink per cell
Dim strResult As String
strResult = GetResult(oHyperlink.Address)
oCell.Offset(0, 1).Value = strResult
End If
Next oCell
End Sub
Private Function GetResult(ByVal strUrl As String) As String
On Error Goto ErrorHandler
Dim oHttp As New MSXML2.XMLHTTP30
oHttp.Open "HEAD", strUrl, False
oHttp.send
GetResult = oHttp.Status & " " & oHttp.statusText
Exit Function
ErrorHandler:
GetResult = "Error: " & Err.Description
End Function
Private Function GetColumn() As Range
Set GetColumn = ActiveWorkbook.Worksheets(1).Range("A:A")
End Function
Gary's code is perfect, but I would rather use a public function in a module and use it in a cell as function. The advantage is that you can use it in a cell of your choice or anyother more complex function.
In the code below I have adjusted Gary's code to return a boolean and you can then use this output in an =IF(CHECKHYPERLINK(A1);"OK";"FAILED"). Alternatively you could return an Integer and return the status itself (eg.: =IF(CHECKHYPERLINK(A1)=200;"OK";"FAILED"))
A1: http://www.whatever.com
A2: =IF(CHECKHYPERLINK(A1);"OK";"FAILED")
To use this code please follow Gary's instructions and additionally add a module to the workbook (right click on the VBAProject --> Insert --> Module) and paste the code into the module.
Option Explicit
Public Function CheckHyperlink(ByVal strUrl As String) As Boolean
Dim oHttp As New MSXML2.XMLHTTP30
On Error GoTo ErrorHandler
oHttp.Open "HEAD", strUrl, False
oHttp.send
If Not oHttp.Status = 200 Then CheckHyperlink = False Else CheckHyperlink = True
Exit Function
ErrorHandler:
CheckHyperlink = False
End Function
Please also be aware that, if the page is down, the timeout can be long.

Resources