Excel VBA Automation Error when scraping web data from site - excel

I have created an application that scrapes data from a website.
The code runs as expected for a small number of iterations.
However, when the code is executed multiple times it crashes with an Automation Error.
My goal is to get daily auction results from this website, for multiple "border directions" and multiple "dates".
Having inspected the HTTP requests that are being sent when browsing the site I have been able to automate them and get the data I'm interested in, as follows:
Sub seecao()
Dim request As New WinHttpRequest
Dim htmlDoc As New MSHTML.HTMLDocument
Dim tableTest As HTMLTable
Dim rowHTML As HTMLTableRow
Dim cellHTML As HTMLTableCell
Dim requestURL As String
Dim responseJSON As Object
Dim reqBody As String
Dim reqResponse As String
Dim seecaoBordersArray As Variant
Dim areaOut As String
Dim areaIn As String
Dim i As Long
Dim j As Long
Dim k As Long
Dim tempRng As Range
seecaoBordersArray = ThisWorkbook.Worksheets("Help").Range("seecaoBordersRng")
requestURL = "http://seecao.com/views/ajax"
Application.ScreenUpdating = False
For k = LBound(seecaoBordersArray, 1) To UBound(seecaoBordersArray, 1) Step 1
areaOut = seecaoBordersArray(k, 1)
areaIn = seecaoBordersArray(k, 2)
Set tempRng = ThisWorkbook.Worksheets("seecao").Cells.Find(What:=areaOut + areaIn, After:=ActiveCell, LookIn:=xlFormulas, LookAt:=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False, SearchFormat:=False)
i = tempRng.Offset(2, 0).Row
j = tempRng.Offset(2, 0).Column
reqBody = ThisWorkbook.Worksheets("Help").Range("A1:A1") + Format(Date + 1, "yyyy-mm-dd") + ThisWorkbook.Worksheets("Help").Range("A2:A2") + areaOut + "+-+" + areaIn + ThisWorkbook.Worksheets("Help").Range("A3:A3")
With request
.Open "POST", requestURL, False
.setRequestHeader "Host", "seecao.com"
.setRequestHeader "Accept", "application/json, text/javascript, */*; q=0.01"
.setRequestHeader "Referer", "http://seecao.com/daily-results"
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded; charset=UTF-8"
.send reqBody
reqResponse = .responseText
End With
Set responseJSON = JsonConverter.ParseJson(reqResponse)
htmlDoc.body.innerHTML = responseJSON(3)("data")
Set tableTest = htmlDoc.getElementsByTagName("table")(0)
For Each rowHTML In tableTest.Rows
If rowHTML.Cells(0).innerText <> "Date " Then
ThisWorkbook.Worksheets("seecao").Cells(i, j) = rowHTML.Cells(2).innerText
ThisWorkbook.Worksheets("seecao").Cells(i, j + 1) = rowHTML.Cells(5).innerText
i = i + 1
End If
Next rowHTML
Next k
Application.ScreenUpdating = True
End Sub
The body of the HTTP request consists of various parameters which form a huge string. I have split this string in 3 parts to be able to edit the "date" and "border direction" and i've stored these 3 parts in ThisWorkbook.Worksheets("Help").Range("A1:A1"), ThisWorkbook.Worksheets("Help").Range("A2:A2") and ThisWorkbook.Worksheets("Help").Range("A3:A3").
All possible "border directions" are stored in a 2x2 named range seecaoBordersArray. The code loops through this range, produces the corresponding request body, sends the request and gets the response.
The response is in JSON format. The part of the response in which I'm interested is an HTML table located in the JSON string.
The JSON string is then parsed to get the HTML table. Finally the HTML table is parsed and the data of interest are being written on a worksheet.
As I said in the beginning of this post, the code works as expected most of the times, but it randomly crashes with an Automation Error depending on how many iterations are being executed.
For example it never crashes when I only execute it for 2 combinations of "borders directions".
It seems to me like some kind of performance issue rather than a bug.
Any suggestions on how to improve performance and avoid these crashes would be greatly appreciated.

Related

API Loop through vba excel

I'm very new to API coding but I have created an API Code that inputs pressure and outputs temperature for a specific refrigerant on a website. However, I can only call for one refrigerant at a time, I was wondering if it is possible to create a string with all the refrigerant id (refId) numbers, so the API code could run through all of them in one long run?
Option Explicit
Public Function GetPressureFromTemp(ByVal Temperature As Double) As Double
Dim body As String
body = "{""Temperature"":""" & Temperature & """,""refId"":""r13"",""temperatureUnit"":""fahrenheit"",""pressureUnit"":""psi"","
body = body & """pressureReferencePoint"":""gauge"","
body = body & """pressureCalculationPoint"":""bubble"",""gaugeType"":""dry"",""altitudeInMeter"":0}"
With CreateObject("MSXML2.XMLHTTP")
.Open "POST", "https://reftools.danfoss.com/api/ref-slider/Pressure?refId=r13", False
.setRequestHeader "content-type", "application/json; charset=utf-8"
.send body
GetPressureFromTemp = CDbl(.responseText)
End With
End Function
Public Sub test()
Dim n As Double
Dim a As Double
a = 4984
n = -150.8
Do Until n >= 74.93
n = n + 0.1
'Debug.Print GetPressureFromTemp(n)
Range("J" & a) = GetPressureFromTemp(n)
a = a + 1
Loop
End Sub
I'm not sure how to loop through each ref ID, however I have the whole list containing the refrigerant id

Scrape economic data from investing website

I am working on a code to get data from : https://www.investing.com/economic-calendar/core-durable-goods-orders-59
I have got the code for getting this via httprequest: but looking to change this to work for the economic data (link above) is there any way I can get the same for the economic indicators??
code below:
Option Explicit
Sub Export_Table()
'Html Objects---------------------------------------'
Dim htmlDoc As MSHTML.HTMLDocument
Dim htmlBody As MSHTML.htmlBody
Dim ieTable As MSHTML.HTMLTable
Dim Element As MSHTML.HTMLElementCollection
'Workbooks, Worksheets, Ranges, LastRow, Incrementers ----------------'
Dim wb As Workbook
Dim Table As Worksheet
Dim i As Long
Set wb = ThisWorkbook
Set Table = wb.Worksheets("Sheet1")
'-------------------------------------------'
Dim xmlHttpRequest As New MSXML2.XMLHTTP60 '
'-------------------------------------------'
i = 2
'Web Request --------------------------------------------------------------------------'
With xmlHttpRequest
.Open "POST", "https://www.investing.com/instruments/HistoricalDataAjax", False
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.setRequestHeader "X-Requested-With", "XMLHttpRequest"
.send "curr_id=951681&smlID=1695217&header=CLNX+Historical+Data&st_date=01%2F01%2F2017&end_date=03%2F01%2F2019&interval_sec=Monthly&sort_col=date&sort_ord=DESC&action=historical_data"
If .Status = 200 Then
Set htmlDoc = CreateHTMLDoc
Set htmlBody = htmlDoc.body
htmlBody.innerHTML = xmlHttpRequest.responseText
Set ieTable = htmlDoc.getElementById("curr_table")
For Each Element In ieTable.getElementsByTagName("tr")
Table.Cells(i, 1) = Element.Children(0).innerText
Table.Cells(i, 2) = Element.Children(1).innerText
Table.Cells(i, 3) = Element.Children(2).innerText
Table.Cells(i, 4) = Element.Children(3).innerText
Table.Cells(i, 5) = Element.Children(4).innerText
Table.Cells(i, 6) = Element.Children(5).innerText
Table.Cells(i, 7) = Element.Children(6).innerText
i = i + 1
DoEvents: Next Element
End If
End With
Set xmlHttpRequest = Nothing
Set htmlDoc = Nothing
Set htmlBody = Nothing
Set ieTable = Nothing
Set Element = Nothing
End Sub
Public Function CreateHTMLDoc() As MSHTML.HTMLDocument
Set CreateHTMLDoc = CreateObject("htmlfile")
End Function
I have used the excel tool Power Query for this very thing. It is also called Get & Transform Data. I don't like using vba for doing this type of thing.
To make it work:
In Excel Go to Data>Get Data>From Other Sources>From Web.
Enter the URL
Wait for the webpage to load and then pick the table that you want.
This website took awhile to load, but it did work for me.
Choose "Load" which goes directly to the sheet, or "Transform Data" to manipulate the data in Power Query. There are many options in power query such as split columns, filter data, Calculate Columns and ...
I would avoid the overhead of setting up a permanent connection and simply continue using XHR. With the data > from web, you cannot grab more rows than are present on the initial landing. If however you go with XHR, you can issue POST requests to get more data. The code below utilizes a loop to retrieve additional results beyond the immediate visible on the page.
When you press the Show more link there is a POST request for an additional 6 rows which uses the latest date from the current set of results as part of the POST body. The response returned is JSON. Rather than bringing in a JSON parser, given the standard nature of the JSON, and that I am already using regex to clean the date format in column 1 to put in the POST body, I use two simple regexes to extract the html for the next results table from the response, and to check whether there are more results.
The format of the JSON is:
{
"historyRows": "<tr>…..</tr>",
"hasMoreHistory": "1"
}
Or
{
"historyRows": "<tr>…..</tr>",
"hasMoreHistory": false
}
So, I do some cleaning of the extracted html in order to not confuse the html parser within MSHTML. Furthermore, I add in an id to identify the table I have constructed, so I can continue to use an id css selector (#) list within my UpdateDateResults function.
I initially oversize an array to store each retrieved table which I update ByRef. I loop requesting more results until either there are no more results, there is an error parsing the maximum date from the last retrieved table column 1, or until my specified earliest date for data retrieval falls within the date range of the latest returned table.
Finally, I write the results array out to the sheet in one go.
N.B. You can target the table by its id. It looks like the number at the end of the id could be the same as for the goods url, lending itself to generalizing the code below to work for other goods.
VBA:
Option Explicit
Public Sub GetInvestingInfo()
'tools > references > Microsoft HTML Object Library
Dim html As MSHTML.HTMLDocument, xhr As Object
Const STARTDATE As Date = "2019-11-25" 'Adjust as required. DateAdd("yyyy", -2, Date) 2 years back. This means may have some earlier months in _
batch that spans the start date but won't issue an additional request after this
Set xhr = CreateObject("MSXML2.XMLHTTP")
Set html = New MSHTML.HTMLDocument
With xhr
.Open "GET", "https://www.investing.com/economic-calendar/core-durable-goods-orders-59", False
.setRequestHeader "User-Agent", "Safari/537.36"
.send
html.body.innerHTML = .responseText
End With
Dim firstTable As Boolean, r As Long, results() As Variant
ReDim results(1 To 100000, 1 To 5)
'process initial table and update results, get cleaned date needed for request for more results
firstTable = True
Dim latestDate As String
UpdateDateResults latestDate, results, firstTable, r, html
Dim re As Object, maxDate As String, hasMoreHistory As Boolean, s As String
Set re = CreateObject("VBScript.RegExp")
With re
.Global = True
.MultiLine = False
End With
maxDate = cleanedDate(latestDate, re)
hasMoreHistory = True
Dim errorDate As Date
errorDate = DateAdd("d", 1, Date)
Do While maxDate >= STARTDATE And maxDate < errorDate 'break loop using pre-defined earliest date, error with date conversion, or when no more rows found
Application.Wait (Now + TimeSerial(0, 0, 1)) 'Pause
s = GetMoreRows(xhr, Format$(maxDate, "YYYY-MM-DD")) 'max a POST request for more data
re.Pattern = "hasMoreHistory"":(""?.*?""?)}" 'Check if there are more rows still available. "1" for yes, false for no
hasMoreHistory = (re.Execute(s)(0).submatches(0) <> False)
If Not hasMoreHistory Then Exit Do
re.Pattern = "historyRows"":""(.*)"","
html.body.innerHTML = "<table id=""me"">" & Replace$(re.Execute(s)(0).submatches(0), "\/", "/") & "</table>" 'fix html and feed into html variable
UpdateDateResults latestDate, results, firstTable, r, html
maxDate = cleanedDate(latestDate, re) 'convert value retrieved from last row in date column of table to an actual date
Loop
With ActiveSheet
.Cells.ClearContents
.Cells(1, 1).Resize(r, 5) = results 'Don't bother to resize results as clear all cells before write ou
End With
End Sub
Public Sub UpdateDateResults(ByRef latestDate As String, ByRef results() As Variant, ByRef firstTable As Boolean, ByRef r As Long, ByVal html As MSHTML.HTMLDocument)
Dim table As MSHTML.HTMLTable 'return latest date from function
Set table = html.querySelector("#eventHistoryTable59, #me")
latestDate = table.Rows(table.Rows.Length - 1).Children(0).innerText
Dim i As Long, n As Long, j As Long
n = IIf(firstTable, 0, 1)
For i = n To table.Rows.Length - 1
r = r + 1
For j = 0 To table.Rows(i).Children.Length - 2
results(r, j + 1) = table.Rows(i).Children(j).innerText
Next
Next
firstTable = False
End Sub
Public Function cleanedDate(ByVal dirtyString As String, ByVal re As Object) As Date
re.Pattern = "(^[A-Z][a-z]{2}).*(\d{2}),.(\d{4})(.*)"
On Error GoTo errhand:
If re.test(dirtyString) Then
cleanedDate = CDate(re.Replace(dirtyString, "$2" & Chr$(32) & "$1" & Chr$(32) & "$3"))
Exit Function
End If
errhand:
cleanedDate = DateAdd("d", 1, Date)
End Function
Public Function GetMoreRows(ByVal xhr As Object, ByVal dateStamp As String) As String
With xhr
.Open "POST", "https://www.investing.com/economic-calendar/more-history", False
.setRequestHeader "User-Agent", "Safari/537.36"
.setRequestHeader "x-requested-with", "XMLHttpRequest"
.setRequestHeader "content-type", "application/x-www-form-urlencoded"
.send "eventID=430865&event_attr_ID=59&event_timestamp=" & dateStamp & "+" & Application.WorksheetFunction.EncodeURL("12:30:00") & "&is_speech=0"
GetMoreRows = .responseText
End With
End Function
Regexes (without the double " escaping for VBA):
hasMoreHistory":("?.*?"?)}
historyRows":"(.*)",

Converting weird characters and symbols into normal language in excel

I am using the VBA code to extract information from a website into excel cells, and the numerical information is fine but I have a problem with text strings. I am mostly extracting information from Georgian websites, and the texts with the Georgian language are not properly displayed in excel, so I was wondering if there is any chance (code or something else) I could convert these symbols into proper language.
Sub GetData()
Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim price As Variant
Dim address As Variant
Dim x As Integer
Dim y As Range
x = 1
Do Until x = 9
Set y = Worksheets(1).Range("A21:A200"). _
Find(x, LookIn:=xlValues, lookat:=xlWhole)
website = "https://www.myhome.ge/ka/pr/11247371/iyideba-Zveli-ashenebuli-bina-veraze-T.-WoveliZis-qucha"
' Create the object that will make the webpage request.
Set request = CreateObject("MSXML2.XMLHTTP")
' Where to go and how to go there.
request.Open "GET", website, False
' Get fresh data.
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
' Send the request for the webpage.
request.send
' Get the webpage response data into a variable.
response = StrConv(request.responseBody, vbUnicode)
' Put the webpage into an html object.
html.body.innerHTML = response
' Get info from the specified element on the page.
address = html.getElementsByClassName("address").Item(0).innerText
price = html.getElementsByClassName("d-block convertable").Item(0).innerText
y.Offset(0, 1).Value = address
y.Offset(0, 5).Value = price
x = x + 1
Loop
End Sub
This is the code that I took from a youtube video (https://www.youtube.com/watch?v=IOzHacoP-u4) and slightly modified, and it works, I just have a problem with how excel displays the characters in text strings.
For your issue in the question
Remove this line response = StrConv(request.responseBody, vbUnicode) as it's not required.
Change html.body.innerHTML = response to html.body.innerHTML = request.responseText.
For your issue in comment
To retrieve the ID of the property, it can be retrieved from the class id-container, you will need to perform some string processing though to remove the extract :
propertyID = Trim$(Replace(html.getElementsByClassName("id-container")(0).innerText, ":", vbNullString))
Note: You should try to avoid declaring variable as Variant. innerText property returns a String datatype so you should declare address and price as String.

URL Issue retrieving data quotes in Yahoo finance

The URL from Yahoo is not working when I try to retrieve quotes from a particular stock. There are several discussion about it, However, it seems nothing is shown regarding VBA macro
Sub Get_Data()
Dim URL As String
Dim Ticker As String
Dim http As New WinHttpRequest
Dim sCotes As String
Dim Lignes
Dim Valeurs
Dim i As Long
Dim j As Long
Dim sLigne As String
Dim sValeur As String
Ticker = Range("Ticker")
URL = "https://query1.finance.yahoo.com/v7/finance/download/TECK?period1=1540456339&period2=1571992339&interval=1d&events=history&crumb=kjOZLFv6ch2"
http.Send
sCotes = http.ResponseText
MsgBox sCotes
Lignes = Split(sCotes, Chr(10))
For i = 1 To UBound(Lignes) 'until the end of the Lignes variable
sLigne = Lignes(i)
Valeurs = Split(sLigne, ",")
For j = 0 To UBound(Valeurs) - 1
Select Case j
Case 0
sValeur = DateSerial(CLng(Left(Valeurs(0), 4)), CLng(Mid(Valeurs(0), 6, 2)), CLng(Right(Valeurs(0), 2)))
Case 5
sValeur = CLng(Valeurs(5))
Case Else
sValeur = CDbl(Replace(Valeurs(j), ".", ","))
End Select
Range("A1").Offset(i, j) = sValeur
Application.StatusBar = Format(Cells(i, 1), "Short Date")
Next
Next
Application.StatusBar = False
End Sub
Execution error at the step Http.send : "This method cannot be called until the Open method has been called"
You would need to use the "open" method before attempting to send and GET is perfectly fine. However, a few things....
There is an easier way. The headers worth adding are the User-Agent and one to mitigate being served cached results. The following shows you how to get a json response from the server for a specified time period and write to Excel. Note: You need to concatenate the ticker into the url. You should probably also test the response code from server to ensure successful.
I use jsonconverter.bas as the json parser to handle response. Download raw code from here and add to standard module called JsonConverter . You then need to go VBE > Tools > References > Add reference to Microsoft Scripting Runtime. Remove the top Attribute line from the copied code.
The values for startDate and endDate need to be passed as unix timestamps. #TimWilliams wrote a nice function, toUnix, for converting Date to Unix here which I use. I have added my own function to manage the conversion in the opposite direction.
This method avoids using any session based identifiers so avoids your issue with invalid cookie crumb.
VBA:
Option Explicit
Public Sub GetYahooHistoricData()
Dim ticker As String, ws As Worksheet, url As String, s As String
Dim startDate As Long, endDate As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
ticker = ws.Range("ticker") 'Range A1. Above write out range
endDate = toUnix("2019-10-27")
startDate = toUnix("2018-10-25")
url = "https://query1.finance.yahoo.com/v8/finance/chart/" & ticker & "?region=US&lang=en-US&includePrePost=false&interval=1d&period1=" & startDate & "&period2=" & endDate & "&corsDomain=finance.yahoo.com&.tsrc=finance"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", url, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
s = .responseText
End With
Dim json As Object
Set json = JsonConverter.ParseJson(s)("chart")("result")
Dim dates As Object, results(), rows As Object, adjClose As Object, r As Long, headers()
headers = Array("date", "close", "volume", "open", "high", "low", "adjclose")
Set dates = json(1)("timestamp")
ReDim results(1 To dates.Count, 1 To UBound(headers) + 1)
Set rows = json(1)("indicators")("quote")(1)
Set adjClose = json(1)("indicators")("adjclose")(1)("adjclose")
For r = 1 To dates.Count
results(r, 1) = GetDate(dates(r))
results(r, 2) = rows("close")(r)
results(r, 3) = rows("volume")(r)
results(r, 4) = rows("open")(r)
results(r, 5) = rows("high")(r)
results(r, 6) = rows("low")(r)
results(r, 7) = adjClose(r)
Next
With ws
.Cells(3, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(4, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
End With
End Sub
Public Function GetDate(ByVal t As Variant) As String
GetDate = Format$(t / 86400 + DateValue("1970-01-01"), "yyyy-mm-dd")
End Function
Public Function toUnix(ByVal dt As Variant) As Long
toUnix = DateDiff("s", "1/1/1970", dt)
End Function
Example top 10 rows:
Try replacing this code
URL = "https://query1.finance.yahoo.com/v7/finance/download/TECK?period1=1540456339&period2=1571992339&interval=1d&events=history&crumb=kjOZLFv6ch2"
http.Send
with this code:
set http = Server.Createobject("MSXML2.ServerXMLHTTP.6.0")
URL = "https://query1.finance.yahoo.com/v7/finance/download/TECK?period1=1540456339&period2=1571992339&interval=1d&events=history&crumb=kjOZLFv6ch2"
http.open "POST", URL, False
http.Send
The error is pretty clear: you need to call the open method before the Send method. Also this would be a POST request. You also may need to put these two lines after the open method:
http.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
http.setRequestHeader "Content-Length", 0
The question is about 99% duplicate as the one from here - How can I send an HTTP POST request to a server from Excel using VBA?. Anyway, the mistake is obviously, because the .Send() method simply sends a completely empty Dim http As New WinHttpRequest object.
To make the code work, copy the example from the duplcated question and print the http.ResponseText:
Sub TestMe()
Dim http As Object
Dim url As String
Set http = CreateObject("MSXML2.ServerXMLHTTP.6.0")
url = "https://query1.finance.yahoo.com/v7/finance/download/TECK?period1=1540456339&period2=1571992339&interval=1d&events=history&crumb=kjOZLFv6ch2"
http.Open "POST", url, False
http.Send
MsgBox http.responsetext
End Sub

how to get inner text of html under id?

I am trying to pull data pull inner text under id in excel cell.
This is for XML code.
Sub getelementbyid()
Dim XMLpage As New MSXML2.XMLHTTP60
Dim hdoc As New MSHTML.HTMLDocument
Dim HBEs As MSHTML.IHTMLElementCollection
Dim HBE As MSHTML.IHTMLElement
Dim ha As String
XMLpage.Open "GET","https://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=HAL", False
XMLpage.send
hdoc.body.innerHTML = XMLpage.responseText
ha = hdoc.getelementbyid("open").innerText
Range("K11").Value = ha
Debug.Print ha
End Sub
I expect output value, but it shows --.
Examine the response text. There is a difference in the way the page is rendered in the browser versus what is returned in the ResponseText.
I put the URL into a browser went into dev tools (F12), found the element, and noted the numeric value inside the HTML element.
Then I dumped the response text we're getting in VBA into a cell and copied the entire cell value into Notepad++. If you do that you'll see the initial value inside the #open element is indeed "--".
The real value appears to be getting written into the HTML via JavaScript, which is common practice. There is a JSON object at the top of the page, presumably injected into the document from the back-end of the website upon your request.
So you have to parse the JSON, not the HTML. I've provided code doing just that. Now, there may be a better way to do it, I feel this code is kind of "hacky" but it's getting the job done for your example URL.
Sub getelementbyid()
Dim XMLpage As New MSXML2.XMLHTTP60
Dim hdoc As New MSHTML.HTMLDocument
Dim HBEs As MSHTML.IHTMLElementCollection
Dim HBE As MSHTML.IHTMLElement
Dim ha As String
XMLpage.Open "GET", "https://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=HAL", False
XMLpage.send
'// sample: ,"open":"681.05",
Dim token As String
token = """open"":"""
Dim startPosition As Integer
startPosition = InStr(1, XMLpage.responseText, token)
Dim endPosition As Integer
endPosition = InStr(startPosition, XMLpage.responseText, ",")
Dim prop As String
prop = Mid(XMLpage.responseText, startPosition, endPosition - startPosition)
prop = Replace(prop, """", vbNullString)
prop = Replace(prop, "open:", vbNullString)
Dim val As Double
val = CDbl(prop)
ha = val
Range("K11").Value = ha
Debug.Print ha
End Sub
Here are two methods. 1) Using regex on the return text. Usually frowned upon but perfectly serviceable here. 2) Traditional extract json string and use json parser to parse out value.
The data you want is stored in a json string found both on the webpage and the xmlhtttp response, under the same element:
This means you can treat the html as a string and target just the pattern for the open price using regex as shown below, or parse the xmlhttp request into an html parser, grab the required element, extract its innerText and trim off the whitespace, then pass to a json parser to extract the open price.
In both methods you want to avoid being served cached results so the following header is an important addition to attempt to mitigate for this:
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
There is no need for addtional cell formatting. Full value comes out for both your tickers.
Regex:
It is present in a json string in the response. You can regex it out easily from return text.
Regex explanation:
VBA:
Option Explicit
Public Sub GetClosePrice()
Dim ws As Worksheet, re As Object, p As String, r As String
Set ws = ThisWorkbook.Worksheets("Sheet1")
p = """open"":""(.*?)"""
Set re = CreateObject("VBScript.RegExp")
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=HAL", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
If .Status = 200 Then
r = GetValue(re, .responseText, p)
Else
r = "Failed connection"
End If
End With
ws.Range("K11").Value = r
End Sub
Public Function GetValue(ByVal re As Object, ByVal inputString As String, ByVal pattern As String) As String
With re
.Global = True
.pattern = pattern
If .test(inputString) Then ' returns True if the regex pattern can be matched agaist the provided string
GetValue = .Execute(inputString)(0).submatches(0)
Else
GetValue = "Not found"
End If
End With
End Function
HTML and json parser:
This requires installing code for jsonparser from jsonconverter.bas in a standard module called JsonConverter and then going VBE>Tools>References>Add a reference to Microsoft Scripting Runtime and Microsoft HTML Object Library.
VBA:
Option Explicit
Public Sub GetClosePrice()
Dim ws As Worksheet, re As Object, r As String, json As Object
Set ws = ThisWorkbook.Worksheets("Sheet1")
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=MRF", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
If .Status = 200 Then
Dim html As HTMLDocument
Set html = New HTMLDocument
html.body.innerHTML = .responseText
Set json = JsonConverter.ParseJson(Trim$(html.querySelector("#responseDiv").innerText))
r = json("data")(1)("open")
Else
r = "Failed connection"
End If
End With
ws.Range("K11").Value = r
End Sub

Resources