I used to download data from "nseindia" website using attached macro code.
Macro does the following job.
take the inputs from "Input" sheet about indices name, start date and end date.
Generate the URL according to input data. Dynamically created URL is mentioned in Second sheet.
In this URL "NIFTY%20BANK&fromDate=30-09-2017&toDate=31-10-2017" is dynamically created part of whole URL based on user input.
Download data available at this link in CSV format, in to "Total Return Index" sheet.
The CSV file is opened in web browser itself.
Old - visit nseindia site -> go to "product" tab on top -> select "Indices" -> Select "Historic Data" -> Select "View Total Returns" -> Select Index as "Nifty 50" from drop down -> enter start and end date -> click "Get Data" button -> click "Download file in csv format"
old website : https://www1.nseindia.com/products/content/equities/indices/historical_index_data.htm
New - visit niftyindices site - to to"reports" tab on top -> select "Historical Data" -> select "Total Returns Index Value" from drop down on left top side -> select start date and end date -> press "submit" button -> click on "csv format"
new Website : https://www.niftyindices.com/reports/historical-data
Can someone advise..
You don't seem to have attempted anything.
I am only posting this educational post hoping that it might inspire you to write your own code in the future.
As I said in the comments the website you're trying to scrape, offers a very convenient way to download the data you want, through an HTTP request.
An HTTP request is a structured way to request something from a server. In this case we want to send two dates to the server and get the corresponding search results.
To find out how this request should look like, you have to inspect the network traffic when the submit button is clicked. You can do that through your browser's developer tools (F12):
If you go through the Headers and the Params of the request you will see how the url, the body and the headers should look like. In this particular case, all the parameters are sent in the request's body in JSON format and most of the headers are not essential to the success of the request.
The body of the request looks like so:
{'name':'NIFTY 50','startDate':'01-Feb-2020','endDate':'29-Feb-2020'}
In this particular case the response's payload is a json string inside another json string. You can inspect its structure using a tool like this. Here's how the second json looks like:
It basically consists of one item per requested date and each item consists of 7 parameters and their corresponding values.
CODE
Option Explicit
Sub nse()
Dim req As New MSXML2.XMLHTTP60
Dim url As String, defaultPayload As String, requestPayload As String, results() As String
Dim payloadJSON As Object, responseJSON As Object, item As Object
Dim startD As Date, endD As Date
Dim key As Variant
Dim i As Long, j As Long
Dim rng As Range
startD = "01/02/2020" 'change the date to whichever you want
endD = "29/02/2020" 'change the date to whichever you want
url = "https://www.niftyindices.com/Backpage.aspx/getHistoricaldatatabletoString"
defaultPayload = "{'name':'NIFTY 50','startDate':'','endDate':''}"
Set rng = ThisWorkbook.Worksheets("Name of your Worksheet").Range("A2") 'use the name of the worksheet in which you want the results to be printed.
Set payloadJSON = JsonConverter.ParseJson(defaultPayload)
payloadJSON("startDate") = Day(startD) & "-" & MonthName(Month(startD), True) & "-" & Year(startD) '01-Feb-2020
payloadJSON("endDate") = Day(endD) & "-" & MonthName(Month(endD), True) & "-" & Year(endD) '29-Feb-2020
requestPayload = JsonConverter.ConvertToJson(payloadJSON)
With req
.Open "POST", url, False
.setRequestHeader "Content-Type", "application/json; charset=UTF-8"
.setRequestHeader "X-Requested-With", "XMLHttpRequest"
.send requestPayload
Set responseJSON = JsonConverter.ParseJson(.responseText)
End With
Debug.Print responseJSON("d")
Set responseJSON = JsonConverter.ParseJson(responseJSON("d"))
ReDim results(1 To responseJSON.Count, 1 To 7)
i = 0
For Each item In responseJSON
i = i + 1
j = 0
For Each key In item
j = j + 1
results(i, j) = item(key)
Next key
Next item
rng.Resize(UBound(results, 1), UBound(results, 2)) = results
End Sub
The above code for demonstration purposes prints the results starting from cell A2 of an empty excel worksheet. You can modify the code to best fit your needs.
You will need to add the following references to your project (VBE>Tools>References):
Microsoft XML version 6.0
Microsoft Scripting Runtime
You will also need to add this JSON parser to your project. Follow the installation instructions in the link and you should be set to go.
RESULTS
Here's a sample of the results for the period 1/2/2020 to 29/2/2020
Related
I have a list of Twitter urls in Column A, for which I am trying to pull some information off, however I am having a lot of trouble. I want to pull off everything in yellow
I am not sure if it is due to having the wrong classes or due to the Twitter Urls NOT opening in excel. If I double click a url in excel and try to open it I get this error message.
The link works fine when I copy and paste them into the browser. I have read some information on the web that states that a HKEY on the PC may need changing LINK. The problem I have the person I am building this for is not pc literate and will struggle, to do any fix.
I have always used the below code for scraping and it has never failed me. When it does pull data off Twitter, I get an error message, see image below columns D + E. I am assuming this is making some contact to Twitter but can not access the page to extract the data. I am NOT using IE as it no longer works with twitter, I am using a MSXML2.ServerXMLHTTP.
This is what i am using to extract the data, it is the same for all the columns, just the class changes and if it is a Span or a child.
''''Element 3 Column D
If doc.getElementsByClassName("css-1dbjc4n")(0) Is Nothing Then
wsSheet.Cells(StartRow + myCounter, 4).Value = "-"
Else
wsSheet.Cells(StartRow + myCounter, 4).Value = doc.getElementsByClassName("css-1dbjc4n")(0).getElementsByTagName("Span")(0).innerText
End If
Public Function NewHTMLDocument(strURL As String) As Object
Dim objHTTP As Object, objHTML As Object, strTemp As String
Set objHTTP = CreateObject("MSXML2.ServerXMLHTTP")
objHTTP.setOption(2) = 13056
objHTTP.Open "GET", strURL, False
objHTTP.send
If objHTTP.Status = 200 Then
strTemp = objHTTP.responseText
Set objHTML = CreateObject("htmlfile")
objHTML.body.innerHTML = strTemp
Set NewHTMLDocument = objHTML
Else
'There has been an error
End If
End Function
QUESTION
Is the problem due to the urls not opening in excel, or is it because the data is dynamic and it can not be extracted?
Twitter Link 1
Twitter Link 2
As always thanks for having a look and my apologies in advance for NOT adding HTML snippet as it would not let me post, I could not find the error so removed the html, it was stating that a URL had been shortened, but could not find it so removed the whole html snippet in order to post.
UPDATE
I thought this link was in my post, but I must have removed it when I removed the HTML Snippet. I found this on Stackoverflow but could not get it to work form me, nothing would extract Link
I'am new to vba in excel. I managed to write a code which scrapes data from a given website and stores it in an excel worksheet. The code works almost every time i run it but sometimes i get an error:
Object variable or With block variable not set.
So it is very challenging to find out why. Also if you could help me out speeding the code (maybe not using clipboard to pastspecial the table, but I don't know how to use otherwise...). Also for you to know, once the error is promted if I click end and run the sub again, it runs without any problem. The error is promted (sometimes only, most of the time the sub works fine) in the specified line with this comment: 'This is the line which throws the error. I appreciate any kind of help guys, thank you in advance :).
The sub looks like this:
Sub PaData()
Dim c As Object, D As Object, H As Object, PID$, SD As Date, FC$, cf$
Set c = CreateObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
Set D = CreateObject("HTMLFile")
Set H = CreateObject("WinHTTP.WinHTTPRequest.5.1")
FC = "EXA" ' This is used to generate the website url
cf = VBMa ' This is another sub which works fine and i need it to get into the webiste
' Get the page
H.SetAutoLogonPolicy 0
H.SetClientCertificate "CURRENT_USER\MY\" & Environ("USERNAME")
H.Open "GET", "https://confidentialwebsite=" & FC
H.setRequestHeader "Cookie", cf
H.Send
H.waitForResponse
' Put the response into the HTML object
D.body.innerHTML = H.ResponseText
' Copy _only a given Table
c.setText D.getElementByID("giventable").outerHTML 'This is the line which throws the error
c.PutInClipBoard
' Paste into the sheet, remove hyperlinks and unMerge all data
Sheets("Pdata").Cells.Delete
Sheets("Pdata").[A1].PasteSpecial
Sheets("Pdata").Cells.Hyperlinks.Delete
Sheets("Pdata").Cells.UnMerge
'update time
Sheets("SM").Range("B1").Value = Sheets("Pdata").Range("D2").Value + 2 / 24
End Sub
When doing an HTTPRequest to a webserver, you should always verify the return status of this call using .Status (see: this )
An overview of the possible status numbers can be found here: https://httpstatuses.com/ or here: https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#1xx_Informational_response
Hi I'm finding difficulty in switching between tabs in chrome selenium VBA coding.
I have this website : http://dgftebrc.nic.in:8100/BRCQueryTrade/index.jsp
Where-in i need to input IEC Code : 0906008051
Shipping Bill no :
3929815
3953913
3979509
And then enter the Captcha(this i can do by giving the user 10 seconds of time)
After all this i need to click on "Show Details"(by pressing ctrl) so that it opens in next tab and then copy a specific data from that tab and then close it.
Then a new Shipping bill no is to be taken from the excel sheet and then the process repeats.
I could manage this much of a code :
Option Explicit
Public Sub multipletabtest()
Dim bot As WebDriver
Dim keys As New Selenium.keys
Dim count As Long
Set bot = New WebDriver
bot.Start "Chrome"
'count = 1
'While (Len(Range("A" & count)) > 0)
bot.Get "http://dgftebrc.nic.in:8100/BRCQueryTrade/index.jsp"
bot.FindElementByXPath("//input[#type='text'][#name='iec']").SendKeys "0906008051"
bot.FindElementByXPath("//input[#type='text'][#name='sno']").SendKeys "3929815"
bot.Wait 10000 'Time to enter the captcha
bot.FindElementByCss("[value='Show Details']").SendKeys keys.Control, keys.Enter 'Take the value from final result sheet
bot.SwitchToNextWindow
ThisWorkbook.Sheets("Sheet1").Range("B1") = bot.FindElementByXPath("//text()[.='Used']/ancestor::td[1]").Text
'Range("B" & count) = bot.FindElementByXPath("//text()[.='Used']/ancestor::td[1]").Text 'To extract the data
'bot.Window.Close
bot.SwitchToPreviousWindow
bot.FindElementByXPath("//input[#type='text'][#name='sno']").Clear
bot.FindElementByXPath("//input[#type='text'][#name='sno']").SendKeys "3953913"
bot.FindElementByCss("[value='Show Details']").SendKeys keys.Control, keys.Enter
bot.SwitchToNextWindow
ThisWorkbook.Sheets("Sheet1").Range("B2") = bot.FindElementByXPath("//text()[.='Used']/ancestor::td[1]").Text
'Range("B" & count) = bot.FindElementByXPath("//text()[.='Used']/ancestor::td[1]").Text
'count = count + 1
'Wend
bot.Quit
End Sub
Please look into this and help me out.
Thanks .
XMLHTTP request:
I would side step this and avoid as well overhead of using a browser.
Make an initial GET xhr request to http://dgftebrc.nic.in:8100/BRCQueryTrade/brcIssuedTrade.jsp and extract the JSESSION cookie (you can probably use .getResponseHeader("Set-Cookie") ) then make a subsequent POST xhr request to same url but provide the cookie in the request-headers and in the body ensure you pass the relevant param values.
The param values required are:
data = {
'iec': '0906008051',
'sno': '3929815',
'billid': '',
'brcstat': 'A',
'captext': 'a7m3p',
'B1': 'Show Details'
}
In VBA, the POST body for the .send body would look like:
"iec=0906008051&sno=3929815&billid&brcstat=A&captext=a7m3p&B1=Show Details"
Where iec and sno are dynamic and you would concatenate into the body of each request, perhaps in a loop.
"iec=" & iec & "&sno=" & sno & "&billid&brcstat=A&captext=" & capText & "=Show Details"
If the captcha changes then you can prompt the user to pass in the value for captext param and pass that in the body as well.
Don't think any additional headers are required though you might add an user-agent
e.g
.setRequestHeader "User-Agent" , "Mozilla/5.0"
Learn about xmlhttp (XHR) requests here or Google (enter the following in the search bar vba jsession cookie and hit enter).
The response from the POST request will contain the html within which is your desired table(s).
Selenium:
If you wish to continue with Selenium, and assuming you have enabled the Show Details button with your prior actions, you can use the following attribute = value selector:
bot.FindElementByCss("[value='Show Details']").click
I copied code to get stock data from hsbc derivatives. (https://www.youtube.com/watch?v=IOzHacoP-u4)
I changed the URL (to hsbc) and that I want to find the value based on the ID, not the class name.
I changed the ID name.
I get
"Run Time Error-91:
Object variable or With block variable not set".
Sub Get_Web_Data()
Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim price As Variant
' Website to go to.
website = "https://www.hsbc-zertifikate.de/home/details#!/isin:DE000TR8S293"
' Create the object that will make the webpage request.
Set request = CreateObject("MSXML2.XMLHTTP")
' Where to go and how to go there - probably don't need to change this.
request.Open "GET", website, False
' Get fresh data.
'request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
' Send the request for the webpage.
request.send
' Get the webpage response data into a variable.
response = StrConv(request.responseBody, vbUnicode)
' Put the webpage into an html object to make data references easier.
html.body.innerHTML = response
' Get the price from the specified element on the page.
price = html.getElementById("kursdaten20").innerText
' Output the price into a message box.
MsgBox price
End Sub
You are searching for element id kursdaten20 that does not exist on the page.
html.getElementById("kursdaten20") returns Nothing and you are accessing the innerText property with Nothing/Null reference.
When searching for element, you could add a check if the element exists:
'query the document
Set element = html.getElementById("kursdaten20")
If Not element Is Nothing Then
' Get the price from the specified element on the page.
price = element.innerText
' Output the price into a message box.
MsgBox price
Else
' no price
MsgBox "no price"
End If
I'm afraid it's more complicated than what you expected it to be.
I will assume that the info you're after is this:
Geldkurs (1 Stuck)4,01 EUR
Briefkurs (1 Stuck)4,11 EUR
These fields are not static. They are dynamically updated (I guess whenever a transaction is made) by scripts. That's why you will not find their ID's in the source code of the HTML page.
There is however a way to get the info you need by replicating the HTTP request that is being sent to the server whenever these fields are updated.
To find this request and its parameters you need to inspect the network traffic, when you load the page, using your browser's developer tools.
This request returns a (quite poorly structured IMHO) JSON response containing another JSON (!!) which contains the info you want, in HTML format(!!). Here's how the second JSON looks like:
To make things even worse, the names that you can see under state, change with each request you send.
So, firstly you need to parse the json response. Then you need to parse the json within the initial json response to get your hands on the HTML code. Then, using an HTML document object, you can easily get access to the HTML table, containing the desired information.
Here's the way to do it:
Option Explicit
Sub hsbc()
Dim req As New WinHttpRequest
Dim doc As New HTMLDocument
Dim table As HTMLTable
Dim cell As HTMLTableCell
Dim parsedJSON As Object
Dim key As Variant
Dim htmlCode As String
Dim url As String, reqBody As String, resp As String
url = "https://www.hsbc-zertifikate.de/web-htde-tip-zertifikate-main/?components=YW1wZWw6UnRQdWxsQ29tcG9uZW50KCdhbmltQ3NzLGMtaGlnaGxpZ2h0LXVwLGMtaGlnaGxpZ2h0LWRvd24sYy1oaWdobGlnaHQtY2hhbmdlZCcpO3NlYXJjaGhpbnRfbW9iaWxlOlNlYXJjaEhpbnRNb2JpbGVDb21wb25lbnQoJ3VsU2VhcmNoU21hbGwvc2VhcmNoSW5wdXRNb2JpbGUnKTtzZWFyY2hoaW50OlNlYXJjaEhpbnRDb21wb25lbnQoJ3VsU2VhcmNoRnVsbC9zZWFyY2gtaGVhZGVyJyk7aXNpbjpSZXNwb25zaXZlU25hcHNob3RDb21wb25lbnQoJ2ZhbHNlJyk%3D&pagepath=https%3A%2F%2Fwww.hsbc-zertifikate.de%2Fhome%2Fdetails%23!%2Fisin%3ADE000TR8S293&magnoliaSessionId=B22F70D76986AB6BACDF110E4E7A724C.public7a&v-1566551332455"
reqBody = "v-browserDetails=1&theme=hsbc&v-appId=myApp&v-sh=1080&v-sw=1920&v-cw=1920&v-ch=550&v-curdate=1566551332455&v-tzo=-180&v-dstd=60&v-rtzo=-120&v-dston=true&v-vw=50&v-vh=50&v-loc=https%3A%2F%2Fwww.hsbc-zertifikate.de%2Fhome%2Fdetails%23!%2Fisin%3ADE000TR8S293&v-wn=myApp-0.5436432044490654"
With req
.Open "POST", url, False
.setRequestHeader "Content-type", "application/x-www-form-urlencoded"
.send reqBody
resp = .responseText
End With
Set parsedJSON = JsonConverter.ParseJson(resp)
Set parsedJSON = JsonConverter.ParseJson(parsedJSON("uidl"))
For Each key In parsedJSON("state").Keys
If parsedJSON("state")(key)("contentMode") = "HTML" Then
htmlCode = htmlCode & parsedJSON("state")(key)("text")
End If
Next key
doc.body.innerHTML = htmlCode
Set table = doc.getElementsByTagName("table")(0)
Debug.Print table.Rows(2).innerText
Debug.Print table.Rows(3).innerText
End Sub
For demonstration purposes the result will be printed in your immediate window.
You will need to add the following references to your project (VBE>Tools>References):
Microsoft WinHTTP Services version 5.1
Microsoft HTML Objects Library
Microsoft Scripting Runtime
You will also need to add this JSON parser to your project. Follow the installation instructions in the link and you should be set to go.
First of all tank you for all help here! nice place to be and learn a lot!
My Problem: I want to return Geo coordinates from an address. Like here: https://www.myengineeringworld.net/2014/06/geocoding-using-vba-google-api.html
I created an API Key and this code is working fine! But we have a Firewall and the Firewall is blocking Excel communication to the google maps server on the "request.send" point (and i couldn't find an IP Range or something that you can put on the whitelist), is there a chance to get the XML code by your browser? Because i can use the link from "Request.Open "GET" and i see the XML content in the Internet Explorer.
Is there any other way than request.send
Thank you guys!
Code:
Function GetCoordinates(Address As String) As String
'-----------------------------------------------------------------------------------------------------
'This function returns the latitude and longitude of a given address using the Google Geocoding API.
'The function uses the "simplest" form of Google Geocoding API (sending only the address parameter),
'so, optional parameters such as bounds, language, region and components are NOT used.
'In case of multiple results (for example two cities sharing the same name), the function
'returns the FIRST OCCURRENCE, so be careful in the input address (tip: use the city name and the
'postal code if they are available).
'NOTE: As Google points out, the use of the Google Geocoding API is subject to a limit of 40,000
'requests per month, so be careful not to exceed this limit. For more info check:
'https://cloud.google.com/maps-platform/pricing/sheet
'In order to use this function you must enable the XML, v3.0 library from VBA editor:
'Go to Tools -> References -> check the Microsoft XML, v3.0.
'If you don't have the v3.0 use any other version of it (e.g. v6.0).
'2018 Update: In order to use this function you will now need a valid API key.
'Check the next link that guides you on how to acquire a free API key:
'https://www.myengineeringworld.net/2018/02/how-to-get-free-google-api-key.html
'2018 Update 2 (July): The EncodeURL function was added to avoid problems with special characters.
'This is a common problem with addresses that are from Greece, Serbia, Germany and other countries.
'Written By: Christos Samaras
'Date: 12/06/2014
'Last Updated: 09/08/2018
'E-mail: xristos.samaras#gmail.com
'Site: https://www.myengineeringworld.net
'-----------------------------------------------------------------------------------------------------
'Declaring the necessary variables.
'The first 2 variables using 30 at the end, corresponding to the "Microsoft XML, v3.0" library
'in VBA (msxml3.dll). If you use any other version of it (e.g. v6.0), then declare these variables
'as XMLHTTP60 and DOMDocument60 respectively.
Dim ApiKey As String
Dim Request As New XMLHTTP30
Dim Results As New DOMDocument30
Dim StatusNode As IXMLDOMNode
Dim LatitudeNode As IXMLDOMNode
Dim LongitudeNode As IXMLDOMNode
'Set your API key in this variable. Check this link for more info:
'https://www.myengineeringworld.net/2018/02/how-to-get-free-google-api-key.html
ApiKey = "Your API Key goes here!"
'Check that an API key has been provided.
If ApiKey = vbNullString Or ApiKey = "Your API Key goes here!" Then
GetCoordinates = "Invalid API Key"
Exit Function
End If
'Generic error handling.
On Error GoTo errorHandler
'Create the request based on Google Geocoding API. Parameters (from Google page):
'- Address: The address that you want to geocode.
'Note: The EncodeURL function was added to allow users from Greece, Poland, Germany, France and other countries
'geocode address from their home countries without a problem. The particular function (EncodeURL),
'returns a URL-encoded string without the special characters.
Request.Open "GET", "https://maps.googleapis.com/maps/api/geocode/xml?" _
& "&address=" & Application.EncodeURL(Address) & "&key=" & ApiKey, False
'Send the request to the Google server.
Request.send
'Read the results from the request.
Results.LoadXML Request.responseText
'Get the status node value.
Set StatusNode = Results.SelectSingleNode("//status")
'Based on the status node result, proceed accordingly.
Select Case UCase(StatusNode.Text)
Case "OK" 'The API request was successful. At least one geocode was returned.
'Get the latitude and longitude node values of the first geocode.
Set LatitudeNode = Results.SelectSingleNode("//result/geometry/location/lat")
Set LongitudeNode = Results.SelectSingleNode("//result/geometry/location/lng")
'Return the coordinates as a string (latitude, longitude).
GetCoordinates = LatitudeNode.Text & ", " & LongitudeNode.Text
Case "ZERO_RESULTS" 'The geocode was successful but returned no results.
GetCoordinates = "The address probably not exists"
Case "OVER_QUERY_LIMIT" 'The requestor has exceeded the limit of 2500 request/day.
GetCoordinates = "Requestor has exceeded the server limit"
Case "REQUEST_DENIED" 'The API did not complete the request.
GetCoordinates = "Server denied the request"
Case "INVALID_REQUEST" 'The API request is empty or is malformed.
GetCoordinates = "Request was empty or malformed"
Case "UNKNOWN_ERROR" 'Indicates that the request could not be processed due to a server error.
GetCoordinates = "Unknown error"
Case Else 'Just in case...
GetCoordinates = "Error"
End Select
End Function '
You can try to detect the proxy settings and add these to the XMLHTTP request.
See this link for a good starting point (too much code to paste here): http://www.tech-archive.net/Archive/VB/microsoft.public.vb.winapi.networks/2004-11/0005.html