I have a list of Twitter urls in Column A, for which I am trying to pull some information off, however I am having a lot of trouble. I want to pull off everything in yellow
I am not sure if it is due to having the wrong classes or due to the Twitter Urls NOT opening in excel. If I double click a url in excel and try to open it I get this error message.
The link works fine when I copy and paste them into the browser. I have read some information on the web that states that a HKEY on the PC may need changing LINK. The problem I have the person I am building this for is not pc literate and will struggle, to do any fix.
I have always used the below code for scraping and it has never failed me. When it does pull data off Twitter, I get an error message, see image below columns D + E. I am assuming this is making some contact to Twitter but can not access the page to extract the data. I am NOT using IE as it no longer works with twitter, I am using a MSXML2.ServerXMLHTTP.
This is what i am using to extract the data, it is the same for all the columns, just the class changes and if it is a Span or a child.
''''Element 3 Column D
If doc.getElementsByClassName("css-1dbjc4n")(0) Is Nothing Then
wsSheet.Cells(StartRow + myCounter, 4).Value = "-"
Else
wsSheet.Cells(StartRow + myCounter, 4).Value = doc.getElementsByClassName("css-1dbjc4n")(0).getElementsByTagName("Span")(0).innerText
End If
Public Function NewHTMLDocument(strURL As String) As Object
Dim objHTTP As Object, objHTML As Object, strTemp As String
Set objHTTP = CreateObject("MSXML2.ServerXMLHTTP")
objHTTP.setOption(2) = 13056
objHTTP.Open "GET", strURL, False
objHTTP.send
If objHTTP.Status = 200 Then
strTemp = objHTTP.responseText
Set objHTML = CreateObject("htmlfile")
objHTML.body.innerHTML = strTemp
Set NewHTMLDocument = objHTML
Else
'There has been an error
End If
End Function
QUESTION
Is the problem due to the urls not opening in excel, or is it because the data is dynamic and it can not be extracted?
Twitter Link 1
Twitter Link 2
As always thanks for having a look and my apologies in advance for NOT adding HTML snippet as it would not let me post, I could not find the error so removed the html, it was stating that a URL had been shortened, but could not find it so removed the whole html snippet in order to post.
UPDATE
I thought this link was in my post, but I must have removed it when I removed the HTML Snippet. I found this on Stackoverflow but could not get it to work form me, nothing would extract Link
Related
I am stuck with this. I am trying to learn how to use xmlhttprequest with VBA. My intention is to access the following url: "https://micuil.net/".
Once there, I can send values to the following fields, as seen in the image:
image1
After pressing the button, the page displays the following information with the data I want to extract:
image2 (return value)
I am able to complete what you see in image 1 by code, but I don't know how to get the result (image 2). Any help please?
Function CuitEstimado2(sDni As Variant, sSexo As String) As String
Set oDoc = New HTMLDocument
Set oHTTP = New XMLHTTP60
sSexo = IIf(sSexo = "f", "Mujeres", "Varones")
sUrl = "https://micuil.net/"
oHTTP.Open "GET", sUrl, False
oHTTP.send
sRespuesta = oHTTP.responseText
oDoc.body.innerHTML = sRespuesta
oDoc.getElementById("dni").Value = sDni
oDoc.getElementsByName("sexo")(0).setAttribute("checked") = True
oDoc.getElementById("btn").parentElement.Click
End Function
Without having done heavy research for the specific website you are using, it may still be possible that the result you are looking for can be found within the response text (assuming I am understanding your dilemma properly).
First, it is recommended to perform a loop through the innerHTML of each element contained in the HTMLDoc. During your loop, use the InStr function to locate the result code as a string. It is a good idea to store each element that contains that result code into a collection for easy access after the loop.
It does get a bit more complicated from here, because the innerHTML of the corresponding elements may differ when pasting them into Notepad vs. trying to utilize in the VBE. However, if you can identify any unique JS (or other language) characters that will consistently indicate the location of the result code for each request, you may be able to use the Mid function to return the desired result into a string variable.
I'am new to vba in excel. I managed to write a code which scrapes data from a given website and stores it in an excel worksheet. The code works almost every time i run it but sometimes i get an error:
Object variable or With block variable not set.
So it is very challenging to find out why. Also if you could help me out speeding the code (maybe not using clipboard to pastspecial the table, but I don't know how to use otherwise...). Also for you to know, once the error is promted if I click end and run the sub again, it runs without any problem. The error is promted (sometimes only, most of the time the sub works fine) in the specified line with this comment: 'This is the line which throws the error. I appreciate any kind of help guys, thank you in advance :).
The sub looks like this:
Sub PaData()
Dim c As Object, D As Object, H As Object, PID$, SD As Date, FC$, cf$
Set c = CreateObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
Set D = CreateObject("HTMLFile")
Set H = CreateObject("WinHTTP.WinHTTPRequest.5.1")
FC = "EXA" ' This is used to generate the website url
cf = VBMa ' This is another sub which works fine and i need it to get into the webiste
' Get the page
H.SetAutoLogonPolicy 0
H.SetClientCertificate "CURRENT_USER\MY\" & Environ("USERNAME")
H.Open "GET", "https://confidentialwebsite=" & FC
H.setRequestHeader "Cookie", cf
H.Send
H.waitForResponse
' Put the response into the HTML object
D.body.innerHTML = H.ResponseText
' Copy _only a given Table
c.setText D.getElementByID("giventable").outerHTML 'This is the line which throws the error
c.PutInClipBoard
' Paste into the sheet, remove hyperlinks and unMerge all data
Sheets("Pdata").Cells.Delete
Sheets("Pdata").[A1].PasteSpecial
Sheets("Pdata").Cells.Hyperlinks.Delete
Sheets("Pdata").Cells.UnMerge
'update time
Sheets("SM").Range("B1").Value = Sheets("Pdata").Range("D2").Value + 2 / 24
End Sub
When doing an HTTPRequest to a webserver, you should always verify the return status of this call using .Status (see: this )
An overview of the possible status numbers can be found here: https://httpstatuses.com/ or here: https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#1xx_Informational_response
I want to enter data into a web page field.
There are 2 data entry fields on the web page.
I entered data in the first section.
However, I cannot enter data in the other field.
Information you need to review the site :
Site : http://splan.byethost7.com/mesaj_yaz.php?fno=1&kip=yeni
user :kurucu password :a11111
I entered the data in the "BAŞLIK" field.
However I am unable to write data to the field named "İÇERİK"
I want to enter data in this field using an Excel macro. But I can't enter data using the code:
Sub deneme()
Dim URL As String
On Error Resume Next
URL = "http://splan.byethost7.com/mesaj_yaz.php?fno=1&kip=yeni"
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = 1
For i = 1 To Range("A" & Rows.Count).End(3).row
If Cells(i, 1) <> Empty Then
ie.navigate URL
Call bekle
ie.Document.getElementById("mesaj_icerik").Value = "TEST"
ie.document.getElementsByName("mesaj_baslik").Item(0).Value = Cells(i, 1)
'IE.Document.getElementsByClassName("submitButton")(0).Click
Call bekle
End If
Next i
' IE.Quit
Set ie = Nothing
End Sub
Sub bekle()
With ie
Do Until .readyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
End With
Application.Wait (Now + TimeValue("00:00:02"))
End Sub
As I said in my comments, there are several issues with your code, although the overall effort is good.
Firstly, this ie.document.getElementsid("mesaj_baslik") is not a valid method. If what you want is to access a single HTML element with a unique ID, then the method you need to use is ie.Document.getElementById("the element's ID").
Assuming that what I wrote above is what you were trying to achieve, you have to keep in mind that the .getElementById() method, returns only one single element.
So this ie.Document.getElementById("the element's ID").item(0) would give you an error saying:
Object doesn't support this property or method.
Even if all the aforementioned mistakes were corrected, I still don't see any elements with an ID equal to "mesaj_baslik", in the HTML snippet that you have provided. In fact this particular string is nowhere to be found in the HTML.
So even if the method was correct, this ie.Document.getElementById("mesaj_baslik"), would return Nothing.
Secondly, although your usage of the method ie.document.getElementsByName() is correct, there is no element with a Name attribute being equal to "formlar_mesajyaz", in the HTML snippet you have provided.
In fact this string seems to be a Class name rather than anything else. In this case you would have to use this method: ie.document.getElementsByClassName().
Now, from the info you have provided, the best I can do is assume that, what you want to do is enter some text in the textArea element. To do that, you can use the element's ID like so:
ie.Document.getElementById("mesaj_icerik").Value = "TEST"
I copied code to get stock data from hsbc derivatives. (https://www.youtube.com/watch?v=IOzHacoP-u4)
I changed the URL (to hsbc) and that I want to find the value based on the ID, not the class name.
I changed the ID name.
I get
"Run Time Error-91:
Object variable or With block variable not set".
Sub Get_Web_Data()
Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim price As Variant
' Website to go to.
website = "https://www.hsbc-zertifikate.de/home/details#!/isin:DE000TR8S293"
' Create the object that will make the webpage request.
Set request = CreateObject("MSXML2.XMLHTTP")
' Where to go and how to go there - probably don't need to change this.
request.Open "GET", website, False
' Get fresh data.
'request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
' Send the request for the webpage.
request.send
' Get the webpage response data into a variable.
response = StrConv(request.responseBody, vbUnicode)
' Put the webpage into an html object to make data references easier.
html.body.innerHTML = response
' Get the price from the specified element on the page.
price = html.getElementById("kursdaten20").innerText
' Output the price into a message box.
MsgBox price
End Sub
You are searching for element id kursdaten20 that does not exist on the page.
html.getElementById("kursdaten20") returns Nothing and you are accessing the innerText property with Nothing/Null reference.
When searching for element, you could add a check if the element exists:
'query the document
Set element = html.getElementById("kursdaten20")
If Not element Is Nothing Then
' Get the price from the specified element on the page.
price = element.innerText
' Output the price into a message box.
MsgBox price
Else
' no price
MsgBox "no price"
End If
I'm afraid it's more complicated than what you expected it to be.
I will assume that the info you're after is this:
Geldkurs (1 Stuck)4,01 EUR
Briefkurs (1 Stuck)4,11 EUR
These fields are not static. They are dynamically updated (I guess whenever a transaction is made) by scripts. That's why you will not find their ID's in the source code of the HTML page.
There is however a way to get the info you need by replicating the HTTP request that is being sent to the server whenever these fields are updated.
To find this request and its parameters you need to inspect the network traffic, when you load the page, using your browser's developer tools.
This request returns a (quite poorly structured IMHO) JSON response containing another JSON (!!) which contains the info you want, in HTML format(!!). Here's how the second JSON looks like:
To make things even worse, the names that you can see under state, change with each request you send.
So, firstly you need to parse the json response. Then you need to parse the json within the initial json response to get your hands on the HTML code. Then, using an HTML document object, you can easily get access to the HTML table, containing the desired information.
Here's the way to do it:
Option Explicit
Sub hsbc()
Dim req As New WinHttpRequest
Dim doc As New HTMLDocument
Dim table As HTMLTable
Dim cell As HTMLTableCell
Dim parsedJSON As Object
Dim key As Variant
Dim htmlCode As String
Dim url As String, reqBody As String, resp As String
url = "https://www.hsbc-zertifikate.de/web-htde-tip-zertifikate-main/?components=YW1wZWw6UnRQdWxsQ29tcG9uZW50KCdhbmltQ3NzLGMtaGlnaGxpZ2h0LXVwLGMtaGlnaGxpZ2h0LWRvd24sYy1oaWdobGlnaHQtY2hhbmdlZCcpO3NlYXJjaGhpbnRfbW9iaWxlOlNlYXJjaEhpbnRNb2JpbGVDb21wb25lbnQoJ3VsU2VhcmNoU21hbGwvc2VhcmNoSW5wdXRNb2JpbGUnKTtzZWFyY2hoaW50OlNlYXJjaEhpbnRDb21wb25lbnQoJ3VsU2VhcmNoRnVsbC9zZWFyY2gtaGVhZGVyJyk7aXNpbjpSZXNwb25zaXZlU25hcHNob3RDb21wb25lbnQoJ2ZhbHNlJyk%3D&pagepath=https%3A%2F%2Fwww.hsbc-zertifikate.de%2Fhome%2Fdetails%23!%2Fisin%3ADE000TR8S293&magnoliaSessionId=B22F70D76986AB6BACDF110E4E7A724C.public7a&v-1566551332455"
reqBody = "v-browserDetails=1&theme=hsbc&v-appId=myApp&v-sh=1080&v-sw=1920&v-cw=1920&v-ch=550&v-curdate=1566551332455&v-tzo=-180&v-dstd=60&v-rtzo=-120&v-dston=true&v-vw=50&v-vh=50&v-loc=https%3A%2F%2Fwww.hsbc-zertifikate.de%2Fhome%2Fdetails%23!%2Fisin%3ADE000TR8S293&v-wn=myApp-0.5436432044490654"
With req
.Open "POST", url, False
.setRequestHeader "Content-type", "application/x-www-form-urlencoded"
.send reqBody
resp = .responseText
End With
Set parsedJSON = JsonConverter.ParseJson(resp)
Set parsedJSON = JsonConverter.ParseJson(parsedJSON("uidl"))
For Each key In parsedJSON("state").Keys
If parsedJSON("state")(key)("contentMode") = "HTML" Then
htmlCode = htmlCode & parsedJSON("state")(key)("text")
End If
Next key
doc.body.innerHTML = htmlCode
Set table = doc.getElementsByTagName("table")(0)
Debug.Print table.Rows(2).innerText
Debug.Print table.Rows(3).innerText
End Sub
For demonstration purposes the result will be printed in your immediate window.
You will need to add the following references to your project (VBE>Tools>References):
Microsoft WinHTTP Services version 5.1
Microsoft HTML Objects Library
Microsoft Scripting Runtime
You will also need to add this JSON parser to your project. Follow the installation instructions in the link and you should be set to go.
I am a self-taught, amateur programmer, and I am new to this forum. Please bear with me.
About two years ago, I wrote a simple Excel vba program to login in to a website and grab a customer statement in the form of a .csv file. My program utilizes GET and POST requests. This program worked perfectly (for my needs) until about three weeks ago, when it unfortunately broke on me. The program could not get through the initial GET request. Specifically, it would break on the getReq.send line.
I came across this post:
Login into website using MSXML2.XMLHTTP instead of InternetExplorer.Application with VBA
Here, I learned that you can use "Msxml2.XMLHTTP.6.0" instead of "Msxml2.ServerXMLHTTP.6.0". I modified my code accordingly, eliminating the need to parse cookies after the Get request, and it worked! But I have no idea. Even though I got it to work, I do not feel like I have learned much in the process.
Some information to note:
My original program broke on my work computer (WindowsXP).
Figuring that it may be an XP issue, and in the market for a new machine anyway, I updated to a new computer running Windows7. The program still did not work, though I received a different error message.
I ran my code on a Windows10 computer and it worked fine.
I use identical code to connect to various other websites and it works fine, regardless of what operating system.
So, my specific questions:
Why might the code work with Msxml2.XMLHTTP.6.0 but not Msxml2.ServerXMLHTTP.6.0?
And why might the code have broken in the first place?
Why would the code work on one particular website, but no another?
Any insight would be greatly appreciated. I have attached my code (with login info X'd out).
Sub RCGInquiry()
Dim postReq, getReq, cookies
Dim p0 As Integer, p1 As Integer, temp As String
Dim result As String, respHead As String
Set getReq = CreateObject("Msxml2.ServerXMLHTTP.6.0")
'Set getReq = CreateObject("Msxml2.XMLHTTP.6.0")
' Visit homepage so we can find the cookies
getReq.Open "GET", "https://www.rcginquiry.com/sfs/Entry", False
getReq.send
respHead = getReq.getAllResponseHeaders
Debug.Print respHead
' Need to parse the cookie from Respone Headers
cookies = ""
p0 = 1
Do While InStr(p0, respHead, "Set-Cookie:") > 0
p0 = InStr(p0, respHead, "Set-Cookie:") + 11
p1 = InStr(p0, respHead, Chr(10))
temp = Trim(Mid(respHead, p0, p1 - p0))
cookies = cookies & temp & "; "
Loop
cookies = Left(cookies, Len(cookies) - 2)
' Debug.Print cookies
' Login
Set postReq = CreateObject("Msxml2.ServerXMLHTTP.6.0")
'Set postReq = CreateObject("Msxml2.XMLHTTP.6.0")
postReq.Open "POST", "https://www.rcginquiry.com/sfs/Entry", False
postReq.setRequestHeader "Cookie", cookies
postReq.setRequestHeader "Content-type", "application/x-www-form-urlencoded" 'send appropriate Headers
postReq.send "Usrid=XXXX&Psswd=XXXX" ' send login info
'-------------------------------------------------------------------------------
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim FSO As Object
Dim myFile As Object
Dim path As String
Dim y As Integer
curDate = Format(Date, "mm_dd_yy")
' Download CSV
postReq.Open "POST", "https://www.rcginquiry.com/sfs/Downloads/tmp.csv?filetype=POS&format=MFA20&heading=true&allaccts=true&junk=tmp.csv", False
postReq.setRequestHeader "Cookie", cookies 'must resend cookies so it knows i am logged in
postReq.setRequestHeader "Content-type", "application/x-www-form-urlencoded"
postReq.send "filetype=POS&format=MFA20&heading=true&allaccts=true&junk=temp.csv" 'url query parameters
' Writes responseText to a .csv file
Set FSO = CreateObject("Scripting.FileSystemObject")
Set myFile = FSO.createtextfile("C:\Users\Adam\Desktop\POSITION\" & curDate & ".csv", True)
myFile.write (postReq.responseText)
myFile.Close
Set FSO = Nothing
Set myFile = Nothing
End Sub
This blog post says that ServerXLMHTTP will not work through a proxy on the client, but XMLHTTP will. In Excel VBA I was using ServerXLMHTTP.6.0, and it was failing for some clients inside corporate networks, whereas XMLHTTP worked.
Welcome to VBA and StackOverflow. Your notes are thorough and so the only thing I can suggest is that you check your proxy settings.
https://support.microsoft.com/en-us/help/289481/you-may-need-to-run-the-proxycfg-tool-for-serverxmlhttp-to-work
That link was buried in this link
https://support.microsoft.com/en-us/help/290761/frequently-asked-questions-about-serverxmlhttp
which you were referred to by ComIntern