Web scrape innertext vba - excel

I've tried for a long time to scrape but i've faced a problem.
I've tried to scrape both the value with getattribute.value and tried to do it with the getelementbyID/name/ClassName, but nothing helps
I need help to web scrape the innertext called '0606' from the following html-code:
<td width="100%" nowrap="" colspan="3">
<input name="pg41_PolicyHolder_FogP_PolicyHolderId_FogP_IdentityQualifier"
type="HIDDEN" value="CPR">CPR-nr:
<input name="pg41_PolicyHolder_FogP_PolicyHolderId_FogP_IdentityValue"
type="HIDDEN" value="0606">0606</td>
My code for now is:
Dim CPR As String
CPR = IE.Document.getElementById("pg41_PolicyHolder_FogP_PolicyHolderId_FogP_IdentityValue").innerText
Range("A2").Value = CPR
I also tried this, but this returns the very first input way above my wanted input, and no matter which value I change (1) to, it errors with 91:
CPR= Trim(Doc.getElementsByTagName("input")(1).getAttribute("value"))
Range("A2").Value = CPR
Can anybody help me?
any suggestions for code would help me immensly

Try to get the element by it's name property like below...
IE.document.getElementsByName("pg41_PolicyHolder_FogP_PolicyHolderId_FogP_IdentityValue")

Try selecting the element and then parse the OuterHTML to retrieve the value.
Dim s As String
s = IE.document.querySelector("input[name=pg41_PolicyHolder_FogP_PolicyHolderId_FogP_IdentityValue]").outerHTML
Debug.Print Split(Split(s, "value=" & Chr$(34))(1), Chr$(34))(0)

Related

VBA how to fill in search bar with ID?

Web page has the following HTML without an ID - how do I set the value of the form?
<input name="loanxFindBorrower" onkeypress="if((window.event&&window.event.keyCode==13) ||
(event.which&&event.which==13 )){ findByBorrowerAction()}" type="text" size="25" value="">
I've tried
IE.Document.getElementByTagName("loanxFindBorrower").Value = "New Value"
Aswell as
IE.Document.getElementByTagName("loanxFindBorrower").Focus
IE.Document.getElementByTagName("loanxFindBorrower").Value = "NewValue"
Any help would be much appreciated
You should use getElementsByName() to get the element. It returns a collection, you'll need to specify the index to get the element you want.
For example, if it's the first element with name "loanxFindBorrower" in the page, the index is 0:
IE.Document.getElementsByName("loanxFindBorrower")(0).Value = "New Value"

Excel VBA: Working with iFrame via IE Automation

I have a project that I am working on where I am trying to automate a site's behavior via Excel's VBA. So far, I know how to initialize a web browser from VBA, navigate to a website, and perform a simple task such as clicking on an item using the getElementById function and click method. However, I wanted to know how can I go about working with an embedded object(s) that is inside of an iframe.
For example, here is an overview of what the tree structure looks like via HTML source code. Of course, there are a lot more tags, but at least you can get an idea of what it is that I am trying to do.
<html>
<body>
<div>
<iframe class="abs0 hw100" scrolling="no" allowtransparency="true" id="Ifrm1568521068980" src="xxxxx" title="mailboxes - Microsoft Exchange" ldhdlr="1" cf="t" frameborder="0"> <<< ----- This is where I am stuck
<div>
<tbody>
<tr>
etc.....
etc.....
etc.....
</tr>
<tbody>
<div>
----------- close tags
I guess the biggest problem for me is to learn how to manipulate an embedded object(s) that is enclosed inside of an iframe because all of this is still new to me and I am not an expert in writing programs in VBA. Any help or guidance in the right direction will help me out a lot. Also, if you need more information or clarification, please let me know.
Here the code that I have written so far:
Option Explicit
'Added: Microsoft HTML Object Library
'Added: Microsoft Internet Controls
'Added: Global Variable
Dim URL As String
Dim iE As InternetExplorer
Sub Main()
'Declaring local variables
Dim objCollection As Object
Dim objElement As Object
Dim counterClock As Long
Dim iframeDoc As MSHTML.HTMLDocument 'this variable is from stackoverflow (https://stackoverflow.com/questions/44902558/accessing-object-in-iframe-using-vba)
'Predefining URL
URL = ""
'Created InternetExplorer Object
Set iE = CreateObject("InternetExplorer.Application")
'Launching InternetExplorer and navigating to the URL.
With iE
.Visible = True
.Navigate URL
'Waiting for the site to load.
loadingSite
End With
'Navigating to the page I need help with that contains the iFrame structure.
iE.Document.getElementById("Menu_UsersGroups").Click
'Waiting for the site to load.
loadingSite
'Set iframeDoc = iE.frames("iframename").Document '<<-- this is where the error message happens: 438 - object doesn't support this property or method.
'The iFrame of the page does not have a name. Instead "Ifrm1" is the ID of the iFrame.
End Sub
'Created a function that will be used frequently.
Function loadingSite()
Application.StatusBar = URL & " is loading. Please wait..."
Do While iE.Busy = True Or iE.ReadyState <> 4: Debug.Print "loading": Loop
Application.StatusBar = URL & " Loaded."
End Function
Please note: My knowledge of programming in VBA is on an entry-level. So, please bear with me if I don't understand your answer the first time around. Plus, any nifty documentation or videos about this topic will help me a lot as well. Either way, I'm very determined to learn this language as it is becoming very fun and interesting to me especially when I can get a program to do exactly what it was designed to do. :)
You try to use the following code to get elements from the iframe:
IE.Document.getElementsbyTagName("iframe")(0).contentDocument.getElementbyId("txtcontentinput").Value = "BBB"
IE.Document.getElementsbyTagName("iframe")(0).contentDocument.getElementbyId("btncontentSayHello").Click
Sample code as below:
index page:
<input id="txtinput" type="text" /><br />
<input id="btnSayHello" type="button" value="Say Hello" onclick="document.getElementById('result').innerText = 'Hello ' + document.getElementById('txtinput').value" /><br />
<div id="result"></div><br />
<iframe width="500px" height="300px" src="vbaiframecontent.html">
</iframe>
vbaframeContent.html
<input id="txtcontentinput" type="text" /><br />
<input id="btncontentSayHello" type="button" value="Say Hello" onclick="document.getElementById('content_result').innerText = 'Hello ' + document.getElementById('txtcontentinput').value" /><br />
<div id="content_result"></div>
The VBA script as below:
Sub extractTablesData1()
'we define the essential variables
Dim IE As Object, Data As Object
Dim ticket As String
Set IE = CreateObject("InternetExplorer.Application")
With IE
.Visible = True
.navigate ("<your website url>")
While IE.ReadyState <> 4
DoEvents
Wend
Set Data = IE.Document.getElementsbyTagName("input")
'Navigating to the page I need help with that contains the iFrame structure.
IE.Document.getElementbyId("txtinput").Value = "AAA"
IE.Document.getElementbyId("btnSayHello").Click
'Waiting for the site to load.
'loadingSite
IE.Document.getElementsbyTagName("iframe")(0).contentDocument.getElementbyId("txtcontentinput").Value = "BBB"
IE.Document.getElementsbyTagName("iframe")(0).contentDocument.getElementbyId("btncontentSayHello").Click
End With
Set IE = Nothing
End Sub
After running the script, the result as below:

Tick Checkbox IE Automation Excel VBA

I'm trying to export a file from website but for that i have to select the elements i want to export from a checkbox.
The HTML code for the checkbox i want to tick is:
<input id="MainReport_ctl04_ctl09_divDropDown_ctl00" type="checkbox" name="MainReport$ctl04$ctl09$divDropDown$ctl00" onclick="$get('MainReport_ctl04_ctl09').control.OnSelectAllClick(this);" class="">
I've tried:
With IE.document.getElementsByName("MainReport$ctl04$ctl09$divDropDown$ctl00")
.Item.Click
end with
And
With IE.document.getElementsByName("MainReport$ctl04$ctl09$divDropDown$ctl00")
.Item(0).Checked = True
End With
And
Set obj = IE.document.getElementsByName("MainReport$ctl04$ctl09$divDropDown$ctl00")
obj.FireEvent ("onclick")
But nothing seems to work.
Could someone help me please? Thank u in advanced!

VBA web scrape innertext

I have now tried for quite some time to web scrape this innertext:
I want the value 0606 copied to an Excel sheet
<TABLE class="group"
<td width="100%" nowrap="" colspan="3">
<input name="pg41_PolicyHolder_FogP_PolicyHolderId_FogP_IdentityQualifier"
type="HIDDEN" value="CPR">CPR-nr:
<input name="pg41_PolicyHolder_FogP_PolicyHolderId_FogP_IdentityValue"
type="HIDDEN" value="0606">0606</td>
I have tried through get.attribute,getelementbyclassname, value and innertext, but now I need some fresh eyes on it.
Does any of you have a good idea?
Something like this should work, however without your code I don't know how you're obtaining your HTMLDocument:
Dim oHTMLDocument As Object
Dim ele As Object
Set oHTMLDocument = ... 'No code provided so I'm unsure how you obtained the HTMLDocument
For Each ele in oHTMLDocument.getElementsByTagName("input")
If ele.Name = "pg41_PolicyHolder_FogP_PolicyHolderId_FogP_IdentityValue" Then
Debug.Print ele.innerText
Exit For
End If
Next ele
You can use a CSS selector and avoid the need for a loop.
input[name="pg41_PolicyHolder_FogP_PolicyHolderId_FogP_IdentityValue"]
Try it
VBA
.querySelector is accessed via the HTML document set when you have your page (method not shown in your question) but with Internet Explorer for example:
IE.document.querySelector("pg41_PolicyHolder_FogP_PolicyHolderId_FogP_IdentityValue").innerText
Further info:
HTML DOM querySelector() Method

getelementsbyID inner dt id values

I am extracting data from HTML using Vb Script. This is the HTML code from which am trying to extract the data.
<dl id="overview">
<dt id="overview-summary-current-title" class="summary-current" style="display:block">
Current
</dt>
<dd class="summary-current" style="display:block">
<ul class="current">
<li>
Software Engineer
<span class="at">at </span>
<a class="company-profile-public" href="/company/ABC Systems?trk=ppro_cprof">
<span class="org summary">ABC Systems</span></a>
</li>
</ul>
</dd>
In my previous question, I had asked for a similar doubt. The link is Excel getElementById extract the span class information.
However, in that case, I wanted to extract the information corresponding to the dl id and it also had span id. In this case, I need to extract the information corresponding to the dt id.
In my VB Script, I tried something like this.
Dim openedpage as String
openedpage = iedoc1.getElementById("overview").getElementById("overview-summary-current-title").innerHTML
However, I am getting no output.
I want the output as Software Engineer at ABC systems.
Kindly help me out.
The object returned by getElementById() doesn't have a method .getElementById(), so the following line fails:
.getElementById("overview").getElementById("overview-summary-current-title")
If you don't get any output, not even an error message, you probably have On Error Resume Next somewhere in your script. Please don't use that unless you know exactly what you're doing and have sensible error handling code in place.
Also, the element with the ID "overview-summary-current-title" is this:
<dt id="overview-summary-current-title" class="summary-current" style="display:block">
Current
</dt>
So you couldn't possibly extract the text "Software Engineer at ABC systems" from that element.
Try selecting the first <ul> tag from the element with the ID "overview", and then use the innerText property instead of the innerHtml property:
Set ie = CreateObject("InternetExplorer Application")
ie.Navigate "..."
While ie.Busy : WScript.Sleep 100 : Wend
Set e1 = ie.document.getElementById("overview")
Set e2 = e1.getElementsByTagName("ul")(0)
WScript.Echo e2.innerText

Resources