GetAttribute in Selenium VBA for style - excel

I am working on selenium in VBA and I have stored a variable"post" to store all the occurrences of a specific element like that
Dim post As Object
Set post = .FindElementsByCss("#DetailSection1")
Dim i As Long
For i = 1 To post.Count
Debug.Print post.Item(i).getAttribute("style")
Next i
I need to extract the style value from the elements
<div id="DetailSection1" style="z-index:3;clip:rect(0px,746px,32px,0px);top:228px;left:0px;width:746px;height:32px;">
</div>
Also I need to print in the immediate window the innerHTML and when I used getAttribute("innerHTML"), it doesn't work for me
Any ideas

getAttribute("style") should work but you have to induce a waiter for the element to be present/visible within the HTML DOM.
Debug.Print post.Item(i).getAttribute("style")
Precisely, to extract value of the style attributes from the elements you can use the getCssValue() method as follows:
Debug.Print post.Item(i).getCssValue("z-index")
Debug.Print post.Item(i).getCssValue("top")
Debug.Print post.Item(i).getCssValue("left")
Debug.Print post.Item(i).getCssValue("width")
Debug.Print post.Item(i).getCssValue("height")
getAttribute("innerHTML")
get_attribute("innerHTML") can be used to read the innerHTML or the text within any node / WebElement
You can find a detailed discussion in Difference between text and innerHTML using Selenium
References
You can find a couple of relevant discussions in:
How to get child property value of a element property using selenium webdriver, NUnit and C#
How can I verify text is bold using selenium on an angular website with C#

Related

Web scraping DEEPL.com using VBA Excel and Selenium

i'm trying to code a function to translate sentences in Excel using DEEPL.com
My approach is using Selenium to scrape the web using Chrome (as IExplore is not supported by the web).
Public Function deepL(txt As String, inputLang As String, outputLang As String)
Dim url As String
Dim driver As New WebDriver
url = "https://www.deepl.com/translator#" & inputLang & "/" & outputLang & "/" & txt
driver.Start "Chrome"
driver.Timeouts.ImplicitWait = 5000
driver.Get url
deepL = driver.FindElementById("target-dummydiv").Text
driver.Close
End Function
----
Sub translating()
'test for word "probando" from "es" to "en"
'url: https://www.deepl.com/translator#es/en/probando
'it should return: "testing"
MsgBox (deepL("probando", "es", "en"))
End Sub
The problem comes when loading the web, so the div containing the translation is empty on load, and the GET instruction returns an empty text.
But after 1 second, the page refreshes with the correct result:
<div id="target-dummydiv" aria-hidden="true" class="lmt__textarea lmt__textarea_dummydiv" lang="en-US">testing</div>
I tried adding an implicit wait of 5 seconds in order to give time to the webpage to load, but the result is the same.
What am I doing wrong?
EDIT: I found that the div with the translation has visibility: hidden. If I show the visibility, the results are correct, but don't know how to get that in my code
OK, I found a solution:
just select the textarea where the translation is located and get the translation with .attribute("value") instead of .text
deepL = driver.FindElementByCss("textarea.lmt__textarea.lmt__target_textarea.lmt__textarea_base_style").Attribute("value")

Scrape Data From A HTML Table [duplicate]

This question already has an answer here:
Scrape table from website
(1 answer)
Closed 2 years ago.
I am really struggling in trying to pull some data of a web table. I have scraped web data in the past but never from a table and can not work it out
I have tried several variations but nothing seems to work, I have channged the class several times and the child node number to reflect each items, however I can not extract anything from the table
Q) Can someone advise on the table class and how to extract from a td
I have read several posts on this forum and other forums on scraping from a table, however none helped, hence the post
''''Data 1
On Error Resume Next
If doc.getElementsByClassName("content")(0).getElementsByTagName("td").Children(0) Is Nothing Then
wsSheet.Cells(StartRow + myCounter, 1).Value = "-"
Else
On Error Resume Next
wsSheet.Cells(StartRow + myCounter, 1).Value = doc.getElementsByClassName("content")(0).getElementsByTagName("td").Children(0).innerText
End If
I have tried the following Variations
doc.getElementsByClassName("content")(0)
doc.getElementsByClassName("content")(0)).Children(0)
doc.getElementsByClassName("content")(0).getElementsByTagName("th").getElementsByTagName("td").Children(0)
doc.getElementsByClassName("content")(0).getElementsByTagName("td").Children(0)
This is an image of the html, I tried to put in the html code, but could not get it to look right
As always thanks in advance
First an advice: Split those statements into pieces and save the result into intermediate variables.
Then an observation: The <td>-tags have no children, so children(0) will return Nothing (the <th> on that page has a child, the <span>-tag) . You probably want to read the content of the cell, you can do this with the property InnerHtml.
Remove the On Error Resume Next-statement. As long as you are developing your routine, let the code run into errors so you can easily debug and see the place where the code fails. And once you are ready, it's better to check for errors by yourself.
Not sure if the following fits, but it should give you the idea:
' Fetch the "Content"-DIV
Dim content As Object
Set content = HtmlDoc.getElementsByClassName("content")(0)
' Fetch the first table with that div
Dim table As Object
Set table = content.getElementsByTagName("table")(0)
' Loop over all <td>-Tags and print the content
Dim td As Object
For Each td In table.getElementsByTagName("td")
Debug.Print td.innerHTML
If td.Children.Length > 0 Then
' If <td> has children, fetch the first child and show the content
Dim child As Object
Set child = td.Children(0)
Debug.Print " We found a child: " & child.tagName, child.innerHTML
End If
Next
When you debug the code, remember to use the "Locals Window" of the VBA (View->Locals Window). There you can inspect all the details of the objects.

Scraping data from a specific table

I'm trying to scrape each of the symbol codes and names from here (about 1/4 of the way down the page): https://uk.finance.yahoo.com/quote/MSFT?p=MSFT&.tsrc=fin-srch
If I inspect the HTML of the first row with the symbol AAPL I am given the following
<tr class="Va(t) Bdc($seperatorColor) TapHc(h) Fw(500) Bgc($hoverBgColor):h H(44px) BdT"</tr>
So in my VBA I navigate to the webpage by creating an internetexplorer object and then the first piece of code to actually begin the scraping is the following:
Dim allRowOfData As Variant
Set allRowOfData = appIE.document.getElementsByClassName("Va(t)")
Dim myValue As String
myValue = allRowOfData.Cells(1).innerHTML
If I look in the immediate window I am then presented with so many HTMLElements (20) plus all of their children that I have no idea where to begin, to be able to get the data that I want.
Is there an easier way to do this?
Also, how do we know what to put in the getElementsByClassName? Initially I had the entire string after the <tr class= and this returned nothing at all.

Element doesn't exist although it has ID attribute

In selenium excel vba I am trying to learn more about how to deal with the CSS selectors
And I am wondering because when inspecting an element with ID and when run the code I got a message that the element not found
Here's the code till now
Private bot As New selenium.ChromeDriver
Sub Test()
Dim win, mainWin As selenium.Window, sCode As String, i As Long
Dim urlImage As String, urlPost As String
Dim sCase As String
sCase = "192160470"
Set bot = New ChromeDriver
With bot
.Start "Chrome"
'First Window (Main Window)
.Get "https://www.kuwaitcourts.gov.kw/searchPages/searchCases.jsp"
'.FindElementById("txtCaseNo").SendKeys sCase
.FindElementByCss("input[type=text][name='txtCaseNo']").SendKeys sCase
'MsgBox "Click OK After Entering Captcha", 64
Stop
.Quit
End With
End Sub
and here's the HTML part for that element
<td><input type="text" name="txtCaseNo" id="txtCaseNo" maxlength="9" class="inputTextBox" onkeypress="return onlyNumbers(event);"></td>
I am stuck at this line
.FindElementByCss("input[type=text][name='txtCaseNo']").SendKeys sCase
Thanks advanced for any help or any ideas
To send a character sequence to the Username field as the the desired element are within an <iframe> so you have to:
Induce WebDriverWait for the desired frame to be available and switch to it.
Induce WebDriverWait for the desired element to be clickable.
You can use the following solution:
With bot
.Start "Chrome"
.Get "https://www.kuwaitcourts.gov.kw/searchPages/searchCases.jsp"
.SwitchToFrame "searchCaseDiv"
.FindElementByCss("input[type=text][name='txtCaseNo']").SendKeys sCase
You can find a relevant discussion in How to send text with in the username field within an iframe using Selenium VBA
tl; dr
Ways to deal with #document under iframe
The element is inside of an iframe with id searchCaseDiv. You have to switch to that iframe to be able to access the element.
Use .SwitchToFrame to switch frame.
For java, it would be like this,
driver.switchTo().frame("searchCaseDiv");

How to get my VBA scraper to find the above row?

I have some experience with VBA but I am very new to web scraping with VBA. However I am very enthusiastic about it and thought of a 1000 ways how could I use it and make my job easier. :)
My problem is that I have a website with two input fields and one button. I can write in the input fields (they have ID so I can easily find them)
My code for the input fields:
.Document.getElementById("header_keyword").Value = my_first
.Document.getElementById("header_location").Value = my_last
But I am really stuck with clicking the button.
Here is the html code for the buttons:
<span class="p2_button_outer p2_button_outer_big"><input class="p2_button_inner" type="submit" value="Keresés" /></span>
<span class="p2_button_outer p2_button_outer_big light hide_floating"><a id="tour_det_search" class="p2_button_inner" href="http://www.profession.hu/kereses">Részletes keresés</a></span>
As you can see there are two different buttons near each other, and they share the same class. I am looking for the first/upper one. My problem is that it has no ID, only class, type and value. But I was not able to find getelementsbytype or getelementsbyvalue method.
Is there any solution to find the button by type or value (or both)?
Sorry if I am asking something stupid but as I said previously I am new in scraping...:)
Thank you in advance and have a nice weekend!
Fortunatelly I have worked out the solution. :)
What I did is the following. I made searched for the relevant classes and then using the getAttribute() method and looping thru the classes I searched for the specific value and clicked on it when found it.
Below is the working code:
Set my_classes = .Document.getElementsByClassName("p2_button_inner")
For Each class In my_classes
If class.getAttribute("value") = "Keresés" Then
Range("c4") = "Clicked"
class.Click
Exit For
End If
Next class
Thank you!
You can use the following function. It looks for a first HTML element with the given caption. You can also limit the searching by HTML tag.
(The code is compatible with IE <9 that doesn't contain getElementsByClassName method).
Public Function FindElementByCaption(dom As Object, Caption As String, _
Optional Tag As String, Optional Nested As Boolean = True) As Object
Dim ControlsSet As Variant
Dim Controls As Variant
Dim Control As Object
'------------------------------------------------------------------------------------
Set ControlsSet = VBA.IIf(Nested, dom.all, dom.childNodes)
If VBA.Len(Tag) Then
Set Controls = ControlsSet.tags(VBA.LCase(Tag))
Else
Set Controls = ControlsSet
End If
For Each Control In Controls
If VBA.StrComp(Control.InnerHtml, Caption, vbTextCompare) = 0 Then
Set FindElementByCaption = Control
Exit For
End If
Next Control
End Function
Here is how to apply it in your code:
Dim button As Object
Set button = FindElementByCaption(.Document, "Keresés", "INPUT", True)
If Not button Is Nothing Then
Call button.Click
Else
Call MsgBox("Button has not been found")
End If
CSS selector:
Use a CSS selector to target the element of:
input[value='Keresés']
This says element with input tag, having attribute value with value 'Keresés'.
CSS query:
VBA:
You apply the selector via the querySelector method of document.
ie.document.querySelector("input[value='Keresés']").Click

Resources