Selenium WebDriver: Get value of DIV without a class name but with category name - python-3.x

I am trying to use selenimum webdriver in python 3.
My html source is for example:
<div the-category="Cat1"...></div>
<div the-category="Cat2"...></div>
I know that if instead of category, I had class for example:
<div class="Cat1"...></div>
<div class="Cat2"...></div>
I could find the first div by:
driver.find_element_by_class_name('Cat1')
but how can I find the first div in the:
<div the-category="Cat1"...></div>
<div the-category="Cat2"...></div>

try this
driver.find_element_by_css_selector('div[the-category="Cat1"]')

Related

Python Selenium, find certain elements under a certain element

<section id='browse-search'>
<div>
<div>
<div>
<div class='product-pod'>
<div class='product-pod>'>
</div>
</div>
</div>
</section>
<div class='product-pod>'>
<div class='product-pod>'>
I have a webpage like this structure. and I need a cleaner way to locate elements with class='product-pod'. driver.find_elements(By.XPATH,"div[#class='product-pod']") will not work, because there are a few matched elements outside the section element.
Please advise what is the most appropriate way to locate those elements.
With what you have provided, this strategy could be built:
if the extra > that you provided is a typo, then:
For first div element:
driver.find_element(By.XPATH, "(//section[#id='browse-search']//div[#class='product-pod'])[1]")
For second div element:
driver.find_element(By.XPATH, "(//section[#id='browse-search']//div[#class='product-pod'])[2]")
If the > is not a typo, then the structure changes, and the below strategies would work:
For main div element:
driver.find_element(By.XPATH, "//section[#id='browse-search']//div[#class='product-pod']")
For inner div element:
driver.find_element(By.XPATH, "//section[#id='browse-search']//div[#class='product-pod']/div")
you can try xpath like
//section[#id='browse-search']//div[contains(#class,'product-pod')]
which will collect all product-pod classes inside section having id = browse-search

How to extract the text content from multiple span elements using Selenium and Python

How to select text content from multiple DIV elements using selenium?
On the website I intend to collect information it contains div and span with the same class.
How can I collect this information separately?
I need the contents inside the panel-body div > span of each block
driver.find_element_by_xpath(".//div[#class='panel-body'][1]/span[1]").text
driver.find_element_by_xpath(".//div[#class='panel-body'][1]/span[2]").text
driver.find_element_by_xpath(".//div[#class='panel-body'][1]/span[3]").text
driver.find_element_by_xpath(".//div[#class='panel-body'][2]/span[1]").text
driver.find_element_by_xpath(".//div[#class='panel-body'][2]/span[2]").text
html
<div class="panel-heading">
<h3 class="panel-title">Identificação</h3>
</div>
<div class="panel-body">
<span class="spanValorVerde">TEXT</span><br>
<span style="font-size:small;color:gray">TEXT</span><br>
<br>
<span class="spanValorVerde">TEXT</span>
</div>
</div>
<div class="panel panel-success">
<div class="panel-heading">
<h3 class="panel-title">Situação Atual</h3>
</div>
<div class="panel-body">
<span class="spanValorVerde">TEXT</span> <br>
<span class="spanValorVerde">TEXT</span>
</div>
</div>
I am expecting that "select text" meant "get text".
first for loop:
count = driver.find_elements_by_xpath(".//div[#class='panel-body'][i]")
second for loop with count iteration:
driver.find_element_by_xpath(".//div[#class='panel-body'][i]/span[j]").text
If you search with findElements with ".//div[#class='panel-body'][i]" will give you the total element present, then add another loop for .//div[#class='panel-body'][i]/span[j] and then get text. Hope it helps!
To extract the texts e.g. TEXT, from each <span> using Selenium and python you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR and get_attribute("innerHTML"):
print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.panel.panel-success div.panel-body span.spanValorVerde")))])
Using XPATH and text attribute:
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[#class='panel panel-success']//div[#class='panel-body']//span[#class='spanValorVerde']")))])
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Outro
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

Selenium Can't Find Element Returning None or []

im having trouble accessing element, here is my code:
driver.get(url)
desc = driver.find_elements_by_xpath('//p[#class="somethingcss xxx"]')
and im trying to use another method like this
desc = driver.find_elements_by_class_name('somethingcss xxx')
the element i try to find like this
<div data-testid="descContainer">
<div class="abc1123">
<h2 class="xxx">The Description<span data-tid="prodTitle">The Description</span></h2>
<p data-id="paragraphxx" class="somethingcss xxx">sometext here
<br>text
<br>
<br>text
<br> and several text with
<br> tag below
</p>
</div>
<!--and another div tag below-->
i want to extract tag p inside div class="abc1123", but it doesn't return any result, only return [] when i try to get_attribute or extract it to text.
When i try extract another element using this method with another class, it works perfectly.
Does anyone know why I can't access these elements?
Try the following css selector to locate p tag.
print(driver.find_element_by_css_selector("p[data-id^='paragraph'][class^='somethingcss']").text)
OR Use get_attribute("textContent")
print(driver.find_element_by_css_selector("p[data-id^='paragraph'][class^='somethingcss']").get_attribute("textContent"))

Why does attribute splitting happen in BeautifulSoup?

I try to get the attribute of the parent element:
<div class="detailMS__incidentRow incidentRow--away odd">
<div class="time-box">45'</div>
<div class="icon-box soccer-ball-own"><span class="icon soccer-ball-own"> </span></div>
<span class=" note-name">(Autogoal)</span><span class="participant-name">
Reynaldo
</span>
</div>
span_autogoal = soup.find('span', class_='note-name')
print(span_autogoal)
print(span_autogoal.find_parent('div')['class'])
# print(span_autogoal.find_parent('div').get('class')
Output:
<span class="note-name">(Autogoal)</span>
['detailMS__incidentRow', 'incidentRow--away', 'odd']
I know i can do something like this:
print(' '.join(span_autogoal.find_parent('div')['class']))
But i want to know why this is happening and is it possible to do this more correctly?
Above answer is correct however if you want get mutli attribute value return as string try use xml parser after get the parent element.
from bs4 import BeautifulSoup
data='''<div class="detailMS__incidentRow incidentRow--away odd">
<div class="time-box">45'</div>
<div class="icon-box soccer-ball-own"><span class="icon soccer-ball-own"> </span></div>
<span class=" note-name">(Autogoal)</span><span class="participant-name">
Reynaldo
</span>
</div>'''
soup=BeautifulSoup(data,'lxml')
span_autogoal = soup.find('span', class_='note-name')
print(span_autogoal)
parentdiv=span_autogoal.find_parent('div')
data=str(parentdiv)
soup=BeautifulSoup(data,'xml')
print(soup.div['class'])
Output on console:
<span class="note-name">(Autogoal)</span>
detailMS__incidentRow incidentRow--away odd
According to the BeautifulSoup documentation:
HTML 4 defines a few attributes that can have multiple values. HTML 5
removes a couple of them, but defines a few more. The most common
multi-valued attribute is class (that is, a tag can have more than one
CSS class). Others include rel, rev, accept-charset, headers, and
accesskey. Beautiful Soup presents the value(s) of a multi-valued
attribute as a list:
css_soup = BeautifulSoup('<p class="body"></p>') css_soup.p['class']
# ["body"]
css_soup = BeautifulSoup('<p class="body strikeout"></p>')
css_soup.p['class']
# ["body", "strikeout"]
So in your case in <div class="detailMS__incidentRow incidentRow--away odd"> a class attribute is multi-valued.
That's why span_autogoal.find_parent('div')['class'] gives you list as an output.

Python + Selenium - Select Drop Down Option using Stored Variable

I have written a python selenium script that selects a state value from a drop down. The HTML for the drop down element is copied below:
<div class="hQSHyh4QFG0Xh0d-6pxTF" tabindex="0" style="height: 238px; display: none;">
<div class="SD_7vnwWhO0KG80czzPb3 option-0 al-option">AL</div>
<div class="SD_7vnwWhO0KG80czzPb3 option-1 ak-option">AK</div>
<div class="SD_7vnwWhO0KG80czzPb3 option-2 as-option">AS</div>
<div class="SD_7vnwWhO0KG80czzPb3 option-3 az-option">AZ</div>
<div class="SD_7vnwWhO0KG80czzPb3 option-4 ar-option">AR</div>
<div class="SD_7vnwWhO0KG80czzPb3 option-5 ca-option">CA</div>
<div class="SD_7vnwWhO0KG80czzPb3 option-59 um-option">UM</div>
</div>
Problem: the automation script locates the same state value ("CA") using a hard-coded xpath statement (See code snippet from script below). Instead, I would like to select the state value using a stored variable called "state".
state_selection = self.driver.find_element_by_xpath("/html/body/div[2]/div/div[2]/div/div/div[2]/div[1]/form/div/div[2]/div[2]/div[3]/div[2]/div/div[3]/div[6]")
state_selection.click()
Additional Notes: I have tried using other methods to locate the state value (see below) but, so far, I have only been successful using the hard-coded xpath above.
I also tried to locate the drop down element using the Selenium Select Method but I got messages telling me that "Select only works on <select> elements, not on 'div' "
driver.findElement(by.xpath("//select[#SD_7vnwWhO0KG80czzPb3='']/option[#value='CA']")).click()
Try to select required option by its text content:
state = "CA"
state_selection = self.driver.find_element_by_xpath("//div[.='%s']" % state)
state_selection.click()

Resources