I want to get text from anchor tag using selenium python I want print text helloworld - python-3.x

<div class="someclass">
<p class="name">helloworld</p>
</div>
//I want to print helloworld text from anchor tag, using python selenium code.

You can do it using CSS:
.find_element_by_css_selector("p.name a")`,
or you can do it using xpath:
.find_element_by_xpath("//p[#class='name']/a")
Example:
element = self.browser.find_element_by_css_selector("p.name a")
print element.get_attribute("text")
I hope this helped, if not tell me :)

One step solution:
browser.find_element_by_xpath('//p[#class="name"]/a').get_attribute('text')
The gives you the text of anchor tag.

To get the text from any html tag using Selenium in python,
You can simply use ".get_attribute('text')".
In this case:
a_tag = self.driver.find_element_by_css_selector("p.name a")
a_tag.get_attribute('text')

Related

Selenium Can't Find Element Returning None or []

im having trouble accessing element, here is my code:
driver.get(url)
desc = driver.find_elements_by_xpath('//p[#class="somethingcss xxx"]')
and im trying to use another method like this
desc = driver.find_elements_by_class_name('somethingcss xxx')
the element i try to find like this
<div data-testid="descContainer">
<div class="abc1123">
<h2 class="xxx">The Description<span data-tid="prodTitle">The Description</span></h2>
<p data-id="paragraphxx" class="somethingcss xxx">sometext here
<br>text
<br>
<br>text
<br> and several text with
<br> tag below
</p>
</div>
<!--and another div tag below-->
i want to extract tag p inside div class="abc1123", but it doesn't return any result, only return [] when i try to get_attribute or extract it to text.
When i try extract another element using this method with another class, it works perfectly.
Does anyone know why I can't access these elements?
Try the following css selector to locate p tag.
print(driver.find_element_by_css_selector("p[data-id^='paragraph'][class^='somethingcss']").text)
OR Use get_attribute("textContent")
print(driver.find_element_by_css_selector("p[data-id^='paragraph'][class^='somethingcss']").get_attribute("textContent"))

How to extract only the text which is not inside any tag using xpath with selenium and python binding

Link to the page is: "https://www.members.agta.org/assnfe/CompanySearch.asp?MODE=DETAIL&COID=1026706&COMPNAME=&CITYNAME=&STATENAME=&CITYID=0&STATEID=0&CTRYID=181&SEARCHIDENTIFIER=81.145.145.150_12/24/2019%203:31:24%20AM&RETAILMBRS=0&ORGTYPE=0&GEMSTONEID=-1&PRODUCTSID=-1&COMPANYDATA=&TID=2&GEMCOLORID=-1&GEMCUTID=-1&GEMQUALID=A"
Here is the html i am targeting:
<p><strong>Contact:</strong>
Garmendia, Diane
<br>
<strong>Email:</strong> Diane33jewels#gmail.com<br>
<strong>P:</strong> 805-957-9100<br>
<strong>F:</strong> 805-957-4191<br>
http://www.33jewels.com
<!-- <b>Email Link:</b> $MC:EMAILLINKTOFORM$ -->
</p>
I need to extract "Garmendia, Diane" using the xpath expression.
I have tried using:
cname=driver.find_element_by_xpath("//*[contains(text(), 'Contact:')]//following-sibling::text()[1]")
But the error i am getting is:
Message: invalid selector: The result of the xpath expression "//*[contains(text(), 'Contact:')]//following-sibling::text()[1]" is: [object Text]. It should be an element.
To Extract the Garmendia, Diane use javascripts executor and childNodes
Induce WebDriverWait() and wait for element_to_be_clickable() with following XPATH
Code:
element=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//p[contains(.,'Contact:')]")))
print(driver.execute_script('return arguments[0].childNodes[1].textContent;', element))

How to extract a field from xpath which is not present in an element tag?

<div class="info">
<span class="label">Establishment year</span>
"2008"
</div>
I want to extract 2008 by using xpath but the expression just selects the establishment text.
driver.find_element_by_xpath("//*[text()='Establishment year']")
As the text 2008 is within a text node to extract the text 2008 you can use the following solution:
print(driver.execute_script('return arguments[0].lastChild.textContent;', WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[#class='info']/span[#class='label' and text()='Establishment year']/..")))).strip())
Unfortunately WebDriver does not allow find_element function result to be a Text Node so you will have to go for execute_script function like:
driver.execute_script(
"return document.evaluate(\"//div[#class='info']/node()[3]\", document, null, XPathResult.STRING_TYPE, null).stringValue;")
Demo:
More information:
XPath Tutorial
XPath Axes
XPath Operators & Functions

How to get substring from string using xpath 1.0 in lxml

This is the example HTML.
<html>
<a href="HarryPotter:Chamber of Secrets">
text
</a>
<a href="HarryPotter:Prisoners in Azkabahn">
text
</a>
</html>
I am in a situation where I need to extract
Chamber of Secrets
Prisoners in Azkabahn
I am using lxml 4.2.1 in python which uses xpathb1.0.
I have tried to extract using XPath
'substring-after(//a/#href,"HarryPotter:")'
which returns only "Chamber of Secrets".
and with XPath
'//a/#href[substring-after(.,"HarryPotter:")]'
which returns
'HarryPotter:Chamber of Secrets'
'HarryPotter:Prisoners in Azkabahn'
I have researched for it and got new learning but didn't find the fix of my problem.
I have hit and tried different XPath using substring-after.
In my research, I got to know that it could also be accomplished by regex too, then I tried and failed.
I found that it is easy to manipulate a string in XPath 2.0 and above using regex but we can also use regex in XPath 1.0 using XSLT extensions.
Could we do it with substring-after function, if yes then what is the XPath and if No then what is the best approach to get the desired output?
And how we can get the desired output using regex in XPath by sticking to lxml.
Try this approach to get both text values:
from lxml import html
raw_source = """<html>
<a href="HarryPotter:Chamber of Secrets">
text
</a>
<a href="HarryPotter:Prisoners in Azkabahn">
text
</a>
</html>"""
source = html.fromstring(raw_source)
for link in source.xpath('//a'):
print(link.xpath('substring-after(#href, "HarryPotter:")'))
If you want to use substring-after() and substring-before() and together
Here is example:
from lxml import html
f_html = """<html><body><table><tbody><tr><td class="df9" width="20%">
<a class="nodec1" href="javascript:reqDl(1254);" onmouseout="status='';" onmouseover="return dspSt();">
<u>
2014-2
</u>
</a>
</td></tr></tbody></table></body></html>"""
tree_html = html.fromstring(f_html)
deal_id = tree_html.xpath("//td/a/#href")
print(tree_html.xpath('substring-after(//td/a/#href, "javascript:reqDl(")'))
print(tree_html.xpath('substring-before(//td/a/#href, ")")'))
print(tree_html.xpath('substring-after(substring-before(//td/a/#href, ")"), "javascript:reqDl(")'))
Result:
1254);
javascript:reqDl(1254
1254

How can i click the third href link?

<ul id='pairSublinksLevel1' class='arial_14 bold newBigTabs'>...<ul>
<ul id='pairSublinksLevel2' class='arial_12 newBigTabs'>
<li>...</li>
<li>...</li>
<li>
<a href='/equities/...'> last data </a> #<-- HERE
</li>
<li>...</li>
Question is how can i get click third li tag ??
In my code
xpath = "//ul[#id='pairSublinksLevel2']"
element = driver.find_element_by_xpath(xpath)
actions = element.find_element_by_css_selector('a').click()
code works partially. but i want to click third li tag.
The code keeps clicking on the second tag.
Try
driver.find_element_by_xpath("//ul[#id='pairSublinksLevel2']/li[3]/a").click()
EDIT:
Thanks #DebanjanB for suggestion:
When you get the element with xpath //ul[#id='pairSublinksLevel2'] and search for a tag in its child elements, then it will return the first match(In your case, it could be inside second li tag). So you can use indexing as given above to get the specific numbered match. Please note that such indexing starts from 1 not 0.
As per the HTML you have shared you can use either of the following solutions:
Using link_text:
driver.find_element_by_link_text("last data").click()
Using partial_link_text:
driver.find_element_by_partial_link_text("last data").click()
Using css_selector:
driver.find_element_by_css_selector("ul.newBigTabs#pairSublinksLevel2 a[href*='equities']").click()
Using xpath:
driver.find_element_by_xpath("//ul[#class='arial_12 newBigTabs' and #id='pairSublinksLevel2']//a[contains(#href,'equities') and contains(.,'last data')]").click()
Reference: Official locator strategies for the webdriver

Resources