Returning hrefs using Selenium

Returning hrefs using Selenium - python-3.x

I'm working with html loosely structured like this:
...
<div class='TL-dsdf2323...'>
<a href='/link1/'>
(more stuff)
</a>
<a href='/link2/'>
(more stuff)
</a>
</div>
...
I want to be able to return all of the hrefs contained within this particular div. So far it seems like I am able to locate the proper div
div = driver.find_elements_by_xpath("//div[starts-with(#class, 'TL')]")
This is where I'm hitting a wall though. I've gone through other posts and tried several options such as
links = div.find_elements_by_xpath("//a[starts-with(#href,'/link')]")
and
div.find_element_by_partial_link_text('/link')
but I keep returning empty lists. Any idea where I'm going wrong here?
Edit:
here's a picture of the actual html. I simplified the div class name from ThumbnailLayout to TL and the href /listing to /link

As #mr_mooo_cow pointed out in a comment, a delay was needed in order to extract the links. Here is the final working code:
a_tags = WebDriverWait(driver,10).until(EC.presence_of_all_elements_located( (By.XPATH, "//a[starts-with(#href,'/listing')]") ))
links = []
for link in a_tags:
links.append(link.get_attribute('href'))

Can you try something like this:
links = div.find_elements_by_xpath("//a[starts-with(#href,'/link') and ./div[starts-with(#class, 'TL')]]")
./ references the parent element in xpath. I haven't tested this so let me know if it doesn't work.

Related

Selenium: Stale Element Reference Exception Error

I am trying to loop through all the pages of a website. but I am getting a stale element reference: element is not attached to the page document error. This happens when the script try to click the third page. The script got the error when it runs to page.click(). Any suggestions?
while driver.find_element_by_id('jsGrid_vgAllCases').find_elements_by_tag_name('a')[-1].text=='...':
links=driver.find_element_by_id('jsGrid_vgAllCases').find_elements_by_tag_name('a')
for link in links:
if ((link.text !='...') and (link.text !='ADD DOCUMENTS')):
print('Page Number: '+ link.text)
print('Page Position: '+str(links.index(link)))
position=links.index(link)
page=driver.find_element_by_id('jsGrid_vgAllCases').find_elements_by_tag_name('a')[position]
page.click()
time.sleep(5)
driver.find_element_by_id('jsGrid_vgAllCases').find_elements_by_tag_name('a')[-1].click()

You can locate the link element each time again according to the index, not to use elements found initially.
Something like this:
amount = len(driver.find_element_by_id('jsGrid_vgAllCases').find_elements_by_tag_name('a'))
for i in range(1,amount+1):
link = driver.find_element_by_xpath("(//*[#id='jsGrid_vgAllCases']//a)["+str(i) +"]")
from now you can continue within your for loop with this link like this:
amount = len(driver.find_element_by_id('jsGrid_vgAllCases').find_elements_by_tag_name('a'))
for i in range(1,amount+1):
link = driver.find_element_by_xpath("(//*[#id='jsGrid_vgAllCases']//a)["+str(i) +"]")
if ((link.text !='...') and (link.text !='ADD DOCUMENTS')):
print('Page Number: '+ link.text)
print('Page Position: '+str(links.index(link)))
position=links.index(link)
page=driver.find_element_by_id('jsGrid_vgAllCases').find_elements_by_tag_name('a')[position]
page.click()
time.sleep(5)
(I'm not sure about the correctness of all the rest your code, just copy-pasted it)

I'm running into an issue with the Stale Element Exception too. Interesting with Firefox no problem, Chrome && Edge both fail randomly. In general i have two generic find method with retry logic, these find methods would look like:
// Yes C# but should be relevant for any WebDriver...
public static IWebElement( this IWebDriver driver, By locator)
public static IWebElement( this IWebElement element, By locator)
The WebDriver variant seems to work fine for my othe fetches as the search is always "fresh"... But the WebElement search is the one causing grief. Unfortunately the app forces me to need the WebElement version. Why he page/html will be something like:
<node id='Best closest ID Possible'>
<span>
<div>text i want</div>
<div>meh ignore this </div>
<div>More text i want</div>
</span>
<span>
<!-- same pattern ... -->
So the code get the closest element possible by id and child spans i.e. "//*[#id='...']/span" will give all the nodes of interest. This is now where i run into issues, enumerating all element, will do two XPath select i.e. "./div[1]" and "./div[3]" for pulling out the text desired. It is only in fetching the text nodes under the elements where randomly a StaleElement will be thrown. Sometimes the very first XPath fails, sometimes i'll go through a few pages, as the pages being might have 10,000's or more pages, while the structure is the same i'll spot check random pages as they all the same format. At most i've gotten through 20 consecutive pages with Chrome (ver 92.0.4515.107) or Edge (ver 94.0.986), both seem to be the latest as of now.
One solution that should work, get all the the span elements first, i.e. '//*[#id='x']/span' get my list then query from the driver like:
var nodeList = driver.FindElements(By.XPath('//*[#id='x']/span' ));
for( int idx = 0 ; idx < nodeList.Count; idx++)
{
string str1 = driver.FindElements(By.XPath("//*[#id='x']/span[idx+1]/div[1]")).GetAttribute("innerText");
string str2 = driver.FindElements(By.XPath("//*[#id='x']/span[idx+1]/div[3]")).GetAttribute("innerText");
}
```
Think it would work but, YUK! This is kind of simplified and being able to do an XPath from the respective "ID" located node would be preferable..

How to get href values from a class - Python - Selenium

<a class="link__f5415c25" href="/profiles/people/1515754-andrea-jung" title="Andrea Jung">
I have above HTML element and tried using
driver.find_elements_by_class_name('link__f5415c25')
and
driver.get_attribute('href')
but it doesn't work at all. I expected to extract values in href.
How can I do that? Thanks!

You have to first locate the element, then retrieve the attribute href, like so:
href = driver.find_element_by_class_name('link__f5415c25').get_attribute('href')
if there are multiple links associated with that class name, you can try something like:
eList = driver.find_elements_by_class_name('link__f5415c25')
hrefList = []
for e in eList:
hrefList.append(e.get_attribute('href'))
for href in hrefList:
print(href)

Show custom message without time prefix in <p:schedule>?

I would like to show a custom message in <p:schedule>. It shows the message with time prefix, e.g. "12a CycDemo".
How can I show only "CycDemo" without time prefix? I'm adding it as follows:
model.addEvent(new DefaultScheduleEvent("CycDemo", fromDate, toDate));

I get one solution. It will not be good solution because of I still cannot find another way.
When I debug the html content on the browser by the developer tool, I found the following code.
....
<span class="fc-event-time">12a</span>
<span class=".....">CycDemo</span>
...
That's why, I solve it by CSS display:none.
.fc-event-time {
display: none;
}
Now, I can only see my expected message without time prefix :)

You have to add an empty timeFormat attribute in <p:schedule> then the prefix will disappear.
<p:schedule id="schedule" value="#{scheduleView.eventModel}" widgetVar="myschedule" timeFormat="">

Set AllDay to true of DefaultScheduleEvent
DefaultScheduleEvent d = new DefaultScheduleEvent();
d.setAllDay(true);
eventModel.addEvent(d)

Alloy user interface (access a tag value)

I'm working with liferay portal 6.2. And I want to get the value of the text in a tag with alloy user interface.
exemple:
<div>
<p> Paragraph </p>
"value"
</div>
the desired result is: value
please help.

AlloyUI, being an extension of YUI3, uses get/set methods to access and manipulate the properties/attributes of the object (YUI3 Node / AlloyUI Node) that is returned when looking up elements from the page.
Some examples can be reviewed in this documentation as well as this documentation.
In general you'll need something unique (i.e. id, css class) to the div in order to fetch only that element. Once you have that element, divNode.get('text') will give you all of the text within the element. There is not a means to easily "skip" the paragraph contents within the div without the value being contained within some other markup. If you have control over the markup and can do this, that would be the best option. Otherwise you are left to using the replace function to strip out the paragraph contents from the text.
<script>
AUI().use('aui-base', function(A) {
var paragraphText = A.one('#myDiv>p').get('text');
var divText = A.one('#myDiv').get('text')
var onlyValue = divText.replace(paragraphText, "").trim()
console.log(onlyValue)
})
</script>

How can I know an object is not visible when any of parent object is not visible in an HTML page using java script?

Let say I have an text box in an HTML page as follows.
<DIV style = "display:none;">
<DIV style = "display:inline;">
<INPUT type = "text" style = "display:inline;">
</DIV>
</DIV>
In this case, the text box will not be visible to the user. How can I identify that text is not currently visible to the user.
Dont say that, I should travel up to the parent objects to find out if they are set to not visible. I have bunch of fields to be validated like this and this would reduce the application performance.
Is there any other way to find out as this object is not visible to the user?
Thanks in advance.

If you don't need it to be pure JavaScript I would suggest using jQuery. Using the :visible or :hidden selector will accomplish what you want:
if ( $('yourElement').is(":hidden") ) {
// The element is not visible
}
http://api.jquery.com/visible-selector/
http://api.jquery.com/hidden-selector/
If you need pure JavaScript and you don't want to travel up through every ancestor element, you could try checking the element's offsetWidth and offsetHeight. If the element is hidden because of an ancestor element, they should both be 0. Note: I've always used jQuery for this, so I don't know how reliable this is.
var yourElement = document.getElementById('yourElementsId');
if ( yourElement.offsetWidth == 0 && yourElement.offsetHeight == 0) {
// The element is not visible
}
https://developer.mozilla.org/en-US/docs/DOM/element.offsetWidth
https://developer.mozilla.org/en-US/docs/DOM/element.offsetHeight

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Returning hrefs using Selenium - python-3.x

Can you try something like this: links = div.find_elements_by_xpath("//a[starts-with(#href,'/link') and ./div[starts-with(#class, 'TL')]]") ./ references the parent element in xpath. I haven't tested this so let me know if it doesn't work.

Related

Selenium: Stale Element Reference Exception Error

How to get href values from a class - Python - Selenium

Show custom message without time prefix in <p:schedule>?

Alloy user interface (access a tag value)

How can I know an object is not visible when any of parent object is not visible in an HTML page using java script?

Categories

Resources