Selenium clicking loop ignored some values - python-3.x

I'm trying to run the code:
for j in range(1,13):
driver.find_element_by_xpath('//*[#id="gateway-page"]/body/table/tbody/tr[3]/td[2]/table/tbody/tr[2]/td[2]/table/tbody/tr/td/table/tbody/tr[2]/td/div/div[2]/ul/li['+str(j)+']').click()
time.sleep(3)
To click every satisfying element on this website. But it ignores some elements every time, while it worked when I tried them not in the for loop but separately. Any idea why this happened?

Seems problem is with /ul/li['+str(j)+'] you are performing the click on <li> tag while actual link reside in it. That's why sometime the actual link won't receive the click without any error as link wrapped inside <li> tag .
Try to locate actual link tag. Use below code. I have tested on my system. Hope this will help you.
driver.get('http://catalog.sps.cuny.edu/preview_program.php?catoid=2&poid=607')
driver.implicitly_wait(10)
links = driver.find_elements_by_xpath("//div//h2[contains(.,'Electives')]/..//ul/li//span/a")
for link in links:
link.click()
time.sleep(3)

After observing xpath, I observed that you are trying to click the Elective option on that website. I think you have stored text of all electives in str array and using the loop, you are trying to click on each elective.
I suggest using another approach. Store all electives in list and then iterate over the elements and click them. e.g.
elements = driver.find_elements_by_xpath('///*[#id="gateway-page"]/body/table/tbody/tr[3]/td[2]/table/tbody/tr[2]/td[2]/table/tbody/tr/td/table/tbody/tr[2]/td/div/div[2]/ul/li')
for element in elements:
element.click()
time.sleep()
Probable problems in your solution
You are storing the name of electives in the array. If there is any typo, xPath will become invalid
You are starting loop from 1 to 13 but str is 0 indexed so start the loop from 0. because in you case you will always miss the first elective
Also after each click, elective expands. So you can also think about scrolling if an element is not found
Suggestion:
Also, use relative xpaths instead of absolute. Relative xpaths are more stable.
Happy Coding~

Related

click on first search auto suggestion on website via VBA macro with selenium

I would like to know how I could program in VBA using Selenium to click/choose on website the first autosuggestion, for example as you can see in the screenshot from amazon.es
Do you have any suggestions?
Tom
Since your situation is replicable I managed to test this two (2) approaches to your situation. I'll just remind you that is always good to show us some code you've tried in your question, so it does not feel like we are helping you from scratch, also try not to rely on images to show us your situation, unless is something that does not change much «like in this case, Amazon.es page». That reminded, let's go to the good part:
1) Advanced 1:
a. Spaces in class are change for dots (if there is any)
b. Requires to understand tag's meaning (a tag is like an object)
'Example
'Clicking first element under tag ("div" alone is a tag)
Selenium.FindElementByCss("div.autocomplete-results-container > div").Click
2) Advanced 2:
a. Requires to understand what ":nth-child" (CSS selector) is
'Example:
'Clicking first child of "div" (Everything inside "div" is a child - starts in 1)
Selenium.FindElementByCss("div.autocomplete-results-container > div:nth-child(1)").Click
I used Firefox to get the xPath property of that suggestion made by the web page. To quickly compare, I copied and pasted, one at a time, the xPath of the first 3 suggestions shown:
/html/body/div[1]/header/div/div[2]/div/div[2]/div[1]/div/div[1]/span[1]
/html/body/div[1]/header/div/div[2]/div/div[2]/div[2]/div/div[1]/span[1]
/html/body/div[1]/header/div/div[2]/div/div[2]/div[3]/div/div[1]/span[1]
So, for the first item, just use the first xPath. If you want to select, for example, the second, just vary the index of the sixth DIV, as we can see in the samples above. Assuming you already have part of the code that navigates to the page, use this, adapting the name of the WebDriver:
objSeleniumDriver.FindElementByXPath("/html/body/div[1]/header/div/div[2]/div/div[2]/div[1]/div/div[1]/span[1]"). click

How to find elements in containers that open when buttons are pressed

I am using headless Firefox on Selenium and XPath Helper to identify insanely long paths to elements.
When the page initially loads, I can use XPath Helper to find the xpath of any element of interest, and selenium can find the element when given the xpath.
However, several buttons that I need to interact with on the page open menus when pressed that are either small or take up the whole "screen". No matter their size, these containers are overlaid on the original page, and although I can find their xpaths using XPath Helper, when I try to use those xpaths to find the elements using selenium, they can't be found.
I've checked, and there's no iframe funny business happening. I'm a bit stumped as to what could be happening. My guess is that the page's source code is being dynamically changed after I press the buttons that open the menu containers and when I call find_element_by_xpath on new elements in the containers, the original source is being searched, instead of the new source. Could that be it?
Any other ideas?
As a workaround, I can get around this issue by sending keystrokes to the body of the page, but I feel this solution is rather brittle and likely to fail. Would be a much more robust solution to actually specify all elements.
EDIT:
With selenium I can find the export button, but not the menu it opens.
Here is the code for the export button itself:
The element of interest for me is "Customize Export" which I have not been able to find using selenium. Here is the code for this element:
Notice the very top line of this last image (cdk-overlay-container)
Now, when I refresh the page and do NOT click the export button, the cdk-overlay-container section of the code is empty:
This suggests my that my hypothesis is correct -- that when the page loads initially, the "Customize Export" button is nowhere in the source code, but appears only after "Export" is clicked, and that selenium is using the original source code only --not the dynamically generated code that appears after clicking "Export" -- to find elements
Selenium could find the dynamic content after doing
driver.execute_script("return document.body.innerHTML")
The WebDriverWait is what you need to use to wait for a certain condition of elements. Here is an example of waiting for the elements to be clickable before the click with a timeout in 5 seconds:
wait = WebDriverWait(driver, 5)
button = wait.until(EC.element_to_be_clickable((By.XPATH, 'button xpath')))
button.click()
wait.until(EC.element_to_be_clickable((By.XPATH, 'menu xpath'))).click()
identify insanely long paths
is an anti pattern. You can try to not use XPath Helper and find xpath or selector yourself.
Update:
wait = WebDriverWait(driver, 10)
export_buttons = wait.until(EC. presence_of_all_elements_located((By.XPATH, '//button[contains(#class, "mat-menu-trigger") and contains(.="Export")]')))
print("Export button count: ", len(export_buttons))
export_button = wait.until(EC.element_to_be_clickable((By.XPATH, '//button[contains(#class, "mat-menu-trigger") and contains(.="Export")]')))
export_button.click()
cus_export_buttons = wait.until(EC. presence_of_all_elements_located((By.XPATH, '//button[contains(#class, "mat-menu-item") and contains(.="Customize Export")]')))
print("Customize Export button count: ", len(cus_export_buttons))

Selenium Webdriver not finding XPATH despite seemingly identical strings

This question is related to my previous two: Inducing WebDriverWait for specific elements and Inconsistency in scraping through <div>'s in Selenium.
I am scraping all of the Air Jordan sneakers off of https://www.grailed.com/. The feed is an infinitely scrolling list of sneakers and I am using Selenium webdriver to scrape the data. My problem is that the images for the shoes seem to take a while to load, so it throws a lot of errors. I have found the pattern in the xpath's of the images. The xpath to the first image is
/html/body/div[3]/div[6]/div[3]/div[3]/div[2]/div[2]/div[1]/a/div[2]/img, and the second is /html/body/div[3]/div[6]/div[3]/div[3]/div[2]/div[2]/div[2]/a/div[2]/img etc.
It follows this linear sequences where the second to last div index increases by one each time. To handle this I put the following in my loop (only relevant code is included).
i = 1
while len(sneakers) < sneaker_count:
# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Get sneakers currently on page and add to sneakers list
feed = driver.find_elements_by_class_name('feed-item')
for item in feed:
xpath = "/html/body/div[3]/div[6]/div[3]/div[3]/div[2]/div[2]/div[" + str(i) + "]/a/div[2]/img"
img = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, xpath)))
i += 1
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height
The issue is, after about the 5th pair of shoes, the wait statement times out, it seems that the xpath passed in after that pair of shoes is not recognized. I used FireFox Developer to check the xpath using the copy xpath feature, and it seems identical to the passed in xpath when I print it. I use ChromeDriver w/Selenium but I don't think that's relevant. Does anyone know why the xpath's stop being recognized even though they seem identical?
UPDATE: So using an Xpath checker add-on to Chrome, it detects xpaths for items 1-4, but often stops detecting them after 6. When I check the xpath (both on Chrome and FireFox Developer mode, the xpath still looks identical, but it doesn't detect them when I use the "CSS and Xpath checker" it still doesn't seem to come out. This is a huge mystery to me.
I found the problem. The xpath was fine, but after the first 4-5 elements, the images are lazy-loaded. This means that a different solution must be reached in order to scrape these images. It's not that they take too long to load, it's that they just load placeholders in the HTML.

How to locate the element as per the HTML through FindElementByXPath in Selenium Basic

I'm writing a VBA code to login into a webpage and load some information i have in an excel worksheet.
I'm new in Selenium. I already got the login part right, but now i need to click in an element and i keep getting errors.
I need to click in the Company 2 button.
This is what i've got so far:
bot.FindElementByXPath("//input[#value=""Company 1""]").Click
Outputs NoSuchElementError
bot.FindElementByXPath("//input[#value=""Company 2""]").Click
Outputs ElementNotVisible
I don't know what i'm doing wrong, i think that the first input being hidden has something to do. Hope anyone can help me.
Might help you to know you can also use ByCss in most circumstances, in which case you can use:
bot.FindElementByCss("input[value='Company 1']").Click
That is nice and short.
The CSS selector is input[value='Company 1']. This says find element with input tag having attribute value with value of 'Company 1'.
XPath might be incorrect. Please try the following syntax:
FindElementByXPath("//input[#value='Company 1']")
First of all, use CSS selectors whenever possible. They are much easier to handle.
Now if you are using CSS selectors try to find the second button using something like
input[value="Company 2"]
For more info on this selector, look at https://www.w3schools.com/cssref/sel_attribute_value.asp
You can use any xpath, in first look I found your xpath is incorrect, try this:
//input[#type='button'][#value='Company 2']
//input[#type='button'&& #value='Company 2']
//input[#role='button'][#value='Company 2']
You can also use findelements() to store are all buttons and using if else you can extract the company 2 button
As per the HTML you have shared to invoke click() on the desired elements you can use the following solution:
To click on the element with text as Company 1:
bot.FindElementByXPath("//input[#class='btn_empresa ui-button ui-widget ui-state-default ui-corner-all' and #value='Company 1']").Click
To click on the element with text as Company 2:
bot.FindElementByXPath("//input[#class='btn_empresa ui-button ui-widget ui-state-default ui-corner-all' and #value='Company 2']").Click
Have you tried right-clicking the HTML in inspect and going to Copy>Copy XPath? That might give you something different. Maybe the buttons are created from Javascript and so the WebDriver can't actually see them?
Or try
Company_1 = bot.find_element_by_xpath("//input[#value='Company 1']")
Company_1.click()
Company_2 = bot.find_element_by_xpath("//input[#value='Company 2']")
Company_2.click()
And change the syntax with the ' ' quotes like someone else mentioned.

Python: finding elements of a webpage to scrape in python when page content is loaded using Java script

I am trying to scrape content of a page.
Let's say this is the page:
http://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL
I know I need to use Selenium to get the data I want.
I found this example from Stackoverflow that shows how to do it:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
driver = webdriver.Chrome()
driver.maximize_window()
driver.get("http://finance.yahoo.com/quote/AAPL/profile?p=AAPL")
# wait for the Full Time Employees to be visible
wait = WebDriverWait(driver, 10)
employees = wait.until(EC.visibility_of_element_located((By.XPATH, "//span[. = 'Full Time Employees']/following-sibling::strong")))
print(employees.text)
driver.close()
My question is this:
In the above example to find Full Time Employees the code that has been used is:
employees = wait.until(EC.visibility_of_element_located((By.XPATH, "//span[. = 'Full Time Employees']/following-sibling::strong")))
How the author has found that s/he needs to use:
"//span[. = 'Full Time Employees']/following-sibling::strong"
To find the number of employees.
For my example page: http://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL how can I find for example Trailing P/E?
Can you please tell me the steps you took to find this? I do right click and choose Inspect, but then what shall I do?
A picture is worth of thousand words.
In web dev. tools (F12) you do the following steps:
Choose Elements tab
Press Element Selector button
With that button pressed you click an element in the main browser window.
In the DOM-elements window you right-click that highlighted element.
The context menu gets transpired and you choose Copy.
Choose Copy XPath in a sub menu. Now you have that element xpath in a console buffer.
NOTE!
The browser makes/composes an element xpath based on its own algorithm. It might not be the way you think or the way that fits to your code. So, you have to understand xpath in nature, be acquainted with it.
See what xpath the Chrome browser has issued for Trailing P/E:
//*[#id="main-0-Quote-Proxy"]/section/div[2]/section/div/section/div[2]/div[1]/div[1]/div/table/tbody/tr[3]/td[1]/span
'//h3[contains(., "Valuation Measures")]/following-sibling::div[1]//tr[3]'
Here I have the answer for all your confusions.
It will be better to look on some xpath tutorials and do practice from yourself, then you will be able to decide what you have to use .
There are so many site. You can start Here or Here
Now come to your Query -
Suppose I am using following xpath to locate the element
//h3/span[text()='Financial Highlights']/../preceding-sibling::div//tr[3]/td/span
Your requirement to find Trailing P/E in your page, definatly you will look unique xpath which won't change. If you try to find this using firepath it shows some lengthy xpath
Now you will check alternative and find another element (may be sibling, child or ancestor of your element) based on that you can to locate your element
in My case, first will find the Financial Highlights text which I will be able to find using //h3/span[text()='Financial Highlights']
Now I move its parent tag which is h3 and I will do this using /..
I have Trailing P/E element in just above the current node so move on just above node using /preceding-sibling::div
And finally find your element in that <div> like -//tr[3]/td/span
See the screens as well -
Step 1 :
Step 2 :
Step 3 :
Step 4 :

Resources