How to extract all href from a class in Python Selenium?

How to extract all href from a class in Python Selenium? - python-3.x

I am trying to extract people's href from the URL https://www.dx3canada.com/agenda/speakers.
I tried:
elems = driver.find_elements_by_css_selector('.display-flex card vancouver')
href_output = []
for ele in elems:
href_output.append(ele.get_attribute("href"))
print(href_output)
But the output list returns nothing...
The expected href shown as the image below and I hope the outputs as a list of hrefs:
I really appreciate the help!

To extract the people's href attribute from the URL https://www.dx3canada.com/agenda/speakers as the the desired elements are within an <iframe> so you have to:
Induce WebDriverWait for the desired frame to be available and switch to it.
Induce WebDriverWait for the visibility of all elements located.
You can use the following Locator Strategies:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get('https://www.dx3canada.com/agenda/speakers')
WebDriverWait(driver, 30).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe#whovaIframeSpeaker")))
print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a.display-flex.card.vancouver")))])
Console Output:
['https://whova.com/embedded/speaker_detail/dcrma_202003/9942778/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907682/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907688/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907676/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907696/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907690/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907670/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907693/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9942779/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9908087/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907671/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907681/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907673/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907678/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907689/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907674/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907684/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907685/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907686/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9942780/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907695/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907687/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907683/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907692/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907672/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907697/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907680/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907679/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907675/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907677/', 'https://whova.com/embedded/speaker_detail/dcrma_202003/9907694/']
Here you can find a relevant discussion on Ways to deal with #document under iframe

Your images are in an iframe, so you will need to switch to this before you can scrape the href attributes using frame_to_be_available_and_switch_to_it.
Then, to get the list of all href attributes, you may need to run some Javascript to scroll the image into view, and handle the case where the images may be lazy loading the href:
# first, switch to iframe
WebDriverWait(driver, 30).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[#id='whovaIframeSpeaker']")))
elements_list = driver.find_elements_by_xpath("//div[contains(#class, 'template-section-body')]/a[contains(#class, 'display-flex card vancouver')]")
for element in elements_list:
driver.execute_script("arguments[0].scrollIntoView(true);", element)
print(element.get_attribute("href"))
The results of this code:

For your css selector use .display-flex.card.vancouver instead.
elems = driver.find_elements_by_css_selector('.display-flex.card.vancouver')
Each word is a class, so you need to place a dot in the front of each one.

Related

Selenium TimeoutException Error on clicking a button despite using EC.visibility_of_element_located().click() method?

I am trying to create an Account on Walmart using selenium python. I successfully opened https://www.walmart.com/ and successfully go to create an account button under Sign In tab. Moreover, I also successfully entered the details of First name, Last name, Email Address and Password. However, once I clicked on Create account button, I got TimeoutException error despite using EC.visibility_of_element_located().click () method.
Can anyone kindly guide me what is wrong with my approach. Thanks in advance.
The source code of the website for Create Account button is as follows:
<button class="button m-margin-top text-inherit" type="submit" data-automation-id="signup-submit-btn" data-tl-id="signup-submit-btn" aria-label="Create Account, By clicking Create Account, the user is acknowledging that they have read and agreed to the Terms of Use and Privacy Policy">Create account</button>
My Python code is as follows:
import time
import requests
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.action_chains import ActionChains
url = "https://www.walmart.com/"
first_name = "chuza"
last_name = "123"
email_id = "chuza123#gmail.com"
password = "Eureka1#"
options = Options()
s=Service('C:/Users/Samiullah/.wdm/drivers/chromedriver/win32/96.0.4664.45/chromedriver.exe')
driver = webdriver.Chrome(service=s, options=options)
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source":
"const newProto = navigator.__proto__;"
"delete newProto.webdriver;"
"navigator.__proto__ = newProto;"
})
wait = WebDriverWait(driver, 20)
actions = ActionChains(driver)
driver.get(url)
sign_in_btn = wait.until(EC.visibility_of_element_located((By.XPATH, "//div[text()='Sign In']")))
actions.move_to_element(sign_in_btn).perform()
time.sleep(0.5)
wait.until(EC.visibility_of_element_located((By.XPATH, '//button[normalize-space()="Create an account"]'))).click()
f_name = driver.find_element(By.ID, 'first-name-su')
l_name = driver.find_element(By.ID, 'last-name-su')
email = driver.find_element(By.ID, 'email-su')
pswd = driver.find_element(By.ID, 'password-su')
f_name.send_keys(first_name)
driver.implicitly_wait(2)
l_name.send_keys(last_name)
driver.implicitly_wait(1.5)
email.send_keys(email_id)
driver.implicitly_wait(2)
pswd.send_keys(password)
driver.implicitly_wait(0.5)
###
wait.until(EC.visibility_of_element_located((By.XPATH, '//button[normalize-space()="Create account"]'))).click()

I see this css selector that represent the desired webelement:
button[data-automation-id='signup-submit-btn']
and xpath would be:
//button[#data-automation-id='signup-submit-btn']
there are 3 matching nodes for each CSS and XPath and Selenium will look for the first match, the CSS and XPath basically are first matching node.
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[data-automation-id='signup-submit-btn']"))).click()
or
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[#data-automation-id='signup-submit-btn']"))).click()
It makes more sense to use element_to_be_clickable when trying to click on a web element instead of visibility_of_element_located. Also, CSS are much better locator as compared to XPath.

//button[normalize-space()="Create account"] locator matches 3 elements on that page, you need to use more precise locator.
This locator is unique: //form[#id='sign-up-form']//button[#data-tl-id='signup-submit-btn']
So, this should work:
wait.until(EC.visibility_of_element_located((By.XPATH, "//form[#id='sign-up-form']//button[#data-tl-id='signup-submit-btn']"))).click()

This xpath based Locator Strategy...
//button[normalize-space()="Create account"]
...identifies three(3) elements within the DOM Tree and your desired element is the second in the list.
Solution
The desired element is a dynamic element so to click on the clickable element instead of visibility_of_element_located() you need to induce WebDriverWait for the element_to_be_clickable() and you can use the following Locator Strategy:
Using XPATH:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//form[#id='sign-up-form']//button[normalize-space()='Create account']"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

This issue is because of selenium, you can fix this by creating it manually, follow the steps:
Create an account on Walmart IO platform here by clicking on the man icon just before the search box.
Login to the account and accept "Terms of Use"
Click on "Create Your Application" to create a new application and fill in appropriate details.
You can follow this tutorial to generate two sets of public/private keys -one set will be used for production and the other set will be used for stage.
Upload both public keys using this: https://walmart.io/key-upload?app_name=<your app name>
Consumer ID will be generated for both sets for prod and stage which can be seen on the dashboard
Click on "Request Access" for OPD APIs at here and fill out the form

Press button using selenium on yahoo finance doesn't work

I am trying to get the top stocks for the day so I go to https://finance.yahoo.com/gainers but I the want to edit the filters by pressing Edit.
driver = webdriver.Chrome()
driver.get("https://finance.yahoo.com/gainers")
element = driver.find_element_by_class_name("Bgc($linkColor).Bgc($linkActiveColor):h.C(white).Fw(500).Px(20px).Py(9px).Bdrs(3px).Bd(0).Fz(s).D(ib).Whs(nw).Miw(110px)")
element.click()
This doesn't work. How can I fix it?

To click on the element Edit you can use either of the following Locator Strategies:
Using xpath:
driver.get("https://finance.yahoo.com/gainers")
driver.find_element_by_xpath("//span[text()='Edit']").click()
Ideally, to click on the element you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using XPATH:
driver.get("https://finance.yahoo.com/gainers")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//span[text()='Edit']"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

The java code below seems to work.
WebDriver driver = new ChromeDriver();
driver.get("https://finance.yahoo.com/gainers");
driver.manage().window().maximize();
WebDriverWait wait = new WebDriverWait(driver, 30);
WebElement editButton = wait
.until(ExpectedConditions.visibilityOfElementLocated(By.cssSelector("button[data-reactid=\"23\"]")));
editButton.click();
Cleaner locator
WebElement editButton = wait
.until(ExpectedConditions.visibilityOfElementLocated(By.xpath("//button/span[contains(text(),'Edit')]")));

unable to click the dropdown element in selenium

i tried so many xpath in selenium but all were unable to click the element and always give me a error element not found or element not interactable
how to solve it any help would be appreciated
Here is xpath of the Element is give Below:
(//a[#href='javascript:void(0)' and #class='select2-choice select2-default'])[1]

Try with ".//*[#id='s2id_search_input']/a"

Wait for element to be clickable before click on it:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# ...
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#s2id_search_input a.select2-choice'))).click()
With scroll:
element = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#s2id_search_input a.select2-choice')))
driver.execute_script('arguments[0].scrollIntoView()', element)
element.click()
Click using JavaScript:
element = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#s2id_search_input a.select2-choice')))
driver.execute_script('arguments[0].click()', element)

find aria-label in html page using soup python

i have html pages, with this code :
<span itemprop="title" data-andiallelmwithtext="15" aria-current="page" aria-label="you in page
number 452">page 452</span>
i want to find the aria-label, so i have tried this:
is_452 = soup.find("span", {"aria-label": "you in page number 452"})
print(is_452)
i want to get the result :
is_452 =page 452
i'm getting the result:
is_452=none
how to do it ?

It has line breaks in it, so it doesn't match through text.Try the following
from simplified_scrapy.simplified_doc import SimplifiedDoc
html='''<span itemprop="title" data-andiallelmwithtext="15" aria-current="page" aria-label="you in page
number 452">page 452</span>'''
doc = SimplifiedDoc(html)
is_452 = doc.getElementByReg('aria-label="you in page[\s]*number 452"',tag="span")
print (is_452.text)

Possibly the desired element is a dynamic element and you can use Selenium to extract the value of the aria-label attribute inducing WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "section#header a.cart-heading[href='/cart']"))).get_attribute("aria-label"))
Using XPATH:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//section[#id='header']//a[#class='cart-heading' and #href='/cart']"))).get_attribute("aria-label"))
Note : You have to add the following imports:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

The reason soup fails in doing this is because of the line break. I have a simpler solution which doesn't use any separate library, just BeautifulSoup only. I know this question is old, but it has 1k views so it's clear many people search up this question.
You can use triple-quote strings to take into account the newline.
This:
is_452 = soup.find("span", {"aria-label": "you in page number 452"})
print(is_452)
Would become:
search_label = """you in page
number 452"""
is_452 = soup.find("span", {"aria-label": search_label})
print(is_452)

ElementNotInteractableException: Message: Element could not be scrolled into view using GeckoDriver Firefox with Selenium WebDriver

How can I fix the error:
selenium.common.exceptions.ElementNotInteractableException: Message: Element <> could not be scrolled into view
error when working with Firefox via Selenium?
None of the tips from the site did not help me. I tried all the solutions I could find, including through WebDriverWait and JS. One of the solutions gave:
selenium.common.exceptions.MoveTargetOutOfBoundsException: Message: (568, 1215) is out of bounds of viewport width (1283) and height (699)
I tried resizing the browser window, which also didn't work.
My code:
webdriverDir = "/Users/Admin/Desktop/MyVersion/geckodriver.exe"
home_url = 'https://appleid.apple.com/account/'
browser = webdriver.Firefox(executable_path=webdriverDir)
browser.get(home_url)
browser.find_element_by_css_selector("captcha-input").click()
A solution that throws a window size error:
actions = ActionChains(browser)
wait = WebDriverWait(browser, 10)
element = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "captcha-input")))
actions.move_to_element(element).perform()
element.click()
By the way, this same code works perfectly in Chrome. But it's obvious enough.

To send a character sequence to the <captcha-input> field you have to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
driver.get('https://appleid.apple.com/account#!&page=create')
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.captcha-input input.generic-input-field"))).send_keys("JohnTit")
Using XPATH:
driver.get('https://appleid.apple.com/account#!&page=create')
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//captcha-input//input[#class='generic-input-field form-textbox form-textbox-text ']"))).send_keys("JohnTit")
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Browser Snapshot:

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to extract all href from a class in Python Selenium? - python-3.x

For your css selector use .display-flex.card.vancouver instead. elems = driver.find_elements_by_css_selector('.display-flex.card.vancouver') Each word is a class, so you need to place a dot in the front of each one.

Related

Selenium TimeoutException Error on clicking a button despite using EC.visibility_of_element_located().click() method?

Press button using selenium on yahoo finance doesn't work

unable to click the dropdown element in selenium

find aria-label in html page using soup python

ElementNotInteractableException: Message: Element could not be scrolled into view using GeckoDriver Firefox with Selenium WebDriver

Categories

Resources