Hi I am scraping a website and I am using selenium python. The code is as below
url ='https://www.hkexnews.hk/'
options = webdriver.ChromeOptions()
browser = webdriver.Chrome(chrome_options=options, executable_path=r'chromedriver.exe')
browser.get(url)
tier1 = browser.find_element_by_id('tier1-select')
tier1.click()
tier12 = browser.find_element_by_xpath('//*[#data-value="rbAfter2006"]')
tier12.click()
time.sleep(1)
tier2 = browser.find_element_by_id('rbAfter2006')
tier2.click()
tier22 = browser.find_element_by_xpath("//*[#id='rbAfter2006']//*[#class='droplist-item droplist-item-level-1']//*[text()='Circulars']")
tier22.click()
tier23 = browser.find_element_by_xpath("//*[#id='rbAfter2006']//*[#class='droplist-item droplist-item-level-2']//*[text()='Securities/Share Capital']")
tier23.click()
tier24 = browser.find_element_by_xpath("//*[#id='rbAfter2006']//*[#class='droplist-group droplist-submenu level3']//*[text()='Issue of Shares']")
tier24.click()
It works stops at tier23 by showing ElementNoVisibleException. I have tried with different class, yet it seems like not working. Thank you for you help
There are two elements that can be selected by your XPath. The first one is hidden. Try below to select required element:
tier23 = browser.find_element_by_xpath("//*[#id='rbAfter2006']//li[#aria-expanded='true']//*[#class='droplist-item droplist-item-level-2']//*[text()='Securities/Share Capital']")
or shorter
tier23 = browser.find_element_by_xpath("//li[#aria-expanded='true']//a[.='Securities/Share Capital']")
tier23.location_once_scrolled_into_view
tier23.click()
P.S. Note that that option will still be not visible because you need to scroll list down first. I used tier23.location_once_scrolled_into_view for this purpose
Also it's better to use Selenium built-in Waits instead of time.sleep
Related
I recently started doing botting to try and get my hands on a better graphics card. My issue is this button: button html which has no name or anything useful that would allow my bot to detect it. I have tried using its XPath but to no success. my XPath attempt (I used the real XPath don't worry). Any help would be much appreciated :)
driver.execute_script('return document.querySelector("core-button").shadowRoot.querySelector("a")').click()
use execute_script and then return the element inside the shadowRoot
I used the following code to fix my issue.
browser = webdriver.Chrome()
def expand_shadow_element(element):
shadow_root = browser.execute_script('return arguments[0].shadowRoot', element)
return shadow_root
root1 = browser.find_element_by_tag_name('app-shell')
shadow_root1 = expand_shadow_element(root1)
root2 = shadow_root1.find_element_by_css_selector('request-page')
shadow_root2 = expand_shadow_element(root2)
root3 = shadow_root2.find_element_by_css_selector('box-content')
shadow_root3 = expand_shadow_element(root3)
pay_button = shadow_root3.find_element_by_css_selector("core-button")
pay_button.click()
I am trying web scraping using Python selenium. I am getting the following error: Message: element not interactable (Session info: chrome=78.0.3904.108) I am trying to access option element by id, value or text. It is giving me this error. I am using Python-3. Can someone help to explain where am I going wrong. I used the select tag using xpath and also tried css_selector. The select tag is selected and I can get the output of select tag selected. Here is my code for a better understanding:
Code-1:-
path = r'D:\chromedriver_win32\chromedriver.exe'
browser = webdriver.Chrome(executable_path = path)
website = browser.get("https://publicplansdata.org/resources/download-avs-cafrs/")
el = browser.find_element_by_xpath('//*[#id="ppd-download-state"]/select')
for option in el.find_elements_by_tag_name('option'):
if option.text != None:
option.click()
break
Blockquote
Code-2:-
select_element = Select(browser.find_element_by_xpath('//*[#id="ppd-download-state"]/select'))
# this will print out strings available for selection on select_element, used in visible text below
print(o.value for o in select_element.options)
select_element.select_by_value('AK')
Both codes give the same error how can I select values from drop down from website
Same as question:
Python selenium select element from drop down. Element Not Visible Exception
But the error is different. Tried the methods in comments
State, Plan, and Year:
browser.find_element_by_xpath("//span[text()='State']").click()
browser.find_element_by_xpath("//a[text()='West Virginia']").click()
time.sleep(2) # wait for Plan list to be populated
browser.find_element_by_xpath("//span[text()='Plan']").click()
browser.find_element_by_xpath("//a[text()='West Virginia PERS']").click()
time.sleep(2) # wait for Year list to be populated
browser.find_element_by_xpath("//span[text()='Year']").click()
browser.find_element_by_xpath("//a[text()='2007']").click()
Don't forget to import time
This question already has answers here:
How to get text with Selenium WebDriver in Python
(9 answers)
Closed 3 years ago.
i try to get the content of
<span class="noactive">0 Jours restants</span>
(which is the expiration date of the warranty)
but i don't know how to get it (i need to print it in a file)
my code
def scrapper_lenovo(file, line):
CHROME_PATH = 'C:\Program Files (x86)\Google\Chrome\Application\chrome.exe'
CHROMEDRIVER_PATH = 'C:\webdriver\chromedriver'
WINDOW_SIZE = "1920,1080"
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--window-size=%s" % WINDOW_SIZE)
chrome_options.binary_location = CHROME_PATH
d = driver.Chrome(executable_path=CHROMEDRIVER_PATH,
chrome_options=chrome_options)
d.get("https://pcsupport.lenovo.com/fr/fr/warrantylookup")
search_bar = d.find_element_by_xpath('//*[#id="input_sn"]')
search_bar.send_keys(line[19])
search_bar.send_keys(Keys.RETURN)
time.sleep(4)
try:
warrant = d.find_element_by_xpath('//*[#id="W-Warranties"]/section/div/div/div[1]/div[1]/div[1]/p[1]/span')
file.write(warrant)
except:
print ("test")
pass
if ("In Warranty" not in d.page_source):
file.write(line[3])
file.write("\n")
d.close()
i tried as you can see to print the content of 'warrant' and i couldn't find any function allowing it (i saw some using .text(), .gettext() but for whatever reasons i couldn't get them to work).
You could try explicitly matching the required tag, the relevant XPath expression would be:
//span[#class='noactive']
I would also recommend getting of this time.sleep() function, it's some form of a performance anti-pattern, if you need to wait for presence/visibility/invisibility/absence of a certain element you should rather go for Explicit Wait
So remove these lines:
time.sleep(4)
warrant = d.find_element_by_xpath('//*[#id="W-Warranties"]/section/div/div/div[1]/div[1]/div[1]/p[1]/span')
and use this one instead:
warrant = WebDriverWait(driver, 10).until(expected_conditions.presence_of_element_located((By.XPATH, "//span[#class='noactive']")))
More information: How to use Selenium to test web applications using AJAX technology
I have tried the code given below but every time i run the code,there is some links added to missing. I want to get all the links in the page in a list,so that i can go to any link that i want using slicing.
links = []
eles = driver.find_elements_by_xpath("//*[#href]")
for elem in eles:#
url = elem.get_attribute('href')
print(url)
links.append(url)
is there any way to get all the elements without missing any.
sometimes the links reside inside the frames.
Search for the frames in your website using inspect.
so you need to switch the frame first
browser.switch_to.frame("x1")
links = []
eles = driver.find_elements_by_xpath("//*[#href]")
for elem in eles:#
url = elem.get_attribute('href')
print(url)
links.append(url)
browser.switch_to.default_content()
I am trying to pull out the names of all courses offered by Lynda.com together with the subject so that it appears on my list as '2D Drawing -- Project Soane: Recover a Lost Monument with BIM with Paul F. Aubin'. So I am trying to write a script that will go to each subject on http://www.lynda.com/sitemap/categories and pull out the list of courses. I already managed to get Selenium to go from one subject to another and pull the courses. My only problem is that there is a button 'See X more courses' to see the rest of the courses. Sometimes you have to click it couple of times that´s why I used while loop. But selenium doesn´t seem to execute this click. Does anyone know why?
This is my code:
from selenium import webdriver
url = 'http://www.lynda.com/sitemap/categories'
mydriver = webdriver.Chrome()
mydriver.get(url)
course_list = []
for a in [1,2,3]:
for b in range(1,73):
mydriver.find_element_by_xpath('//*[#id="main-content"]/div[2]/div[3]/div[%d]/ul/li[%d]/a' % (a,b)).click()
while True:
#click the button 'See more results' as long as it´s available
try:
mydriver.find_element_by_xpath('//*[#id="main-content"]/div[1]/div[3]/button').click()
except:
break
subject = mydriver.find_element_by_tag_name('h1') # pull out the subject
courses = mydriver.find_elements_by_tag_name('h3') # pull out the courses
for course in courses:
course_list.append(str(subject.text)+" -- " + str(course.text))
# go back to the initial site
mydriver.get(url)
Scroll to element before clicking:
see_more_results = browser.find_element_by_css_selector('button[class*=see-more-results]')
browser.execute_script('return arguments[0].scrollIntoView()', see_more_results)
see_more_results.click()
One solution how to repeat these actions could be:
def get_number_of_courses():
return len(browser.find_elements_by_css_selector('.course-list > li'))
number_of_courses = get_number_of_courses()
while True:
try:
button = browser.find_element_by_css_selector(CSS_SELECTOR)
browser.execute_script('return arguments[0].scrollIntoView()', button)
button.click()
while True:
new_number_of_courses = get_number_of_courses()
if (new_number_of_courses > number_of_courses):
number_of_courses = new_number_of_courses
break
except:
break
Caveat: it's always better to use build-in explicit wait than while True:
http://www.seleniumhq.org/docs/04_webdriver_advanced.jsp#explicit-waits
The problem is that you're calling a method to find element by class name, but you're passing a xpath. if you're sure this is the correct xpath you'll simply need to change to method to 'find_element_by_xpath'.
A recommendation if you allow: Try to stay away from these long xpaths and go through some tutorials on how to write efficient xpath for example.