my code not working when reaching the second loop. when I hover over the first category it's showing the second category and I need to hover the second category to see the third category. here is my code:
driver.get("https://www.daraz.com.bd/")
main_category = driver.find_elements(By.CSS_SELECTOR , '.lzd-site-menu-root-item span')
for i in main_category:
hover = ActionChains(driver).move_to_element(i)
hover.perform()
time.sleep(1)
sub_category_one = driver.find_elements(By.CSS_SELECTOR , ".Level_1_Category_No1 [data-spm-anchor-id] span")
for i in sub_category_one:
hover = ActionChains(driver).move_to_element(i)
hover.perform()
To Mouse Hover over the first category which shows the second categories and then to Hover over the second categories to see the third categories you need to induce WebDriverWait for the visibility_of_all_elements_located() and you can use the following locator strategies:
Code block:
driver.get('https://www.daraz.com.bd/')
main_category = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".lzd-site-menu-root-item span")))
for i in main_category:
ActionChains(driver).move_to_element(i).perform()
time.sleep(1)
sub_category_one = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "ul.Level_1_Category_No1 li.lzd-site-menu-sub-item")))
for j in sub_category_one:
ActionChains(driver).move_to_element(j).perform()
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
First of to scrape the site, bs4 and iterating over the lists seems like a much better approach.
Now find_elements returns a list. You are iterating over a list with only one value in your second for loop. When I inspected the page I noticed that a submenu or subsubmenu that is active is assigned the class lzd-site-menu-sub-active and lzd-site-menu-grand-active.
main_category = driver.find_elements(By.CSS_SELECTOR, ".lzd-site-menu-root-item span")
for main in main_category:
ActionChains(driver).move_to_element(main).perform()
sub_category = WebDriverWait(driver, 3).until(
lambda x: x.find_elements(By.CSS_SELECTOR, ".lzd-site-menu-sub-item span")
)
for sub in sub_category:
ActionChains(driver).move_to_element(sub).perform()
subsub_category = WebDriverWait(driver, 3).until(
lambda x: x.find_elements(By.CSS_SELECTOR, ".lzd-site-menu-grand-item span")
)
for subsub in subsub_category:
ActionChains(driver).move_to_element(subsub).perform()
This code manages to go hover over the third level as you will see. However because of the bad CSS_Selector it's somewhat useless.
I hope this could be of help.
Related
I wrote a python script that goes to a site, and interacts with some dropdowns. It works perfectly fine if after I run the script, quickly make the browser instance full screen so that the elements are in view. If I don't do that, I get the error "Element could not be scrolled into view".
Here is my script:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("https://example.com")
driver.implicitly_wait(5)
yearbtn = driver.find_element("id", "dropdown_year")
yearbtn.click()
year = driver.find_element("css selector", '#dropdown_ul_year li:nth-child(5)')
year.click()
makebtn = driver.find_element("id", "dropdown_make")
makebtn.click()
make = driver.find_element("css selector", '#dropdown_ul_make li:nth-child(2)')
make.click()
modelbtn = driver.find_element("id", "dropdown_model")
modelbtn.click()
model = driver.find_element("css selector", '#dropdown_ul_model li:nth-child(2)')
model.click()
trimbtn = driver.find_element("id", "dropdown_trim")
trimbtn.click()
trim = driver.find_element("css selector", '#dropdown_ul_trim li:nth-child(2)')
trim.click()
vehicle = driver.find_element("css selector", '#vehiclecontainer > div > p')
vdata = driver.find_element("css selector", '.top-sect .tow-row:nth-child(2)')
print("--------------")
print("Your Vehicle: " + vehicle.text)
print("Vehicle Data: " + vdata.text)
print("--------------")
print("")
driver.close()
Like I said, it works fine if I make the browser full-screen (or manually scroll) so that the elements in question are in view. It finds the element, so what's the issue here? I've tried both Firefox and Chrome.
Generally firefox opens in maximized mode. Incase for specific version of os it doesn't, the best practice is to open it in maximized mode as follows:
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument("start-maximized")
driver = webdriver.Firefox(options=options)
However in somecases the desired element may not get rendered within the due coarse, so you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:
vehicle = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#vehiclecontainer > div > p")))
vdata = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".top-sect .tow-row:nth-child(2)")))
print("--------------")
print("Your Vehicle: " + vehicle.text)
print("Vehicle Data: " + vdata.text)
print("--------------")
print("")
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
I'm not sure what all is happening on the page in question. I use a generic method to take care of common actions like click(), find(), etc. that contain a wait, scroll, and whatever else is useful for that action. I've put an example below. It waits for the element to be clickable (always a best practice), scrolls to the element, and then clicks it. You may want to tweak it to your specific use but it will get you pointed in the right direction.
def click(self, locator):
element = WebDriverWait(driver, 20).until(EC.element_to_be_clickable(locator))
driver.execute_script("arguments[0].scrollIntoView();", element)
element.click()
I would suggest you try this and then convert all of your clicks to use this method and see if that helps, e.g.
click((By.ID, "dropdown_year"))
or
locator = (By.ID, "dropdown_year")
click(locator)
I am trying to scrape data from this website using Selenium.
There are three features in data, "Value", "Net change" and "percent change", including values for net and percentage changes for 1, 3, 6, and 12 months. I want to fetch 1 month's net change and percent change. For that, I need to click on the check boxes and click on the update button.
Now, I performed these actions using selenium's find element by XPath method but for percent change, I needed to use the ActionChains command, as I was getting "Element not clickable error".
When I execute the code, all three features should occur in the downloaded csv. But that's not happening. I am just able to fetch "Value" and "1 Month Net change". If anyone knows, may I know, why the is not getting updated and or how to fix it? Thanks
My code:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
driver.get("https://beta.bls.gov/dataViewer/view/timeseries/CUUR0000SA0")
soup = BeautifulSoup(driver.page_source, "html.parser", from_encoding='utf-8')
driver.find_element(By.XPATH, '/html/body/div[2]/div/div/div[4]/div/div[1]/form/div[2]/fieldset/div[1]/table/tbody/tr[1]/td[1]/label/input').click() //1 month net change
element = WebDriverWait(driver, 60).until(EC.element_to_be_clickable((By.XPATH, '// [#id="percent_monthly_changes_div"]/table/tbody/tr[1]/td[1]/label/input')))
ActionChains(driver).move_to_element(element).click().perform() //1 month percent change
driver.find_element(By.XPATH, '/html/body/div[2]/div/div/div[4]/div/div[1]/form/div[4]/input').click() //update button
driver.find_element(By.XPATH, '//*[#id="csvclickCU"]').click() //download csv button
The website is showing N/A in the column of 1 Month Net change.
if you still not getting 1 month % change value you can do
driver.execute_script('document.querySelector("#percent_monthly_changes_div > table > tbody > tr:nth-child(1) > td:nth-child(1) > label > input").click()')
instead of:
element = WebDriverWait(driver, 60).until(EC.element_to_be_clickable((By.XPATH, '// [#id="percent_monthly_changes_div"]/table/tbody/tr[1]/td[1]/label/input')))
ActionChains(driver).move_to_element(element).click().perform()
this might not be the optimal solution, but it works fine.
and 1 month net change value is not given from the website itself.
To click on the elements with text as 1-Month Net Change and 1-Month % Change using ActionChains will be an overhead and you can avoid it easily.
Ideally, you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:
Using CSS_SELECTOR:
driver.get("https://beta.bls.gov/dataViewer/view/timeseries/CUUR0000SA0")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[value='1N']"))).click()
driver.find_element(By.CSS_SELECTOR, "input[value='1P']").click()
Using XPATH:
driver.get("https://beta.bls.gov/dataViewer/view/timeseries/CUUR0000SA0")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[#value='1N']"))).click()
driver.find_element(By.XPATH, "//input[#value='1P']").click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Browser Snapshot:
I want to select an option that displays only after you've clicked on the dropdown (see attached image). I have been able to click on the dropdown to get the list, but have not been able to figure out how to click on an option, say option 1, 'Last Day' after the list comes into picture.
Here's what I have so far:
from selenium import webdriver
binary = FirefoxBinary('C:\\Program Files\\Firefox Developer Edition\\firefox.exe')
cap = DesiredCapabilities().FIREFOX
cap["marionette"] = True
url='https://www.glassdoor.com/Job/jobs.htm?suggestCount=0&suggestChosen=false&clickSource=searchBtn&typedKeyword=data+sc&sc.keyword=data+scientist&locT=C&locId=1154532&jobType='
driver = webdriver.Firefox(firefox_binary=binary, capabilities=cap, executable_path=GeckoDriverManager().install())
driver.get(url=url)
driver.implicitly_wait(10)
driver.maximize_window()
# clicking on dropdown
d = driver.find_element_by_id('filter_fromAge')
d.click()
I also tried using the following code (found on another SO answer) but it did not work either:
element = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"ul#css-1dv4b0s ew8xong0")))
I'm new to web scraping and not really familiar with XPATHs and how to deal with actions. Help appreciated!
You can use Java script to click on your element. As element is present in DOM but its only appears when click on drop down, so normal click method mau work or may not. But with JS it will always click. Can use below code:
day = driver.find_element_by_xpath("//span[contains(text(),'Last Day')]") #Identify your element
driver.execute_script("arguments[0].click();", day) # CLick it with help of JS
out Put:
You can try the following approach:
driver.get('https://www.glassdoor.com/Job/boston-data-scientist-jobs-SRCH_IL.0,6_IC1154532_KO7,21.htm')
driver.maximize_window()
expand_element = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.ID, 'filter_fromAge')))
expand_element.click()
target_text = 'Last 3 Days'
target_element = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//ul[#class="css-1dv4b0s ew8xong0"]/li/span[text()="{}"]'.format(target_text))))
target_element.click()
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, '//div[#id="filter_fromAge"]/span[text()="{}"]'.format(target_text))))
To select the option with text as Last Day that displays only after clicking on the dropdown you need to induce WebDriverWait for the element_to_be_clickable() and you can use the following Locator Strategy:
Using XPATH:
driver.get('https://www.glassdoor.com/Job/jobs.htm?suggestCount=0&suggestChosen=false&clickSource=searchBtn&typedKeyword=data+sc&sc.keyword=data+scientist&locT=C&locId=1154532&jobType=')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div#filter_fromAge>span"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[#id='PrimaryDropdown']/ul//li//span[#class='label' and contains(., 'Last Day')]"))).click()
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Browser Snapshot:
Hey I am trying to make a simple Insta bot using selenium.I reached up to my following list by automating but now I don't know how I can scroll down the list of my following/followers. I want to grab the account from my following and followers list and compare it and make a list of the accounts who havenot followed me back.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from time import sleep
from secrets import password,username
class Instabot:
def __init__(self,username,password):
self.driver = webdriver.Chrome(executable_path="C:\\Users\\user\\PycharmProjects\\chromedriver_win32\\chromedriver.exe")
self.driver.get("https://www.instagram.com")
sleep(5)
self.driver.find_element_by_xpath('//*[#id="react-root"]/section/main/article/div[2]/div[1]/div/form/div[2]/div/label/input').send_keys(username) #searching the username box and giving it username
self.driver.find_element_by_xpath('//*[#id="react-root"]/section/main/article/div[2]/div[1]/div/form/div[3]/div/label/input').send_keys(password)#searching the password box and giving it password
self.driver.find_element_by_xpath('//*[#id="react-root"]/section/main/article/div[2]/div[1]/div/form/div[4]/button').click() #clicking login button
wait=WebDriverWait(self.driver,10)
notNowButton = wait.until(
lambda d: d.find_element_by_xpath('//button[text()="Not Now"]'))#it will click on first notnow button
notNowButton.click()
next_not_now=wait.until(lambda notnow:notnow.find_element_by_xpath('//button[text()="Not Now"]'))
next_not_now.click() #it will click on second not now button
def get_unfollowers(self):
wait = WebDriverWait(self.driver, 10)
clickprofile= wait.until(lambda a: a.find_element_by_xpath('//*[#id="react-root"]/section/nav/div[2]/div/div/div[3]/div/div[5]/a'))
clickprofile.click()
following_list = wait.until(EC.element_to_be_clickable((By.XPATH, "//a[contains(#href,'/following')]")))
following_list.click()#IT clicks the following and gives window of following listy
print("CLicked following list")
my_bot=Instabot(username,password)
my_bot.get_unfollowers()
I saw about execute_script() But I dont know what to put under those brackets.
Well I figure it out. This code works for me now!
fBody = self.driver.find_element_by_css_selector("div[class='isgrP']")
scrolling_times=(numoffollowers/4)
scroll=0
scroll_count = scrolling_times+5 # You can use your own logic to scroll down till the bottom
while scroll < scroll_count:
self.driver.execute_script(
'arguments[0].scrollTop = arguments[0].scrollTop + arguments[0].offsetHeight;',
fBody)
sleep(2)
scroll += 1
You can use something like that :
self.driver.execute_script("window.scrollTo(0, Y)")
with Y : how much you want to scroll in pixel.
It's up to you to see how far down you have to go down.
The code is working itself, but when thrown to loop not working anymore.
driver.get("https://www.esky.pl/okazje/16578/WMI-EDI-FR")
i = 1
departure_date_clickable = False
while departure_date_clickable == False:
try:
time.sleep(5)
xpath ="/html/body/div[4]/div[3]/div/div[1]/div[1]/div[2]/div/div[4]/div/div[{}]".format(i)
find_ele = driver.find_element_by_xpath(xpath)
find_ele.click()
print("Departure:Found clickable date on " + str(i))
departure_date_clickable = True
except WebDriverException:
print("Departure date not clickable, checking next day")
i += 1
continue
I expect to click first element able to be clicked from the calendar. But for some reason it's problem for selenium when in loop.
Code that's working:
xpath = "/html/body/div[4]/div[3]/div/div[1]/div[1]/div[2]/div/div[4]/div/div[{}]".format("4")
find_ele = driver.find_element_by_xpath(xpath)
time.sleep(2)
find_ele.click()
I recommend identifying clickable divs by a class attribute unique to those divs. It looks like all the "clickable" links have a class called "offer" so you could add an if else condition to check each element for that class. I also added a termination condition to your loop since there are 35 divs in the block.
driver.get("https://www.esky.pl/okazje/16578/WMI-EDI-FR")
i = 1
departure_date_clickable = False
while departure_date_clickable == False and i <= 35:
try:
time.sleep(1)
xpath ="/html/body/div[4]/div[3]/div/div[1]/div[1]/div[2]/div/div[4]/div/div[{}]".format(i)
find_ele = driver.find_element_by_xpath(xpath)
if "offer" in find_ele.get_attribute("class").split(" "):
find_ele.click()
print("Departure:Found clickable date on " + str(i))
departure_date_clickable = True
else:
raise error()
except:
print("Departure date not clickable, checking next day")
i += 1
continue
Please find below solution, There are few issues observed while iterating through a list element. Kindly note if you want to select any specific departure date then you need to put that condition in the for loop. At the moment as per your above code we are just click all departure dates
from selenium.common.exceptions import WebDriverException
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
from selenium.webdriver.support.ui import WebDriverWait as Wait
from selenium.webdriver.common.action_chains import ActionChains
driver = webdriver.Chrome('C:\New folder\chromedriver.exe')
driver.maximize_window()
driver.get("https://www.esky.pl/okazje/16578/WMI-EDI-FR")
i = 1
departure_date_clickable = False
while departure_date_clickable == False:
try:
xpath = WebDriverWait(driver, 15).until(EC.presence_of_all_elements_located((By.XPATH, "//div[#id='departure-calendar']//div[#class='month-days']/div/div")))
for value in xpath:
value.click()
departure_date_clickable = True
except WebDriverException:
I had the same problem in R while using selenium, I found that running the command to click twice sequentially within the loop works.
When the click was only present once it seemed to highlight the element, but not click on it.
While the language is different, the logic might hold.
I have provided the code below in a kind of pseudo form so the logic can be seen.
The loop would iterate over different webpages within a list (represented by 'list') with the same element (represented by '[xpath]') to be clicked in each webpage in turn.
library(RSelenium) # load package
# set up the driver
fprof <-getFirefoxProfile('[pathtoprofile]', useBase = TRUE)
rd <- rsDriver(browser = "firefox", port = 4567L,
extraCapabilities = fprof
)
ffd <- rd$client
for (i in 1:length(list)){
ffd$open() # open the browser
ffd$navigate(list[i]) # navigate to page
Sys.sleep(10) # wait
ffd$findElement("xpath", '[xpath]')$clickElement() # click element first time
ffd$findElement("xpath", '[xpath]')$clickElement() # click element second time
Sys.sleep(5) # wait
ffd$close() # close the browser
}
You can get all departure days with offer using:
driver.find_elements_by_css_selector("#departure-calendar .cell-day.number.offer")
For arrival calendar:
driver.find_elements_by_css_selector("#arrival-calendar .cell-day.number.offer")
Code below click on each day with offer in departure calendar and print the date:
driver.get("https://www.esky.pl/okazje/16578/WMI-EDI-FR")
departure_days = driver.find_elements_by_css_selector("#departure-calendar .cell-day.number.offer")
for departure_day in departure_days:
departure_day.click()
# print selected date
print(departure_day.get_attribute("data-qa-date"))
If you are just looking for the first available date, you can use the CSS selector below
div.cell-day.offer
Your code would look like
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver.get("https://www.esky.pl/okazje/16578/WMI-EDI-FR")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.cell-day.offer"))).click()