Webscraping with selenium, click() line only works after x number of tries - python-3.x

Hi I have a script that scrapes a website based on filters, this was working fine up until the latest chrome update and chromedriver update and then one of the filters was failing to click.
I have a try and except loop that repeats for 10 tries, sometimes this is enough sometimes and other times I need > 15 clicks of the same line of code for it to eventually work.
Here is my code to initiate the webpage:
from selenium import webdriver
url='https://www.marketscreener.com/stock-exchange/calendar/finance/'
options=webdriver.ChromeOptions()
chrome_prefs = {}
options.experimental_options["prefs"]=chrome_prefs
chrome_prefs["profile.defaul_content_settings"] = {"popups":1}
options.add_argument('--disable-browser-side-navigation')
options.add_argument('--disable-infobars')
options.add_argument('--disable-extensions')
options.add_argument('--disable-popup-blocking')
options.add_argument('--disable-notifications')
driver = webdriver.Chrome(executable_path=r'foo.exe',options=options)
after I log in using my credentials I try this line of code:
driver.find_element_by_id("selCountry").click()
which opens up the country filter drop down menu. The majority of times it takes an arbitrary number >10 tries for it to eventually work.
I am using chrome version 87.0.4280.141 and its chromedriver with python 3.7.
Can anyone provide a better solution to my problem or maybe an explanation on why it suceedes after x number of tries?

Related

Selenium Python - Internet Explorer - Problem

I'm new on Selenium and I'm trying to do a task using Internet Explorer (I'm using Python 3.8)
In order to uderstand the commands of Selenium, I tried to run the simple code below
from selenium.webdriver.common.keys import Keys
import time
driver=webdriver.Ie(executable_path="C:\\Users\\usuario\\Downloads\\IEDriverServer.exe")
driver.get("https://www.google.com.br/")
driver.set_page_load_timeout(10)
driver.find_element_by_name("q").send_keys("Ronaldinho Gaucho")
driver.find_element_by_name("btnK").send_keys(Keys.ENTER)
driver.maximize_window()
driver.refresh()
The page opens, however, nothing is typed on the search bar on Google website, I have seen this code in a Youtube video and it has worked well, but when I try on my computer, it does not works (it does not raise any error on my terminal)
Anyone can help?, what I should been looking for?
Thanks in advance

Internet Clicks using Python Module

PROBLEM:
Power BI, under the pro-license, only allows data sources to be refreshed at most 8 times in a day and in 30 minute increments (using AM/PM timing).
This takes away my ability to make near real time decisions. So currently, my visualization is only updating every hour starting at 9:30 and the update happens 8 times, i.e. the total update times are 9:30, 10:30, 11:30, 12:30, 1:30, 2:30, 3:30, 4:30.
WORK AROUND:
So in order to bypass the pro license limitation on refreshes, I created code using PYAUTOGUI that will login to my Power BI server and click on that refresh button for me and I run this every 5 minutes.
WORK AROUND ISSUE:
The problem, is that this PYAUTOGUI will only work if the computer is active, i.e. I am logged in.
REQUEST:
What modules exist so that I can perform this same functionality in the background (without needing to have computer logged in or awake)?
NOTE:
I have done a search for packages from the command prompt using the code pip search mouse, pip search click, etc. but this is not the best use of time.
This is very much possible using selenium.
Firstly, ensure you have selenium installed in your environment. Run the following command:
pip install selenium
Then you'll need to make sure you have the correct chrome executable installed either in your PATH variable or explicitly written in your code (the latter option implies that you may use a config file as I have below, to avoid posting credentials). You can download the chrome executable for your version of Chrome here:
https://chromedriver.chromium.org/downloads
(You can do this with any browser, my choice is Chrome, though you will have to adjust the code as necessary)
Now the below code should work for you:
NOTE: The time module is essential here because if your code executes faster than the DOM loads, then selenium/python will return an error usually that error is saying it cannot find the elements you are looking for (because they haven't loaded yet)
'''REQUIRED PACKAGES'''
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
import time
'''CONFIG FILE FOR CHROME EXECUTABLE & LOGIN INFO'''
from _config import chrome_executable, user_name, user_pass
'''LINK TO SIGN IN PAGE FOR POWER BI'''
powerbi = 'https://app.powerbi.com/home?redirectedFromSignup=1&noSignUpCheck=1&response=AlreadyAssignedLicense'
'''SET CHROME OPTIONS TO RUN IN INCOGNITO AND HEADLESS(no browser window)'''
option = webdriver.ChromeOptions()
option.headless = True
option.incognito = True
'''CREATE BROWSER OBJECT / NAVIGATE TO POWER BI / MAXIMIZE WINDOW'''
browser = webdriver.Chrome(executable_path=chrome_executable, options=option)
browser.get(powerbi)
browser.maximize_window()
'''WAIT / FIND AND FILL IN EMAIL FIELD / FIND AND CLICK NEXT BUTTON'''
time.sleep(4)
bi_email = WebDriverWait(browser, 5).until(EC.presence_of_element_located((By.ID, 'i0116')))
bi_email.send_keys(user_name)
bi_email.send_keys(Keys.ENTER)
'''WAIT / FIND AND FILL IN PASSWORD FIELD / FIND AND CLICK ON LOGIN BUTTON'''
time.sleep(4)
bi_pass = WebDriverWait(browser, 5).until(EC.presence_of_element_located((By.ID, 'i0118')))
bi_pass.send_keys(user_pass)
bi_pass.send_keys(Keys.ENTER)
'''THIS IS THE PROMPT THAT ASKS IF YOU WANT TO REMAIN LOGGED IN'''
'''WAIT / STAY ON CURRENT SELECTION / MOVE AND CLICK ON ADJACENT SELECTION'''
time.sleep(4)
yes_choice = browser.find_element_by_id('idSIButton9')
yes_choice.send_keys(Keys.SHIFT, Keys.TAB, Keys.ENTER)
'''WAIT / FIND AND CLICK ON MY WORKSPACE'''
time.sleep(7)
my_workspace_button = browser.find_element_by_class_name('workspaceName')
my_workspace_button.click()
'''CREATE MOUSE OBJECT'''
action = ActionChains(browser)
'''WAIT / FIND CORRECT ROW / CLICK ON REFRESH'''
time.sleep(4)
stock_alerts = browser.find_elements_by_xpath("//*[ text() = 'NAME OF YOUR DATA SOURCE' ]")
action.move_to_element(stock_alerts[-1]).click().send_keys(Keys.TAB).send_keys(Keys.ENTER).perform()
'''WAIT / CLOSE BROWSER'''
time.sleep(3)
browser.close()
This is the poor mans way of bypassing Power BIs strict 8 refreshes for a basic account. It also helps avoid having to buy a costly plan just to increase the number of data source scheduled updates.
Hope it helps

Selenium webdriver python element screenshot not working properly

I looked up Selenium python documentation and it allows one to take screenshots of an element. I tried the following code and it worked for small pages (around 3-4 actual A4 pages when you print them):
from selenium.webdriver import FirefoxOptions
firefox_profile = webdriver.FirefoxProfile()
firefox_profile.set_preference("browser.privatebrowsing.autostart", True)
# Configure options for Firefox webdriver
options = FirefoxOptions()
options.add_argument('--headless')
# Initialise Firefox webdriver
driver = webdriver.Firefox(firefox_profile=firefox_profile, options=options)
driver.maximize_window()
driver.get(url)
driver.find_element_by_tag_name("body").screenshot("career.png")
driver.close()
When I try it with url="https://waitbutwhy.com/2020/03/my-morning.html", it gives the screenshot of the entire page, as expected. But when I try it with url="https://waitbutwhy.com/2018/04/picking-career.html", almost half of the page is not rendered in the screenshot (the image is too large to upload here), even though the "body" tag does extend all the way down in the original HTML.
I have tried using both implicit and explicit waits (set to 10s, which is more than enough for a browser to load all contents, comments and discussion section included), but that has not improved the screenshot capability. Just to be sure that selenium was in fact loading the web page properly, I tried loading without the headless flag, and once the webpage was completely loaded, I ran driver.find_element_by_tag_name("body").screenshot("career.png"). The screenshot was again half-blank.
It seems that there might be some memory constraints put on the screenshot method (although I couldn't find any), or the logic behind the screenshot method itself is flawed. I can't figure it out though. I simply want to take the screenshot of the entire "body" element (preferably in a headless environment).
You may try this code, just that you need to install a package from command prompt using the command pip install Selenium-Screenshot
import time
from selenium import webdriver
from Screenshot import Screenshot_Clipping
driver = webdriver.Chrome()
driver.maximize_window()
driver.implicitly_wait(10)
driver.get("https://waitbutwhy.com/2020/03/my-morning.html")
obj=Screenshot_Clipping.Screenshot()
img_loc=obj.full_Screenshot(driver, save_path=r'.', image_name='capture.png')
print(img_loc)
time.sleep(5)
driver.close()
Outcome/Result comes out to be like, you just need to zoom the screenshot saved
Hope this works for you!

Make python chromedriver script run faster (change send_keys *too slow*)

Hello I have built a program script that goes onto a website and selects a size and auto checks out an Item for me it works very well but I have 2 concerns
1.I want to have this script run faster before the script ran pretty fast (so fast that it basically added to cart and went to the checkout page before the Item could even load into the cart (which resulted in errors) and so I added there script to my code
wait = WebDriverWait(driver, 10) and this one which I mainly used to wait until the item loaded into the cart and all the "add to cart" buttons showed up
wait.until(EC.presence_of_element_located((By.NAME, 'commit')))
but I want this script to run faster I tried changing the
wait = WebDriverWait(driver, 10) into something like
wait = WebDriverWait(driver, 1) and
wait = WebDriverWait(driver, 100) but I see no difference is there anything I can do to make the script run faster?(it doesnt have to do with the wait= thing Ill take any thing I can get to even shave off milaseconds.
I am currently using the send_keys option for autofill which is PAINFULLY SLOW is there anything I can use that will fill all the stuff instanstly alltogether? ik there are some "JAVA-scripts simular to this that can do it but im not sure how to right java script or more importantly how to even combine them
Can anyone help me out I just want my selenium python chromedriver script to run as fast as possible.
Thank you.
(for my script im using select for the size and just .click() and a couple of if statements which depends on how many items they want to cart and lots of def fweuf
fweuf() (i forget what those are called lol) )
For sending values with JS you can do this:
js= "document.getElementById('YOURELEMENT').value = '" + str(YOURVALUE) + "';"
driver.execute_script(js)
Hope this helps.

Gtk Window Non-Responsive During Selenium Automation

Good Afternoon :) Having a problem with my Python3 Gtk3 application and Selenium WebDriver (ChromeDriver). Also, using Linux if it matters.
Basically, the user presses a button to start the Selenium webdriver automation and then as the automation process is going, it 'SHOULD' give feedback to the user in the GUI (See Content.content_liststore.append(list(item)) and LogBox.log_text_buffer).
However, it's not adding anything into the content_liststore until after fb_driver.close() is done. In the meantime, the Gtk window just "hangs".
Now, I've been looking into multithreading in hopes of the GUI being responsive to this feedback but I've also been reading that Selenium doesn't like multithreading (but I presume thats running multiple browsers/tabs (which this is not)).
So, my question is; Is multithreading the go-to fix for getting this to work?
# ELSE IF, FACEBOOK COOKIES DO NOT EXIST, PROCEED TO LOGIN PAGE
elif os.stat('facebook_cookies').st_size == 0:
while True:
try: # look for element, if not found, refresh the webpage
assert "Facebook" in fb_driver.title
login_elem = fb_driver.find_element_by_id("m_login_email")
login_elem.send_keys(facebook_username)
login_elem = fb_driver.find_element_by_id("m_login_password")
login_elem.send_keys(facebook_password)
login_elem.send_keys(Keys.RETURN)
except ElementNotVisibleException:
fb_driver.refresh()
StatusBar.status_bar.push(StatusBar.context_id, "m_login_password element not found, trying again...")
ProblemsLog.text_buffer.set_text("Facebook has hidden the password field, refreshing page...")
else:
query_elem = fb_driver.find_element_by_name("query")
query_elem.send_keys(target)
query_elem.send_keys(Keys.RETURN)
break
m_facebook_url_remove = "query="
m_facebook_url = fb_driver.current_url.split(m_facebook_url_remove, 1)[1] # Remove string before "query="
facebook_url = "https://www.facebook.com/search/top/?q=" + m_facebook_url # Merge left-over string with the desktop url
StatusBar.status_bar.push(StatusBar.context_id, "Facebook elements found")
fb_title = fb_driver.title
fb_contents = [(target_name.title(), "Facebook", facebook_url)]
for item in fb_contents:
Content.content_liststore.append(list(item))
#progress_bar.set_fraction(0.10)
LogBox.log_text_buffer.set_text("Facebook Search Complete")
with open('facebook_cookies', 'wb') as filehandler:
pickle.dump(fb_driver.get_cookies(), filehandler)
fb_driver.close()
I've considered it not working because of the 'while' loop, but another piece of code doesn't have a loop and does the exact same thing, it waits for Selenium to finish before adding content to the GUI.
Additionally, the user can select multiple websites to do this with, so the application can first go to Facebook (do it's business then close), go to LinkedIn (do it's business then close) and so fourth. And it still waits for all the Selenium code to finish before adding anything to the Gtk GUI.
I really hope that makes sense! Thank you :)
Your`s question is the answer you are lookig for. Take a read here https://wiki.gnome.org/Projects/PyGObject/Threading

Resources