Selenium and Python 3: selecting a search box on otto.de - python-3.x

I am new to Python 3 as well as web scraping and I am a bit stuck right now.
I want to:
1. Select the search box on otto.de.
2. Insert the product number I want to search for.
3. Press enter or click the search button.
4. Download the following page.
The search field on otto.de has the following source code:
<form class="p_form js_searchForm focus" action="/suche" data-article-number-search="/p/search/" autocomplete="off" autocorrect="off" spellcheck="false" role="search">
<input placeholder="Suchbegriff / Artikelnr. eingeben" data-error="Bitte mind. ein Zeichen eingeben" class="p_form__input js_searchField sanSearchInput" type="text" autocomplete="off" autocorrect="off" maxlength="50" disabled>
<button class="sanSearchDelBtn p_symbolBtn100--4th" type="reset"><i>X</i></button>
<button class="js_submitButton sanSearchButton" type="submit" title="Suche" disabled ><span>»</span></button>
</form>
What I tried to do:
browser = webdriver.Firefox()
browser.get('http://www.otto.de')
elem = browser.find_element_by_xpath("//form[input/#class='p_form__input js_searchField sanSearchInput']")
elem.send_keys('538707' + Keys.RETURN)
with open("Productpage.txt", "w") as outfile:
outfile.write(browser.page_source)
browser.quit()
It gives me the following error message:
selenium.common.exceptions.InvalidElementStateException: Message: Unable to clear element that cannot be edited: <form class=
"p_form js_searchForm focus">
I tried many different commands but I just can't get on the page I need. Does anyone have an idea how to solve this problem?

Here is a working code:
from selenium import webdriver
from selenium.webdriver.support import ui
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time
browser = webdriver.Firefox()
browser.get("http://www.otto.de")
ui.WebDriverWait(browser, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".p_form__input.js_searchField.sanSearchInput"))).send_keys("538707")
ui.WebDriverWait(browser, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".js_submitButton.sanSearchButton"))).click()
with open("Productpage.txt", "w", encoding="utf-8") as outfile:
outfile.write(browser.page_source)
time.sleep(5)
browser.quit()
Hope it helps you!

Related

Website login loading endlessly when I use find_element()

I'm having a problem with Selenium, when fetching an input from a login from a page. For some reason, whenever I store the element in a variable, the site in question, when trying to log in, keeps loading infinitely. However, if I remove that part of the code, and enter the login credentials manually, the login is done normally. I will explain below:
This is how the page input I try to access looks like:
<input id="txtUsername" name="txtUsername" class="random-name" placeholder="User" type="text" required="required" aria-required="true" autocomplete="off" autocorrect="off" autocapitalize="off">
<input id="txtPassword" name="txtUsername" class="random-name" placeholder="User" type="text" required="required" aria-required="true" autocomplete="off" autocorrect="off" autocapitalize="off">
And this is my python code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver import chrome
from time import sleep
options = webdriver.ChromeOptions()
browser = webdriver.Chrome(options=options)
site = "www.google.com"
browser.get(site)
def make_login(browser):
sleep(1)
login = browser.find_element(By.ID, "txtUsername")
login.click()
login_text = "User"
for x in login_text:
sleep(0.15)
login.send_keys(x)
senha = navegador.find_element(By.ID, "txtPassword")
senha.click()
senha_text = "Password"
for x in senha_text:
sleep(0.15)
senha.send_keys(x)
if __name__ == "__main__":
make_login(browser)
When I run it, it clicks on each input, enters the password, as it should. However, when I click log in, the website keeps loading endlessly.
If i remove this part:
login = browser.find_element(By.ID, "txtUsername")
login.click()
senha = navegador.find_element(By.ID, "txtPassword")
senha.click()
And manually clicking on the inputs, he enters the site normally...
I appreciate anyone who can help me.
Have you tried to use only use send_keys()? So, your code will look like:
def make_login(browser):
sleep(1)
login = browser.find_element(By.ID, "txtUsername")
login_text = "User"
login.send_keys(login_text)
senha = navegador.find_element(By.ID, "txtPassword")
senha_text = "Password"
senha.send_keys(senha_text)
As shown in selenium documentation send_keys() method is enough to fill input fields.
But if you're trying to access sites like bet365 it's almost certainly the website prohibiting your code. Take a look at this post and its answers.

submit button in selenium without id and value

I have a button
<button type="submit" class="btn btn-default waves-effect waves-light"><i class="fa fa-search" aria-hidden="true"></i></button>
I have already tested all this but without success
# browser.find_element_by_class_name('fa fa-search').click()
# browser.find_element_by_xpath('/html/body/div[1]/div[1]/div/div[4]/div/form/div/button').click()
# WebDriverWait(browser, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[#id="RechAvFormRadio"]/div/button'))).click()
Try with css selector like this :
button.btn.btn-default.waves-effect.waves-light
or xpath
//button[contains(#class,'btn btn-default waves-effect waves-light')]
or
//i[contains(#class,'fa fa-search')]/parent::button
PS : Please check in the dev tools (Google chrome) if we have unique entry in HTML DOM or not.
Steps to check:
Press F12 in Chrome -> go to element section -> do a CTRL + F -> then paste the xpath and see, if your desired element is getting highlighted with 1/1 matching node.
Code trial 1 :
time.sleep(5)
driver.find_element_by_xpath("//i[contains(#class,'fa fa-search')]/parent::button").click()
Code trial 2 :
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//i[contains(#class,'fa fa-search')]/parent::button"))).click()
Imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Contains text in Selenium Python

I am trying to capture an Error which would restart my program and change proxy but I am unable to catch the error as its stored like this and classes are dynamically named :
<p class="g4Vm4">By signing up, you agree to our <a target="_blank" href="https://help.instagram.com/581066165581870">Terms</a> . Learn how we collect, use and share your data in our <a target="_blank" href="https://help.instagram.com/519522125107875">Data Policy</a> and how we use cookies and similar technology in our <a target="_blank" href="/legal/cookies/">Cookies Policy</a> .</p>
so I am trying to catch the xpath by this function but I am un able to do so.
def has_error(browser):
try: #/*[contains(text(), 'technology')]/html/body/span/section/main/div/article/div/div[1]/div/form/p"
browser.find_element_by_xpath("/html/body//*[contains(text(),'technology')]")
return False
except: return True
if not has_error(browser):
print('Error found! , aborted!')
browser.quit()
os.execv(sys.executable, ['python'] + sys.argv)
To Handle dynamic element use WebDriverwait and following Xpath Startegy.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
element=WebDriverWait(driver,30).until(expected_conditions.element_to_be_clickable((By.XPATH,'//p[contains(.,"technology")]')))
print(element.text)
You can check if the source of the web-page contains special text.
if 'By signing up, you agree to our ' in browser.page_source:
pass
# TODO Exception

How to click on a hidden button with selenium through Python

I’m trying to click an Upload from my Computer button on a page that has the source below.
I’m using selenium and tried several different approaches. The past failed approaches are commented out below, along with the current failed approach. The error that’s returned with the current approach is below.
Can anyone see what the issue might be and suggest how to solve it? I’m new to selenium so if someone can provide some explanation of what the html is doing and how their code solves the issue as well it would be really helpful for my understanding.
HTML code of the button:
<div class="hidden-xs">
<label for="fuUploadFromMyComputer" class="hidden">
Upload from my Computer
</label>
<input id="fuUploadFromMyComputer" type="file" name="upload">
<button id="btnUploadFromMyComputer"
class="center-block btn btn-white-fill btn-block "
data-resume-type="COMPUTER" type="submit">
<i class="zmdi zmdi-desktop-mac"></i>
Upload from my Computer
</button>
</div>
attempts:
# clicking upload button
# upload_btn = driver.find_element_by_id("fuUploadFromMyComputer")
# upload_btn = driver.find_element_by_css_selector(
# '.center-block.btn.btn-white-fill.btn-block')
# upload_btn = driver.find_element_by_link_text('Upload from my Computer')
# upload_btn.click()
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
WebDriverWait(driver, 20).until(EC.element_to_be_clickable(
(By.CSS_SELECTOR, "div.center-block btn.btn-white-fill.btn-block"))).click()
error:
---------------------------------------------------------------------------
TimeoutException Traceback (most recent call last)
<ipython-input-43-8fd80ff3c690> in <module>()
14 from selenium.webdriver.support import expected_conditions as EC
15
---> 16 WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.center-block btn.btn-white-fill.btn-block"))).click()
17
18 time.sleep(3)
~/anaconda/envs/py36/lib/python3.6/site-packages/selenium/webdriver/support/wait.py in until(self, method, message)
78 if time.time() > end_time:
79 break
---> 80 raise TimeoutException(message, screen, stacktrace)
81
82 def until_not(self, method, message=''):
TimeoutException: Message:
To click on the element with text as Upload from my Computer you need to induce WebDriverwait for the element to be clickable and you can use either of the following solutions:
CSS_SELECTOR:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button.center-block.btn.btn-white-fill.btn-block#btnUploadFromMyComputer"))).click()
XPATH:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[#class='center-block btn btn-white-fill btn-block ' and #id='btnUploadFromMyComputer']"))).click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Selenium's click() does not support to operate on invisible element. Thus please double confirm the button is visible or not when your code intend to click it.
If the button is not visible, how do you click it hands-on? Thus change your script to following the human steps to make the button visible before you can click it.
Back to your failure on below code
WebDriverWait(driver, 20).until(EC.element_to_be_clickable(
(By.CSS_SELECTOR, "div.center-block btn.btn-white-fill.btn-block"))).click()
The reason is you give a wrong css selector which can't find any element from the page util reach the waiting timeout.
The correct css selector of the button can be any one of following:
button.center-block.btn.btn-white-fill.btn-block
button#btnUploadFromMyComputer
For C#, I used IJavaScriptExecutor to click on element. You may search this solution for Python syntax
public static void scrollElementToClick(IWebDriver driver, IWebElement element)
{
IJavaScriptExecutor ex = (IJavaScriptExecutor)driver;
ex.ExecuteScript("arguments[0].click();", element);
}

Scraping dynamic website (CSS?) with python

I want to know if a item is available in the local library. I can see this in the catalog with a green icon (available), or a red icon (loaned out/not available).
First I tried just beautifullsoup, this is the python code I tried:
try:
import urllib.request as urllib2
except ImportError:
import urllib2
from bs4 import BeautifulSoup
page = urllib2.urlopen("http://zoeken.mol.bibliotheek.be/?itemid=|library/marc/vlacc|9394694&undup=false")
soup = BeautifulSoup(page)
bal = soup.find(class_="avail-icon")
print(bal)
But while an element inspection in firefox gives:
<span class="avail-icon">
<i class="circle-icon avail-icon-none"></i>
<span class="hidden-text"></span>
</span>
class="circle-icon avail-icon-none" means the item is available (shows green icon on webpage),
class="circle-icon avail-icon-loanedout" means the item is loaned out (shows red icon on webpage).
I got:
<span class="avail-icon">
<i class="circle-icon avail-icon-loading"></i>
<span class="hidden-text">Toon beschikbaarheid voor</span>
</span>
class="circle-icon avail-icon-loading" means dynamic, I asume, so after some searching I found Selenium.
I tried the following code:
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("http://zoeken.mol.bibliotheek.be/?itemid=|library/marc/vlacc|9394694&undup=false")
html = driver.page_source
soup = BeautifulSoup(html)
bal = soup.find(class_="avail-icon")
print(bal)
Sadly, this gives me:
<span class="avail-icon">
<i class="circle-icon avail-icon-unknown"></i>
<span class="hidden-text">Toon beschikbaarheid voor</span>
</span>
Maybe I wasn't waiting enough before Selenium grabbed the webpage, so after some searching I changed the code to:
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Firefox()
driver.implicitly_wait(10) # seconds
driver.get("http://zoeken.mol.bibliotheek.be/?itemid=|library/marc/vlacc|9394694&undup=false")
html = driver.page_source
soup = BeautifulSoup(html)
bal = soup.find(class_="avail-icon")
print(bal)
Still the same result, class="circle-icon avail-icon-unknown" isn't what I'm looking for and I'm now out of ideas. Can someone throw me a hint?
PS: Maybe an idea, but I don't know how to do it:
In Firefox, in the element inspector, the right pane has a column called rules (dutch: regels). The red and green icon are loaded as one .png file (icon-sprite.png). Select the red/green icon to see what I mean.
background-position: -48px -16px; means available (green icon)
background-position: 0px -32px; means not available (red icon)
Can I somehow test for this?
PS2: I'm a novice programmer (skill level = low).
You can do this entirely by using selenium:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
driver = webdriver.Firefox()
driver.get("http://zoeken.mol.bibliotheek.be/?itemid=|library/marc/vlacc|9394694&undup=false")
#waiting until the icon is loaded...
WebDriverWait(driver, 20).until(
EC.presence_of_element_located((By.XPATH, """//*[#id="availabilityStatic"]/div/div/ul/li/ul/li/span/i""")))
circle_icon = driver.find_element_by_xpath("""//*[#id="availabilityStatic"]/div/div/ul/li/ul/li/span/i""")
icon_class = circle_icon.get_attribute("class")
if "loanedout" in icon_class:
print "item not available"
else:
print "item available"
driver.quit()

Resources