Google search next pages using selenium - python-3.x

I'm trying to automate the clicking of the next page in google search, after I must have gone into the links in the 1st and 2nd search page.
I've so far been able to do the following:
Spin up the chrome browser
Go to the Google webpage
Type in the search words
Click on the search icon
Go into the links on the 1st and 2nd google page
See my code below:
from time import sleep
from selenium import webdriver
from parsel import Selector
from selenium.webdriver.common.keys import Keys
#path to the chromedriver
driver = webdriver.Chrome('/Users\my_path/chromedriver')
driver.get('https://www.gooogle.com')
#locate search form by name
search_query = driver.find_element_by_name('q')
#Input search words
search_query.send_keys('X-Men')
#Simulate return key
search_query.send_keys(Keys.RETURN)
Xmen_urls = driver.find_elements_by_class_name('iUh30')
for page in range(0,3):
Xmen_urls = [url.text for url in Xmen_urls]
#loop to iterate through all links in the google search query
for Xmen_url in Xmen_urls:
driver.get(Xmen_url)
sel = Selector(text = driver.page_source)
#Go back to google search
driver.get('https://www.gooogle.com')
#locate search form by name
search_query = driver.find_element_by_name('q')
#Input search words
search_query.send_keys('X-Men')
#Simulate return key
search_query.send_keys(Keys.RETURN)
#find next page icon in Google search
Next_Google_page = driver.find_element_by_link_text("Next").click()
page += 1
When I'm done collecting the links on the '2nd' search page, how do I tell the algorithm to start from the '2nd' search page and not the 1st search page (this will enable me go into >2 pages).
I know it's a 'for loop' and syntax re-arranging I'm missing somewhere but my brain is frozen at this point.
I saw this page: How to click the next link in google search results? but it only helps if I'm not navigating away from the google search page
What am I doing wrong?

There are two ways I can see:
Open each X-Men url in a separate window using window_handles, collect page_source, close the window and switch back to the original window.
driver.execute_script("window.open(X-Men_url, 'new_window')")
driver.switch_to.window(driver.window_handles[1])
sel = Selector(text = driver.page_source)
driver.close()
driver.switch_to.window(driver.window_handles[0])
The code above may not work exactly, but something to that effect.
The other way is to simulate a number of clicks on NEXT at the beginning of your FOR loop using a loop:
a = 0;
while a <= page:
driver.find_element_by_xpath("//*[contains(local-name(), 'span') and contains(text(), 'Next')]").click()
a = a+1

Related

how to copy to clipboard text between h2 tags in selenium python

what i try to do here is get email code for verification. so I log in to the email, select and copy the 6 digits code from the email and paste it to the other tab. everything is done except i can not double click to select the 6 digit code and copy it to clipboard. the code is between h2 tag and nothing else, like this: 639094 where 639094 is actually the code which i need to be copied. how can i find the code element or whatever and copy it? here is a screen shot of the email and the chrome inspect element if anything helps.
this is the code that I use to copy the code:
codeID = driver.find_element(By.XPATH,
'//table[#class="main"]//tr//td//p//h2').text
ActionChains = ActionChains(driver)
ActionChains.double_click(codeID).perform()
time.sleep(2)
codeID.send_keys(Keys.CONTROL + 'c')
text = pyperclip.paste()
print(text)
screen shot
element is found however looks like can not be copied. the error is Element is not reachable by keyboard. if i do everything automatically up until the element is selected with double click and copy the element with my actual keyboard the element is copied, however when selenium try to copy i get the error from above. the code i use to double click the element is:
codeID = driver.find_element(By.XPATH, '//*[#id="message-htmlpart1"]/div/table/tbody/tr/td[2]/div/table/tbody/tr/td/table/tbody/tr/td/h2')
ActionChains = ActionChains(driver)
ActionChains.double_click(codeID).perform()
time.sleep(2)
and to do the copy is :
codeID.send_keys(Keys.CONTROL + 'c')
text = pyperclip.paste()
print(text)
this is the part where the error ocur:
codeID.send_keys(Keys.CONTROL + 'c')
text = pyperclip.paste()
print(text)
for some reason it says "Element is not reachable by keyboard" but the element/code numbers are selected.
if I use print(text) they are also printed in the console.
driver.find_element_by_xpath('//table[#class="main"]//tr//td//h2').text this will give you the text/code
Hey i will analyse this problem with you
For the first part :
try to take that XPath you have and past it in the Xpath helper (google chrome extension)
=> If you find that element , than the problem in your code
=> if you don’t than the element is already in a frame or in a table
The solution is to change your drive to the new frame and relocate the element inside the frame
Exemple :
iframe_xpath = driver.find_element_by_xpath('//iframe')
driver.switch_to.frame('iframe_xpath')
Now try to relocate the element starting from the iframe
For the second part :
You say it’s a table so you need to mention the /td[i] and /tr[j] value where the number is located so you can get it
Exemple
d = driver.find_element_by_xpath( "//tr[i]/td[j]").text
I hope that’s help

Adding str to every item in list python3

Introduction
Since im starting to get familiar with scrapy, i try to crawl some links out of random webpages.
Problem
The links im saving to my items.py file, are written without: "https://", but i need them as a hyperlink.
So i want to add "https://" before the actual links, so its formatted to a hyperlink.
My Code
def parse_target_page(self, response):
card = response.xpath('//div[#class="text-center artikelbox"]')
for a in card:
items = LinkcollectItem()
link = ('a/#href')
items ['Title'] = a.xpath('.//h5[#class="title"]/a/text()').get()
items ['Link'] = a.xpath('.//h5[#class="title"]/a/#href').get()
yield items
I tried with insert my string at index 0, but it didnt work
My output should print all links as hyperlink in csv-file.
if you need only add https:// for each link, you can do following:
link = a.xpath('.//h5[#class="title"]/a/#href').get()
items ['Link'] = "https://" + link if link else link

Store and access elements between pages navigation

I'm new to watir and page objects.
I have defined two pages, SearchPage and ReportsPage; On SearchPage I have button reports, I clicked on that and ReportPage was opened, I want to be able to use an element value from SearchPage on ReportsPage for making some assertions.
How can I store a value from page SearchPage and use it on ReportPage?
I have SearchPage defined like that:
class SearchPage < Browser
URL = "http://localhost:3000/"
def open
#browser.goto URL
self
end
def reports_link
#browser.link(href: '/reports')
end
def name_field
#browser.div(class: 'M0Z8m _1_jJ7')
end
And ReportsPage:
class ReportPage < Browser
URL = "http://localhost:3000/reports"
def report_value
#browser.element(class: '_1CV_I')
end
end
And Browser page defined like that:
class Browser
def initialize(browser)
#browser = browser
end
end
I click on reports link like this on a step :
Given (/^I navigate to application$/) do
#searchPage = SearchgPage.new(#browser);
#searchPage.open; sleep 1
end
When (/^I click on reports$/) do
#search_page.reports_link.click
ReportPage was loaded and on Then step I'll need value of field name_field from SearchPAge. How can I access that value because with #search_page is not working as it is expected, as that page is not loaded any more.

Using chromedriver to click on dropdown menu and get table of new webpage

I am creating a code to click on each option of the dropdown menu and then get the content of the new webpage, which has a table. I want to save one file for each option of the dropdown menu.
My code doesn't get these infos right now. I ain't sure if it's possible with chromedriver and Python. Could you give a help?
The website is: http://www2.camara.leg.br/deputados/pesquisa
On the first dropdown menu (below "Legislatura Atual - Deputados em Exercício") you have the name of 513 politicians in Brazil. I should choose one name per time, then select "presença em plenário" and then click on "pesquisar". The table which shows on the new webpage should be saved as a file named with the politician's name.
The same situation happens for the other names.
Below is the code that is workable.
from selenium import webdriver
from selenium.webdriver.support.ui import Select
path_to_chromedriver = 'path_to_chromedriver'
chrome_options = webdriver.ChromeOptions()
browser = webdriver.Chrome(chrome_options=chrome_options, executable_path=path_to_chromedriver)
prefs = {"profile.default_content_setting_values.notifications": 2}
chrome_options.add_experimental_option("prefs", prefs)
chrome_options.add_argument("start-maximized")
browser.get('http://www2.camara.leg.br/deputados/pesquisa')
Contact_data = browser.find_element_by_class_name('form-control')
listing = Contact_data.find_elements_by_tag_name("option")
listing1 =[]
for element in listing:
dropdownvalue = listing1.append(element.text)
for i in range(len(listing1)):
Selection = Select(browser.find_element_by_class_name('form-control'))
Selection.select_by_visible_text(listing1[i+1])
browser.find_element_by_id('rbDeputado7').click()
browser.find_element_by_name('Pesquisa').click()

Pulling Excel data across multiple firefox pages in Python, Selenium

Goal: Take a list of First and Last names from Excel, and put them into an online registration form, using multiple firefox pages, with only one first name and one last name per page.
Tasks:
Open firefox page
Fill in "First Name" text box with Excel, cell 'A2'="Bob"
Fill in "Last Name" text box with Excel, cell 'B2'="Apple"
Click 'Submit'. -- End of Registration 1 --
Open a new firefox page
Fill in "First Name" text box with Excel, cell 'A3'="Linda"
Fill in "Last Name" text box with Excel, cell 'B3'= "Orange"
Click 'Submit'.
for x in range(2):
from selenium import webdriver
browser=webdriver.Firefox()
browser.get('The Website')
import openpyxl
wb=openpyxl.load_workbook('Names.xlsx')
sheet=wb.get_sheet_by_name('Full Names')
tuple(sheet['A2':'B3'])
#I'm guessing about this next part:
for rowOfCellObjects in sheet['A2':'B3']:
for cellObj in rowOfCellObjects:
browser.find_element_by_id('first_name').send_keys(????)
browser.find_element_by_id('last_name').send_keys(????)
Using Python 3.6.2. Excel 2016. Windows 10 x64. Selenium.
Please dumb it down in the answers, I'm very new to coding :). Thanks!!
This is my usual format:
import pandas as pd
from selenium import webdriver
driver = webdriver.Firefox()
headers = ['first_name', 'last_name']
data = pd.read_csv('Names.csv', names=headers) #Youll want to change the .xlsx to .csv
depth = len(data['first_names']) #this finds how deep the columns are
url = "www.website.com"
driver.get(url) #opens browser
for i in range (0,depth):
driver.find_element_by_xpath('first_name').send_keys(data['first_name'][i])
driver.find_element_by_xpath('last_name').send_keys(data['last_name'][i])
driver.find_element_by_xpath('submit').click()
Also note that in find_element_by_xpath, the format is:
driver.find_element_by_xpath('//input[#name = "first_name"]')
or similar. Youll need to ctl+i or right click-->Inspect to find the xpath.
'input' is the main tag name, and 'name' will be whatever element of 'input' has the "first_name" string literally embedded.

Resources