Object:
automate following process.
1. Open particular web page, fill the information in search box, submit.
2. from search results click on first result and download the PDF
Work done:
To reach to this object I have written a code as first step. The code works fine but opens up download pop up. Till the time I can't get rid of it, I can not automate the process further. Searched for very many solutions. But none has worked.
For instance, This solution is hard for me to understand and I think its more to do with Java then Python. I changed fire fox profile as suggested by many. This dose matches though not exactly same. I haven't tried as there is no much difference. Even this speaks about changing fire fox profile but that doesn't work.
My code is as below
import selenium.webdriver as webdriver
import selenium.webdriver.support.ui as ui
from time import sleep
import time
import wget
from wget import download
import os
#set firefox Profile
profile = webdriver.FirefoxProfile()
profile.set_preference('browser.download.folderList', 2)
profile.set_preference('browser.download.manager.showWhenStarting', False)
profile.set_preference("browser.download.manager.showAlertOnComplete", False)
profile.set_preference('browser.download.dir', os.getcwd())
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', 'application/pdf')
#set variable driver to open firefox
driver = webdriver.Firefox(profile)
#set variable webpage to open the expected URL
webpage = r"https://documents.un.org/prod/ods.nsf/home.xsp" # edit me
#set variable to enter in search box
searchterm = "A/HRC/41/23" # edit me
#open the webpage with get command
driver.get(webpage)
#find the element "symbol", insert data and click submit.
symbolBox = driver.find_element_by_id("view:_id1:_id2:txtSymbol")
symbolBox.send_keys(searchterm)
submit = driver.find_element_by_id("view:_id1:_id2:btnRefine")
submit.click()
#list of search results open up and 1st occarance is clicked by coppying its id element
downloadPage = driver.find_element_by_id("view:_id1:_id2:cbMain:_id135:rptResults:0:linkURL")
downloadPage.click()
#change windiows. with sleep time
window_before = driver.window_handles[0]
window_after = driver.window_handles[1]
time.sleep(10)
driver.switch_to.window(window_after)
#the actual download of the pdf page
theDownload = driver.find_element_by_id("download")
theDownload.click()
Please guide.
The "Selections" popup is not a different window/tab, it's just an HTML popup. You can tell this because if you right click on the dialog, you will see the normal context menu. You just need to make your "Language" and "File type(s)" selections and click the "Download selected" button.
Related
I'm trying to automate a process within the OpenSea Create page after having logged in with Metamask, and so far, I have managed to develop a simple program that chooses a particular image file using a path which passes to the Open File dialog "implicitly", here's the code:
import pyautogui
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
def wait_xpath(code): #function to wait for the xpath of an element to be located
WebDriverWait(driver, 60).until(EC.presence_of_element_located((By.XPATH, code)))
opt = Options() #the variable that will store the selenium options
opt.add_experimental_option("debuggerAddress", "localhost:9222") #this allows bulk-dozer to take control of your Chrome Browser in DevTools mode.
s = Service(r'C:\Users\ResetStoreX\AppData\Local\Programs\Python\Python39\Scripts\chromedriver.exe') #Use the chrome driver located at the corresponding path
driver = webdriver.Chrome(service=s, options=opt) #execute the chromedriver.exe with the previous conditions
nft_folder_path = r'C:\Users\ResetStoreX\Pictures\Cryptobote\Cryptobote NFTs\Crypto Cangrejos\SANDwich\Crabs'
start_number = 3
if driver.current_url == 'https://opensea.io/asset/create':
print('all right')
print('')
print(driver.current_window_handle)
print(driver.window_handles)
print(driver.title)
print('')
nft_to_be_selected = nft_folder_path+"\\"+str(start_number)+".png"
wait_xpath('//*[#id="main"]/div/div/section/div/form/div[1]/div/div[2]')
imageUpload = driver.find_element(By.XPATH, '//*[#id="main"]/div/div/section/div/form/div[1]/div/div[2]').click() #click on the upload image button
print(driver.current_window_handle)
print(driver.window_handles)
time.sleep(2)
pyautogui.write(nft_to_be_selected)
pyautogui.press('enter', presses = 2)
Output:
After checking the URL, the program clicks on the corresponding button to upload a file
Then it waits 2 seconds before pasting the image path into the Name textbox, for then pressing Enter
So the file ends up being correctly uploaded to this page.
The thing is, the program above works because the following conditions are met before execution:
The current window open is the Chrome Browser tab (instead of the Python program itself, i.e. Spyder environment in my case)
After clicking the button to upload a file, the Name textbox is selected by default, regardless the current path it opens with.
So, I'm kind of perfectionist, and I would like to know if there's a method (using Selenium or other Python module) to check if there's an Open File dialog open before doing the rest of the work.
I tried print(driver.window_handles) right after clicking that button, but Selenium did not recognize the Open File dialog as another Chrome Window, it just printed the tab ID of this page, so it seems to me that Selenium can't do what I want, but I'm not sure, so I would like to hear what other methods could be used in this case.
PS: I had to do this process this way because send_keys() method did not work in this page
The dialog you are trying to interact with is a native OS dialog, it's not a kind of browser handler / dialog / tab etc. So Selenium can not indicate it and can not handle it. There are several approaches to work with such OS native dialogs. I do not want to copy - paste existing solutions. You can try for example this solution. It is highly detailed and looks good.
PROBLEM:
Power BI, under the pro-license, only allows data sources to be refreshed at most 8 times in a day and in 30 minute increments (using AM/PM timing).
This takes away my ability to make near real time decisions. So currently, my visualization is only updating every hour starting at 9:30 and the update happens 8 times, i.e. the total update times are 9:30, 10:30, 11:30, 12:30, 1:30, 2:30, 3:30, 4:30.
WORK AROUND:
So in order to bypass the pro license limitation on refreshes, I created code using PYAUTOGUI that will login to my Power BI server and click on that refresh button for me and I run this every 5 minutes.
WORK AROUND ISSUE:
The problem, is that this PYAUTOGUI will only work if the computer is active, i.e. I am logged in.
REQUEST:
What modules exist so that I can perform this same functionality in the background (without needing to have computer logged in or awake)?
NOTE:
I have done a search for packages from the command prompt using the code pip search mouse, pip search click, etc. but this is not the best use of time.
This is very much possible using selenium.
Firstly, ensure you have selenium installed in your environment. Run the following command:
pip install selenium
Then you'll need to make sure you have the correct chrome executable installed either in your PATH variable or explicitly written in your code (the latter option implies that you may use a config file as I have below, to avoid posting credentials). You can download the chrome executable for your version of Chrome here:
https://chromedriver.chromium.org/downloads
(You can do this with any browser, my choice is Chrome, though you will have to adjust the code as necessary)
Now the below code should work for you:
NOTE: The time module is essential here because if your code executes faster than the DOM loads, then selenium/python will return an error usually that error is saying it cannot find the elements you are looking for (because they haven't loaded yet)
'''REQUIRED PACKAGES'''
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
import time
'''CONFIG FILE FOR CHROME EXECUTABLE & LOGIN INFO'''
from _config import chrome_executable, user_name, user_pass
'''LINK TO SIGN IN PAGE FOR POWER BI'''
powerbi = 'https://app.powerbi.com/home?redirectedFromSignup=1&noSignUpCheck=1&response=AlreadyAssignedLicense'
'''SET CHROME OPTIONS TO RUN IN INCOGNITO AND HEADLESS(no browser window)'''
option = webdriver.ChromeOptions()
option.headless = True
option.incognito = True
'''CREATE BROWSER OBJECT / NAVIGATE TO POWER BI / MAXIMIZE WINDOW'''
browser = webdriver.Chrome(executable_path=chrome_executable, options=option)
browser.get(powerbi)
browser.maximize_window()
'''WAIT / FIND AND FILL IN EMAIL FIELD / FIND AND CLICK NEXT BUTTON'''
time.sleep(4)
bi_email = WebDriverWait(browser, 5).until(EC.presence_of_element_located((By.ID, 'i0116')))
bi_email.send_keys(user_name)
bi_email.send_keys(Keys.ENTER)
'''WAIT / FIND AND FILL IN PASSWORD FIELD / FIND AND CLICK ON LOGIN BUTTON'''
time.sleep(4)
bi_pass = WebDriverWait(browser, 5).until(EC.presence_of_element_located((By.ID, 'i0118')))
bi_pass.send_keys(user_pass)
bi_pass.send_keys(Keys.ENTER)
'''THIS IS THE PROMPT THAT ASKS IF YOU WANT TO REMAIN LOGGED IN'''
'''WAIT / STAY ON CURRENT SELECTION / MOVE AND CLICK ON ADJACENT SELECTION'''
time.sleep(4)
yes_choice = browser.find_element_by_id('idSIButton9')
yes_choice.send_keys(Keys.SHIFT, Keys.TAB, Keys.ENTER)
'''WAIT / FIND AND CLICK ON MY WORKSPACE'''
time.sleep(7)
my_workspace_button = browser.find_element_by_class_name('workspaceName')
my_workspace_button.click()
'''CREATE MOUSE OBJECT'''
action = ActionChains(browser)
'''WAIT / FIND CORRECT ROW / CLICK ON REFRESH'''
time.sleep(4)
stock_alerts = browser.find_elements_by_xpath("//*[ text() = 'NAME OF YOUR DATA SOURCE' ]")
action.move_to_element(stock_alerts[-1]).click().send_keys(Keys.TAB).send_keys(Keys.ENTER).perform()
'''WAIT / CLOSE BROWSER'''
time.sleep(3)
browser.close()
This is the poor mans way of bypassing Power BIs strict 8 refreshes for a basic account. It also helps avoid having to buy a costly plan just to increase the number of data source scheduled updates.
Hope it helps
I have a bunch of links in a list and I want to open each link in a different tab (only one window). I know how to open a new tab in Selenium but for some reason, when I iterate over the list, all links get open in the same tab and I don't know what I am missing. Could anyone explain me what the error is and how I can fix it? I would really appreciate it.
from selenium import webdriver as wd
from selenium.webdriver.common.keys import Keys
url_list = ["https://www.kdnuggets.com/2017/06/text-clustering-unstructured-data.html", "https://github.com/vivekkalyanarangan30/Text-Clustering-API/", "https://machinelearningblogs.com/2017/01/26/text-clustering-get-quick-insights-from-unstructured-data/", "https://machinelearningblogs.com/2017/06/23/text-clustering-get-quick-insights-unstructured-data-2/", "https://machinelearningblogs.com/2017/06/23/text-clustering-get-quick-insights-unstructured-data-2/"]
driver = wd.Firefox(executable_path="/usr/local/bin/geckodriver")
for url in url_list:
body = driver.find_element_by_tag_name("body")
body.send_keys(Keys.COMMAND + "t")
driver.get(url)
Currently using python3.7, Firefox 65.0.1 and Selenium 3.141 on a Mac
When you open a new tab it is a new window for webdriver which will have its unique handle. driver.window_handles holds the list of active windows, you can use this to switch to newly created window and perform tasks on it.
for url in url_list:
body = driver.find_element_by_tag_name("body")
body.send_keys(Keys.COMMAND + "t")
driver.switch_to_window(driver.window_handles[-1])
driver.get(url)
Note that you will be using the same variable driver to refer to the newly switched window, so if you close that window then you need to switch to an active window again to perform further tasks.
UPDATE:
If new tab is not opening with your code then you can also try this.
for url in url_list:
driver.execute_script("window.open()")
driver.switch_to_window(driver.window_handles[-1])
driver.get(url)
use window switching with commands
one=driver.window_handles[0] - set the name of the first window
two=driver.window_handles[1] - the name of the second window (after opening it)
driver.switch_to.window(two) - switch to the desired window
I want to learn WINDOW handling in Python Selenium.
My Task is:
First open 'Google.com'.
Second open 'Yahoo.com' in New Window.
Third switch back to First Window and click on Gmail Link.
Fourth switch to Second Window and click on Finance Link.
Following Code works for me:
browser.get("http://www.google.co.in")
browser.execute_script("window.open('https://www.yahoo.com')")
browser.switch_to_window(browser.window_handles[0])
print(browser.title)
gmail=browser.find_element_by_class_name("gb_P")
gmail.click()
browser.switch_to_window(browser.window_handles[1])
print(browser.title)
fin=browser.find_element_by_link_text("Finance")
fin.click()
But when I try to change sequence to task as:
First open 'Google.com'.
Second open 'Yahoo.com' in New Window.
Third remaining in same window and click on Finance Link.
Fourth switch to First Window and click on Gmail Link.
The below altered code for the task in which after opening yahoo.com in new window and then clicking on finance link and then switching to main window containing Google.com then clicking on Gmail link doesn't work:
browser.get("http://www.google.co.in")
browser.execute_script("window.open('https://www.yahoo.com')")
browser.switch_to_window(browser.window_handles[1])
print(browser.title)
fin=browser.find_element_by_link_text("Finance")
fin.click()
browser.switch_to_window(browser.window_handles[0])
print(browser.title)
gmail=browser.find_element_by_class_name("gb_P")
gmail.click()
But if I refresh the page after switching to the Yahoo tab this works only in Chrome Driver and not in Firefox Driver.
browser.get("http://www.google.co.in")
print(browser.current_window_handle)
browser.execute_script("window.open('https://www.yahoo.com')")
print(browser.current_window_handle)
WebDriverWait(browser, 10).until(EC.number_of_windows_to_be(2))
browser.switch_to_window(browser.window_handles[1])
print(browser.current_window_handle)
print(browser.title)
browser.refresh()
fin=browser.find_element_by_link_text("Finance")
fin.click()
print(browser.window_handles)
WebDriverWait(browser,10000)
browser.switch_to_window(browser.window_handles[0])
print(browser.title)
print(browser.current_window_handle)
gmail=browser.find_element_by_class_name("gb_P")
gmail.click()
As per your updated question a few words about Tab/Window switching/handling:
Always keep track of the Parent Window handle so you can traverse back for the rest of your usecases.
Always use WebDriverWait with expected-conditions as number_of_windows_to_be(num_windows)
Always keep track of the Child Window handles so you can traverse back if required.
Here is your own code with some minor tweaks mentioned above:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
#other lines of code
browser.get("http://www.google.co.in")
print("Initial Page Title is : %s" %browser.title)
windows_before = browser.current_window_handle
print("First Window Handle is : %s" %windows_before)
browser.execute_script("window.open('https://www.yahoo.com')")
WebDriverWait(browser, 10).until(EC.number_of_windows_to_be(2))
windows_after = browser.window_handles
new_window = [x for x in windows_after if x != windows_before][0]
# browser.switch_to_window(new_window) <!---deprecated>
browser.switch_to.window(new_window)
print("Page Title after Tab Switching is : %s" %browser.title)
print("Second Window Handle is : %s" %new_window)
Console Output:
Initial Page Title is : Google
First Window Handle is : CDwindow-34D74AB1BB2F0A1A8B7426BF25B86F52
Page Title after Tab Switching is : Yahoo
Second Window Handle is : CDwindow-F3ABFEBE4907CFBB3CD09CEB75ED570E
Browser Snapshot:
Now you have got both the Window Handles so you can easily switch to any of the TABs to perform any action.
Running my script I get "javascript:getDetail(19978)" such items as href. The number in braces if concatenated with "https://www.aopa.org/airports/4M3/business/", produces valid links. However, clicking on this newly created links I can see that It gets me to a different page which is not similar to the one if clicked from the original page link. How can I get the original links instead of "javascript:getDetail(19978)". Search should be made writing "All" in the searchbox.
The code I've tried with:
from selenium import webdriver
import time
link = "https://www.aopa.org/airports/4M3/business/"
driver = webdriver.Chrome()
driver.get("https://www.aopa.org/learntofly/school/")
driver.find_element_by_id('searchTerm').send_keys('All')
driver.find_element_by_id('btnSearch').click()
time.sleep(5)
for pro in driver.find_elements_by_xpath('//td/a'):
print(pro.get_attribute("href"))
driver.quit()
Code to create new links with the base url I pasted in my description:
from selenium import webdriver
import time
link = "https://www.aopa.org/airports/4M3/business/"
driver = webdriver.Chrome()
driver.get("https://www.aopa.org/learntofly/school/")
driver.find_element_by_id('searchTerm').send_keys('All')
driver.find_element_by_id('btnSearch').click()
time.sleep(5)
for item in driver.find_elements_by_xpath('//td/a'):
fresh = item.get_attribute("href").replace("javascript:getDetail(","")
print(link + fresh.replace(")",""))
driver.quit()
However, this newly created links lead me to different destinations.
FYC, original links are embedded within elements like the below one:
<td>GOLD DUST FLYING SERVICE, INC.</td>
Clicking link you make an XHR. The page is actually remained the same, but received data from JSON rendered instead of previous content.
If you want to open raw data inside an HTML page you might try something like
# To get list of entries as ["19978", "30360", ... ]
entries = [a.get_attribute('href').split("(")[1].split(")")[0] for a in driver.find_elements_by_xpath('//td/a')]
url = "https://www.aopa.org/learntofly/school/wsSearch.cfm?method=schoolDetail&businessId="
for entry in entries:
driver.get(url + entry)
print(driver.page_source)
You also might use requests to get each JSON response as
import requests
for entry in entries:
print(requests.get(url + entry).json())
without rendering data in browser
If you look at how getDetail() is implemented in the source code and explore the "network" tab when you click each of the search result links, you may see that there are multiple XHR requests issued for a result and there is some client-side logic executed to form a search result page.
If you don't want to dive into replicating all the logic happening to form each of the search result pages - simply go back and forth between the search results page and a single search result page:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
driver.get("https://www.aopa.org/learntofly/school/")
driver.find_element_by_id('searchTerm').send_keys('All')
driver.find_element_by_id('btnSearch').click()
# wait for search results to be visible
table = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#searchResults table")))
for index in range(len(table.find_elements_by_css_selector('td a[href*=getDetail]'))):
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#searchResults table")))
# get the next link
link = table.find_elements_by_css_selector('td a[href*=getDetail]')[index]
link_text = link.text
link.click()
print(link_text)
# TODO: get details
# go back
back_link = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#schoolDetail a[href*=backToList]")))
driver.execute_script("arguments[0].click();", back_link)
driver.quit()
Note the use of Explicit Waits instead of hardcoded "sleeps".
It may actually make sense to avoid using selenium here altogether and approach the problem "headlessly" - doing HTTP requests (via requests module) to the website's API.