Attempting to open browser with Specific Profile in Incognito Mode - browser

Attempting to open browser with Specific Profile in Incognito Mode.
After the browser is opened, need to verify that the URL opened the proper page
I am using the following code
browserName = set()
profile_name = 'Person 1
rpa_string = 'https://www.google.ca'
path_br_ch = '/Google/Chrome/Application/chrome.exe'
browserName.add("chrome")
browser_profile = f"--profile-directory={profile_name}"
browser_path = path_br_ch
browser_privacy = "--incognito"
rpa_command = [browser_path, browser_privacy, browser_profile, rpa_string]
browser_process = subprocess.Popen(rpa_command)
# Next step need to verify that the google page is opened
# Please help

Related

Selenium anticaptcha submission

I am trying to fill recaptcha using anticaptcha api.
But I am unable to figure out how to submit response.
Here is what I am trying to do:
driver.switch_to.frame(driver.find_element_by_xpath('//iframe'))
url_key = driver.find_element_by_xpath('//*[#id="captcha-submit"]/div/div/iframe').get_attribute('src')
#site_key = re.search('k=([^&]+)',url_key).group(1)
site_key = '6Ldd2doaAAAAAFhvJxqgQ0OKnYEld82b9FKDBnRE'
api_key = 'api_keys'
url = driver.current_url
client = AnticaptchaClient(api_key)
task = NoCaptchaTaskProxylessTask(url, site_key)
job = client.createTask(task)
job.join()
driver.execute_script("document.getElementById('g-recaptcha-response').innerHTML='{}';".format(job.get_solution_response()))
driver.refresh()
Above code snippet only refreshes the same page and not redirecting to input url.
Then I see that there is a variable in script on the same page and I tried to execute that variable too to submit form just like that
driver.execute_script("var captchaSubmitEl = document.getElementById('captcha-submit');")
driver.refresh()
Which also fails.The webpage is here.

why does my default browser didn't open, instead microsoft edge is opened

If i pass full web address like: https://gaana.com
it opens through my default browser chrome.
But when i pass, ganna.com , the web page is perfectly open through microsoft edge.
Any help about why the browser changes?
import webbrowser
alink = input('Enter the name: ')
new = 2
webbrowser.open(alink, new=new)
There is no error in general, but browser changes from default(chrome) to microsoft edge browser
That's most likely because your system doesn't know how to open the URL properly when no protocol was specified. Why don't you just prepend the protocol when the user has not passed any?
Example:
import webbrowser
alink = input('Enter the name: ')
new = 2
# protocol always comes before "://"
temp = alink.split('://')
if len(temp) > 2: # no protocol was specified
# let's use HTTP then
alink = "http://" + alink
webbrowser.open(alink, new=new)

Login to a website then open it in browser

I am trying to write a Python 3 code that logins in to a website and then opens it in a web browser to be able to take a screenshot of it.
Looking online I found that I could do webbrowser.open('example.com')
This opens the website, but cannot login.
Then I found that it is possible to login to a website using the request library, or urllib.
But the problem with both it that they do not seem to provide the option of opening a web page.
So how is it possible to login to a web page then display it, so that a screenshot of that page could be taken
Thanks
Have you considered Selenium? It drives a browser natively as a user would, and its Python client is pretty easy to use.
Here is one of my latest works with Selenium. It is a script to scrape multiple pages from a certain website and save their data into a csv file:
import os
import time
import csv
from selenium import webdriver
cols = [
'ies', 'campus', 'curso', 'grau_turno', 'modalidade',
'classificacao', 'nome', 'inscricao', 'nota'
]
codigos = [
96518, 96519, 96520, 96521, 96522, 96523, 96524, 96525, 96527, 96528
]
if not os.path.exists('arquivos_csv'):
os.makedirs('arquivos_csv')
options = webdriver.ChromeOptions()
prefs = {
'profile.default_content_setting_values.automatic_downloads': 1,
'profile.managed_default_content_settings.images': 2
}
options.add_experimental_option('prefs', prefs)
# Here you choose a webdriver ("the browser")
browser = webdriver.Chrome('chromedriver', chrome_options=options)
for codigo in codigos:
time.sleep(0.1)
# Here is where I set the URL
browser.get(f'http://www.sisu.mec.gov.br/selecionados?co_oferta={codigo}')
with open(f'arquivos_csv/sisu_resultados_usp_final.csv', 'a') as file:
dw = csv.DictWriter(file, fieldnames=cols, lineterminator='\n')
dw.writeheader()
ies = browser.find_element_by_xpath('//div[#class ="nome_ies_p"]').text.strip()
campus = browser.find_element_by_xpath('//div[#class ="nome_campus_p"]').text.strip()
curso = browser.find_element_by_xpath('//div[#class ="nome_curso_p"]').text.strip()
grau_turno = browser.find_element_by_xpath('//div[#class = "grau_turno_p"]').text.strip()
tabelas = browser.find_elements_by_xpath('//table[#class = "resultado_selecionados"]')
for t in tabelas:
modalidade = t.find_element_by_xpath('tbody//tr//th[#colspan = "4"]').text.strip()
aprovados = t.find_elements_by_xpath('tbody//tr')
for a in aprovados[2:]:
linha = a.find_elements_by_class_name('no_candidato')
classificacao = linha[0].text.strip()
nome = linha[1].text.strip()
inscricao = linha[2].text.strip()
nota = linha[3].text.strip().replace(',', '.')
dw.writerow({
'ies': ies, 'campus': campus, 'curso': curso,
'grau_turno': grau_turno, 'modalidade': modalidade,
'classificacao': classificacao, 'nome': nome,
'inscricao': inscricao, 'nota': nota
})
browser.quit()
In short, you set preferences, choose a webdriver (I recommend Chrome), point to the URL and that's it. The browser is automatically opened and start executing your instructions.
I have tested using it to log in and it works fine, but never tried to take screenshot. It theoretically should do.

Python selenium - "No modal dialog is currently open exception in selenium"

I am getting input url's and trying to load it in the browser (Firefox - Updated version). The script has been working for all the url's. Except few URLs return this error
"No modal dialog is currently open"
I tried adding driver.switch_to_alert().accept() but no help. I usually end up closing browser and restart it all over again.
Please find the code :
capabilities = DesiredCapabilities.FIREFOX.copy()
capabilities['marionette'] = True
capabilities['acceptSslCerts'] = True
driver = webdriver.Firefox(capabilities=capabilities)
driver.implicitly_wait(3)
print ("Number of rows to be processed is:" + str(ws1.max_row))
for row in range(2,ws1.max_row+1):
try:
region = ws1["E"+str(row)].value
# Reading Row 'row'
print("Reading row #"+str(row))
url = ws1["C"+str(row)].value
regex = df[df['Country'] == region]['Regex'].values[0]
#Load mainpage or input URL
i = 0
while i < 3:
try:
driver.get("http://"+url)
driver.switch_to_alert().accept()
time.sleep(5)
break
except (TimeoutException,WebDriverException,NoSuchElementException) as e:
print(e,'Retrying...',i+1)
i += 1
Sample URLs :
afilias.info
www.nuwavenow.com
www.IntoTomorrow.com
www.picobrew.com

scrapy + splash : not rendering full page javascript data

I am just exploring scrapy with splash and I am trying to scrape all the product (pants) data with productid,name and price from one of the e-commerce site
gap but I didn't see all the dynamic product data loaded when I see from splash web UI splash web UI (only 16 items are loading though for every request - no clue why)
I tried with the following options but no luck
Increasing wait time upto 20 sec
By starting the docker with "--disable-private-mode"
By using lua_script for page scrolling
With view report full option splash:set_viewport_full()
lua_script2 = """ function main(splash)
local num_scrolls = 10
local scroll_delay = 2.0
local scroll_to = splash:jsfunc("window.scrollTo")
local get_body_height = splash:jsfunc(
"function() {return document.body.scrollHeight;}"
)
assert(splash:go(splash.args.url))
splash:wait(splash.args.wait)
for _ = 1, num_scrolls do
scroll_to(0, get_body_height())
splash:wait(scroll_delay)
end
return splash:html()
end"""
yield SplashRequest(
url,
self.parse_product_contents,
endpoint='execute',
args={
'lua_source': lua_script2,
'wait': 5,
}
)
Can anyone please shed some light on this behavior?
p.s : I am using scrapy framework and I am able to parse the product information (itemid,name and price) from the render.html (but render.html has only 16 items information)
I updated the script to below
function main(splash)
local num_scrolls = 10
local scroll_delay = 2.0
splash:set_viewport_size(1980, 8020)
local scroll_to = splash:jsfunc("window.scrollTo")
local get_body_height = splash:jsfunc(
"function() {return document.body.scrollHeight;}"
)
assert(splash:go(splash.args.url))
-- splash:set_viewport_full()
splash:wait(10)
splash:runjs("jQuery('span.icon-x').click();")
splash:wait(1)
for _ = 1, num_scrolls do
scroll_to(0, get_body_height())
splash:wait(scroll_delay)
end
splash:wait(30)
return {
png = splash:png(),
html = splash:html(),
har = splash:har()
}
end
And ran it in my local splash, the png doesn't work fine but the HTML has the last product
The only issue was when the email subscribe popup is there it won't scroll, so I added code to close it

Resources