Deploying Selenium on Heroku(Status code -6) - python-3.x

I'm trying to run a script via scheduler which uses selenium but it shows the following error - Message: Service /app/.apt/opt/google/chrome/chrome unexpectedly exited. Status code was: -6
I've used both the buildpacks -
https://github.com/heroku/heroku-buildpack-chromedriver.git
https://github.com/heroku/heroku-buildpack-xvfb-google-chrome
The script is:
chrome_exec_shim = "/app/.apt/opt/google/chrome/chrome"
opts = webdriver.ChromeOptions()
opts.binary_location = chrome_exec_shim
driver = webdriver.Chrome(executable_path=chrome_exec_shim, chrome_options=opts)

What you should do is download the Chrome driver here.
You can either put it in the chrome package that way you don't need to set the path at all. ( In my experience better to put in the path) or you can just give the path to the downloaded driver it can be in the project folder (recommend).
Just change the variable chrome_exec_shim to the path of the driver.

chrome_exec_shim = "/app/.apt/opt/google/chrome/chrome"
opts = webdriver.ChromeOptions()
opts.binary_location = chrome_exec_shim
opts.addArguments("--no-sandbox");
opts.addArguments("--disable-gpu");
driver = webdriver.Chrome(executable_path=chrome_exec_shim, chrome_options=opts)
Try with this code.
You must add the arguments of the chromeoption and it will work.
I tried this and its working for me.

After downloading the chromedriver, it was giving an error that binary was not found.
Gave the address of chrome in the executable path and the path of chrome driver in the chrome options.
That too resulted in the error, and after adding --disable-gpu and --no-sandbox arguments in the chrome options, it got resolved.
Thanks for the help... :)
The code that ran, at last, is below -
from selenium import webdriver
import os
chrome_exec_shim = os.environ.get("GOOGLE_CHROME_BIN", "chromedriver")
opts = webdriver.ChromeOptions()
opts.binary_location = chrome_exec_shim
opts.add_argument('--disable-gpu')
driver = webdriver.Chrome(executable_path='/app/development/chromedriver', chrome_options=opts)

Related

--headless is not an option in Chrome WebDriver for Selenium

I would like to have Selenium run a headless instance of Google Chrome to mine data from certain websites without the UI overhead. I downloaded the ChromeDriver executable from here and copied it to my current scripting directory.
The driver appears to work fine with Selenium and is able to browse automatically, however I cannot seem to find the headless option. Most online examples of using Selenium with headless Chrome go something along the lines of:
import os
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.binary_location = '/Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary'`
driver = webdriver.Chrome(executable_path=os.path.abspath(“chromedriver"), chrome_options=chrome_options)
driver.get("http://www.duo.com")`
However when I inspect the possible arguments for the Selenium WebDriver using the command chromedriver -h this is what I get:
D:\Jobs\scripts>chromedriver -h
Usage: chromedriver [OPTIONS]
Options
--port=PORT port to listen on
--adb-port=PORT adb server port
--log-path=FILE write server log to file instead of stderr, increases log level to INFO
--log-level=LEVEL set log level: ALL, DEBUG, INFO, WARNING, SEVERE, OFF
--verbose log verbosely (equivalent to --log-level=ALL)
--silent log nothing (equivalent to --log-level=OFF)
--append-log append log file instead of rewriting
--replayable (experimental) log verbosely and don't truncate long strings so that the log can be replayed.
--version print the version number and exit
--url-base base URL path prefix for commands, e.g. wd/url
--whitelisted-ips comma-separated whitelist of remote IP addresses which are allowed to connect to ChromeDriver
No --headless option is available.
Does the ChromeDriver obtained from the link above allow for headless browsing?
--headless is not argument for chromedriver but for Chrome. --headless Run chrome in headless mode, i.e., without a UI or display server dependencies. ChromeDriver is a separate executable that WebDriver uses to control Chrome and Webdriver is a a collection of language specific bindings to drive a browser.
I am able to run in headless mode with this set of options. I hope this will help:
from bs4 import BeautifulSoup, NavigableString
from selenium.webdriver.chrome.options import Options
from selenium import webdriver
import requests
import re
options = Options()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-gpu')
browser = webdriver.Chrome(chrome_options=options) # see edit for recent code change.
browser.implicitly_wait(20)
Update 12 Aug 2019:
old : browser = webdriver.Chrome(chrome_options=options)
new : browser = webdriver.Chrome(options=options)
Try
options.headless=True
The following is how I set up my headless chrome
options = webdriver.ChromeOptions()
options.headless=True
options.add_argument('window-size=1920x1080')
prefs = {
"download.default_directory": r"C:\FilePath\Download",
"download.prompt_for_download": False,
"download.directory_upgrade": True}
options.add_experimental_option('prefs', prefs)
chromedriver = (r"C:\Filepath\chromedriver.exe")
--headless is not argument for chromedriver but Chrome, you can see more arguments or Command Line Switches for chrome here

Running Headless Chrome using Python 3.6 on AWS Lambda - Permissions Error

I have struggled to get Headless Chrome running on AWS Lambda for days. It works fine on EC2 but when I try it on Lambda, I just get "Message: 'chromedriver' executable may have wrong permissions.
The modules are zipped with the chromedriver and headless-chromium executables in the root directory of the zip file. The total zipped file I upload to S3 is 52mb but extracted it is below the 250mb limit so I don't think that is the issue.
Python Zip Folder Structure Image
from selenium import webdriver
def lambda_handler(event, context):
options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1280x1696")
options.add_argument("--disable-application-cache")
options.add_argument("--disable-infobars")
options.add_argument("--no-sandbox")
options.add_argument("--hide-scrollbars")
options.add_argument("--enable-logging")
options.add_argument("--log-level=0")
options.add_argument("--v=99")
options.add_argument("--single-process")
options.add_argument("--ignore-certificate-errors")
options.add_argument("--homedir=/tmp")
options.binary_location = "/var/task/headless-chromium"
driver = webdriver.Chrome("/var/task/chromedriver", chrome_options=options)
driver.get("https://www.google.co.uk")
title = driver.title
driver.close()
return title
if __name__ == "__main__":
title = lambda_handler(None, None)
print("title:", title)
A few posts on the web have reported compatibility issues that may have caused problems so I have the specific executable versions for Chrome and ChromeDriver from the web, where others seem to on previous success EC2 and other means.
DOWNLOAD SOURCES FOR HEADLESS CHROME AND CHROMEDRIVER
(stable) https://github.com/adieuadieu/serverless-chrome/releases/tag/v1.0.0-37
(https://sites.google.com/a/chromium.org/chromedriver/downloads) Download unavailable so retrieved from the source below
https://chromedriver.storage.googleapis.com/index.html?path=2.37/
Can anyone help me crack this?
I found a solution for this problem few minutes ago.
When use chromedriver in Lambda Function (I think) it need permission can write. but when chrome driver file is in 'task' folder or 'opt' folder, user can only have read permission.
Only folder can change permission in Lambda Function is 'tmp' folder.
So I move the chrome driver file to 'tmp' folder. and it works.
like this.
os.system("cp ./chromedriver /tmp/chromedriver")
os.system("cp ./headless-chromium /tmp/headless-chromium")
os.chmod("/tmp/chromedriver", 0o777)
os.chmod("/tmp/headless-chromium", 0o777)
chrome_options.binary_location = "/tmp/headless-chromium"
driver = webdriver.Chrome(executable_path=r"/tmp/chromedriver",chrome_options=chrome_options)

Where to place PhantomJS exe?

I am trying to use PhantomJS with Selenium and Python.
My understanding is:
I will have to write Python script utilizing Selenium package which will interact with Selenium to operate on PhantomJS WebDriver to automate web application testing.
I have installed following:
Python v3.5.1.
Selenium using pip install selenium v3.7.0.
PhantomJS v2.1.1
In meantime I tested using Chrome WebDriver by placing it in PATH, and it executes without errors. Following is my script to open google.com using chrome webdriver.
from selenium import webdriver
driver = webdriver.Chrome() # or add to your PATH
driver.get('https://google.com/')
Using PhantomJS:
from selenium import webdriver
url = "http://www.google.com"
path_phantom = r'H:\phantomjs\bin\phantomjs.exe'
driver = webdriver.PhantomJS(executable_path=path_phantom)
driver.get(url)
driver.save_screenshot(r'H:\out.png')
driver.quit()
Errors:
Traceback (most recent call last):
File "C:\Users\acer\Desktop\testing\openYoutube.py", line 5, in
driver = webdriver.PhantomJS()
File "C:\Users\acer\AppData\Local\Programs\Python\Python35-32\lib\site-package
s\selenium\webdriver\phantomjs\webdriver.py", line 51, in init
log_path=service_log_path)
File "C:\Users\acer\AppData\Local\Programs\Python\Python35-32\lib\site-package
s\selenium\webdriver\phantomjs\service.py", line 50, in init
service.Service.init(self, executable_path, port=port, log_file=open(log
_path, 'w'))
PermissionError: [Errno 13] Permission denied: 'ghostdriver.log'
Am I misplacing PhantomJS exe or missing any step ?
You can place the PhantomJS v2.1.1 binary at any location within your system and use the following code block :
from selenium import webdriver
url = "http://www.url.com.br/contact.asp"
path_phantom = r'C:\your_path\phantomjs-2.1.1-windows\bin\phantomjs.exe'
driver = webdriver.PhantomJS(executable_path=path_phantom)
driver.set_window_size(1400,1000)
driver.get(url)
Update :
Please consider the following points and try the following code block with debug messages:
Run CCleaner tool to wipe off all the OS chores from your system.
You can opt for a System Reboot.
Try to keep the Python Application, WebBrowser binaries and the WebDriver binaries i.e. phantomjs.exe on the same drive.
from selenium import webdriver
url = "http://www.google.com"
path_phantom = r'C:\Utility\phantomjs-2.1.1-windows\bin\phantomjs.exe'
driver = webdriver.PhantomJS(executable_path=path_phantom)
print("PhantomJS browser invoked")
driver.get(url)
print("Browser Initialized")
driver.save_screenshot("C://Utility//out.png")
driver.quit()
print("Browser Closed")
Problem seems to be with the log file.
Changing path of log file solved this problem.
path_phantom = r'H:\phantomjs\bin\phantomjs.exe'
log_path=r'H:\ghostdriver.log' #changed path to a temporary file.
# service_log_path is required to change path of log file.
driver = webdriver.PhantomJS(executable_path=path_phantom,service_log_path=log_path)
From your error:
PermissionError: [Errno 13] Permission denied: 'ghostdriver.log
Seems that it try to create this file ghostdriver.log but fails because of the permissions.
As suggested in this answer, try to add the argument
service_log_path=os.path.devnull
to the function webdriver.PhantomJS().
Or make sure it is able to create the file.

How to suppress console error/warning/info messages when executing selenium python scripts using chrome canary

I am running python script (complete script link below) for selenium test using Chrome Canary. The test seems to be running fine, however, there are lots of error/warning/info messages displayed on the console.
Is there a way to suppress these messages? I have tried:
chrome_options.add_argument("--silent"), but does not help. I am not able to find the right solution. Appreciate any help.
Python script : Example script provided here
Python: 3.6.3
Selenium: 3.6.0
Chrome Canary: 63.0.3239.5 (64 bit)
ChromeDriver : 2.33
Try options.add_argument('log-level=3').
log-level:
Sets the minimum log level.
Valid values are from 0 to 3:
INFO = 0,
WARNING = 1,
LOG_ERROR = 2,
LOG_FATAL = 3.
default is 0.
If "--log-level" doesn't work for you (as of 75.0.3770.100 it didn't for me), this should:
options = webdriver.ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-logging'])
driver = webdriver.Chrome(executable_path='<path-to-chrome>', options=options)
See https://bugs.chromium.org/p/chromedriver/issues/detail?id=2907#c3
Code copied from Python selenium: DevTools listening on ws://127.0.0.1
Works for me in Python/Chrome...
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--log-level=3')
You can take help of below link.
List of Chromium Command Line Switches
"--log-level" sets the minimum log level. Valid values are from 0 to 3: INFO = 0, WARNING = 1, LOG_ERROR = 2, LOG_FATAL = 3.
I have just tested this one, it works for me (C#):
ChromeOptions options = new ChromeOptions();
options.AddArguments("--headless", "--log-level=3");
RemoteWebDriver driver = new ChromeDriver(options);
import os
os.environ['WDM_LOG_LEVEL'] = '0'
That code hides the console output for from webdriver_manager.chrome import ChromeDriverManager console outputs

Set Accepted-Lang for Chrome Headless with Selenium (Python)

Setting the Accepted-Lang header works fine with regular Chrome via ChromeOptions
options.add_experimental_option('prefs', {'intl.accept_languages': 'en,en_US'})
I'm trying to switch to new headless Chrome, but apparently this option has no effect when checking headers on validator.w3.org. Can I change them in another way? Anybody knows if support for this feature is coming?
Using Chrome 60, Chromedriver 2.30, Selenium 3.4.3, Python 3.6.1 on MacOS
Using this code:
from selenium import webdriver
print('Start')
options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_experimental_option('prefs', {'intl.accept_languages':'en,en_US'})
driver = webdriver.Chrome(chrome_options=options)
driver.get('http://validator.w3.org/i18n-checker/check?uri=google.com#validate-by-uri+')
print('Loaded')
# Check headers in output.html file. Search for 'Request headers'
html_source = driver.page_source
file = open('output.html', 'w')
file.write(html_source)
file.close
driver.implicitly_wait(5)
# Or check headers with select
# WARNING: This fails with 'headless' chrome option!
element = driver.find_element_by_xpath("//code[#class='headers_accept_language_0']").get_attribute('textContent')
print('Element:', element)
driver.close()
print('Finish')
Thanks!
That should be possible using the Chrome-developer-protocoll (cdp).
You can execute cdp commands using driver.execute_cdp_cmd().
Implemented it to Selenium-Profiles

Resources