How to run headless firefox browser on remote server over SSH connection? - python-3.x

I have a remote server and I wish to run a headless session of Firefox there. I login into the remote server and execute the command. Even if the commands are headless, still it opens my machine's Firefox and performs actions within it. Any idea what could be the reason? I wish to perform these actions remotely without my display machine (like my laptop) being connected to it.
from selenium.webdriver import Firefox
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.support import expected_conditions as expected
from selenium.webdriver.support.wait import WebDriverWait
if __name__ == "__main__":
options = Options()
options.add_argument('-headless')
driver = Firefox(executable_path='/path/to/geckodriver', firefox_options=options)
wait = WebDriverWait(driver, timeout=10)
driver.get('http://www.google.com')
wait.until(expected.visibility_of_element_located((By.NAME, 'q'))).send_keys('headless firefox' + Keys.ENTER)
wait.until(expected.visibility_of_element_located((By.CSS_SELECTOR, '#ires a'))).click()
print(driver.page_source)
driver.quit()

I resolved it myself as follows:
First run this in terminal
sudo apt-get install xvfb
sudo pip3 install pyvirtualdisplay
Then add following lines to your code
from pyvirtualdisplay import Display
display = Display(visible=0,size=(1024,768))
display.start()
And my browser configuration looks like this:
cap = DesiredCapabilities().FIREFOX
cap["marionette"] = False
display = Display(visible=0,size=(1024,768))
display.start()
options = Options()
options.set_headless(headless=True)
binary = FirefoxBinary("/home/ubuntu/firefox/firefox")
options.add_argument("-headless")
browser = Firefox(firefox_options=options, executable_path='/home/ubuntu/Documents/sourcecode/geckodriver',firefox_binary=binary,capabilities = cap )

Related

Why does opt.add_argument('--user-data-dir='+r'path') work but opt.add_argument('--user-data-dir='+fr'"{path}"') doesn't as an option to Selenium?

I have realized of something very weird when trying to deploy a chrome driver using --user-data-dir and --profile-directory from the user on Python 3.9.7, see below:
If you compile the following code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
opt = Options() #the variable that will store the selenium options
opt.add_argument('--user-data-dir='+r'C:\Users\ResetStoreX\AppData\Local\Google\Chrome\User Data') #Add the user data path as an argument in selenium Options
opt.add_argument('--profile-directory=Default') #Add the profile directory as an argument in selenium Options
s = Service('C:/Users/ResetStoreX/AppData/Local/Programs/Python/Python39/Scripts/chromedriver.exe')
driver = webdriver.Chrome(service=s, options=opt)
driver.get('https://opensea.io/login?referrer=%2Faccount')
You get successfully a chrome driver instance using the corresponding --user-data-dir and --profile-directory:
Now, after killing all chrome driver instances using the following code on cmd:
taskkill /F /IM chromedriver.exe
And then compiling this other code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
opt = Options() #the variable that will store the selenium options
path = input('Introduce YOUR profile path:')
opt.add_argument('--user-data-dir='+fr'"{path}"') #Add the user data path as an argument in selenium Options
opt.add_argument('--profile-directory=Default') #Add the profile directory as an argument in selenium Options
s = Service('C:/Users/ResetStoreX/AppData/Local/Programs/Python/Python39/Scripts/chromedriver.exe')
driver = webdriver.Chrome(service=s, options=opt)
driver.get('https://opensea.io/login?referrer=%2Faccount')
For finally typing: C:\Users\ResetStoreX\AppData\Local\Google\Chrome\User Data as input
You get this error:
WebDriverException: unknown error: Could not remove old devtools port
file. Perhaps the given user-data-dir at
"C:\Users\ResetStoreX\AppData\Local\Google\Chrome\User Data"
is still attached to a running Chrome or Chromium process
Why does that happen?
Isn't opt.add_argument('--user-data-dir='+fr'"{path}"') a valid way of passing this user data path:
path = C:\Users\ResetStoreX\AppData\Local\Google\Chrome\User Data ?
I figured it out, I was creating a syntax error with opt.add_argument('--user-data-dir='+fr'"{path}"'), so I changed it for opt.add_argument('--user-data-dir='+fr'{path}'), the improved code would be the following:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
opt = Options() #the variable that will store the selenium options
path = input('Introduce YOUR profile path:')
opt.add_argument('--user-data-dir='+fr'{path}') #Add the user data path as an argument in selenium Options
opt.add_argument('--profile-directory=Default') #Add the profile directory as an argument in selenium Options
s = Service('C:/Users/ResetStoreX/AppData/Local/Programs/Python/Python39/Scripts/chromedriver.exe')
driver = webdriver.Chrome(service=s, options=opt)
driver.get('https://opensea.io/login?referrer=%2Faccount')
After compiling this code, the program will run without throwing any errors and get the same result as the first code shown in this post.

--headless is not an option in Chrome WebDriver for Selenium

I would like to have Selenium run a headless instance of Google Chrome to mine data from certain websites without the UI overhead. I downloaded the ChromeDriver executable from here and copied it to my current scripting directory.
The driver appears to work fine with Selenium and is able to browse automatically, however I cannot seem to find the headless option. Most online examples of using Selenium with headless Chrome go something along the lines of:
import os
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.binary_location = '/Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary'`
driver = webdriver.Chrome(executable_path=os.path.abspath(“chromedriver"), chrome_options=chrome_options)
driver.get("http://www.duo.com")`
However when I inspect the possible arguments for the Selenium WebDriver using the command chromedriver -h this is what I get:
D:\Jobs\scripts>chromedriver -h
Usage: chromedriver [OPTIONS]
Options
--port=PORT port to listen on
--adb-port=PORT adb server port
--log-path=FILE write server log to file instead of stderr, increases log level to INFO
--log-level=LEVEL set log level: ALL, DEBUG, INFO, WARNING, SEVERE, OFF
--verbose log verbosely (equivalent to --log-level=ALL)
--silent log nothing (equivalent to --log-level=OFF)
--append-log append log file instead of rewriting
--replayable (experimental) log verbosely and don't truncate long strings so that the log can be replayed.
--version print the version number and exit
--url-base base URL path prefix for commands, e.g. wd/url
--whitelisted-ips comma-separated whitelist of remote IP addresses which are allowed to connect to ChromeDriver
No --headless option is available.
Does the ChromeDriver obtained from the link above allow for headless browsing?
--headless is not argument for chromedriver but for Chrome. --headless Run chrome in headless mode, i.e., without a UI or display server dependencies. ChromeDriver is a separate executable that WebDriver uses to control Chrome and Webdriver is a a collection of language specific bindings to drive a browser.
I am able to run in headless mode with this set of options. I hope this will help:
from bs4 import BeautifulSoup, NavigableString
from selenium.webdriver.chrome.options import Options
from selenium import webdriver
import requests
import re
options = Options()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-gpu')
browser = webdriver.Chrome(chrome_options=options) # see edit for recent code change.
browser.implicitly_wait(20)
Update 12 Aug 2019:
old : browser = webdriver.Chrome(chrome_options=options)
new : browser = webdriver.Chrome(options=options)
Try
options.headless=True
The following is how I set up my headless chrome
options = webdriver.ChromeOptions()
options.headless=True
options.add_argument('window-size=1920x1080')
prefs = {
"download.default_directory": r"C:\FilePath\Download",
"download.prompt_for_download": False,
"download.directory_upgrade": True}
options.add_experimental_option('prefs', prefs)
chromedriver = (r"C:\Filepath\chromedriver.exe")
--headless is not argument for chromedriver but Chrome, you can see more arguments or Command Line Switches for chrome here

proxy server refusing connections when trying to run tor browser with selenium (using TorBrowserDriver not profile and binary) [duplicate]

I am trying to connect to a Tor browser but get an error stating "proxyConnectFailure" any ideas I have tried multiple attempts to get into the basics of Tor browser to get it connected but all in vain if any could help life could be saved big time:
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
binary = FirefoxBinary(r"C:\Users\Admin\Desktop\Tor Browser\Browser\firefox.exe")
profile = FirefoxProfile(r"C:\Users\Admin\Desktop\Tor Browser\Browser\TorBrowser\Data\Browser\profile.default")
# Configured profile settings.
proxyIP = "127.0.0.1"
proxyPort = 9150
proxy_settings = {"network.proxy.type":1,
"network.proxy.socks": proxyIP,
"network.proxy.socks_port": proxyPort,
"network.proxy.socks_remote_dns": True,
}
driver = webdriver.Firefox(firefox_binary=binary,proxy=proxy_settings)
def interactWithSite(driver):
driver.get("https://www.google.com")
driver.save_screenshot("screenshot.png")
interactWithSite(driver)
To connect to a Tor Browser through a FirefoxProfile you can use the following solution:
Code Block:
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
import os
torexe = os.popen(r'C:\Users\AtechM_03\Desktop\Tor Browser\Browser\TorBrowser\Tor\tor.exe')
profile = FirefoxProfile(r'C:\Users\AtechM_03\Desktop\Tor Browser\Browser\TorBrowser\Data\Browser\profile.default')
profile.set_preference('network.proxy.type', 1)
profile.set_preference('network.proxy.socks', '127.0.0.1')
profile.set_preference('network.proxy.socks_port', 9050)
profile.set_preference("network.proxy.socks_remote_dns", False)
profile.update_preferences()
driver = webdriver.Firefox(firefox_profile= profile, executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("http://check.torproject.org")
Browser Snapshot:
You can find a relevant discussion in How to use Tor with Chrome browser through Selenium
I would like to expand on #DebanjanB answer by adding the Linux counterpart:
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
import os
torexe = os.popen('some/path/tor-browser_en-US/Browser/start-tor-browser')
# in my case, I installed it under a folder tor-browser_en-US after
# downloading and extracting it from
# https://www.torproject.org/download/ for linux
profile = FirefoxProfile(
'some/path/tor-browser_en-US/Browser/TorBrowser/Data/Browser/profile.default')
profile.set_preference('network.proxy.type', 1)
profile.set_preference('network.proxy.socks', '127.0.0.1')
profile.set_preference('network.proxy.socks_port', 9050)
profile.set_preference("network.proxy.socks_remote_dns", False)
profile.update_preferences()
firefox_options = webdriver.FirefoxOptions()
firefox_options.binary_location = '/usr/bin/firefox'
# /usr/bin/firefox is default location of firefox - for me anyway
driver = webdriver.Firefox(
firefox_profile=profile, options=firefox_options,
executable_path='wherever/you/installed/geckodriver')
# I keep my geckodriver(s) in a special folder sorted by versions.
# Geckodriver downloadable here:
# https://github.com/mozilla/geckodriver/releases/
driver.get("http://check.torproject.org")
The verified answer does not work in case of opening dot onion sites(I believe that's something to do with tor network which is not allowing access to normal firefox).
As for the latest tor browser (from the tor browser bundle), starting it using selenium causes some error due to which the browser cannot start tor proxy itself causing proxy and timeout errors(doesn't matter if tor proxy is started by python or manually or not started at all). This could also be due to port 9050 or 9150 being used by tor proxy and not being available to browser's tor instance but this does not explain the error caused when no instance of tor proxy is running.
The solution i have found is to start the tor proxy as normal, manually or using os.popen("tor.exe") and configure tor browser to not start tor proxy.
here's the code:
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
os.popen(r'e:\\bla\\bla\\bla\\tor\\Tor\\tor.exe')
binary=FirefoxBinary(r'e:\\bla\\bla\\bla\\Tor Browser\\Browser\\firefox.exe')
fp=FirefoxProfile(r'e:\\foo\\bar\\bla\\Tor Browser\\Browser\\TorBrowser\\Data\\Browser\\profile.default')
fp.set_preference('extensions.torlauncher.start_tor',False)#note this
fp.set_preference('network.proxy.type',1)
fp.set_preference('network.proxy.socks', '127.0.0.1')
fp.set_preference('network.proxy.socks_port', 9050)
fp.set_preference("network.proxy.socks_remote_dns", True)
fp.update_preferences()
driver = webdriver.Firefox(firefox_profile=fp,firefox_binary=binary)
driver.get("http://check.torproject.org")
driver.get('https://www.bbcnewsv2vjtpsuy.onion/')
*note fp.set_preference('extensions.torlauncher.start_tor',False) on line 10 is being used to configure tor to not start its own tor instance so that it uses the proxy config and tor instance started above.
lo and behold as the tbb starts working like normal firefox bot browser

Chrome driver not working headlessly with Python and Selenium?

I am new to Selenium programming with Python. The code works fine without the --headless argument, but does not execute at all when I try to run it headlessly. Could someone help me out with it?
Below is my sample code, I am using Python 3.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import time
import requests
from selenium.webdriver.chrome.options import Options
options = Options()
options.set_headless(headless=True)
driver = webdriver.Chrome(chrome_options=options, executable_path=r'E:/Python/[FreeTutorials.Us] Udemy - python-master-web-scraping-course-doing-20-real-projects/03 Step _ Download HTML Content/chromedriver.exe')
driver.get("https://www.tirerack.com/survey/ValidationServlet?autoYear=2006&autoMake=Porsche&autoModel=911%20Carrera%20S%20Cabriolet&newDesktop=true")
print ("Headless Chrome Initialized")
html_doc=driver.page_source
soup= BeautifulSoup(html_doc,'lxml')
print(soup)
driver.quit()
I have added headless as an arguement instead of option and it worked for me. Give a try.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path='path to your chromedriver')
driver.get('https://stackoverflow.com/')
Note: Always keep your chromedriver in homepath of python i.e. C:\Python34.

selenium - Can't work out how to get FireFox WebDriver to work

This should be easy, but I'm missing something. I'm just trying to get some selenium python tests running on firefox, which work perfectly in chrome.
The problem is just trying to get the ff webdriver up and running!
I have the following code, all the paths are correct:
import selenium
from selenium.webdriver.firefox import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
binary = FirefoxBinary('C:\\Program Files\\Mozilla Firefox\\firefox.exe')
profile = webdriver.FirefoxProfile()
geckopath = 'C:\source\web_deploy_tests\geckodriver.exe'
browser = selenium.webdriver.Firefox(
capabilities={},
executable_path=geckopath,
firefox_profile=profile,
firefox_binary=binary
)
browser.get("http://google.com")
I'm using Python 3.6.2, selenium 3.6.0 and have v0.19.0 of geckodriver.exe and FF is v56.0.1.
When I run the above code, firefox appears but just sits there for about 30 secs then crashes with:
selenium.common.exceptions.WebDriverException: Message: Can't load the
profile. Possible firefox version mismatch. You must use GeckoDriver
instead for Firefox 48+. Profile Dir:
C:\Users\ADMINI~1\AppData\Local\Temp\3\tmpkx5dau8h If you specified a
log_file in the FirefoxBinary constructor, check it for details.
I've tried various combinations of args but I am failing.
Any ideas?
TIA
I got it working by passing in DesiredCapabilities.FIREFOX
import selenium
from selenium.webdriver import DesiredCapabilities
from selenium.webdriver.firefox import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
binary = FirefoxBinary('C:\\Program Files\\Mozilla Firefox\\firefox.exe')
profile = webdriver.FirefoxProfile()
geckopath = 'C:\source\web_deploy_tests\geckodriver.exe'
browser = selenium.webdriver.Firefox(
capabilities=**DesiredCapabilities.FIREFOX**,
executable_path=geckopath,
firefox_profile=profile,
firefox_binary=binary
)
browser.get("http://google.com")
Now I can at least bring up the browser window!

Resources