How to execute Chrome-webdriver in Azure Cloud Service? - azure

You are about to run the chrome-webdriver on an azure cloud service.
However, the following error occurs.
chrome_webdriver sample code
from selenium import webdriver
driver = webdriver.Chrome('/.chromedriver.exe')
url = "https://www.instagram.com/"
driver.get(url)
If you look at the picture, I put the chrome-driver file in the same directory as .ipynb
The path was set to './chromedriver.exe', but an error occurred.
Other attempted methods
1. driver = webdriver.Chrome(r"\chromedriver.exe")
2. driver = webdriver.Chrome("\\chromedriver.exe")
3. driver = webdriver.Chrome("/chromedriver.exe")
-> I tried a different method, but an error occurred.
how to chrome-webdriver excute in azure cloud service?
Vova Bilyachat Update posts for comments
1. driver = webdriver.Chrome('./Users/admin/chromedriver.exe')
-> Message: 'chromedriver.exe' executable needs to be in PATH.
2. driver = webdriver.Chrome('./chromedriver.exe')
-> OSError: [Errno 8] Exec format error: './chromedriver.exe'
update post2

I think your error is that you set path "/chromedriver.exe" which is looking for file in root folder, change it to "./chromedriver.exe" where "./" means start from folder you execute script
Also you must be sure that you deployed right driver for right Operating System
And since its linux use
driver = webdriver.Chrome('/.chromedriver')

Related

Winidea Python script called from Jenkins

I am trying to launch Winidea configuration stored on U: drive. I am able to execute this script using command promt. However when I am trying to execute this script using Jenkins, it is giving error related to dll. My code is
import isystem.connect as ic
import time
print('isystem.connect version: ' + ic.getModuleVersion())
# 1. connect to winIDEA Application
pathTowinIDEA = 'C:/winIDEA/iConnect.dll'
cmgr_APPL = ic.ConnectionMgr(pathTowinIDEA)
cmgr_APPL.connectMRU('U:/winIDEA/myconfig.xjrf')
debug_APPL = ic.CDebugFacade(cmgr_APPL)
ec = ic.CExecutionController(cmgr_APPL)
The error is coming at line
cmgr_APPL.connectMRU('U:/winIDEA/myconfig.xjrf')
Error is as follow
enter image description here

User program failed with ValueError: ZIP does not support timestamps before 1980

Running pipeline failed with the following error.
User program failed with ValueError: ZIP does not support timestamps before 1980
I created Azure ML Pipeline that call several child run. See the attached codes.
# start parent Run
run = Run.get_context()
workspace = run.experiment.workspace
from azureml.core import Workspace, Environment
runconfig = ScriptRunConfig(source_directory=".", script="simple-for-bug-check.py")
runconfig.run_config.target = "cpu-cluster"
# Submit the run
for i in range(10):
print("child run ...")
run.submit_child(runconfig)
It seems timestamp of python script (simple-for-bug-check.py) is invalid.
My Python SDK version is 1.0.83.
Any workaround on this ?
Regards,
Keita
One workaround to the issue is setting the source_directory_data_store to a datastore pointing to a file share. Every workspace comes with a datastore pointing to a file share by default, so you can change the parent run submission code to:
# workspacefilestore is the datastore that is created with every workspace that points to a file share
run_config.source_directory_data_store = 'workspacefilestore'
if you are using RunConfiguration or if you are using an estimator, you can do the following:
datastore = Datastore(workspace, 'workspacefilestore')
est = Estimator(..., source_directory_data_store=datastore, ...)
The cause of the issue is the current working directory in a run is a blobfuse mounted directory, and in the current (1.2.4) as well as prior versions of blobfuse, the last modified date of every directory is set to the Unix epoch (1970/01/01). By changing the source_directory_data_store to a file share, this will change the current working directory to a cifs mounted file share, which will have the correct last modified time for directories and thus will not have this issue.

How to fix "Cannot connect to server" exception?

I was setting up chromedriver with selenium, using the test script provided on the chromdriver website. Everything worked fine, until I switched to a different WiFi network. Now I'm getting an error message when running my script.
I have searched the web for solutions, and I've tried the following things:
Made sure the chromedriver version matches my chrome version.
Try to whitelist the ip-address
I checked for 127.0.0.1 localhost in /etc/hosts
The test code I'm running (/path/to/my/chromedriver is correct):
import time
from selenium import webdriver
driver = webdriver.Chrome("/path/to/my/chromedriver") # Optional argument, if not specified will search path.
driver.get('http://www.google.com/xhtml');
time.sleep(5) # Let the user actually see something!
search_box = driver.find_element_by_name('q')
search_box.send_keys('ChromeDriver')
search_box.submit()
time.sleep(5) # Let the user actually see something!
driver.quit()
I'm expecting the program to run fine, and the browser should pop up. However, the browser is not popping up and I'm getting the following error message:
File "test.py", line 4, in
driver = webdriver.Chrome("/path/to/my/chromedriver") # Optional argument, if not specified will search path.
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in init
self.service.start()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/selenium/webdriver/common/service.py", line 104, in start
raise WebDriverException("Can not connect to the Service %s" % self.path)
selenium.common.exceptions.WebDriverException: Message: Can not connect to the Service /path/to/my/chromedriver
When running the chromedriver in the terminal I'm getting the following message (and the browser is also not popping up as supposed to):
Only local connections are allowed.
Please protect ports used by ChromeDriver and related test frameworks to prevent access by malicious code.
EDIT: I have the same problem with the geckodriver for firefox, so it is not specific for Chrome.

Why am I getting : Unable to import module 'handler': No module named 'paramiko'?

I was in the need to move files with a aws-lambda from a SFTP server to my AWS account,
then I've found this article:
https://aws.amazon.com/blogs/compute/scheduling-ssh-jobs-using-aws-lambda/
Talking about paramiko as a SSHclient candidate to move files over ssh.
Then I've written this calss wrapper in python to be used from my serverless handler file:
import paramiko
import sys
class FTPClient(object):
def __init__(self, hostname, username, password):
"""
creates ftp connection
Args:
hostname (string): endpoint of the ftp server
username (string): username for logging in on the ftp server
password (string): password for logging in on the ftp server
"""
try:
self._host = hostname
self._port = 22
#lets you save results of the download into a log file.
#paramiko.util.log_to_file("path/to/log/file.txt")
self._sftpTransport = paramiko.Transport((self._host, self._port))
self._sftpTransport.connect(username=username, password=password)
self._sftp = paramiko.SFTPClient.from_transport(self._sftpTransport)
except:
print ("Unexpected error" , sys.exc_info())
raise
def get(self, sftpPath):
"""
creates ftp connection
Args:
sftpPath = "path/to/file/on/sftp/to/be/downloaded"
"""
localPath="/tmp/temp-download.txt"
self._sftp.get(sftpPath, localPath)
self._sftp.close()
tmpfile = open(localPath, 'r')
return tmpfile.read()
def close(self):
self._sftpTransport.close()
On my local machine it works as expected (test.py):
import ftp_client
sftp = ftp_client.FTPClient(
"host",
"myuser",
"password")
file = sftp.get('/testFile.txt')
print(file)
But when I deploy it with serverless and run the handler.py function (same as the test.py above) I get back the error:
Unable to import module 'handler': No module named 'paramiko'
Looks like the deploy is unable to import paramiko (by the article above it seems like it should be available for lambda python 3 on AWS) isn't it?
If not what's the best practice for this case? Should I include the library into my local project and package/deploy it to aws?
A comprehensive guide tutorial exists at :
https://serverless.com/blog/serverless-python-packaging/
Using the serverless-python-requirements package
as serverless node plugin.
Creating a virtual env and Docker Deamon will be required to packup your serverless project before deploying on AWS lambda
In the case you use
custom:
pythonRequirements:
zip: true
in your serverless.yml, you have to use this code snippet at the start of your handler
try:
import unzip_requirements
except ImportError:
pass
all details possible to find in Serverless Python Requirements documentation
You have to create a virtualenv, install your dependencies and then zip all files under sites-packages/
sudo pip install virtualenv
virtualenv -p python3 myvirtualenv
source myvirtualenv/bin/activate
pip install paramiko
cp handler.py myvirtualenv/lib/python
zip -r myvirtualenv/lib/python3.6/site-packages/ -O package.zip
then upload package.zip to lambda
You have to provide all dependencies that are not installed in AWS' Python runtime.
Take a look at Step 7 in the tutorial. Looks like he is adding the dependencies from the virtual environment to the zip file. So I'd assume your ZIP file to contain the following:
your worker_function.py on top level
a folder paramico with the files installed in virtual env
Please let me know if this helps.
I tried various blogs and guides like:
web scraping with lambda
AWS Layers for Pandas
spending hours of trying out things. Facing SIZE issues like that or being unable to import modules etc.
.. and I nearly reached the end (that is to invoke LOCALLY my handler function), but then my function even though it was fully deployed correctly and even invoked LOCALLY with no problems, then it was impossible to invoke it on AWS.
The most comprehensive and best by far guide or example that is ACTUALLY working is the above mentioned by #koalaok ! Thanks buddy!
actual link

Bad SSL Key When Trying to Use spark-ec2 script to launch cluster on EC2?

Version of Apache Spark: spark-1.2.1-bin-hadoop2.4
Platform: Ubuntu
I have been using the spark-1.2.1-bin-hadoop2.4/ec2/spark-ec2 script to create temporary clusters on ec2 for testing. All was working well.
Then I started to get the following error when trying to launch the cluster:
[Errno 185090050] _ssl.c:344: error:0B084002:x509 certificate routines:X509_load_cert_crl_file:system lib
I have traced this back to the following line in the spark_ec2.py script:
conn = ec2.connect_to_region(opts.region)
Thus, the first time the script interacts with ec2, it is throwing this error. Spark is using the Python boto library (included with the Spark download) to make this call.
I assume the error I am getting is because of a bad cacert.pem file somewhere.
My question: which cacert.pem file gets used when I try to invoke the spark-ec2 script, and why is it not working?
I also had this error with spark-1.2.0-bin-hadoop2.4
SOLVED: the embedded boto library that comes with Spark found a ~/.boto config file I had for another non-Spark project (actually it was for the Google Cloud Services...GCS installed it, I had forgotten about it). That was screwing everything up.
As soon as I deleted the ~/.boto config file GCS installed, everything started working again for Spark!

Resources