How to use paralledots api in app engine? - python-3.x

I want to check text similarity using paralleldots api in app engine, but when setting the api key in app engine using.
paralleldots.set_api_key("XXXXXXXXXXXXXXXXXXXXXXXXXXX")
App engine giving Error:
with open('settings.cfg', 'w') as configfile:
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/tools/devappserver2/python/runtime/stubs.py", line 278, in __init__
raise IOError(errno.EROFS, 'Read-only file system', filename)
IOError: [Errno 30] Read-only file system: 'settings.cfg'
INFO 2019-03-17 10:43:59,852 module.py:835] default: "GET / HTTP/1.1" 500 -
INFO 2019-03-17 10:46:47,548 client.py:777] Refreshing access_token
ERROR 2019-03-17 10:46:50,931 wsgi.py:263]
Traceback (most recent call last):
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 299, in _LoadHandler
handler, path, err = LoadObject(self._handler)
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 85, in LoadObject
obj = __import__(path[0])
File "/home/ulti72/Desktop/koda/main.py", line 26, in <module>
paralleldots.set_api_key("7PR8iwo42DGFB8qpLjpUGJPqEQHU322lqTDkgaMrX7I")
File "/home/ulti72/Desktop/koda/lib/paralleldots/config.py", line 13, in set_api_key
with open('settings.cfg', 'w') as configfile:
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/tools/devappserver2/python/runtime/stubs.py", line 278, in __init__
raise IOError(errno.EROFS, 'Read-only file system', filename)
IOError: [Errno 30] Read-only file system: 'settings.cfg'

The paralleldots api, seems to want to save a settings.cfg file to the local filesystem in response to that call. Which is not allowed in the 1st generation standard environment and only allowed for files in the /tmp filesystem in the 2nd generation.
The local development server was designed for the 1st generation standard env and enforces the restriction with that error. It has limited support for the 2nd generation env, see Python 3.7 Local Development Server Options for new app engine apps.
Things to try:
check if specifying the location of the settings.cfg is supported and if so make it reside under /tmp. Maybe the local development server allows that or you switch to some other local development method than the development server.
check if saving the settings using an already open file handler is supported and, if so, use one obtained from Cloud Storage client library, something along these lines: How to zip or tar a static folder without writing anything to the filesystem in python?
check if set_api_key() supports some other method of persisting the API key than saving the settings to a file
check if it's possible to specify the API key for every subsequent call so you don't have to persist it using set_api_key() (maybe using a common wrapper function for convenience)

Related

UndetectedChromedriver Extension Issue

I can't use extensions on UndetectedChromedriver PYPI Package (Python). If I use it with normal selenium its works, but not with this package. I tried to install extensions directly from webstore, but Chrome Webstore Alert is not an Alert to handle with selenium is a Window Event, so we need to use AutoIT, Pyautogui, etc... To handle that.
The only thing is working is loading profiles, but... I'm working for multiprocess windows, is working, but I need to create houndred of windows and then delete them. And I can't clone profiles, because UndetectedChromedriver doesn't work, i need to create manually.
Finally i tried with Google Chrome Enterprise Bundle, then I used Extensions policy to install forced the extension for all profiles. And yes, is working, but if I enabled that, selenium, doesn't work properly.
The error traceback log is:
Exception in thread Thread-2:
Traceback (most recent call last):
File "C:\Users\andre\anaconda3\envs\selenium-env\lib\threading.py", line 950, in _bootstrap_inner
self.run()
File "C:\Users\andre\anaconda3\envs\selenium-env\lib\threading.py", line 888, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\andre\OneDrive\Documentos\(A1)_Inicio\(A2)_CyberEspacio\LAB\(A1)_Programador123\(A1)_Programming_(Section)\VSCode Snippets\python\selenium\app.py", line 72, in test
seleniumCaptchaSolver.reCaptchaServiceLogin(apiKey='MYAPIKEY', solverType = SeleniumCaptchaSolverType().Capmonster)
File "C:\Users\andre\OneDrive\Documentos\(A1)_Inicio\(A2)_CyberEspacio\LAB\(A1)_Programador123\(A1)_Programming_(Section)\VSCode Snippets\python\selenium\modules\seleniumCaptchaSolver.py", line 103, in reCaptchaServiceLogin
self.__driver.get('chrome-extension://pabjfbciaedomjjfelfafejkppknjleh/popup.html')
File "C:\Users\andre\anaconda3\envs\selenium-env\lib\site-packages\undetected_chromedriver\__init__.py", line 535, in get
return super().get(url)
File "C:\Users\andre\anaconda3\envs\selenium-env\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 447, in get
self.execute(Command.GET, {'url': url})
File "C:\Users\andre\anaconda3\envs\selenium-env\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 435, in execute
self.error_handler.check_response(response)
File "C:\Users\andre\anaconda3\envs\selenium-env\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 247, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: cannot determine loading status
from disconnected: received Inspector.detached event
(Session info: chrome=103.0.5060.134)
This happen only when chrome-extension://pabjfbciaedomjjfelfafejkppknjleh/popup.html is opened to login (Send APi Key). I can login etc... But when Policy is Activated I can't because of that issue.
Anyone here know how to fix that or properly use extensions in UndetctedChromedriver?
Note: This error only happens if i load chrome-extension://pabjfbciaedomjjfelfafejkppknjleh/popup.html link, others links works.
I found this solution:
import undetected_chromedriver as uc
import os
working_dir = os.getcwd()
# Im using proxy extension
proxy_plugin = f'{working_dir}/proxy_plugin'
options = uc.ChromeOptions()
options.add_argument(f'--load-extension={proxy_plugin}')
# {proxy_plugin} path to extension folder, I tried to import .zip file
# and this doesnt working, maybe you can try import .crx file
# Also, I use extensions.ui.developer_mode
options.add_experimental_option('prefs', { 'extensions.ui.developer_mode': True })
driver = uc.Chrome(options = options)
extensions.ui.developer_mode
Yet, I seen this pages:
Load unpacked Chrome extension programmatically
Installing extension into V2

FileNotFoundError: [WinError 2] The system cannot find the file specified while loading model from s3

I have recently saved a model into s3 using joblib
model_doc is the model object
import subprocess
import joblib
save_d2v_to_s3_current_doc2vec_model(model_doc,"doc2vec_model")
def save_d2v_to_s3_current_doc2vec_model(model,fname):
model_name = fname
joblib.dump(model,model_name)
s3_base_path = 's3://sd-flikku/datalake/current_doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(model_name,path).split()
print('saving...'+model_name)
subprocess.call(command)
It was successful, but after that when i try to load the model back from s3 it gives me an error
model = load_d2v("doc2vec_model")
def load_d2v(fname):
model_name = fname
s3_base_path='s3://sd-flikku/datalake/current_doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(path,model_name).split()
print('loading...'+model_name)
subprocess.call(command)
model=joblib.load(model_name)
return model
This is the error i get:
loading...doc2vec_model
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in load_d2v
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 339, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 800, in __init__
restore_signals, start_new_session)
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1207, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
I don't even understand why it is saying File not found, this was the path i used to save the model but now i'm unable to get the model back from s3. Please help me!!
I suggest that rather than your generic print() lines, showing your intent, you should print the actual command you've composed, to verify that it makes sense upon observation.
If it does, then also try that exact same aws ... command directly, at the command prompt where you had been launching your python code, to make sure it runs that way. If it doesn't, you may get a more clear error.
Note that the error you're getting doesn't particularly look like it's coming from the aws command, of from the S3 service - which might talk about 'paths' or 'objects'. Rather, it's from the Python subprocess system & Popen' call. I think those are via your call tosubprocess.call(), but for some reason your line-of-code isn't shown. (How are you running the block of code with theload_d2v()`?)
That suggests the file that's no found might be the aws command itself. Are you sure it's installed & runnable from the exact working-directory/environment that your Python is running in, and invoking via subprocess.call()?
(BTW, if my previous answer got you over your sklearn.externals.joblib problem, it'd be good for you to mark the answer as accepted, to save other potential answerers from thinking that's still an unsolved question that's blocking you.)
try to add extension of your model file to your fname if you are confident the model file is there.
e.g. doc2vec_model.h3

Registering and downloading a fastText .bin model fails with Azure Machine Learning Service

I have a simple RegisterModel.py script that uses the Azure ML Service SDK to register a fastText .bin model. This completes successfully and I can see the model in the Azure Portal UI (I cannot see what model files are in it). I then want to download the model (DownloadModel.py) and use it (for testing purposes), however it throws an error on the model.download method (tarfile.ReadError: file could not be opened successfully) and makes a 0 byte rjtestmodel8.tar.gz file.
I then use the Azure Portal and Add Model and select the same bin model file and it uploads fine. Downloading it with the download.py script below works fine, so I am assuming something is not correct with the Register script.
Here are the 2 scripts and the stacktrace - let me know if you can see anything wrong:
RegisterModel.py
import azureml.core
from azureml.core import Workspace, Model
ws = Workspace.from_config()
model = Model.register(workspace=ws,
model_name='rjSDKmodel10',
model_path='riskModel.bin')
DownloadModel.py
# Works when downloading the UI Uploaded .bin file, but not the SDK registered .bin file
import os
import azureml.core
from azureml.core import Workspace, Model
ws = Workspace.from_config()
model = Model(workspace=ws, name='rjSDKmodel10')
model.download(target_dir=os.getcwd(), exist_ok=True)
Stacktrace
Traceback (most recent call last):
File "...\.vscode\extensions\ms-python.python-2019.9.34474\pythonFiles\ptvsd_launcher.py", line 43, in <module>
main(ptvsdArgs)
File "...\.vscode\extensions\ms-python.python-2019.9.34474\pythonFiles\lib\python\ptvsd\__main__.py", line 432, in main
run()
File "...\.vscode\extensions\ms-python.python-2019.9.34474\pythonFiles\lib\python\ptvsd\__main__.py", line 316, in run_file
runpy.run_path(target, run_name='__main__')
File "...\.conda\envs\DoC\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "...\.conda\envs\DoC\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "...\.conda\envs\DoC\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "...\\DownloadModel.py", line 21, in <module>
model.download(target_dir=os.getcwd(), exist_ok=True)
File "...\.conda\envs\DoC\lib\site-packages\azureml\core\model.py", line 712, in download
file_paths = self._download_model_files(sas_to_relative_download_path, target_dir, exist_ok)
File "...\.conda\envs\DoC\lib\site-packages\azureml\core\model.py", line 658, in _download_model_files
file_paths = self._handle_packed_model_file(tar_path, target_dir, exist_ok)
File "...\.conda\envs\DoC\lib\site-packages\azureml\core\model.py", line 670, in _handle_packed_model_file
with tarfile.open(tar_path) as tar:
File "...\.conda\envs\DoC\lib\tarfile.py", line 1578, in open
raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully
Environment
riskModel.bin is 6 megs
AMLS 1.0.60
Python 3.7
Working locally with Visual Code
The Azure Machine Learning service SDK has a bug with how it interacts with Azure Storage, which causes it to upload corrupted files if it has to retry uploading.
A couple workarounds:
The bug was introduced in 1.0.60 release. If you downgrade to AzureML-SDK 1.0.55, the code should fail when there are issue uploading instead of silently corrupting data.
It's possible that the retry is being triggered by the low timeout values that the AzureML-SDK defaults to. You could investigate changing the timeout in site-packages/azureml/_restclient/artifacts_client.py
This bug should be fixed in the next release of the AzureML-SDK.

Localhost: how to get credentials to connect GAE Python 3 app and Datastore Emulator?

I'd like to use the new Datastore Emulator together with a GAE Flask app on localhost. I want to run it in the Docker environment, but the error I get (DefaultCredentialsError) happens with or without Docker.
My Flask file looks like this (see the whole repository here on GitHub):
main.py:
from flask import Flask
from google.cloud import datastore
app = Flask(__name__)
#app.route("/")
def index():
return "App Engine with Python 3"
#app.route("/message")
def message():
# auth
db = datastore.Client()
# add object to db
entity = datastore.Entity(key=db.key("Message"))
message = {"message": "hello world"}
entity.update(message)
db.put(entity)
# query from db
obj = db.get(key=db.key("Message", entity.id))
return "Message for you: {}".format(obj["message"])
The index() handler works fine, but the message() handler throws this error:
[2019-02-03 20:00:46,246] ERROR in app: Exception on /message [GET]
Traceback (most recent call last):
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/app/main.py", line 16, in message
db = datastore.Client()
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/datastore/client.py", line 210, in __init__
project=project, credentials=credentials, _http=_http
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/client.py", line 223, in __init__
_ClientProjectMixin.__init__(self, project=project)
INFO 2019-02-03 20:00:46,260 module.py:861] default: "GET /message HTTP/1.1" 500 291
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/client.py", line 175, in __init__
project = self._determine_default(project)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/datastore/client.py", line 228, in _determine_default
return _determine_default_project(project)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/datastore/client.py", line 75, in _determine_default_project
project = _base_default_project(project=project)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/_helpers.py", line 186, in _determine_default_project
_, project = google.auth.default()
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/auth/_default.py", line 306, in default
raise exceptions.DefaultCredentialsError(_HELP_MESSAGE)
google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
I checked the website in the error log and tried the JSON auth file (GOOGLE_APPLICATION_CREDENTIALS), but the result was that my app then connected with a production Datastore on Google Cloud, instead of the local Datastore Emulator.
Any idea how to resolve this?
I managed to solve this problem by adding env vars directly into the Python code (in this case in main.py) and using the Mock library:
import os
import mock
from flask import Flask, render_template, request
from google.cloud import datastore
import google.auth.credentials
app = Flask(__name__)
if os.getenv('GAE_ENV', '').startswith('standard'):
# production
db = datastore.Client()
else:
# localhost
os.environ["DATASTORE_DATASET"] = "test"
os.environ["DATASTORE_EMULATOR_HOST"] = "localhost:8001"
os.environ["DATASTORE_EMULATOR_HOST_PATH"] = "localhost:8001/datastore"
os.environ["DATASTORE_HOST"] = "http://localhost:8001"
os.environ["DATASTORE_PROJECT_ID"] = "test"
credentials = mock.Mock(spec=google.auth.credentials.Credentials)
db = datastore.Client(project="test", credentials=credentials)
The Datastore Emulator is then run like this:
gcloud beta emulators datastore start --no-legacy --data-dir=. --project test --host-port "localhost:8001"
Requirements needed:
Flask
google-cloud-datastore
mock
google-auth
GitHub example here: https://github.com/smartninja/gae-2nd-gen-examples/tree/master/simple-app-datastore
The fact that credentials are required indicates you're reaching to the actual Datastore, not to the datastore emulator (which neither needs nor requests credentials).
To reach the emulator the client applications (that support it) need to figure out where the emulator is listening and, for that, you need to set the DATASTORE_EMULATOR_HOST environment variable for them. From Setting environment variables:
After you start the emulator, you need to set environment variables so
that your application connects to the emulator instead of the
production Datastore mode environment. Set these environment variables
on the same machine that you use to run your application.
You need to set the environment variables each time you start the
emulator. The environment variables depend on dynamically assigned
port numbers that could change when you restart the emulator.
See the rest of that section on details about setting the environment and maybe peek at Is it possible to start two dev_appserver.py connecting to the same google cloud datastore emulator?

Mercurial largefiles not working on Windows Server 2008

I'm trying to get the largefiles extension working on a mercurial server under Windows Server 2008 / IIS 7.5 with the hgweb.wsgi script.
When I clone a repo with largefiles locally (but using https://domain/, not a file system path) everything gets cloned fine, but when I try it on a different machine I get abort: remotestore: largefile XXXXX is missing
Here's the verbose output:
requesting all changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 177 changes to 177 files
calling hook changegroup.lfiles: <function checkrequireslfiles at 0x0000000002E00358>
updating to branch default
resolving manifests
getting .hglf/path/to.file
...
177 files updated, 0 files merged, 0 files removed, 0 files unresolved
getting changed largefiles
getting path/to.file:c0c81df934cd72ca980dd156984fa15987e3881d
abort: remotestore: largefile c0c81df934cd72ca980dd156984fa15987e3881dis missing
Both machines have the extension working. I've tried disabling the firewall but that didn't help. Do I have to do anything to set up the extension besides adding it to mercurial.ini?
Edit: If I delete the files from the server's AppData\Local\largefiles\ directory, I get the same error when cloning on the server, unless I use a filesystem path to clone, in which case the files are added back to `AppData\Local\largefiles\'
Edit 2: Here's the debug output and traceback:
177 files updated, 0 files merged, 0 files removed, 0 files unresolved
getting changed largefiles
using http://domain
sending capabilities command
getting largefiles: 0/75 lfile (0.00%)
getting path/to.file:64f2c341fb3b1adc7caec0dc9c51a97e51ca6034
sending statlfile command
Traceback (most recent call last):
File "mercurial\dispatch.pyo", line 87, in _runcatch
File "mercurial\dispatch.pyo", line 685, in _dispatch
File "mercurial\dispatch.pyo", line 467, in runcommand
File "mercurial\dispatch.pyo", line 775, in _runcommand
File "mercurial\dispatch.pyo", line 746, in checkargs
File "mercurial\dispatch.pyo", line 682, in <lambda>
File "mercurial\util.pyo", line 463, in check
File "mercurial\commands.pyo", line 1167, in clone
File "mercurial\hg.pyo", line 400, in clone
File "mercurial\extensions.pyo", line 184, in wrap
File "hgext\largefiles\overrides.pyo", line 629, in hgupdate
File "hgext\largefiles\lfcommands.pyo", line 416, in updatelfiles
File "hgext\largefiles\lfcommands.pyo", line 398, in cachelfiles
File "hgext\largefiles\basestore.pyo", line 80, in get
File "hgext\largefiles\remotestore.pyo", line 56, in _getfile
Abort: remotestore: largefile 64f2c341fb3b1adc7caec0dc9c51a97e51ca6034 is missing
The _getfile function throws an exception because the statlfile command returns that the file wasn't found.
I've never used python myself, so I don't know what I'm doing while trying to debug this :D
AFAIK the statlfile command gets executed on the server so I can't debug it from my local machine. I've tried running python -m win32traceutil on the server, but it doesn't show anything. I also tried setting accesslog and errorlog in the server's mercurial config file, but it doesn't generate them.
I run hg through the hgweb.wsgi script, and I have no idea if/how I can get into the python debugger using that, but if I could get the debugger running on the server I could narrow down the problem...
Finally figured it out, the extension tries to write temporary files to %windir%\System32\config\systemprofile\AppData\Local, which was causing permission errors. The call was wrapped in a try-catch block that ended up returning the "file not found" error.
I'm just posting this for anyone else coming into the thread from a search.
There's currently an issue using the largefiles extension in the mercurial python module when hosted via IIS. See this post if you're encountering issues pushing large changesets (or large files) to IIS via TortoiseHg.
The problem ultimlately turns out to be a bug in SSL processing introduced Python 2.7.3 (probably explaining why there are so many unresolve posts of people looking for problems with Mercurial). Rolling back to Python 2.7.2 let me get a little further ahead (blocked at 30Mb pushes instead of 15Mb), but to properly solve the problem I had to install the IISCrypto utility to completely disable transfers over SSLv2.

Resources