Localhost: how to get credentials to connect GAE Python 3 app and Datastore Emulator? - python-3.x

I'd like to use the new Datastore Emulator together with a GAE Flask app on localhost. I want to run it in the Docker environment, but the error I get (DefaultCredentialsError) happens with or without Docker.
My Flask file looks like this (see the whole repository here on GitHub):
main.py:
from flask import Flask
from google.cloud import datastore
app = Flask(__name__)
#app.route("/")
def index():
return "App Engine with Python 3"
#app.route("/message")
def message():
# auth
db = datastore.Client()
# add object to db
entity = datastore.Entity(key=db.key("Message"))
message = {"message": "hello world"}
entity.update(message)
db.put(entity)
# query from db
obj = db.get(key=db.key("Message", entity.id))
return "Message for you: {}".format(obj["message"])
The index() handler works fine, but the message() handler throws this error:
[2019-02-03 20:00:46,246] ERROR in app: Exception on /message [GET]
Traceback (most recent call last):
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/app/main.py", line 16, in message
db = datastore.Client()
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/datastore/client.py", line 210, in __init__
project=project, credentials=credentials, _http=_http
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/client.py", line 223, in __init__
_ClientProjectMixin.__init__(self, project=project)
INFO 2019-02-03 20:00:46,260 module.py:861] default: "GET /message HTTP/1.1" 500 291
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/client.py", line 175, in __init__
project = self._determine_default(project)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/datastore/client.py", line 228, in _determine_default
return _determine_default_project(project)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/datastore/client.py", line 75, in _determine_default_project
project = _base_default_project(project=project)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/_helpers.py", line 186, in _determine_default_project
_, project = google.auth.default()
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/auth/_default.py", line 306, in default
raise exceptions.DefaultCredentialsError(_HELP_MESSAGE)
google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
I checked the website in the error log and tried the JSON auth file (GOOGLE_APPLICATION_CREDENTIALS), but the result was that my app then connected with a production Datastore on Google Cloud, instead of the local Datastore Emulator.
Any idea how to resolve this?

I managed to solve this problem by adding env vars directly into the Python code (in this case in main.py) and using the Mock library:
import os
import mock
from flask import Flask, render_template, request
from google.cloud import datastore
import google.auth.credentials
app = Flask(__name__)
if os.getenv('GAE_ENV', '').startswith('standard'):
# production
db = datastore.Client()
else:
# localhost
os.environ["DATASTORE_DATASET"] = "test"
os.environ["DATASTORE_EMULATOR_HOST"] = "localhost:8001"
os.environ["DATASTORE_EMULATOR_HOST_PATH"] = "localhost:8001/datastore"
os.environ["DATASTORE_HOST"] = "http://localhost:8001"
os.environ["DATASTORE_PROJECT_ID"] = "test"
credentials = mock.Mock(spec=google.auth.credentials.Credentials)
db = datastore.Client(project="test", credentials=credentials)
The Datastore Emulator is then run like this:
gcloud beta emulators datastore start --no-legacy --data-dir=. --project test --host-port "localhost:8001"
Requirements needed:
Flask
google-cloud-datastore
mock
google-auth
GitHub example here: https://github.com/smartninja/gae-2nd-gen-examples/tree/master/simple-app-datastore

The fact that credentials are required indicates you're reaching to the actual Datastore, not to the datastore emulator (which neither needs nor requests credentials).
To reach the emulator the client applications (that support it) need to figure out where the emulator is listening and, for that, you need to set the DATASTORE_EMULATOR_HOST environment variable for them. From Setting environment variables:
After you start the emulator, you need to set environment variables so
that your application connects to the emulator instead of the
production Datastore mode environment. Set these environment variables
on the same machine that you use to run your application.
You need to set the environment variables each time you start the
emulator. The environment variables depend on dynamically assigned
port numbers that could change when you restart the emulator.
See the rest of that section on details about setting the environment and maybe peek at Is it possible to start two dev_appserver.py connecting to the same google cloud datastore emulator?

Related

403 permission error when executing from command line client on Bigquery

I have set-up gcloud in my local system. I am using Python 3.7 to insert records in big-query dataset situated in projectA. So I try it from command line client with the project set to projectA. The first command I give is to get authenticated
gcloud auth login
Then I use Python 3 and get into Python mode, and I give the following commands:
from googleapiclient.discovery import build
from google.cloud import bigquery
import json
body={json input} //pass the json string here
bigquery = build('bigquery', 'v2', cache_discovery=False)
bigquery.tabledata().insertAll(projectId="projectA",datasetId="compute_reports",tableId="compute_snapshot",body=body).execute()
I get this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
return wrapped(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/googleapiclient/http.py", line 915, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://bigquery.googleapis.com/bigquery/v2/projects/projectA/datasets/compute_reports/tables/compute_snapshot/insertAll?alt=json returned "Access Denied: Table projectA:compute_reports.compute_snapshot: User does not have bigquery.tables.updateData permission for table projectA:compute_reports.compute_snapshot."
I am executing it as a user with role/Owner and BigQueryDataOwner permissions for the project and also added DataEditor to the dataset also, which has these permissions including:
bigquery.tables.update
bigquery.datasets.update
Still I am getting this error.
Why with my credentials am I still not able to execute insert in the big-query?
The error lies in the permissions, so the service account which was used by the python run-time, which is the default service account as set in the bash profile did not have the Bigquery dataeditor access for projectA. Once I gave the access it started working

Roles for queue.yaml deployment with service account

I'm trying to deploy our app.yaml and queue.yaml using the following command:
gcloud --verbosity=debug --project PROJECT_ID app deploy app.yaml queue.yaml
I created a new service account with the roles
App Engine Deployer
App Engine Service Admin
Cloud Build Service Account
for deploying the app.yaml, which works by itself. When trying to deploy the queue.yaml, I get the following error:
DEBUG: Running [gcloud.app.deploy] with arguments: [--project: "PROJECT_ID", --verbosity: "debug", DEPLOYABLES:1: "[u'queue.yaml']"]
DEBUG: Loading runtimes experiment config from [gs://runtime-builders/experiments.yaml]
INFO: Reading [<googlecloudsdk.api_lib.storage.storage_util.ObjectReference object at 0x7fcc7dba0dd0>]
DEBUG: API endpoint: [https://appengine.googleapis.com/], API version: [v1]
Configurations to update:
descriptor: [/home/dominic/workspace/PROJECT/api/queue.yaml]
type: [task queues]
target project: [PROJECT_ID]
DEBUG: (gcloud.app.deploy) PERMISSION_DENIED: The caller does not have permission
Traceback (most recent call last):
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/calliope/cli.py", line 983, in Execute
resources = calliope_command.Run(cli=self, args=args)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/calliope/backend.py", line 807, in Run
resources = command_instance.Run(args)
File "/usr/lib/google-cloud-sdk/lib/surface/app/deploy.py", line 117, in Run
default_strategy=flex_image_build_option_default))
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/command_lib/app/deploy_util.py", line 606, in RunDeploy
app, project, services, configs, version_id, deploy_options.promote)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/command_lib/app/output_helpers.py", line 111, in DisplayProposedDeployment
DisplayProposedConfigDeployments(project, configs)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/command_lib/app/output_helpers.py", line 134, in DisplayProposedConfigDeployments
project, 'cloudtasks.googleapis.com')
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/services/enable_api.py", line 43, in IsServiceEnabled
service = serviceusage.GetService(project_id, service_name)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/services/serviceusage.py", line 168, in GetService
exceptions.ReraiseError(e, exceptions.GetServicePermissionDeniedException)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/services/exceptions.py", line 96, in ReraiseError
core_exceptions.reraise(klass(api_lib_exceptions.HttpException(err)))
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/core/exceptions.py", line 146, in reraise
six.reraise(type(exc_value), exc_value, tb)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/services/serviceusage.py", line 165, in GetService
return client.services.Get(request)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/third_party/apis/serviceusage/v1/serviceusage_v1_client.py", line 297, in Get
config, request, global_params=global_params)
File "/usr/bin/../lib/google-cloud-sdk/lib/third_party/apitools/base/py/base_api.py", line 731, in _RunMethod
return self.ProcessHttpResponse(method_config, http_response, request)
File "/usr/bin/../lib/google-cloud-sdk/lib/third_party/apitools/base/py/base_api.py", line 737, in ProcessHttpResponse
self.__ProcessHttpResponse(method_config, http_response, request))
File "/usr/bin/../lib/google-cloud-sdk/lib/third_party/apitools/base/py/base_api.py", line 604, in __ProcessHttpResponse
http_response, method_config=method_config, request=request)
GetServicePermissionDeniedException: PERMISSION_DENIED: The caller does not have permission
ERROR: (gcloud.app.deploy) PERMISSION_DENIED: The caller does not have permission
I've also tried the following roles:
Cloud Tasks Admin
Cloud Tasks Queue Admin
Cloud Tasks Service Agent
I'm using the Project Editor role for now, which works but I would like to only permit the roles which are actually required.
In addition of Cloud Tasks Queue Admin role, you have to add Service Account User to allow the service account of Cloud Task to generate token on behalf the service account.
Was banging my head against the wall for awhile with this myself, it seems "intuitively" you also need "serviceusage.services.list" perms, so Service Usage Viewer role
found via this issue
https://issuetracker.google.com/issues/137078982

Registering and downloading a fastText .bin model fails with Azure Machine Learning Service

I have a simple RegisterModel.py script that uses the Azure ML Service SDK to register a fastText .bin model. This completes successfully and I can see the model in the Azure Portal UI (I cannot see what model files are in it). I then want to download the model (DownloadModel.py) and use it (for testing purposes), however it throws an error on the model.download method (tarfile.ReadError: file could not be opened successfully) and makes a 0 byte rjtestmodel8.tar.gz file.
I then use the Azure Portal and Add Model and select the same bin model file and it uploads fine. Downloading it with the download.py script below works fine, so I am assuming something is not correct with the Register script.
Here are the 2 scripts and the stacktrace - let me know if you can see anything wrong:
RegisterModel.py
import azureml.core
from azureml.core import Workspace, Model
ws = Workspace.from_config()
model = Model.register(workspace=ws,
model_name='rjSDKmodel10',
model_path='riskModel.bin')
DownloadModel.py
# Works when downloading the UI Uploaded .bin file, but not the SDK registered .bin file
import os
import azureml.core
from azureml.core import Workspace, Model
ws = Workspace.from_config()
model = Model(workspace=ws, name='rjSDKmodel10')
model.download(target_dir=os.getcwd(), exist_ok=True)
Stacktrace
Traceback (most recent call last):
File "...\.vscode\extensions\ms-python.python-2019.9.34474\pythonFiles\ptvsd_launcher.py", line 43, in <module>
main(ptvsdArgs)
File "...\.vscode\extensions\ms-python.python-2019.9.34474\pythonFiles\lib\python\ptvsd\__main__.py", line 432, in main
run()
File "...\.vscode\extensions\ms-python.python-2019.9.34474\pythonFiles\lib\python\ptvsd\__main__.py", line 316, in run_file
runpy.run_path(target, run_name='__main__')
File "...\.conda\envs\DoC\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "...\.conda\envs\DoC\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "...\.conda\envs\DoC\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "...\\DownloadModel.py", line 21, in <module>
model.download(target_dir=os.getcwd(), exist_ok=True)
File "...\.conda\envs\DoC\lib\site-packages\azureml\core\model.py", line 712, in download
file_paths = self._download_model_files(sas_to_relative_download_path, target_dir, exist_ok)
File "...\.conda\envs\DoC\lib\site-packages\azureml\core\model.py", line 658, in _download_model_files
file_paths = self._handle_packed_model_file(tar_path, target_dir, exist_ok)
File "...\.conda\envs\DoC\lib\site-packages\azureml\core\model.py", line 670, in _handle_packed_model_file
with tarfile.open(tar_path) as tar:
File "...\.conda\envs\DoC\lib\tarfile.py", line 1578, in open
raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully
Environment
riskModel.bin is 6 megs
AMLS 1.0.60
Python 3.7
Working locally with Visual Code
The Azure Machine Learning service SDK has a bug with how it interacts with Azure Storage, which causes it to upload corrupted files if it has to retry uploading.
A couple workarounds:
The bug was introduced in 1.0.60 release. If you downgrade to AzureML-SDK 1.0.55, the code should fail when there are issue uploading instead of silently corrupting data.
It's possible that the retry is being triggered by the low timeout values that the AzureML-SDK defaults to. You could investigate changing the timeout in site-packages/azureml/_restclient/artifacts_client.py
This bug should be fixed in the next release of the AzureML-SDK.

How to use paralledots api in app engine?

I want to check text similarity using paralleldots api in app engine, but when setting the api key in app engine using.
paralleldots.set_api_key("XXXXXXXXXXXXXXXXXXXXXXXXXXX")
App engine giving Error:
with open('settings.cfg', 'w') as configfile:
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/tools/devappserver2/python/runtime/stubs.py", line 278, in __init__
raise IOError(errno.EROFS, 'Read-only file system', filename)
IOError: [Errno 30] Read-only file system: 'settings.cfg'
INFO 2019-03-17 10:43:59,852 module.py:835] default: "GET / HTTP/1.1" 500 -
INFO 2019-03-17 10:46:47,548 client.py:777] Refreshing access_token
ERROR 2019-03-17 10:46:50,931 wsgi.py:263]
Traceback (most recent call last):
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 299, in _LoadHandler
handler, path, err = LoadObject(self._handler)
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 85, in LoadObject
obj = __import__(path[0])
File "/home/ulti72/Desktop/koda/main.py", line 26, in <module>
paralleldots.set_api_key("7PR8iwo42DGFB8qpLjpUGJPqEQHU322lqTDkgaMrX7I")
File "/home/ulti72/Desktop/koda/lib/paralleldots/config.py", line 13, in set_api_key
with open('settings.cfg', 'w') as configfile:
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/tools/devappserver2/python/runtime/stubs.py", line 278, in __init__
raise IOError(errno.EROFS, 'Read-only file system', filename)
IOError: [Errno 30] Read-only file system: 'settings.cfg'
The paralleldots api, seems to want to save a settings.cfg file to the local filesystem in response to that call. Which is not allowed in the 1st generation standard environment and only allowed for files in the /tmp filesystem in the 2nd generation.
The local development server was designed for the 1st generation standard env and enforces the restriction with that error. It has limited support for the 2nd generation env, see Python 3.7 Local Development Server Options for new app engine apps.
Things to try:
check if specifying the location of the settings.cfg is supported and if so make it reside under /tmp. Maybe the local development server allows that or you switch to some other local development method than the development server.
check if saving the settings using an already open file handler is supported and, if so, use one obtained from Cloud Storage client library, something along these lines: How to zip or tar a static folder without writing anything to the filesystem in python?
check if set_api_key() supports some other method of persisting the API key than saving the settings to a file
check if it's possible to specify the API key for every subsequent call so you don't have to persist it using set_api_key() (maybe using a common wrapper function for convenience)

ws4py under cherrypy under WSGI: exception AttributeError: 'mod_wsgi.Input' object has no attribute 'rfile'

I am trying to implement websockets on a openshift.com server (which should support them).
openshift.com provides me a WSGI, so I embed my cherrypy to it, so that my wsgi.pyscript define an application object.
Also, cherrypy has a websocket tool, as defined by ws4py.
This is a minimal cherrypy application that works under WSGI in OpenShift, and that should use websockets too!
import cherrypy
from ws4py.server.cherrypyserver import WebSocketPlugin, WebSocketTool
from ws4py.websocket import EchoWebSocket
import atexit
import logging
# see http://tools.cherrypy.org/wiki/ModWSGI
cherrypy.config.update({'environment': 'embedded'})
if cherrypy.__version__.startswith('3.0') and cherrypy.engine.state == 0:
cherrypy.engine.start(blocking=False)
atexit.register(cherrypy.engine.stop)
class Root(object):
def index(self): return 'I work!'
def ws(self): print('THIS IS NEVER PRINTED :(')
index.exposed=True
ws.exposed=True
# registering the websocket
conf={'/ws':{'tools.websocket.on': True,'tools.websocket.handler_cls': EchoWebSocket}}
WebSocketPlugin(cherrypy.engine).subscribe()
cherrypy.tools.websocket = WebSocketTool()
#show stacktraces in console (for some reason this is not default in cherrypy+WSGI)
logger = logging.getLogger()
logger.setLevel(logging.INFO)
stream = logging.StreamHandler()
stream.setLevel(logging.INFO)
logger.addHandler(stream)
application = cherrypy.Application(Root(), script_name='', config=conf)
Everything work wonderfully, except when I create a websocket ( connecting to ws://myserver:8000/ws ), this is the stacktrace I get:
cherrypy/_cplogging.py, 214, HTTP Traceback (most recent call last):
File "cherrypy/_cprequest.py", line 661, in respond
self.hooks.run('before_request_body')
File "cherrypy/_cprequest.py", line 114, in run
raise exc
File "cherrypy/_cprequest.py", line 104, in run
hook()
File "cherrypy/_cprequest.py", line 63, in __call__
return self.callback(**self.kwargs)
File "ws4py/server/cherrypyserver.py", line 200, in upgrade
ws_conn = get_connection(request.rfile.rfile)
AttributeError: 'mod_wsgi.Input' object has no attribute 'rfile'
(I manually deleted the absolute path from the filenames)
PS: I use python3.3, cherrypy==3.5.0, ws4py==0.3.4.
It is not clear to me:
if this is a lack of compatibility between cherrypy and ws4py when in an WSGI environment.
if it is a problem of ws4py when in a WSGI environment
if it is because Openshift websockets have a different port than the http one
PPS: this is a complete OpenShift project, that you can run and try this yourself: https://github.com/spocchio/wsgi-cherrypy-ws4py
I don't think it is possible at all. WSGI is a synchronous protocol (1, 2), WebSocket protocol is asynchronous. Wiki states that for a Python application interface OpenShift uses WSGI (3). Alas.
However I've recently played with ws4py in pub/sub scenario and it works really well on top of CherryPy standard HTTP-server deployment. So it shouldn't be a problem on a generic virtual server with no application interface constraints.

Resources