403 permission error when executing from command line client on Bigquery - python-3.x

I have set-up gcloud in my local system. I am using Python 3.7 to insert records in big-query dataset situated in projectA. So I try it from command line client with the project set to projectA. The first command I give is to get authenticated
gcloud auth login
Then I use Python 3 and get into Python mode, and I give the following commands:
from googleapiclient.discovery import build
from google.cloud import bigquery
import json
body={json input} //pass the json string here
bigquery = build('bigquery', 'v2', cache_discovery=False)
bigquery.tabledata().insertAll(projectId="projectA",datasetId="compute_reports",tableId="compute_snapshot",body=body).execute()
I get this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
return wrapped(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/googleapiclient/http.py", line 915, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://bigquery.googleapis.com/bigquery/v2/projects/projectA/datasets/compute_reports/tables/compute_snapshot/insertAll?alt=json returned "Access Denied: Table projectA:compute_reports.compute_snapshot: User does not have bigquery.tables.updateData permission for table projectA:compute_reports.compute_snapshot."
I am executing it as a user with role/Owner and BigQueryDataOwner permissions for the project and also added DataEditor to the dataset also, which has these permissions including:
bigquery.tables.update
bigquery.datasets.update
Still I am getting this error.
Why with my credentials am I still not able to execute insert in the big-query?

The error lies in the permissions, so the service account which was used by the python run-time, which is the default service account as set in the bash profile did not have the Bigquery dataeditor access for projectA. Once I gave the access it started working

Related

UndetectedChromedriver Extension Issue

I can't use extensions on UndetectedChromedriver PYPI Package (Python). If I use it with normal selenium its works, but not with this package. I tried to install extensions directly from webstore, but Chrome Webstore Alert is not an Alert to handle with selenium is a Window Event, so we need to use AutoIT, Pyautogui, etc... To handle that.
The only thing is working is loading profiles, but... I'm working for multiprocess windows, is working, but I need to create houndred of windows and then delete them. And I can't clone profiles, because UndetectedChromedriver doesn't work, i need to create manually.
Finally i tried with Google Chrome Enterprise Bundle, then I used Extensions policy to install forced the extension for all profiles. And yes, is working, but if I enabled that, selenium, doesn't work properly.
The error traceback log is:
Exception in thread Thread-2:
Traceback (most recent call last):
File "C:\Users\andre\anaconda3\envs\selenium-env\lib\threading.py", line 950, in _bootstrap_inner
self.run()
File "C:\Users\andre\anaconda3\envs\selenium-env\lib\threading.py", line 888, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\andre\OneDrive\Documentos\(A1)_Inicio\(A2)_CyberEspacio\LAB\(A1)_Programador123\(A1)_Programming_(Section)\VSCode Snippets\python\selenium\app.py", line 72, in test
seleniumCaptchaSolver.reCaptchaServiceLogin(apiKey='MYAPIKEY', solverType = SeleniumCaptchaSolverType().Capmonster)
File "C:\Users\andre\OneDrive\Documentos\(A1)_Inicio\(A2)_CyberEspacio\LAB\(A1)_Programador123\(A1)_Programming_(Section)\VSCode Snippets\python\selenium\modules\seleniumCaptchaSolver.py", line 103, in reCaptchaServiceLogin
self.__driver.get('chrome-extension://pabjfbciaedomjjfelfafejkppknjleh/popup.html')
File "C:\Users\andre\anaconda3\envs\selenium-env\lib\site-packages\undetected_chromedriver\__init__.py", line 535, in get
return super().get(url)
File "C:\Users\andre\anaconda3\envs\selenium-env\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 447, in get
self.execute(Command.GET, {'url': url})
File "C:\Users\andre\anaconda3\envs\selenium-env\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 435, in execute
self.error_handler.check_response(response)
File "C:\Users\andre\anaconda3\envs\selenium-env\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 247, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: cannot determine loading status
from disconnected: received Inspector.detached event
(Session info: chrome=103.0.5060.134)
This happen only when chrome-extension://pabjfbciaedomjjfelfafejkppknjleh/popup.html is opened to login (Send APi Key). I can login etc... But when Policy is Activated I can't because of that issue.
Anyone here know how to fix that or properly use extensions in UndetctedChromedriver?
Note: This error only happens if i load chrome-extension://pabjfbciaedomjjfelfafejkppknjleh/popup.html link, others links works.
I found this solution:
import undetected_chromedriver as uc
import os
working_dir = os.getcwd()
# Im using proxy extension
proxy_plugin = f'{working_dir}/proxy_plugin'
options = uc.ChromeOptions()
options.add_argument(f'--load-extension={proxy_plugin}')
# {proxy_plugin} path to extension folder, I tried to import .zip file
# and this doesnt working, maybe you can try import .crx file
# Also, I use extensions.ui.developer_mode
options.add_experimental_option('prefs', { 'extensions.ui.developer_mode': True })
driver = uc.Chrome(options = options)
extensions.ui.developer_mode
Yet, I seen this pages:
Load unpacked Chrome extension programmatically
Installing extension into V2

FileNotFoundError: [WinError 2] The system cannot find the file specified while loading model from s3

I have recently saved a model into s3 using joblib
model_doc is the model object
import subprocess
import joblib
save_d2v_to_s3_current_doc2vec_model(model_doc,"doc2vec_model")
def save_d2v_to_s3_current_doc2vec_model(model,fname):
model_name = fname
joblib.dump(model,model_name)
s3_base_path = 's3://sd-flikku/datalake/current_doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(model_name,path).split()
print('saving...'+model_name)
subprocess.call(command)
It was successful, but after that when i try to load the model back from s3 it gives me an error
model = load_d2v("doc2vec_model")
def load_d2v(fname):
model_name = fname
s3_base_path='s3://sd-flikku/datalake/current_doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(path,model_name).split()
print('loading...'+model_name)
subprocess.call(command)
model=joblib.load(model_name)
return model
This is the error i get:
loading...doc2vec_model
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in load_d2v
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 339, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 800, in __init__
restore_signals, start_new_session)
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1207, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
I don't even understand why it is saying File not found, this was the path i used to save the model but now i'm unable to get the model back from s3. Please help me!!
I suggest that rather than your generic print() lines, showing your intent, you should print the actual command you've composed, to verify that it makes sense upon observation.
If it does, then also try that exact same aws ... command directly, at the command prompt where you had been launching your python code, to make sure it runs that way. If it doesn't, you may get a more clear error.
Note that the error you're getting doesn't particularly look like it's coming from the aws command, of from the S3 service - which might talk about 'paths' or 'objects'. Rather, it's from the Python subprocess system & Popen' call. I think those are via your call tosubprocess.call(), but for some reason your line-of-code isn't shown. (How are you running the block of code with theload_d2v()`?)
That suggests the file that's no found might be the aws command itself. Are you sure it's installed & runnable from the exact working-directory/environment that your Python is running in, and invoking via subprocess.call()?
(BTW, if my previous answer got you over your sklearn.externals.joblib problem, it'd be good for you to mark the answer as accepted, to save other potential answerers from thinking that's still an unsolved question that's blocking you.)
try to add extension of your model file to your fname if you are confident the model file is there.
e.g. doc2vec_model.h3

How to use paralledots api in app engine?

I want to check text similarity using paralleldots api in app engine, but when setting the api key in app engine using.
paralleldots.set_api_key("XXXXXXXXXXXXXXXXXXXXXXXXXXX")
App engine giving Error:
with open('settings.cfg', 'w') as configfile:
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/tools/devappserver2/python/runtime/stubs.py", line 278, in __init__
raise IOError(errno.EROFS, 'Read-only file system', filename)
IOError: [Errno 30] Read-only file system: 'settings.cfg'
INFO 2019-03-17 10:43:59,852 module.py:835] default: "GET / HTTP/1.1" 500 -
INFO 2019-03-17 10:46:47,548 client.py:777] Refreshing access_token
ERROR 2019-03-17 10:46:50,931 wsgi.py:263]
Traceback (most recent call last):
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 299, in _LoadHandler
handler, path, err = LoadObject(self._handler)
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 85, in LoadObject
obj = __import__(path[0])
File "/home/ulti72/Desktop/koda/main.py", line 26, in <module>
paralleldots.set_api_key("7PR8iwo42DGFB8qpLjpUGJPqEQHU322lqTDkgaMrX7I")
File "/home/ulti72/Desktop/koda/lib/paralleldots/config.py", line 13, in set_api_key
with open('settings.cfg', 'w') as configfile:
File "/home/ulti72/google-cloud-sdk/platform/google_appengine/google/appengine/tools/devappserver2/python/runtime/stubs.py", line 278, in __init__
raise IOError(errno.EROFS, 'Read-only file system', filename)
IOError: [Errno 30] Read-only file system: 'settings.cfg'
The paralleldots api, seems to want to save a settings.cfg file to the local filesystem in response to that call. Which is not allowed in the 1st generation standard environment and only allowed for files in the /tmp filesystem in the 2nd generation.
The local development server was designed for the 1st generation standard env and enforces the restriction with that error. It has limited support for the 2nd generation env, see Python 3.7 Local Development Server Options for new app engine apps.
Things to try:
check if specifying the location of the settings.cfg is supported and if so make it reside under /tmp. Maybe the local development server allows that or you switch to some other local development method than the development server.
check if saving the settings using an already open file handler is supported and, if so, use one obtained from Cloud Storage client library, something along these lines: How to zip or tar a static folder without writing anything to the filesystem in python?
check if set_api_key() supports some other method of persisting the API key than saving the settings to a file
check if it's possible to specify the API key for every subsequent call so you don't have to persist it using set_api_key() (maybe using a common wrapper function for convenience)

Localhost: how to get credentials to connect GAE Python 3 app and Datastore Emulator?

I'd like to use the new Datastore Emulator together with a GAE Flask app on localhost. I want to run it in the Docker environment, but the error I get (DefaultCredentialsError) happens with or without Docker.
My Flask file looks like this (see the whole repository here on GitHub):
main.py:
from flask import Flask
from google.cloud import datastore
app = Flask(__name__)
#app.route("/")
def index():
return "App Engine with Python 3"
#app.route("/message")
def message():
# auth
db = datastore.Client()
# add object to db
entity = datastore.Entity(key=db.key("Message"))
message = {"message": "hello world"}
entity.update(message)
db.put(entity)
# query from db
obj = db.get(key=db.key("Message", entity.id))
return "Message for you: {}".format(obj["message"])
The index() handler works fine, but the message() handler throws this error:
[2019-02-03 20:00:46,246] ERROR in app: Exception on /message [GET]
Traceback (most recent call last):
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/app/main.py", line 16, in message
db = datastore.Client()
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/datastore/client.py", line 210, in __init__
project=project, credentials=credentials, _http=_http
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/client.py", line 223, in __init__
_ClientProjectMixin.__init__(self, project=project)
INFO 2019-02-03 20:00:46,260 module.py:861] default: "GET /message HTTP/1.1" 500 291
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/client.py", line 175, in __init__
project = self._determine_default(project)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/datastore/client.py", line 228, in _determine_default
return _determine_default_project(project)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/datastore/client.py", line 75, in _determine_default_project
project = _base_default_project(project=project)
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/cloud/_helpers.py", line 186, in _determine_default_project
_, project = google.auth.default()
File "/tmp/tmpJcIw2U/lib/python3.5/site-packages/google/auth/_default.py", line 306, in default
raise exceptions.DefaultCredentialsError(_HELP_MESSAGE)
google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
I checked the website in the error log and tried the JSON auth file (GOOGLE_APPLICATION_CREDENTIALS), but the result was that my app then connected with a production Datastore on Google Cloud, instead of the local Datastore Emulator.
Any idea how to resolve this?
I managed to solve this problem by adding env vars directly into the Python code (in this case in main.py) and using the Mock library:
import os
import mock
from flask import Flask, render_template, request
from google.cloud import datastore
import google.auth.credentials
app = Flask(__name__)
if os.getenv('GAE_ENV', '').startswith('standard'):
# production
db = datastore.Client()
else:
# localhost
os.environ["DATASTORE_DATASET"] = "test"
os.environ["DATASTORE_EMULATOR_HOST"] = "localhost:8001"
os.environ["DATASTORE_EMULATOR_HOST_PATH"] = "localhost:8001/datastore"
os.environ["DATASTORE_HOST"] = "http://localhost:8001"
os.environ["DATASTORE_PROJECT_ID"] = "test"
credentials = mock.Mock(spec=google.auth.credentials.Credentials)
db = datastore.Client(project="test", credentials=credentials)
The Datastore Emulator is then run like this:
gcloud beta emulators datastore start --no-legacy --data-dir=. --project test --host-port "localhost:8001"
Requirements needed:
Flask
google-cloud-datastore
mock
google-auth
GitHub example here: https://github.com/smartninja/gae-2nd-gen-examples/tree/master/simple-app-datastore
The fact that credentials are required indicates you're reaching to the actual Datastore, not to the datastore emulator (which neither needs nor requests credentials).
To reach the emulator the client applications (that support it) need to figure out where the emulator is listening and, for that, you need to set the DATASTORE_EMULATOR_HOST environment variable for them. From Setting environment variables:
After you start the emulator, you need to set environment variables so
that your application connects to the emulator instead of the
production Datastore mode environment. Set these environment variables
on the same machine that you use to run your application.
You need to set the environment variables each time you start the
emulator. The environment variables depend on dynamically assigned
port numbers that could change when you restart the emulator.
See the rest of that section on details about setting the environment and maybe peek at Is it possible to start two dev_appserver.py connecting to the same google cloud datastore emulator?

How to avoid this ssl.SSLError, or simply ignore?

The program should allow to run several https get requests with one aiohttp.ClientSession as the documentation suggests. It is intended to run a telegram bot.
I was not able to catch the exception with try ... except. Therefore the program hangs when exiting. During extended sessions the error is printed in the command windows (but not in the error log).
SSL error in data received
protocol: <asyncio.sslproto.SSLProtocol object at 0x0000016A581E4400>
transport: <_SelectorSocketTransport fd=644 read=polling write=<idle, bufsize=0>>
Traceback (most recent call last):
File "C:\Users\annet\Anaconda3\lib\asyncio\sslproto.py", line 526, in data_received
ssldata, appdata = self._sslpipe.feed_ssldata(data)
File "C:\Users\annet\Anaconda3\lib\asyncio\sslproto.py", line 207, in feed_ssldata
self._sslobj.unwrap()
File "C:\Users\annet\Anaconda3\lib\ssl.py", line 767, in unwrap
return self._sslobj.shutdown()
ssl.SSLError: [SSL: KRB5_S_INIT] application data after close notify (_ssl.c:2592)
^C
As the error information is very unspecific I could not really isolate the source and have a short code to reproduce the error.
A sample code is on github under https://github.com/fhag/telegram2.git
In order to run the code you will need an API token from telegram of your own bot.
This error showed up the first time when I upgraded to python 3.7.1.
Python is running on Windows 10.

Resources