How can you connect to Google Cloud Datastore from Cloud Run? - python-3.x

I'm willing to deploy a service in Google-Cloud-Run. It would be a (python) Flask App that would connect to datastore (firestore in datastore mode) to either write or read a small blob.
The problem is that it is not explained in the docs: Accessing your Database how to reach datastore within GCP but not from GCE or AppEngine. Is there a fancy/seamless way to achieve this or should I go with providing a service account credentials as if it was an external platform ?
Thank you in advance for your help and answers.

When your Cloud Run logic executes, it executes with the identity of a GCP Service Account. You can configure which service account it runs as at configuration time. You can create and configure a Service Account that has the correct roles to allow/provide access to your datastore. This means that when your Cloud Run logic executes, it will have the correct authority to perform the desired operations. This story is documented here:
Using per-service identity
If for some reason you don't find this sufficient, an alternative is to save the tokens necessary for access in compute metadata and then dynamically retrieve these explicitly within your cloud run logic. This is described here:
Fetching identity and access tokens
Hopefully this covers the fundamentals of what you are looking for. If after reading these areas new questions arise, feel very free to create new questions which are more specific and detailed and we'll follow up there.

To connect to Cloud Datastore from your Flask app deployed to Cloud Run...
Ensure you've got both services enabled in a project with an active billing account.
Ensure you've got at least both Flask & Datastore packages in your requirements.txt file (w/any desired versioning):
flask
google-cloud-datastore
Integrate Datastore usage into your app... here's some sample usage in my demo main.py (Flask code dropped for simplicity):
from google.cloud import datastore
ds_client = datastore.Client()
KEY_TYPE = 'Record'
def insert(**data):
entity = datastore.Entity(key=ds_client.key(KEY_TYPE))
entity.update(**data) ## where data = dict/JSON of key-value pairs
ds_client.put(entity)
def query(limit):
return ds_client.query(kind=KEY_TYPE).fetch(limit=limit)
You can have a Dockerfile (minimal one below), but better yet, skip it and let Google (Cloud Buildpacks) build your container for you so you don't have extra stuff like this to worry about.
FROM python:3-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "main.py"]
Come up with an app/service name SVC_NAME then build & deploy your prototype container with gcloud beta run deploy SVC_NAME --source . --platform managed --allow-unauthenticated. (Think docker build followed by docker push and then docker run, all from 1 command!) If you have a Dockerfile, Buildpacks will use it, but if not, it'll introspect your code and dependencies to build the most efficient container it can.
That's it. Some of you will get distracted by service accounts and making a public/private key-pair, both of which are fine. However to keep things simple, especially during prototyping, just use the default service account you get for free on Cloud Run. The snippet above works without any service account or IAM code present.
BTW, the above is for a prototype to get you going. If you were deploying to production, you wouldn't use the Flask dev server. You'd probably add gunicorn to your requirements.txt and Dockerfile, and you'd probably create a unique service account key w/specific IAM permissions, perhaps adding other requirements like IAP, VPC, and/or a load-balancer.

Related

GOOGLE_APPLICATION_CREDENTIALS and GOOGLE_CLOUD_PROJECT setup in prod

Like to know what is the best practice to store or reference these GOOGLE_APPLICATION_CREDENTIALS and GOOGLE_CLOUD_PROJECT in prod?
export GOOGLE_APPLICATION_CREDENTIALS="~/Download/key.json"
export GOOGLE_CLOUD_PROJECT=`gcloud config get-value project
as in dev these are likely to be in .zshrc so when running:
poetry run python3 samples/snippets/quickstart/pub.py $PROJECT hello_topic
to publish message is fine and we can see messages inside console portal, however:
poetry run pytest samples/snippets/quickstart/quickstart_test.py
ERROR samples/snippets/quickstart/quickstart_test.py::test_pub - google.api_core.exceptions.PermissionDenied: 403 User not authorized to perform this...
.zshrc have already set:
export GOOGLE_APPLICATION_CREDENTIALS="$HOME/Documents/python-pubsub/key.json"
export GOOGLE_CLOUD_PROJECT=`gcloud config get-value project`
source .zshrc
The environment variables are intended for development. In production do not use them.
Each compute service offers the ability to attach a service account. By default, compute services have a Google created service account (aka Default Service Account) attached. Best practices recommend creating a User-managed service account with only the required IAM roles and attaching that to the service.
User-managed service accounts
Compute Engine Service accounts
Best practices for securing service accounts

How to run a Node app (NextJS) on gcloud from github?

I have followed these steps:
I installed `Google Cloud Build app on Github, linked it to Cloud Build and configured it to use a certain repository (a private one)
I set up a trigger at Cloud Build: Push to any branch
the project has no app instances after deploying (App Engine -> Dashboard)
My cloudbuild.yarml looks like this:
steps:
- name: 'gcr.io/cloud-builders/gcloud'
args: ['app', 'deploy', '--project=project-name', '--version=$SHORT_SHA']
If I try to run the trigger manually: I get this error in Google Cloud:
unable to get credentials for cloud build robot
I have also tried to set IAM roles based on this article but using #cloudbuild.gserviceaccount.com doesn't seem to be a valid "member" (perhaps I need two projects, one for running and one for building the app?)
How do I fill the gaps / fixes the errors mentioned?
It seems the error message looking for credential that has the required permission. From the article that you are following, in the step #4, don't add manually the Service Account for Cloud Build. Check if you enable the Cloud Build API in your project, if the API is disabled try to enable. It will automatically create the Service Account for Cloud Build and look likes this:
[PROJECT_NUMBER]#cloudbuild.gserviceaccount.com
Once the service account is created, go to Cloud Build > Setting page and enable the required roles for you application.

Authenticating to Google Cloud Firestore from GKE with Workload Identity

I'm trying to write a simple backend that will access my Google Cloud Firestore, it lives in the Google Kubernetes Engine. On my local I'm using the following code to authenticate to Firestore as detailed in the Google Documentation.
if (process.env.NODE_ENV !== 'production') {
const result = require('dotenv').config()
//Additional error handling here
}
This pulls the GOOGLE_APPLICATION_CREDENTIALS environment variable and populates it with my google-application-credentals.json which I got from creating a service account with the "Cloud Datastore User" role.
So, locally, my code runs fine. I can reach my Firestore and do everything I need to. However, the problem arises once I deploy to GKE.
I followed this Google Documentation to set up a Workload Identity for my cluster, I've created a deployment and verified that the pods all are using the correct IAM Service Account by running:
kubectl exec -it POD_NAME -c CONTAINER_NAME -n NAMESPACE sh
> gcloud auth list
I was under the impression from the documentation that authentication would be handled for my service as long as the above held true. I'm really not sure why but my Firestore() instance is behaving as if it does not have the necessary credentials to access the Firestore.
In case it helps below is my declaration and implementation of the instance:
const firestore = new Firestore()
const server = new ApolloServer({
schema: schema,
dataSources: () => {
return {
userDatasource: new UserDatasource(firestore)
}
}
})
UPDATE:
In a bout of desperation I decided to tear down everything and re-build it. Following everything over step by step I appear to have either encountered a bug or (more likely) I did something mildly wrong the first time. I'm now able to connect to my backend service. However, I'm now getting a different error. Upon sending any request (I'm using GraphQL, but in essence it's any REST call) I get back a 404.
Inspecting the logs yields the following:
'Getting metadata from plugin failed with error: Could not refresh access token: A Not Found error was returned while attempting to retrieve an accesstoken for the Compute Engine built-in service account. This may be because the Compute Engine instance does not have any permission scopes specified: Could not refresh access token: Unsuccessful response status code. Request failed with status code 404'
A cursory search for this issue doesn't seem to return anything related to what I'm trying to accomplish, and so I'm back to square one.
I think your initial assumption was correct! Workload Identity is not functioning properly if you still have to specify scopes. In the Workload article you have linked, scopes are not used.
I've been struggling with the same issue and have identified three ways to get authenticated credentials in the pod.
1. Workload Identity (basically the Workload Identity article above with some deployment details added)
This method is preferred because it allows each pod deployment in a cluster to be granted only the permissions it needs.
Create cluster (note: no scopes or service account defined)
gcloud beta container clusters create {cluster-name} \
--release-channel regular \
--identity-namespace {projectID}.svc.id.goog
Then create the k8sServiceAccount, assign roles, and annotate.
gcloud container clusters get-credentials {cluster-name}
kubectl create serviceaccount --namespace default {k8sServiceAccount}
gcloud iam service-accounts add-iam-policy-binding \
--member serviceAccount:{projectID}.svc.id.goog[default/{k8sServiceAccount}] \
--role roles/iam.workloadIdentityUser \
{googleServiceAccount}
kubectl annotate serviceaccount \
--namespace default \
{k8sServiceAccount} \
iam.gke.io/gcp-service-account={googleServiceAccount}
Then I create my deployment, and set the k8sServiceAccount.
(Setting the service account was the part that I was missing)
kubectl create deployment {deployment-name} --image={containerImageURL}
kubectl set serviceaccount deployment {deployment-name} {k8sServiceAccount}
Then expose with a target of 8080
kubectl expose deployment {deployment-name} --name={service-name} --type=LoadBalancer --port 80 --target-port 8080
The googleServiceAccount needs to have the appropriate IAM roles assigned (see below).
2. Cluster Service Account
This method is not preferred, because all VMs and pods in the cluster will have permissions based on the defined service account.
Create cluster with assigned service account
gcloud beta container clusters create [cluster-name] \
--release-channel regular \
--service-account {googleServiceAccount}
The googleServiceAccount needs to have the appropriate IAM roles assigned (see below).
Then deploy and expose as above, but without setting the k8sServiceAccount
3. Scopes
This method is not preferred, because all VMs and pods in the cluster will have permisions based on the scopes defined.
Create cluster with assigned scopes (firestore only requires "cloud-platform", realtime database also requires "userinfo.email")
gcloud beta container clusters create $2 \
--release-channel regular \
--scopes https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/userinfo.email
Then deploy and expose as above, but without setting the k8sServiceAccount
The first two methods require a Google Service Account with the appropriate IAM roles assigned. Here are the roles I assigned to get a few Firebase products working:
FireStore: Cloud Datastore User (Datastore)
Realtime Database: Firebase Realtime Database Admin (Firebase Products)
Storage: Storage Object Admin (Cloud Storage)
Going to close this question.
Just in case anyone stumbles onto it here's what fixed it for me.
1.) I re-followed the steps in the Google Documentation link above, this fixed the issue of my pods not launching.
2.) As for my update, I re-created my cluster and gave it the Cloud Datasource permission. I had assumed that the permissions were seperate from what Workload Identity needed to function. I was wrong.
I hope this helps someone.

Google Cloud Platform Authentification: Recognized as end user anthentification despite using a service account

Anyone can HELP? This one is really driving me crazy... Thank you!
I tried to use a google cloud platform API Speech-to-text.
Tools: WINDOWS 10 && GCP &&Python(Pycharm IDE)
I've created a service account as a owner for my speech-to-test project and generated a key from GCP console in json, then I set the environment variables.
Code I ran on WIN10 Powershell && CMD:
$env:GOOGLE_APPLICATION_CREDENTIALS="D:\GCloud speech-to-text\Speech
To Text Series-93e03f36bc9d.json"
set GOOGLE_APPLICATION_CREDENTIALS=D:\GCloud speech-to-text\Speech To
Text Series-93e03f36bc9d.json
PS: the added environment variables disappear in CMD and Powershell after reboot my laptop but do show in the env list if added again.
I've enabled the google storage api and google speech-to-text api in GCP console.
I've tried the explicitely showing credential method via python, same problem.
I've installed the google cloud SDK shell and initialized by using command to log in my account.
PYTHON SPEECH-TO-TEXT CODE(from GCP demo)
import io
import os
# Imports the Google Cloud client library
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
# Instantiates a client
client = speech.SpeechClient()
# The name of the audio file to transcribe
file_name = os.path.join(
os.path.dirname(__file__),
'test_cre.m4a')
# Loads the audio into memory
with io.open(file_name, 'rb') as audio_file:
content = audio_file.read()
audio = types.RecognitionAudio(content=content)
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=16000,
language_code='en-US')
# Detects speech in the audio file
response = client.recognize(config, audio)
for result in response.results:
print('Transcript: {}'.format(result.alternatives[0].transcript))
----Expected to receive a "200OK" and the transcribed text when runing the code above (a demo of short speech to text api from GCP Document)
----But got:
D:\Python\main program\lib\site-packages\google\auth_default.py:66: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK. We recommend that most server applications use service accounts instead. If your application continues to use end user credentials from Cloud SDK, you might receive a "quota exceeded" or "API not enabled" error. For more information about service accounts, see https://cloud.google.com/docs/authentication/
warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)
google.api_core.exceptions.ResourceExhausted: 429 Quota exceeded for quota metric 'speech.googleapis.com/default_requests' and limit 'DefaultRequestsPerMinutePerProject' of service 'speech.googleapis.com' for consumer 'project_number:764086051850'.
ANOTHER WEIRD THING: the error info shows that 'project_number:764086051850', which is different from my speech-to-text project_number on GCP (I do distinguish project number and project ID), the project_number shown in the error info also varies every time the code runs. It seems I was sending cloud requirement of the wrong project?
My GOOGLE_APPLICATION_CREDENTIALS system environment variables disappear after I restart my laptop next time. After adding again, it will appear in the env list but can't be stored after reboot again.
Appreciate it if someone can help, thank you!
try to do this:
Run gcloud init -> authenticate with your account and choose your project
Run gcloud auth activate-service-account <service account email> --key-file=<JSON key file>
Run gcloud config list to validate your configuration.
Run your script and see if it's better.
Else, try to do the same thing on a micro-vm for validating your code, service account and environment (and for validating that there is a problem only with Windows)
For Windows issues, I'm on ChromeBook, I can't test and help you on this. However, I checked about EnvVar on internet, and this update the registry. Check if you don't have stuff which protect Registry update (Antivirus,....)
D:\Python\main program\lib\site-packages\google\auth_default.py:66:
UserWarning: Your application has authenticated using end user
credentials from Google Cloud SDK. We recommend that most server
applications use service accounts instead. If your application
continues to use end user credentials from Cloud SDK, you might
receive a "quota exceeded" or "API not enabled" error. For more
information about service accounts, see
https://cloud.google.com/docs/authentication/
warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)
This error means that your code is not using a service account. Your code is configured to use ADC (Application Default Credentials). Most likely your code is using the Google Cloud SDK credentials configured and stored by the CLI gcloud.
To determine what credentials the Cloud SDK is using, execute this command:
gcloud auth list
The IAM Member ID, displayed as ACCOUNT, with the asterisk is the account used by the CLI and any applications that do not specify credentials.
To learn more about ADC, read this article that I wrote:
Google Cloud Application Default Credentials
google.api_core.exceptions.ResourceExhausted: 429 Quota exceeded for
quota metric 'speech.googleapis.com/default_requests' and limit
'DefaultRequestsPerMinutePerProject' of service
'speech.googleapis.com' for consumer 'project_number:764086051850'.
The Cloud SDK has the concept of default values. Execute gcloud config list. This will display various items. Look for project. Most likely this project does not have the API Cloud Speech-to-Text enabled.
ANOTHER WEIRD THING: the error info shows that
'project_number:764086051850', which is different from my
speech-to-text project_number on GCP (I do distinguish project number
and project ID), the project_number shown in the error info also
varies every time the code runs. It seems I was sending cloud
requirement of the wrong project?
To see the list of projects, Project IDs and Project Numbers that your current credentials can see (access) execute:
gcloud projects list.
This command will display the Project Number given a Project ID:
gcloud projects list --filter="REPLACE_WITH_PROJECT_ID" --format="value(PROJECT_NUMBER)"
My GOOGLE_APPLICATION_CREDENTIALS system environment variables
disappear after I restart my laptop next time. After adding again, it
will appear in the env list but can't be stored after reboot again.
When you execute this command in a Command Prompt, it only persists for the life of the Command Prompt: set GOOGLE_APPLICATION_CREDENTIALS=D:\GCloud speech-to-text\Speech To
Text Series-93e03f36bc9d.json. When you exit the Command Prompt, reboot, etc. the environment variable is destroyed.
To create persistent environment variables on Windows, edit the System Properties -> Environment Variables. You can launch this command as follows from a Command Prompt:
SystemPropertiesAdvanced.exe
Suggestions to make your life easier:
Do NOT use long path names with spaces for your service account files. Create a directory such as C:\Config and place the file there with no spaces in the file name.
Do NOT use ADC (Application Default Credentials) when developing on your desktop. Specify the actual credentials that you want to use.
Change this line:
client = speech.SpeechClient()
To this:
client = speech.SpeechClient().from_service_account_json('c:/config/service-account.json')
Service Accounts have a Project ID inside them. Create the service account in the same project that you intend to use them (until you understand IAM and Service Accounts well).

Find the app which created a VM using python azure-sdk

How can i find the app, whose credentials were used to launch a vm in azure. I am able to use the compute client to get admin_username attached to a VM but it does not solve my use case as a user can give any username while launching it.
compute_client = ComputeManagementClient(credentials, subscription_id)
vm_details = compute_client.virtual_machines.get(resource_group_name= <resource_group>, vm_name=<vm_name>, expand='instanceView')
username = vm_details.os_profile.admin_username
Is the app_name stored as a vm property anywhere that can be accessed via azure-sdk for python?
First, please clarify "launch". Do you mean initial deployment, or starting an already existing VM which was off? Or both :)?
I do believe that this information is not part of the VM, but will be considered an event of ARM. Then, this will be available part of the Activity Log:
https://learn.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-overview-activity-logs
Activity Log is available in the Monitor SDK:
https://learn.microsoft.com/en-us/python/api/overview/azure/monitoring?view=azure-python
If you want to test this quickly, try the CLI:
https://learn.microsoft.com/en-us/cli/azure/monitor/activity-log?view=azure-cli-latest#az-monitor-activity-log-list
Since this CLI is using the same SDK, if you find your information with the CLI this means you can definitely get it with SDK
(I work at MS in the Python team, but not in the VM or Monitor team, it's why I start my post with "believe", but I really think it's accurate based on my knowledge of Azure)

Resources