Authenticating to Google Cloud Firestore from GKE with Workload Identity - node.js

I'm trying to write a simple backend that will access my Google Cloud Firestore, it lives in the Google Kubernetes Engine. On my local I'm using the following code to authenticate to Firestore as detailed in the Google Documentation.
if (process.env.NODE_ENV !== 'production') {
const result = require('dotenv').config()
//Additional error handling here
}
This pulls the GOOGLE_APPLICATION_CREDENTIALS environment variable and populates it with my google-application-credentals.json which I got from creating a service account with the "Cloud Datastore User" role.
So, locally, my code runs fine. I can reach my Firestore and do everything I need to. However, the problem arises once I deploy to GKE.
I followed this Google Documentation to set up a Workload Identity for my cluster, I've created a deployment and verified that the pods all are using the correct IAM Service Account by running:
kubectl exec -it POD_NAME -c CONTAINER_NAME -n NAMESPACE sh
> gcloud auth list
I was under the impression from the documentation that authentication would be handled for my service as long as the above held true. I'm really not sure why but my Firestore() instance is behaving as if it does not have the necessary credentials to access the Firestore.
In case it helps below is my declaration and implementation of the instance:
const firestore = new Firestore()
const server = new ApolloServer({
schema: schema,
dataSources: () => {
return {
userDatasource: new UserDatasource(firestore)
}
}
})
UPDATE:
In a bout of desperation I decided to tear down everything and re-build it. Following everything over step by step I appear to have either encountered a bug or (more likely) I did something mildly wrong the first time. I'm now able to connect to my backend service. However, I'm now getting a different error. Upon sending any request (I'm using GraphQL, but in essence it's any REST call) I get back a 404.
Inspecting the logs yields the following:
'Getting metadata from plugin failed with error: Could not refresh access token: A Not Found error was returned while attempting to retrieve an accesstoken for the Compute Engine built-in service account. This may be because the Compute Engine instance does not have any permission scopes specified: Could not refresh access token: Unsuccessful response status code. Request failed with status code 404'
A cursory search for this issue doesn't seem to return anything related to what I'm trying to accomplish, and so I'm back to square one.

I think your initial assumption was correct! Workload Identity is not functioning properly if you still have to specify scopes. In the Workload article you have linked, scopes are not used.
I've been struggling with the same issue and have identified three ways to get authenticated credentials in the pod.
1. Workload Identity (basically the Workload Identity article above with some deployment details added)
This method is preferred because it allows each pod deployment in a cluster to be granted only the permissions it needs.
Create cluster (note: no scopes or service account defined)
gcloud beta container clusters create {cluster-name} \
--release-channel regular \
--identity-namespace {projectID}.svc.id.goog
Then create the k8sServiceAccount, assign roles, and annotate.
gcloud container clusters get-credentials {cluster-name}
kubectl create serviceaccount --namespace default {k8sServiceAccount}
gcloud iam service-accounts add-iam-policy-binding \
--member serviceAccount:{projectID}.svc.id.goog[default/{k8sServiceAccount}] \
--role roles/iam.workloadIdentityUser \
{googleServiceAccount}
kubectl annotate serviceaccount \
--namespace default \
{k8sServiceAccount} \
iam.gke.io/gcp-service-account={googleServiceAccount}
Then I create my deployment, and set the k8sServiceAccount.
(Setting the service account was the part that I was missing)
kubectl create deployment {deployment-name} --image={containerImageURL}
kubectl set serviceaccount deployment {deployment-name} {k8sServiceAccount}
Then expose with a target of 8080
kubectl expose deployment {deployment-name} --name={service-name} --type=LoadBalancer --port 80 --target-port 8080
The googleServiceAccount needs to have the appropriate IAM roles assigned (see below).
2. Cluster Service Account
This method is not preferred, because all VMs and pods in the cluster will have permissions based on the defined service account.
Create cluster with assigned service account
gcloud beta container clusters create [cluster-name] \
--release-channel regular \
--service-account {googleServiceAccount}
The googleServiceAccount needs to have the appropriate IAM roles assigned (see below).
Then deploy and expose as above, but without setting the k8sServiceAccount
3. Scopes
This method is not preferred, because all VMs and pods in the cluster will have permisions based on the scopes defined.
Create cluster with assigned scopes (firestore only requires "cloud-platform", realtime database also requires "userinfo.email")
gcloud beta container clusters create $2 \
--release-channel regular \
--scopes https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/userinfo.email
Then deploy and expose as above, but without setting the k8sServiceAccount
The first two methods require a Google Service Account with the appropriate IAM roles assigned. Here are the roles I assigned to get a few Firebase products working:
FireStore: Cloud Datastore User (Datastore)
Realtime Database: Firebase Realtime Database Admin (Firebase Products)
Storage: Storage Object Admin (Cloud Storage)

Going to close this question.
Just in case anyone stumbles onto it here's what fixed it for me.
1.) I re-followed the steps in the Google Documentation link above, this fixed the issue of my pods not launching.
2.) As for my update, I re-created my cluster and gave it the Cloud Datasource permission. I had assumed that the permissions were seperate from what Workload Identity needed to function. I was wrong.
I hope this helps someone.

Related

Azure DevOps Release Pipeline || To sign in, use a web browser to open

I created the aks cluster with azure service principal id and i provided the contributer role according to the subscription and resource group.
For each and every time when i executed the pipeline the sign-in is asking and after i authenticated it is getting the data.
Also the "kubectl get" task is taking more than 30 min and is getting "Kubectl Server Version: Could not find kubectl server version"
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code CRA2XssWEXUUA to authenticate
Thanks in advance
What is the version of the created cluster?
I'm assuming from your snapshot that you are using az in order to get credentials for it.
Old azure auth plugin is deprecated in V1.22+. If you are using V1.22 or above you should use kubelogin in order authenticate.
You will also need to update your kube config accordingly:
kubelogin convert-kubeconfig
and specifically if you're logging via az:
kubelogin convert-kubeconfig -l azurecli
Note that the flag -l azurecli is important here: the default value is "devicecode" which will not consider your az as a logging method - and you will still be requested a browser authentication.
Alternatively, you can set environment variable:
AAD_LOGIN_METHOD=azurecli
Because you are getting sign in request and not the deprecation warning for the auth plugin, I suspect that you already have kubelogin installed on your agent, and you just need to update the kube config file
What task are you using? There is official kubectl task: https://learn.microsoft.com/en-us/azure/devops/pipelines/tasks/deploy/kubernetes?view=azure-devops
It requires the service connection.
If you still want to execute kubectl directly, you should run the following before the kubectl inside the AzureCLI task:
az aks get-credentials --resource-group "$(resourceGroup)" --name "$(k8sName)" --overwrite-existing
Please use Selfhosted agents for executing your commands. looks like you have private endpoints for your AKS and requests are only allowed from trusted devices.
I ran into the same issue and for me the fix was to change the Connection Type in the stage definition from Azure Resource Manager to Kubernetes Service Connection - check on the screenshot below.
Then you should be able to also specify the connection type in each of the tasks where you are running kubectl or helm commands. For example, in a kubectl task, under Kubernetes Cluster --> Service connection type use the Kubernetes Service Connection:
As mentioned by #DevOpsEngg, the problem could be related to private endpoints but I wouldn't say that it is regarding selfhosted agents, because I'm using these. As an extra comment - this started happening when I added more than one user to the cluster, so you might want to check user permissions and authentication. Unfortunately, I'm still getting used to K8s so I don't have more info about that.

How to use Cloud Trace with Nodejs on GKE with workload identity enabled?

I'm trying to set up Cloud Trace on a GKE cluster with workload identity enabled. My pod uses a service account, which has the Cloud Trace Agent role. (I also tried giving it the Owner role, to rule out permission issues, but that didn't change the error.)
I followed the Node.js quickstart, which says to add the following snippet to my code:
require('#google-cloud/trace-agent').start();
When I try to add a trace, I get the following error:
#google-cloud/trace-agent DEBUG TraceWriter#publish: Received error while publishing traces to cloudtrace.googleapis.com: Error: Could not refresh access token: A Forbidden error was returned while attempting to retrieve an access token for the Compute Engine built-in service account. This may be because the Compute Engine instance does not have the correct permission scopes specified: Could not refresh access token: Unsuccessful response status code. Request failed with status code 403
(How) can I configure the library to work in this scenario?
In order to answer your question on comments above: correct me if I'm wrong - workload identity is a cluster feature, not connected to a namespace?
And seeing that you have fixed your problem by configuring the binding between KSA/K8s Namespace and GCP SA I will add a response to add more context that I believe could help clarify this.
Yes you are right, Workload identity is a GKE cluster feature that lets you bind an identity from K8s (Kubernetes Service Account (KSA)) with a GCP identity (Google Service Account(GSA)) in order to have your workloads authenticated with an specific GCP identity and with enough permissions to be able to reach certain APIs (depending on the permissions that your GCP service account has). k8s namespaces and KSA take a critical role here, as KSA are Namespaced resources.
Therefore, in order to authenticate correctly your workloads (containers) and with a GCP Service account, you need to create them in the configured k8s Namespace and with the configured KSA, as mentioned in this doc
If you create your workloads on a different k8s Namespace (meaning using a different KSA), you will not be able to get an authenticated identity for your workloads, instead of that, your workloads will be authenticated with the Workload Identity Pool/Workload Identity Namespace, which is: PROJECT_ID.svc.id.goog. Meaning that when you create a container with the GCP SDK installed and run a glcoud auth list you will get PROJECT_ID.svc.id.goog as the authenticated identity, which is an IAM object but not an identity with permission in IAM. So your workloads will be lacking of permissions.
Then you need to create your containers in the configured namespace and with the configured service account to be able to have a correct identity in your containers and with IAM permissions.
I'm assuming that above (authentication with lack of permission and lack of an actual IAM Identity) is what happened here, as you mentioned in your response, you just added the needed binding between GSA and the KSA, meaning that your container was lacking of an identity with actual IAM permissions.
Just to be clear on this, Workload Identity allows you to authenticate your workloads with a service account different from the one on your GKE nodes. If your application runs inside a Google Cloud environment that has a default service account, your application can retrieve the service account credentials to call Google Cloud APIs. Such environments include Compute Engine, Google Kubernetes Engine, App Engine, Cloud Run, and Cloud Functions, here.
With above comment I want to say that even if you do not use Workload Identity, your containers will be authenticated as they are running on GKE, which by default use a service account, and this service account is inherited from the nodes to your containers, the default service account (Compute service Account) and its scopes are enough to write from containers to Cloud Trace and that is why you were able to see traces with a GKE cluster with Workload Identity disabled, because the default service account was used on your containers and nodes.
If you test this on both environments:
GKE cluster with Workload Identity: You will be able to see, with the correct config, a service account different than the default, authenticating your workloads/containers.
GKE cluster with Workloads Identity disabled: You will be able to see the same service account used by your nodes (by default the compute engine service account with Editor role and scopes applied on your nodes when using default service account) on your Containers.
These tests can be performed by spinning the same container you used on your response, which is:
kubectl run -it \
--image google/cloud-sdk:slim \
--serviceaccount KSA_NAME \ ##If needed
--namespace K8S_NAMESPACE \ ##If needed
workload-identity-test
And running `glcoud auth list to see the identity you are authenticated with on your containers.
Hope this can help somehow!
It turned out I had misconfigured the IAM service account.
I managed to get a more meaningful error message by running a new pod in my namespace with the gcloud cli installed:
kubectl run -it \
--image gcr.io/google.com/cloudsdktool/cloud-sdk \
--serviceaccount $GKE_SERVICE_ACCOUNT test \
-- bash
after that, just running any gcloud command gave an error message containing (emphasis mine):
Unable to generate access token; IAM returned 403 Forbidden: The caller does not have permission
This error could be caused by a missing IAM policy binding on the target IAM service account.
Running
gcloud iam service-accounts get-iam-policy $SERVICE_ACCOUNT
indeed showed that the binding to the Kubernetes service account was missing.
Adding it manually fixed the issue:
gcloud iam service-accounts add-iam-policy-binding \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:$PROJECT.svc.id.goog[$NAMESPACE/$GKE_SERVICE_ACCOUNT]" \
$SERVICE_ACCOUNT
After more research, the underlying problem was that I created my service accounts using Config Connector but hadn't properly annotated the Kubernetes namespace with the Google Cloud project to deploy the resources in:
kubectl annotate namespace "$NAMESPACE" cnrm.cloud.google.com/project-id="$PROJECT"
Therefore, Cloud Connector could not add the IAM policy binding.

How to access Files in Google Cloud Storage through GKE pods

I'm trying get image files of Google Cloud Storage (GCS) in my Node.js application using Axios client. On develop mode using my PC I pass a Bearer Token and all works properly.
But, I need to use this in production in a cluster hosted on Google Kubernetes Engine (GKE).
I made recommended tuturials to create a service account (GSA), then I vinculed with kubernetes account (KSA), via Workload identity approach, but when I try get files througt one endpoint on my app, I'm receiving:
{"statusCode":401,"message":"Unauthorized"}
What is missing to make?
Update: What I've done:
Create Google Service Account
https://cloud.google.com/iam/docs/creating-managing-service-accounts
Create Kubernetes Service Account
# gke-access-gcs.ksa.yaml file
apiVersion: v1
kind: ServiceAccount
metadata:
name: gke-access-gcs
kubectl apply -f gke-access-gcs.ksa.yaml
Relate KSAs and GSAs
gcloud iam service-accounts add-iam-policy-binding \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:cluster_project.svc.id.goog[k8s_namespace/ksa_name]" \
gsa_name#gsa_project.iam.gserviceaccount.com
Note the KSA and complete the link between KSA and GSA
kubectl annotate serviceaccount \
--namespace k8s_namespace \
ksa_name \
iam.gke.io/gcp-service-account=gsa_name#gsa_project.iam.gserviceaccount.com
Set Read and Write role:
gcloud projects add-iam-policy-binding project-id \
--member=serviceAccount:gsa-account#project-id.iam.gserviceaccount.com \
--role=roles/storage.objectAdmin
Test access:
kubectl run -it \
--image google/cloud-sdk:slim \
--serviceaccount ksa-name \
--namespace k8s-namespace \
workload-identity-test
The above command works correctly. Note that was passed --serviceaccount and workload-identity. Is this necessary to GKE?
PS: I don't know if this influences, but I am using SQL Cloud with proxy in the project.
EDIT
Issue portrayed in the question is related to the fact that axios client does not use the Application Default Credentials (as official Google libraries) mechanism that Workload Identity takes advantage of. The ADC checks:
If the environment variable GOOGLE_APPLICATION_CREDENTIALS is set, ADC uses the service account file that the variable points to.
If the environment variable GOOGLE_APPLICATION_CREDENTIALS isn't set, ADC uses the default service account that Compute Engine, Google Kubernetes Engine, App Engine, Cloud Run, and Cloud Functions provide.
-- Cloud.google.com: Authentication: Production
This means that axios client will need to fall back to the Bearer token authentication method to authenticate against Google Cloud Storage.
The authentication with Bearer token is described in the official documentation as following:
API authentication
To make requests using OAuth 2.0 to either the Cloud Storage XML API or JSON API, include your application's access token in the Authorization header in every request that requires authentication. You can generate an access token from the OAuth 2.0 Playground.
Authorization: Bearer OAUTH2_TOKEN
The following is an example of a request that lists objects in a bucket.
JSON API
Use the list method of the Objects resource.
GET /storage/v1/b/example-bucket/o HTTP/1.1
Host: www.googleapis.com
Authorization: Bearer ya29.AHES6ZRVmB7fkLtd1XTmq6mo0S1wqZZi3-Lh_s-6Uw7p8vtgSwg
-- Cloud.google.com: Storage: Docs: Api authentication
I've included basic example of a code snippet using Axios to query the Cloud Storage (requires $ npm install axios):
const Axios = require('axios');
const config = {
headers: { Authorization: 'Bearer ${OAUTH2_TOKEN}' }
};
Axios.get(
'https://storage.googleapis.com/storage/v1/b/BUCKET-NAME/o/',
config
).then(
(response) => {
console.log(response.data.items);
},
(err) => {
console.log('Oh no. Something went wrong :(');
// console.log(err) <-- Get the full output!
}
);
I left below example of Workload Identity setup with a node.js official library code snippet as it could be useful to other community members.
Posting this answer as I've managed to use Workload Identity and a simple nodejs app to send and retrieve data from GCP bucket.
I included some bullet points for troubleshooting potential issues.
Steps:
Check if GKE cluster has Workload Identity enabled.
Check if your Kubernetes service account is associated with your Google Service account.
Check if example workload is using correct Google Service account when connecting to the API's.
Check if your Google Service account is having correct permissions to access your bucket.
You can also follow the official documentation:
Cloud.google.com: Kubernetes Engine: Workload Identity
Assuming that:
Project (ID) named: awesome-project <- it's only example
Kubernetes namespace named: bucket-namespace
Kubernetes service account named: bucket-service-account
Google service account named: google-bucket-service-account
Cloud storage bucket named: workload-bucket-example <- it's only example
I've included the commands:
$ kubectl create namespace bucket-namespace
$ kubectl create serviceaccount --namespace bucket-namespace bucket-service-account
$ gcloud iam service-accounts create google-bucket-service-account
$ gcloud iam service-accounts add-iam-policy-binding --role roles/iam.workloadIdentityUser --member "serviceAccount:awesome-project.svc.id.goog[bucket-namespace/bucket-service-account]" google-bucket-service-account#awesome-project.iam.gserviceaccount.com
$ kubectl annotate serviceaccount --namespace bucket-namespace bucket-service-account iam.gke.io/gcp-service-account=google-bucket-service-account#awesome-project-ID.iam.gserviceaccount.com
Using the guide linked above check the service account authenticating to API's:
$ kubectl run -it --image google/cloud-sdk:slim --serviceaccount bucket-service-account --namespace bucket-namespace workload-identity-test
The output of $ gcloud auth list should show:
Credentialed Accounts
ACTIVE ACCOUNT
* google-bucket-service-account#AWESOME-PROJECT.iam.gserviceaccount.com
To set the active account, run:
$ gcloud config set account `ACCOUNT`
Google service account created earlier should be present in the output!
Also it's required to add the permissions for the service account to the bucket. You can either:
Use Cloud Console
Run: $ gsutil iam ch serviceAccount:google-bucket-service-account#awesome-project.iam.gserviceaccount.com:roles/storage.admin gs://workload-bucket-example
To download the file from the workload-bucket-example following code can be used:
// Copyright 2020 Google LLC
/**
* This application demonstrates how to perform basic operations on files with
* the Google Cloud Storage API.
*
* For more information, see the README.md under /storage and the documentation
* at https://cloud.google.com/storage/docs.
*/
const path = require('path');
const cwd = path.join(__dirname, '..');
function main(
bucketName = 'workload-bucket-example',
srcFilename = 'hello.txt',
destFilename = path.join(cwd, 'hello.txt')
) {
const {Storage} = require('#google-cloud/storage');
// Creates a client
const storage = new Storage();
async function downloadFile() {
const options = {
// The path to which the file should be downloaded, e.g. "./file.txt"
destination: destFilename,
};
// Downloads the file
await storage.bucket(bucketName).file(srcFilename).download(options);
console.log(
`gs://${bucketName}/${srcFilename} downloaded to ${destFilename}.`
);
}
downloadFile().catch(console.error);
// [END storage_download_file]
}
main(...process.argv.slice(2));
The code is exact copy from:
Googleapis.dev: NodeJS: Storage
Github.com: Googleapis: Nodejs-storage: downloadFile.js
Running this code should produce an output:
root#ubuntu:/# nodejs app.js
gs://workload-bucket-example/hello.txt downloaded to /hello.txt.
root#ubuntu:/# cat hello.txt
Hello there!

How to get all running PODs on Kubernetes cluster

This simple Node.js program works fine on local because it pulls the kubernetes config from my local /root/.kube/config file
const Client = require('kubernetes-client').Client;
const Config = require('kubernetes-client/backends/request').config;
const client = new K8sClient({ config: Config.fromKubeconfig(), version: '1.13' });
const pods = await client.api.v1.namespaces('xxxxx').pods.get({ qs: { labelSelector: 'application=test' } });
console.log('Pods: ', JSON.stringify(pods));
Now I want to run it as a Docker container on cluster and get all current cluster's running PODs (for same/current namespace). Now of course it fails:
Error: { Error: ENOENT: no such file or directory, open '/root/.kube/config'
So how make it work when deployed as a Docker container to cluster?
This little service needs to scan all running PODs... Assume it doesn't need pull config data since it's already deployed.. So it needs to access PODs on current cluster
Couple of concepts to grab your head around first:
Service account
Role
Role binding
To perform you end goal (which if i understand correct): Containerize Node js application
Step 1: Put application in a container
Step 2: Create a deployment/statefulset/daemonset as per you requirement using the container created above in step 1
Explanation:
In step 2 above {by default} if you do not (explicitly) mention a serviceaccount (custom) then it will be the default account the credentials of which are mounted inside the container (by default) here
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-xxxx
readOnly: true
which can be verified by this command after (successful) pod creation
kubectl get pod -n {yournamespace(by default is default)} POD_NAME -o yaml
Now (Gotchas!!) if you cannot access the cluster with those credentials then depending on which service account you are using and the rights of that serviceaccount needs to be accessed. For example if you are using abc serviceaccount which does not have rolebinding to it then you will not be able to view the cluster. In that case you need to create (first) a role (to read pods) and a rolebinding (for that role) to the serviceaccount.
UPDATE:The problem got resolved by changing Config.fromKubeconfig() to Config.getInCluster() Ref
Clarification: fromKubeconfig() function is good if you are running your application on a node which is a part of kubernetes cluster and has cluster accessing token saved here: /$USER/.kube/config but if you want to run the nodeJS appilcation in a container in a pod then you need this Config.getInCluster() to load the token.
if you are nosy enough then check the comments of this answer! :P
Note: here the nodejs library in discussion is this

"insufficient authentication scopes" from Google API when calling from K8S cluster

I'm trying to report Node.js errors to Google Error Reporting, from one of our kubernetes deployments running on a GCP/GKE cluster with RBAC. (i.e. permissions defined in a service account associated to the cluster)
const googleCloud = require('#google-cloud/error-reporting');
const googleCloudErrorReporting = new googleCloud.ErrorReporting();
googleCloudErrorReporting.report('[test] dummy error message');
This works only in certain environments:
it works when run on my laptop, using a service account that has the "Errors Writer" role
it works when running in my cluster as a K8S job, after having added the "Errors Writer" role to that cluster's service account
it causes the following error when called from my Node.js application running in one of my K8S deployments:
ERROR:#google-cloud/error-reporting: Encountered an error while attempting to transmit an error to the Stackdriver Error Reporting API.
Error: Request had insufficient authentication scopes.
It feels like the job did pick up the permission changes of the cluster's service account, whereas my deployment did not.
I did try to re-create the deployment to make it refresh its auth token, but the error is still happening...
Any ideas?
UPDATE: I ended up following Jérémie Girault's suggestion: create a service account and bind it to my deployment. It works!
The error message has to do with the access scopes set on the cluster when using the default service account. You must enable access to the appropriate API.
As you mentioned, creating a separate service account, providing it the appropriate IAM permissions and linking it to your cluster or workload will bypass this error as well.

Resources