Get list of application packages available for a batch account of Azure Batch - azure

I'm making a python app that launches a batch.
I want, via user inputs, to create a pool.
For simplicity, I'll just add all the applications present in the batch account to the pool.
I'm not able to get the list of available application packages.
This is the portion of code:
import azure.batch.batch_service_client as batch
from azure.common.credentials import ServicePrincipalCredentials
credentials = ServicePrincipalCredentials(
client_id='xxxxx',
secret='xxxxx',
tenant='xxxx',
resource="https://batch.core.windows.net/"
)
batch_client = batch.BatchServiceClient(
credentials,
base_url=self.AppData['CloudSettings'][0]['BatchAccountURL'])
# Get list of applications
batchApps = batch_client.application.list()
I can create a pool, so credentials are good and there are applications but the returned list is empty.
Can anybody help me with this?
Thank you,
Guido
Update:
I tried:
import azure.batch.batch_service_client as batch
batchApps = batch.ApplicationOperations.list(batch_client)
and
import azure.batch.operations as batch_operations
batchApps = batch_operations.ApplicationOperations.list(batch_client)
but they don't seem to work. batchApps is always empty.
I don't think it's an authentication issue since I'd get an error otherwise.
At this point I wonder if it just a bug in the python SDK?
The SDK version I'm using is:
azure.batch: 4.1.3
azure: 4.0.0
This is a screenshot of the empty batchApps var:

Is this the link you are looking for:
Understanding the application package concept here: https://learn.microsoft.com/en-us/azure/batch/batch-application-packages
Since its python SDK in action here: https://learn.microsoft.com/en-us/python/api/azure-batch/azure.batch.operations.applicationoperations?view=azure-python
list operation and here is get
hope this helps.

I haven't tried lately using the Azure Python SDK but the way I solved this was to use the Azure REST API:
https://learn.microsoft.com/en-us/rest/api/batchservice/application/list
For the authorization, I had to create an application and give it access to the Batch services and the I programmatically generated the token with the following request:
data = {'grant_type': 'client_credentials',
'client_id': clientId,
'client_secret': clientSecret,
'resource': 'https://batch.core.windows.net/'}
postReply = requests.post('https://login.microsoftonline.com/' + tenantId + '/oauth2/token', data)

Related

How to get reference to AzureML Workspace Class in scoring script?

My scoring function needs to refer to an Azure ML Registered Dataset for which I need a reference to the AzureML Workspace object. When including this in the init() function of the scoring script it gives the following error:
"code": "ScoreInitRestart",
"message": "Your scoring file's init() function restarts frequently. You can address the error by increasing the value of memory_gb in deployment_config."
On debugging the issue is:
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code [REDACTED] to authenticate.
How can I resolve this issue without exposing Service Principal Credentials in the scoring script?
I found a workaround to reference the workspace in the scoring script. Below is a code snippet of how one can do that -
My deploy script looks like this :
from azureml.core import Environment
from azureml.core.model import InferenceConfig
#Add python dependencies for the models
scoringenv = Environment.from_conda_specification(
name = "scoringenv",
file_path="config_files/scoring_env.yml"
)
#Create a dictionary to set-up the env variables
env_variables={'tenant_id':tenant_id,
'subscription_id':subscription_id,
'resource_group':resource_group,
'client_id':client_id,
'client_secret':client_secret
}
scoringenv.environment_variables=env_variables
# Configure the scoring environment
inference_config = InferenceConfig(
entry_script='score.py',
source_directory='scripts/',
environment=scoringenv
)
What I am doing here is creating an image with the python dependencies(in the scoring_env.yml) and passing a dictionary of the secrets as environment variables. I have the secrets stored in the key-vault.
You may define and pass native python datatype variables.
Now, In my score.py, I reference these environment variables in the init() like this -
tenant_id = os.environ.get('tenant_id')
client_id = os.environ.get('client_id')
client_secret = os.environ.get('client_secret')
subscription_id = os.environ.get('subscription_id')
resource_group = os.environ.get('resource_group')
Once you have these variables, you may create a workspace object using Service Principal authentication like #Anders Swanson mentioned in his reply.
Another way to resolve this may be by using managed identities for AKS. I did not explore that option.
Hope this helps! Please let me know if you found a better way of solving this.
Thanks!
Does your score.py include a Workspace.get() with auth=InteractiveAuthentication call? You should swap it to ServicePrincipalAuthentication (docs) to which you pass your credentials ideally through environment variables.
import os
from azureml.core.authentication import ServicePrincipalAuthentication
svc_pr_password = os.environ.get("AZUREML_PASSWORD")
svc_pr = ServicePrincipalAuthentication(
tenant_id="my-tenant-id",
service_principal_id="my-application-id",
service_principal_password=svc_pr_password)
ws = Workspace(
subscription_id="my-subscription-id",
resource_group="my-ml-rg",
workspace_name="my-ml-workspace",
auth=svc_pr
)
print("Found workspace {} at location {}".format(ws.name, ws.location))
You can get the workspace object directly from your run.
from azureml.core.run import Run
ws = Run.get_context().experiment.workspace
I came across the same challenge. As you are mentioning AML Datasets, I assume an AML Batch Endpoint is suitable to your scenario. The scoring script for a batch endpoint is meant to receive a list of files as input. When invoking the batch endpoint, you can pass (among the others) AML Datasets (consider that an endpoint is deployed in the context of an AML workspace). Have a look to this.
When Running on AML Compute CLuster use the following code
from azureml.pipeline.core import PipelineRun as Run
run = Run.get_context()
run.experiment.workspace
ws = run.experiment.workspace
Note: This works only when you run on AML Cluster
Run.get_context() gets the conext of total AML cluster from that object we can extract workspace context which allows you to authenticate AML workspace with AML cluster

How to use google-api-client for Google Cloud Logging

I want to access to Google Cloud Platform Logging from a python script.
I have get to access to this logs from https://cloud.google.com/logging/docs/reference/v2/rest/v2/entries/list --> Try this API
Now I want to get the same, but from a Python script. I saw that in step before, is created an authorization token automatically.
I am trying with this code sample, but then I don't know how to POST https://logging.googleapis.com/v2/entries:list using discovery:
from google.oauth2 import service_account
import googleapiclient.discovery
credentials = service_account.Credentials.from_service_account_file(service_account_file)
logging = googleapiclient.discovery.build('logging', 'v2', credentials=credentials)
Then I have tried with this code sample:
import requests
payload = {
"projectIds": [
"my-proyect"
],
"resourceNames": [],
"filter": "resource.type=cloudiot_device",
"orderBy": "timestamp desc",
"pageSize": 1
}
headers = {"Authorization": "Bearer AAAAAAA"}
r = requests.post("https://logging.googleapis.com/v2/entries:list", params=payload, headers=headers)
That code sample works correctly but where it puts AAAAAAA token I copy and paste the code that I saw in https://cloud.google.com/logging/docs/reference/v2/rest/v2/entries/list but I don't know how to generate this token from a python script.
Thanks!
This is less easy to find because many of Google's Cloud (!) services now prefer Cloud Client libraries.
However...
import google.auth
from googleapiclient import discovery
credentials, project = google.auth.default()
service = discovery.build("logging", "v2", credentials=credentials)
Auth: https://pypi.org/project/google-auth/
Now, this uses Google Application Default credentials and I recommend you create a service account, generate a key and grant the account the permission needed. You will then need to export GOOGLE_APPLICATION_CREDENTIALS before running your code.
PROJECT=[[YOUR-PROJECT]]
BILLING=[[YOUR-BILLING]]
ACCOUNT=[[YOUR-ACCOUNT]]
gcloud projects create ${PROJECT}
gcloud beta billing projects link ${PROJECT} \
--billing-account=${BILLING}
gcloud iam service-accounts create ${ACCOUNT} \
--project=${PROJECT}
EMAIL="${ACCOUNT}#${PROJECT}.iam.gserviceaccount.com"
gcloud iam service-accounts keys create ${PWD}/${ACCOUNT}.json \
--iam-account=${EMAIL} \
--project=${PROJECT}
# See: https://cloud.google.com/iam/docs/understanding-roles#logging-roles
gcloud projects add-iam-policy-binding ${PROJECT} \
--member=serviceAccount:${EMAIL} \
--role=roles/logging.viewer
export GOOGLE_APPLICATION_CREDENTIALS=${PWD}/${ACCOUNT}.json
python3 your-code.py
Ok, thanks to the Google Engineer, the first part of the solution is to disable the SDK's use of gRPC and force HTTP so that page_size is respected:
client = logging.Client(_use_grpc=0)
Alternatively, you can GOOGLE_CLOUD_DISABLE_GRPC="{{anything}}"
And the second part of the solution is to only iterate over the first page of page_size results:
iterator = logger.list_entries(
order_by=DESCENDING,
page_size=page_size,
)
print(type(iterator))
for entry in next(iterator.pages):
timestamp = entry.timestamp.isoformat()
print("{}".format(timestamp))
NOTE forcing HTTP entails logger.list_entries returning an HTTPIterator instead of a (gRPC) generator hence the ability to use next() and the pages property.
NOTE The 'trick' is to only enumerate the first page of n results. There may be multiple pages but we ignore subsequent ones.
I am using the following code sample to extract log information from Google Cloud Logging.
import os
from google.cloud import logging
from google.cloud.logging import DESCENDING
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "my-service-account-file"
def list_entries(logger_name):
"""Lists the most recent entries for a given logger."""
logging_client = logging.Client()
logger = logging_client.logger(logger_name)
print("Listing entries for logger {}:".format(logger.name))
filter_str = "resource.type=cloudiot_device AND resource.labels.device_num_id=00000000000 AND jsonPayload.eventType=PUBLISH"
for entry in logger.list_entries(filter_=filter_str, order_by=DESCENDING, page_size=10):
timestamp = entry.timestamp.isoformat()
print(" {}: {}".format(, timestamp, entry.payload))
list_entries("cloudiot.googleapis.com%2Fdevice_activity")
My goal is to run this python script every 5 minutes and get the last 5 entries from the Logging. My problem is that this code sample start extracting entries, but it never stoppes. How can I limit the number of entries?
Thanks!

How to delete GKE (Google Kubernetes Engine) cluster using python?

I'm new to GKE-Python. I would like to delete my GKE(Google Kubernetes Engine) cluster using a python script.
I found an API delete_cluster() from the google-cloud-container python library to delete the GKE cluster.
https://googleapis.dev/python/container/latest/index.html
But I'm not sure how to use that API by passing the required parameters in python. Can anyone explain me with an example?
Or else If there is any other way to delete the GKE cluster in python?
Thanks in advance.
First you'd need to configure the Python Client for Google Kubernetes Engine as explained on this section of the link you shared. Basically, set up a virtual environment and install the library with pip install google-cloud-container.
If you are running the script within an environment such as the Cloud Shell with an user that has enough access to manage the GKE resources (with at least the Kubernetes Engine Cluster Admin permission assigned) the client library will handle the necessary authentication from the script automatically and the following script will most likely work:
from google.cloud import container_v1
project_id = "YOUR-PROJECT-NAME" #Change me.
zone = "ZONE-OF-THE-CLUSTER" #Change me.
cluster_id = "NAME-OF-THE-CLUSTER" #Change me.
name = "projects/"+project_id+"/locations/"+zone+"/clusters/"+cluster_id
client = container_v1.ClusterManagerClient()
response = client.delete_cluster(name=name)
print(response)
Notice that as per the delete_cluster method documentation you only need to pass the name parameter. If by some reason you are just provided the credentials (generally in the form of a JSON file) of a service account that has enough permissions to delete the cluster you'd need to modify the client for the script and use the credentials parameter to get the client correctly authenticated in a similar fashion to:
...
client = container_v1.ClusterManagerClient(credentials=credentials)
...
Where the credentials variable is pointing to the JSON filename (and path if it's not located in the folder where the script is running) of the service account credentials file with enough permissions that was provided.
Finally notice that the response variable that is returned by the delete_cluster method is of the Operations class which can serve to monitor a long running operation in a similar fashion as to how it is explained here with the self_link attribute corresponding to the long running operation.
After running the script you could use a curl command in a similar fashion to:
curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
https://container.googleapis.com/v1/projects/[RPOJECT-NUMBER]/zones/[ZONE-WHERE-THE-CLUSTER-WAS-LOCATED]/operations/operation-[OPERATION-NUMBER]
by checking the status field (which could be in RUNNING state while it is happening) of the response to that curl command. Or your could also use the requests library or any equivalent to automate this checking procedure of the long running operation within your script.
This page contains an example for the command you are trying to perform.
To give some more details that are required for the command to succeed -
Your environment needs to contain environment variables, this page contains instructions for how to do that.
Once your environment is successfully authenticated we can run the delete cluster command like so -
from google.cloud import container_v1
client = container_v1.ClusterManagerClient()
response = client.delete_cluster(name=projects/<project>/locations/<location>/clusters/<cluster>)

How to connect Google Datastore from a script in Python 3

We want to do some stuff with the data that is in the Google Datastore. We have a database already, We would like to use Python 3 to handle the data and make queries from a script on our developing machines. Which would be the easiest way to accomplish what we need?
From the Official Documentation:
You will need to install the Cloud Datastore client library for Python:
pip install --upgrade google-cloud-datastore
Set up authentication by creating a service account and setting an environment variable. It will be easier if you see it, please take a look at the official documentation for more info about this. You can perform this step by either using the GCP console or command line.
Then you will be able to connect to your Cloud Datastore client and use it, as in the example below:
# Imports the Google Cloud client library
from google.cloud import datastore
# Instantiates a client
datastore_client = datastore.Client()
# The kind for the new entity
kind = 'Task'
# The name/ID for the new entity
name = 'sampletask1'
# The Cloud Datastore key for the new entity
task_key = datastore_client.key(kind, name)
# Prepares the new entity
task = datastore.Entity(key=task_key)
task['description'] = 'Buy milk'
# Saves the entity
datastore_client.put(task)
print('Saved {}: {}'.format(task.key.name, task['description']))
As #JohnHanley mentioned, you will find a good example on this Bookshelf app tutorial that uses Cloud Datastore to store its persistent data and metadata for books.
You can create a service account and download the credentials as JSON and then set an environment variable called GOOGLE_APPLICATION_CREDENTIALS pointing to the json file. You can see the details at the link below.
https://googleapis.dev/python/google-api-core/latest/auth.html

What is suggested method to get service versions

What is the best way to get list of service versions in google app engine in flex env? (from service instance in Python 3). I want to authenticate using service account json keys file. I need to find currently default version (with most of traffic).
Is there any lib I can use like googleapiclient.discovery, or google.appengine.api.modules? Or I should build it from scratches and request REST api on apps.services.versions.list using oauth? I couldn't not find any information in google docs..
https://cloud.google.com/appengine/docs/standard/python3/python-differences#cloud_client_libraries
Finally I was able to solve it. Simple things on GAE became big problems..
SOLUTION:
I have path to service_account.json set in GOOGLE_APPLICATION_CREDENTIALS env variable. Then you can use google.auth.default
from googleapiclient.discovery import build
import google.auth
creds, project = google.auth.default(scopes=['https://www.googleapis.com/auth/cloud-platform.read-only'])
service = build('appengine', 'v1', credentials=creds, cache_discovery=False)
data = service.apps().services().get(appsId=APPLICATION_ID, servicesId=SERVICE_ID).execute()
print data['split']['allocations']
Return value is allocations dictionary with versions as keys and traffic percents in values.
All the best!
You can use Google's Python Client Library to interact with the Google App Engine Admin API, in order to get the list of a GAE service versions.
Once you have google-api-python-client installed, you might want to use the list method to list all services in your application:
list(appsId, pageSize=None, pageToken=None, x__xgafv=None)
The arguments of the method should include the following:
appsId: string, Part of `name`. Name of the resource requested. Example: apps/myapp. (required)
pageSize: integer, Maximum results to return per page.
pageToken: string, Continuation token for fetching the next page of results.
x__xgafv: string, V1 error format. Allowed values: v1 error format, v2 error format
You can find more information on this method in the link mentioned above.

Resources