How to update the existing web service with a new docker image on Azure Machine Learning Services? - azure-machine-learning-service

I am currently working on machine learning project with Azure Machine Learning Services. But I found the problem that I can't update a new docker image to the existing web service (I want to same url as running we service).
I have read the documentation but it doesn't really tell me how to update (documentation link: https://learn.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where).
The documentation said that we have to use update() with image = new-image.
from azureml.core.webservice import Webservice
service_name = 'aci-mnist-3
# Retrieve existing service
service = Webservice(name = service_name, workspace = ws)
# Update the image used by the service
service.update(image = new-image)
print(service.state)
But the new-image isn't described where it comes from.
Does anyone know how to figure out this problem?
Thank you

The documentation could be a little more clear on this part, I agree. The new-image is an image object that you should pass into the update() function. If you just created the image you might already have the object in a variable, then just pass it. If not, then you can obtain it from your workspace using
from azureml.core.image.image import Image
new_image = Image(ws, image_name)
where ws is your workspace object and image_name is a string with the name of the image you want to obtain. Then you go on calling update() as
from azureml.core.webservice import Webservice
service_name = 'aci-mnist-3'
# Retrieve existing service
service = Webservice(name = service_name, workspace = ws)
# Update the image used by the service
service.update(image = new_image) # Note that dash isn't supported in variable names
print(service.state)
You can find more information in the SDK documentation
EDIT:
Both the Image and the Webservice classes above are abstract parent classes.
For the Image object, you should really use one of these classes, depending on your case:
ContainerImage
UnknownImage
(see Image package in the documentation).
For the Webservice object, you should use one of these classes, depending on your case:
AciWebservice
AksWebservice
UnknownWebservice
(see Webservice package in the documentation).

Related

Python SDK v2 for Azure Machine Learning SDK (preview) - how to retrieve workspace from WorkspaceOperations

I’m following this article to create ML pipelines with the new SDK.
So I started by loading the first class
from azure.ai.ml import MLClient
and then I used it to authenticated on my workspace
ml_client = MLClient(
credential=credential,
subscription_id=subscription_id,
resource_group_name=resource_group_name,
workspace_name=" mmAmlsWksp01",
)
However, I can’t understand how I can retrieve the objects it refers to. For example, it contains a “workspaces” member, but if I run
ml_client.workspaces["mmAmlsWksp01"]
, I get the error “'WorkspaceOperations' object is not subscriptable”.
So I tried to run
for w in ml_client.workspaces.list():
print(w)
and it returns the workspace details (name, displayName, id…) for a SINGLE workspace, but not the workspace object.
In fact, the ml_client.workspaces object is a
<azure.ai.ml._operations.workspace_operations.WorkspaceOperations at 0x7f3526e45d60>
, but I don’t want a WorkspaceOperation, I want the Workspace itself. How can I retrieve it?
You need to use the get method on workspaces e.g.
ml_client.workspaces.get.("mmAmlsWksp01")

How to get reference to AzureML Workspace Class in scoring script?

My scoring function needs to refer to an Azure ML Registered Dataset for which I need a reference to the AzureML Workspace object. When including this in the init() function of the scoring script it gives the following error:
"code": "ScoreInitRestart",
"message": "Your scoring file's init() function restarts frequently. You can address the error by increasing the value of memory_gb in deployment_config."
On debugging the issue is:
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code [REDACTED] to authenticate.
How can I resolve this issue without exposing Service Principal Credentials in the scoring script?
I found a workaround to reference the workspace in the scoring script. Below is a code snippet of how one can do that -
My deploy script looks like this :
from azureml.core import Environment
from azureml.core.model import InferenceConfig
#Add python dependencies for the models
scoringenv = Environment.from_conda_specification(
name = "scoringenv",
file_path="config_files/scoring_env.yml"
)
#Create a dictionary to set-up the env variables
env_variables={'tenant_id':tenant_id,
'subscription_id':subscription_id,
'resource_group':resource_group,
'client_id':client_id,
'client_secret':client_secret
}
scoringenv.environment_variables=env_variables
# Configure the scoring environment
inference_config = InferenceConfig(
entry_script='score.py',
source_directory='scripts/',
environment=scoringenv
)
What I am doing here is creating an image with the python dependencies(in the scoring_env.yml) and passing a dictionary of the secrets as environment variables. I have the secrets stored in the key-vault.
You may define and pass native python datatype variables.
Now, In my score.py, I reference these environment variables in the init() like this -
tenant_id = os.environ.get('tenant_id')
client_id = os.environ.get('client_id')
client_secret = os.environ.get('client_secret')
subscription_id = os.environ.get('subscription_id')
resource_group = os.environ.get('resource_group')
Once you have these variables, you may create a workspace object using Service Principal authentication like #Anders Swanson mentioned in his reply.
Another way to resolve this may be by using managed identities for AKS. I did not explore that option.
Hope this helps! Please let me know if you found a better way of solving this.
Thanks!
Does your score.py include a Workspace.get() with auth=InteractiveAuthentication call? You should swap it to ServicePrincipalAuthentication (docs) to which you pass your credentials ideally through environment variables.
import os
from azureml.core.authentication import ServicePrincipalAuthentication
svc_pr_password = os.environ.get("AZUREML_PASSWORD")
svc_pr = ServicePrincipalAuthentication(
tenant_id="my-tenant-id",
service_principal_id="my-application-id",
service_principal_password=svc_pr_password)
ws = Workspace(
subscription_id="my-subscription-id",
resource_group="my-ml-rg",
workspace_name="my-ml-workspace",
auth=svc_pr
)
print("Found workspace {} at location {}".format(ws.name, ws.location))
You can get the workspace object directly from your run.
from azureml.core.run import Run
ws = Run.get_context().experiment.workspace
I came across the same challenge. As you are mentioning AML Datasets, I assume an AML Batch Endpoint is suitable to your scenario. The scoring script for a batch endpoint is meant to receive a list of files as input. When invoking the batch endpoint, you can pass (among the others) AML Datasets (consider that an endpoint is deployed in the context of an AML workspace). Have a look to this.
When Running on AML Compute CLuster use the following code
from azureml.pipeline.core import PipelineRun as Run
run = Run.get_context()
run.experiment.workspace
ws = run.experiment.workspace
Note: This works only when you run on AML Cluster
Run.get_context() gets the conext of total AML cluster from that object we can extract workspace context which allows you to authenticate AML workspace with AML cluster

How to pass in the model name during init in Azure Machine Learning Service?

I am deploying 50 NLP models on Azure Container Instances via the Azure Machine Learning service. All 50 models are quite similar and have the same input/output format with just the model implementation changing slightly.
I want to write a generic score.py entry file and pass in the model name as a parameter. The interface method signature does not allow a parameter in the init() method of score.py, so I moved the model loading into the run method. I am assuming the init() method gets run once whereas Run(data) will get executed on every invocation, so this is possibly not ideal (the models are 1 gig in size)
So how can I pass in some value to the init() method of my container to tell it what model to load?
Here is my current, working code:
def init():
def loadModel(model_name):
model_path = Model.get_model_path(model_name)
return fasttext.load_model(model_path)
def run(raw_data):
# extract model_name from raw_data omitted...
model = loadModel(model_name)
...
but this is what I would like to do (which breaks the interface)
def init(model_name):
model = loadModel(model_name)
def loadModel(model_name):
model_path = Model.get_model_path(model_name)
return fasttext.load_model(model_path)
def run(raw_data):
...
If you're looking to use the same deployed container and switch models between requests; it's not the preferred design choice for Azure machine learning service, we need to specify the model name to load during build/deploy.
Ideally, each deployed web-service endpoint should allow inference of one model only; with the model name defined before the container the image starts building/deploying.
It is mandatory that the entry script has both init() and run(raw_data) with those exact signatures.
At the moment, we can't change the signature of init() method to take a parameter like in init(model_name).
The only dynamic user input you'd ever get to pass into this web-service is via run(raw_data) method. As you have tried, given the size of your model passing it via run is not feasible.
init() is run first and only once after your web-service deploy. Even if init() took the model_name parameter, there isn't a straight forward way to call this method directly and pass your desired model name.
But, one possible solution is:
You can create params file like below and store the file in azure blob storage.
Example runtime parameters generation script:
import pickle
params = {'model_name': 'YOUR_MODEL_NAME_TO_USE'}
with open('runtime_params.pkl', 'wb') as file:
pickle.dump(params, file)
You'll need to use Azure Storage Python SDK to write code that can read from your blob storage account. This also mentioned in the official docs here.
Then you can access this from init() function in your score script.
Example score.py script:
from azure.storage.blob import BlockBlobService
import pickle
def init():
global model
block_blob_service = BlockBlobService(connection_string='your_connection_string')
blob_item = block_blob_service.get_blob_to_bytes('your-container-name','runtime_params.pkl')
params = pickle.load(blob_item.content)
model = loadModel(params['model_name'])
You can store connection strings in Azure KeyVault for secure access. Azure ML Workspaces comes with built-in KeyVault integration. More info here.
With this approach, you're abstracting runtime params config to another cloud location rather than the container itself. So you wouldn't need to re-build the image or deploy the web-service again. Simply restarting the container will work.
If you're looking to simply re-use score.py (not changing code) for multiple model deployments in multiple containers then here's another possible solution.
You can define your model name to use in web-service in a text file and read it in score.py. You'll need to pass this text file as a dependency when setting up the image config.
This would, however, need multiple params files for each container deployment.
Passing 'runtime_params.pkl' in dependencies to your image config (More detail example here):
image_config = ContainerImage.image_configuration(execution_script="score.py",
runtime="python",
conda_file="myenv.yml",
dependencies=["runtime_params.pkl"],
docker_file="Dockerfile")
Reading this in your score.py init() function:
def init():
global model
with open('runtime_params.pkl', 'rb') as file:
params = pickle.load(file)
model = loadModel(params['model_name'])
Since your creating a new image config with this approach, you'll need to build the image and re-deploy the service.

How to connect Google Datastore from a script in Python 3

We want to do some stuff with the data that is in the Google Datastore. We have a database already, We would like to use Python 3 to handle the data and make queries from a script on our developing machines. Which would be the easiest way to accomplish what we need?
From the Official Documentation:
You will need to install the Cloud Datastore client library for Python:
pip install --upgrade google-cloud-datastore
Set up authentication by creating a service account and setting an environment variable. It will be easier if you see it, please take a look at the official documentation for more info about this. You can perform this step by either using the GCP console or command line.
Then you will be able to connect to your Cloud Datastore client and use it, as in the example below:
# Imports the Google Cloud client library
from google.cloud import datastore
# Instantiates a client
datastore_client = datastore.Client()
# The kind for the new entity
kind = 'Task'
# The name/ID for the new entity
name = 'sampletask1'
# The Cloud Datastore key for the new entity
task_key = datastore_client.key(kind, name)
# Prepares the new entity
task = datastore.Entity(key=task_key)
task['description'] = 'Buy milk'
# Saves the entity
datastore_client.put(task)
print('Saved {}: {}'.format(task.key.name, task['description']))
As #JohnHanley mentioned, you will find a good example on this Bookshelf app tutorial that uses Cloud Datastore to store its persistent data and metadata for books.
You can create a service account and download the credentials as JSON and then set an environment variable called GOOGLE_APPLICATION_CREDENTIALS pointing to the json file. You can see the details at the link below.
https://googleapis.dev/python/google-api-core/latest/auth.html

What is suggested method to get service versions

What is the best way to get list of service versions in google app engine in flex env? (from service instance in Python 3). I want to authenticate using service account json keys file. I need to find currently default version (with most of traffic).
Is there any lib I can use like googleapiclient.discovery, or google.appengine.api.modules? Or I should build it from scratches and request REST api on apps.services.versions.list using oauth? I couldn't not find any information in google docs..
https://cloud.google.com/appengine/docs/standard/python3/python-differences#cloud_client_libraries
Finally I was able to solve it. Simple things on GAE became big problems..
SOLUTION:
I have path to service_account.json set in GOOGLE_APPLICATION_CREDENTIALS env variable. Then you can use google.auth.default
from googleapiclient.discovery import build
import google.auth
creds, project = google.auth.default(scopes=['https://www.googleapis.com/auth/cloud-platform.read-only'])
service = build('appengine', 'v1', credentials=creds, cache_discovery=False)
data = service.apps().services().get(appsId=APPLICATION_ID, servicesId=SERVICE_ID).execute()
print data['split']['allocations']
Return value is allocations dictionary with versions as keys and traffic percents in values.
All the best!
You can use Google's Python Client Library to interact with the Google App Engine Admin API, in order to get the list of a GAE service versions.
Once you have google-api-python-client installed, you might want to use the list method to list all services in your application:
list(appsId, pageSize=None, pageToken=None, x__xgafv=None)
The arguments of the method should include the following:
appsId: string, Part of `name`. Name of the resource requested. Example: apps/myapp. (required)
pageSize: integer, Maximum results to return per page.
pageToken: string, Continuation token for fetching the next page of results.
x__xgafv: string, V1 error format. Allowed values: v1 error format, v2 error format
You can find more information on this method in the link mentioned above.

Resources