How can I load the latest model version from MLflow model registry? - mlflow

I can load a specific version of a model using the mlflow client:
import mlflow
model_version = 1
model = mlflow.pyfunc.load_model(
model_uri=f"models:/c3760a15e6ac48f88ad7e5af940047d4/{model_version}"
)
But is there a way to load the latest model version?

There is no such thing, like load latest, but:
You can specify the stage (staging, production) - see docs
You can find latest version using the get_latest_versions function - but it will also return latest per stage
So you need to define what latest means for you.

Related

Form Recognizer labeling tool - what is the API version?

I have used the Form Recognizer labeling tool to create train and create models. At the time I started I had used the 2.1 preview and the API version was v2.1-preview.3
To use the models created with the labeling tool via the analyze REST api I used the url endpoint and API version as:
https://{endpoint}/formrecognizer/v2.1preview3/custom/models/...
Now the 2.1 GA version of the labeling tool is available.
Is the version of the API used by the from labeling tool fixed based on which version of the labeling tool docker image we run?
If I use the docker image {mcr.microsoft.com/azure-cognitive-services/custom-form/labeltool:latest-2.1}, the API endpoint will be
https://{endpoint}/formrecognizer/v2.1/custom/models/...
AND
If I use the docker image {mcr.microsoft.com/azure-cognitive-services/custom-form/labeltool:latest-preview} then it is the last preview and the API endpoint will be
https://{endpoint}/formrecognizer/v2.1-preview.3/custom/models/...
Or is there someway I can set explicitly set the API version so I can be sure which API version is being used by the labeling tool?
Is there any settings that I am missing where I can set or at least confirm which version is being used by the tool?
The .fott file has a version property which is set to "2.1.0" in the project created with the preview version and "2.1" with the GA version. Does this property indicate anything?
Thanks
The project (.fott) file has an optional property "apiVersion", you can set it to "v2.1" or "v2.1-preview.3" based on your needs :)
If this property is not set in the project file, label tool will use the default version which will be the same as tool version.
note: the "version" property in .fott doesn't represent the API version, instead it's just reflect the source control version.
Thanks!

Cannot configure a GCP project when using DataProcPySparkOperator

I am using a Cloud Composer environment to run workflows in a GCP project. One of my workflows creates a Dataproc cluster in different project using the DataprocClusterCreateOperator, and then attempts to submit a PySpark job to that cluster using the DataProcPySparkOperator from the airflow.contrib.operators.dataproc_operator module.
To create the cluster, I can specify a project_id parameter to create it in another project, but it seems like DataProcPySparkOperator ignores this parameter. For example, I expect to be able to pass a project_id, but I end up with a 404 error when the task runs:
from airflow.contrib.operators.dataproc_operator import DataProcPySparkOperator
t1 = DataProcPySparkOperator(
project_id='my-gcp-project',
main='...',
arguments=[...],
)
How can I use DataProcPySparkOperator to submit a job in another project?
The DataProcPySparkOperator from the airflow.contrib.operators.dataproc_operator module doesn't accept a project_id kwarg in its constructor, so it will always default to submitting Dataproc jobs in the project the Cloud Composer environment is in. If an argument is passed, then it is ignored, which results in a 404 error when running the task, because the operator will try to poll for a job using an incorrect cluster path.
One workaround is to copy the operator and hook, and modify it to accept a project ID. However, an easier solution is to use the newer operators from the airflow.providers packages if you are using a version of Airflow that supports them, because many airflow.contrib operators are deprecated in newer Airflow releases.
Below is an example. Note that there is a newer DataprocSubmitPySparkJobOperator in this module, but it is deprecated in favor of DataprocSubmitJobOperator. So, you should use the latter, which accepts a project ID.
from airflow.providers.google.cloud.operators.dataproc import DataprocSubmitJobOperator
t1 = DataprocSubmitJobOperator(
project_id='my-gcp-project-id',
location='us-central1',
job={...},
)
If you are running an environment with Composer 1.10.5+, Airflow version 1.10.6+, and Python 3, the providers are preinstalled and can be used immediately.

Version error trying to use custom scenario from GitHub

I am taking this one straight from the repo.
I am using this scenario Azure Samples and when I try to upload the base I get the following error (cut for brevity):
The specified page contract 'urn:com:microsoft:aad:b2c:elements:contract:unifiedssp' has invalid version '2.0.0'. The available versions are: '["1.0.0","1.1.0","1.2.0"]'.
Any thoughts on this?
The current available versions for page layout are ["1.0.0","1.1.0","1.2.0"]. You can find the version change log here.
Try to change the version from 2.0.0 to the available version in PasswordlessEmailAndPhoneBase.xml file.

Migrating Runs to MLFlow 0.9

we have been using MLFlow 0.8.2 (with a local file store) for a while, and I was happy to see the release of MLFlow 0.9. After upgrading to the new version, I realized that pointing the MLFLow server to the old file store leads to a non-working web UI (I just see some image of a waterfall).
Is there a recommendation for proper migration of data when upgrading?
Thanks a lot in advance,
Da

Yahoo - caffe on Spark library dependency?

Yahoo just released a version of caffe that uses the latest version of Apache-Spark yesterday, the git repo is not well documented yet: git link
There is a scala test file which is suppose to run an example: Scala Example
but it requires the dependency com.yahoo.ml.caffe.{Config, CaffeOnSpark, DataSource} which I assume contains basically the data, the config and the API. Has this been made into a library yet? How could I build this using sbt?
Our CaffeOnSpark release contains all the code that you need to run. com.yahoo.ml.caffe.* are collection of Scala classes in caffe-grid folder. Please follow our guides at CaffeOnSpark wiki page, and ask questions at your mailing list.

Resources