adbazureml not supported by mlflow - mlflow

We've been following the latest Microsoft webinar about deploying the ML model from Azure Databricks to Azure ML using the MLFlow, and we get the following error when trying to run the experiment from Databricks notebook using the following code:
experimentName="someExperimentName"
mlflow.set_experiment(experimentName)
the error message:
UnsupportedModelRegistryStoreURIException: Unsupported URI
'adbazureml://westus.experiments.azureml.net/history/v1.0/subscriptions/cemrecdsap-t10us-20180830/resourceGroups/2f5a718e-7c56-4dd3-aa7b-03a19b70667/providers/Microsoft.MachineLearningServices/workspaces/cemrecdsap-mlservice'
for model registry store. Supported schemes are: ['', 'file',
'sqlite', 'https', 'databricks', 'postgresql', 'mysql', 'http',
'mssql']
Init script we use as suggested in Microsoft MLflow webinar:
(it was available here, but now it's removed - https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/azure-databricks/linking/azureml-cluster-init.sh)
#!/bin/bash
############## START CONFIGURATION #################
# Provide the required *AzureML* workspace information
region="westus"
subscriptionId="bcb65f42-f234-4bff-91cf-9ef816cd9936"
resourceGroupName="dev-rg"
workspaceName="myazuremlws"
# Optional config directory
configLocation="/databricks/config.json"
############### END CONFIGURATION #################
# Drop the workspace configuration on the cluster
sudo touch $configLocation
sudo echo {\\"subscription_id\\": \\"${subscriptionId}\\", \\"resource_group\\": \\"${resourceGroupName}\\", \\"workspace_name\\": \\"${workspaceName}\\"} > $configLocation
# Set the MLflow Tracking URI
trackingUri="adbazureml://${region}.experiments.azureml.net/history/v1.0/subscriptions/${subscriptionId}/resourceGroups/${resourceGroupName}/providers/Microsoft.MachineLearningServices/workspaces/${workspaceName}"
sudo echo export MLFLOW_TRACKING_URI=${trackingUri} >> /databricks/spark/conf/spark-env.sh
We use the latest MLFlow version, 1.4
Is there a chance that the adbazureml protocol that was used in the webinar is not supported yet by MLFlow?
Or we missed something else?

Related

MLflow - Serving model by reference to model registry

I'm having an issue to serve a model with reference to model registry. According to help, the path should look like this:
models:/model_name/stage
When I type in terminal:
mlflow models serve -m models:/ml_test_model1/Staging --no-conda -h 0.0.0.0 -p 5003
I got the error:
mlflow.exceptions.MlflowException: Not a proper models:/ URI: models:/ml_test_model1/Staging/MLmodel. Models URIs must be of the form 'models:/<model_name>/<version or stage>'.
Model is registered and visible in db and server.
If I put absolute path, it works (experiment_id/run_id/artifacts/model_name).
mlflow version: 1.4
Python version: 3.7.3
Is it matter of some environmental settings or something different?
That style of referencing model artefacts is fixed from mlflow v1.5 (Bug Fix).
You'll need to run mlflow db upgrade <db uri> to refresh your schemas before restarting your mlflow server.
You may find listing registered models helpful:
<server>:<port>/api/2.0/preview/mlflow/registered-models/list
setting the env solved this for me:
export MLFLOW_TRACKING_URI=http://localhost:5000
mlflow models serve models:/my_clf_model/Staging -p 1234 -h 0.0.0.0 --no-conda

Installing Istio in Kubernetes with automatic sidecar injection: istio-inializer.yaml Validation Failure

I'm trying to install Istio with automatic sidecar injection into Kubernetes. My environment consists of three masters and two nodes and was built on Azure using the Azure Container Service marketplace product.
Following the documentation located here, I have so far enabled RBAC and DynamicAdmissionControl. I have accomplished this by modifying /etc/kubernetes/istio-inializer.yaml on the Kubernetes Master by adding the following content outlined in red and then restarting the Kubernetes Master using the Unix command, reboot.
The next step in the documentation is to apply the yaml using kubectl. I assume that the documentation intends for the user to clone the Istio repository and cd into it before this step but that is unmentioned.
git clone https://github.com/istio/istio.git
cd istio
kubectl apply -f install/kubernetes/istio-initializer.yaml
After which the following error occurs:
user#hostname:~/istio$ kubectl apply -f install/kubernetes/istio-initializer.yaml
configmap "istio-inject" configured
serviceaccount "istio-initializer-service-account" configured
error: error validating "install/kubernetes/istio-initializer.yaml": error validating data: found invalid field initializers for v1.ObjectMeta; if you choose to ignore these errors, turn validation off with --validate=false
If I attempt to execute kubectl apply with the mentioned flag, validate=false, then this error is generated instead:
user#hostname:~/istio$ kubectl apply -f install/kubernetes/istio-initializer.yaml --validate=false
configmap "istio-inject" configured
serviceaccount "istio-initializer-service-account" configured
deployment "istio-initializer" configured
error: unable to recognize "install/kubernetes/istio-initializer.yaml": no matches for admissionregistration.k8s.io/, Kind=InitializerConfiguration
I'm not sure where to go from here. The problem appears to be related to the admissionregistration.k8s.io/v1alpha1 block in the yaml but I'm unsure what specifically is incorrect in this block.
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: InitializerConfiguration
metadata:
name: istio-sidecar
initializers:
- name: sidecar.initializer.istio.io
rules:
- apiGroups:
- "*"
apiVersions:
- "*"
resources:
- deployments
- statefulsets
- jobs
- daemonsets
Installed version of Kubernetes:
user#hostname:~/istio$ kubectl version
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.6", GitCommit:"7fa1c1756d8bc963f1a389f4a6937dc71f08ada2", GitTreeState:"clean", BuildDate:"2017-06-16T18:21:54Z", GoVersion:"go1.7.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.6", GitCommit:"7fa1c1756d8bc963f1a389f4a6937dc71f08ada2", GitTreeState:"clean", BuildDate:"2017-06-16T18:21:54Z", GoVersion:"go1.7.6", Compiler:"gc", Platform:"linux/amd64"}
I suspect this is a versioning mismatch. As a follow up question, is it possible to deploy a version of kubernetes >= 1.7.4 to Azure using ACS?
I'm fairly new to working with Kubernetes so if anyone could help I would greatly appreciate it. Thank you for your time.
Seems to be a versioning problem as the alpha feature is supported for k8s version> 1.7 as mentioned here (https://kubernetes.io/docs/admin/extensible-admission-controllers/#what-are-initializers).
1.7 introduces two alpha features, Initializers and External Admission
Webhooks, that address these limitations. These features allow admission
controllers to be developed out-of-tree and configured at runtime.
And it is possible to deploy a version of kubernetes >= 1.7.4 to Azure. Note sure about the deployed version using the portal. But if you use acs-egnine to generate the ARM template, it is possible to deploy a cluster with version 1.7.5.
You can refer here for the procedures https://github.com/Azure/acs-engine. Basically it involves three steps. First, you should create the json file by referring to the clusterDefinition section. To use version 1.7.5, you should specify the attribute "orchestratorRelaease" to be "1.7" and also enable the RBAC by specifying the attribute "enableRbac" to be true. Second, use the acs engine (version >= 0.6.0) to parse the json file to ARM template (azuredeploy.json & azuredeploy.parameters.json should be created). Lastly, use the command "New-AzureRmResourceGroupDeployment" in powershell to deploy the cluster to Azure.
Hope this helps :)

Puppet enterprise error while running "puppet agent -t" commnad, unable to get User/Group data from hieara

I have Puppet enterprise installed on my VM, running in Virtualbox.
The installation went fine, but when I try to run the command puppet agent -t I get the following error:
[root#puppetmaster ~]# puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Function Call, Could not find data item role in any Hiera data file and no default supplied at /etc/puppetlabs/code/environments/production/manifests/site.pp:32:10 on node puppetmaster.localdomain
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
Here is my site.pp file line where the error is coming from;
## site.pp ##
# This file (/etc/puppetlabs/puppet/manifests/site.pp) is the main entry point
# used when an agent connects to a master and asks for an updated configuration.
#
# Global objects like filebuckets and resource defaults should go in this file,
# as should the default node definition. (The default node can be omitted
# if you use the console and don't define any other nodes in site.pp. See
# http://docs.puppetlabs.com/guides/language_guide.html#nodes for more on
# node definitions.)
## Active Configurations ##
# Disable filebucket by default for all File resources:
#http://docs.puppetlabs.com/pe/latest/release_notes.html#filebucket-resource-no-longer-created-by-default
File { backup => false }
# DEFAULT NODE
# Node definitions in this file are merged with node data from the console. See
# http://docs.puppetlabs.com/guides/language_guide.html#nodes for more on
# node definitions.
# The default node definition matches any node lacking a more specific node
# definition. If there are no other nodes in this file, classes declared here
# will be included in every node's catalog, *in addition* to any classes
# specified in the console for that node.
node default {
# This is where you can declare classes for all nodes.
# Example:
# class { 'my_class': }
$role = hiera('role')
$location = hiera('location')
notify{"in the top level site.pp : role is '${role}', location is '${location}'": }
include "::roles::${role}"
}
If you look at the error, it can't find the hiera key that you've asked for in your site.pp:
Could not find data item role in any Hiera data file and no default supplied at /etc/puppetlabs/code/environments/production/manifests/site.pp:32:10 on node puppetmaster.localdomain
In your code, you have the following:
$role = hiera('role')
$location = hiera('location')
Both of these are hiera calls, that require that hiera is setup and that the relevant key is in a hieradata folder.
A useful tool to help you diagnose hiera issues is hiera_explain, which shows you how your hiera hierarchy is setup and configured, and might help explain what the issue is with your code.

Trying to get basic Nodejs example working on GAE

I'm using Windows 7x64, gcloud installed version
Google Cloud SDK 0.9.71
app 2015.07.24
app-engine-java 1.9.24
app-engine-python 1.9.24
app-engine-python-extras 1.9.21
bq 2.0.18
bq-win 2.0.18
core 2015.07.24
core-win 2015.07.24
gcloud 2015.07.24
gsutil 4.13
gsutil-win 4.13
preview 2015.07.24
windows-ssh-tools 2015.06.02
I'm trying to run on preview and deploy the tutorial example from here. Note that app.yaml from this example has "nodejs" set as runtime.
After running command
gcloud preview app run --host localhost:8080 app.yaml
I get
RuntimeError: Unknown runtime 'nodejs'; supported runtimes are 'custom', 'go', 'java', 'java7', 'php', 'php55', 'python, 'python27', 'vm'.
If I put "vm" for runtime it wants to use docker, which doesn't work for me either and I wanted to use the option to do this without docker anyhow.
If I put "custom" for runtime in yaml file I get:
ValueError: The --custom_entrypoint flag must be set for custom runtimes
Example given in the help output for this switch is the following
--custom_entrypoint="gunicorn -b localhost:{port} mymodule:application"
I tried with this, best guess
gcloud preview app run --custom_entrypoint="nodejs -b localhost:{8080} mymodule:application" app.yaml
and got this
ERROR: Argument [--custom_entrypoint=nodejs -b localhost:{8080} mymodule:application] is not a valid deployable file.
ERROR: (gcloud.preview.app.run) Errors occurred while parsing the App Engine app configuration.
Thanks for your time.
The gcloud command seems to be undergoing some changes, so this question seems no longer valid, since we're meant to run dev_appserver.py instead of gcloud to run devserver processes; you can also just straight-up run the node server, or even use docker to build the image from your dockerfile and run that as a container.
If running from dev_appserver.py, make sure you have runtime: custom and a Dockerfile sourcing FROMgcr.io/google_appengine/nodejs, since dev_appserver.py currently raises:
RuntimeError: Unknown runtime 'nodejs'; supported runtimes are 'custom', 'go', 'java', 'java-compat', 'java7', 'php55', 'python', 'python-compat', 'python27'.

How to change location of Influxdb storage folder?

I've Installed package from the official site by instruction. By default the physical destination of database folder is /opt/influxdb/shared.
I've tried to change properties of config file and written it properly. But after that I can't start the influxdb service.
[storage]
dir = "/media/alex/Second/InfluxStorage/data/db" //my settings
How I can change the default database directory ?
EDIT: This is for InfluxDB v1.x only. It has been reported to not work for InfluxDB v2.x.
Make a new directory where you want to put your data and set the appropriate permissions, e.g.:
mkdir /new/path/to/influxdb
sudo chown influxdb:influxdb influxdb
Edit the following three lines of your /etc/influxdb/influxdb.conf (/usr/local/etc/influxdb.conf on macOS) so that they point to your new location:
# under [meta]
dir = "/new/path/to/influxdb/meta"
# under [data]
dir = "/new/path/to/influxdb/data"
wal-dir = "/new/path/to/influxdb/wal"
Restart the InfluxDB daemon.
sudo service influxdb restart # Ubuntu/Debian
brew services restart influxdb # macOS/homebrew
Done!
In case you want to move existing data, just simply copy the existing data (location can be found at influxdb.conf; /var/lib/influxdb on Ubuntu/Debian) to your new desired location before editing influxdb.conf and make sure the new folder has the appropriate permissions/ownership.
There is some information about backups/restores on the official docs, however just plain copying worked for me.
The above was tested on InfluxDB v1.2 on macOS/Ubuntu/Raspbian.
For InfluxDB 2.0:
In InfluxDB 2.0 the data directories are below ~/.influxdbv2 by default.
Actually, there are 2 data storages for bolt (various key-value configurations) and engine (the TSM database).
From the documentation, to change the location to the bolt database:
Default: ~/.influxdbv2/influxd.bolt
influxd flag: influxd --bolt-path=~/.influxdbv2/influxd.bolt
Environment variable: export INFLUXD_BOLT_PATH=~/.influxdbv2/influxd.bolt
Configuration file: bolt-path: /users/user/.influxdbv2/influxd.bolt
From the documentation, to change the location to the engine database:
Default: ~/.influxdbv2/engine
influxd flag: influxd --engine-path=~/.influxdbv2/engine
Environment variable: export INFLUXD_ENGINE_PATH=~/.influxdbv2/engine
Configuration file: engine-path: /users/user/.influxdbv2/engine

Resources