Azure Machine learning fails when trying to deploy model

Azure Machine learning fails when trying to deploy model - azure

I'm currently trying to deploy a model on azure and expose it's endpoint to my application but I kept running into errors
DEPLOYMENT CODE
model = run.register_model(model_name='pytorch-modeloldage', model_path="outputs/model") print("Starting.........")
inference_config = InferenceConfig(runtime= "python",
entry_script="pytorchscore.py",
conda_file="myenv.yml")
aciconfig = AciWebservice.deploy_configuration(cpu_cores=1,auth_enabled=True,
memory_gb=1,
tags={'name':'oldageml', 'framework': 'pytorch'},
description='oldageml training')
service = Model.deploy(workspace=ws,
name='pytorch-olageml-run',
models=[model],
inference_config=inference_config,
overwrite=True,
deployment_config=aciconfig)
service.wait_for_deployment(True)
# print(service.get_logs()) print("bruh did you run", service.scoring_uri) print(service.state)
ERROR
ERROR - Service deployment polling reached non-successful terminal state, current service state: Transitioning
More information can be found here:
Error:
{
"code": "EnvironmentBuildFailed",
"statusCode": 400,
"message": "Failed Building the Environment."
}

I had this error, too, and I was convinced it was working a few days ago!
Anyway, I realised that I was using python 3.5 in my environment definition.
I changed that to 3.6 and it works! I notice that there was a new release of azureml-code on 9 Dec 2019.
This is my code for changing the environment; I add the environment for a variable rather than a file as you do, so that's a bit different.
myenv=Environment(name="env-keras")
conda_packages = ['numpy']
pip_packages = ['tensorflow==2.0.0', 'keras==2.3.1', 'azureml-sdk','azureml-defaults']
mycondaenv = CondaDependencies.create(conda_packages=conda_packages, pip_packages=pip_packages, python_version='3.6.2')
myenv.python.conda_dependencies=mycondaenv
myenv.register(workspace=ws)
inference_config = InferenceConfig(entry_script='score.py',source_directory='.',environment=myenv)

Related

Getting error while deploying using flyctl (flyio)

On my heroku app, the free dyno hours are over so i am trying to deploy the web app on https://fly.io/
I have installed flyctl and signed up too but now I am getting this error: incorrect function function
👉 Complete error below:
$ flyctl launch
Creating app in C:\Users\Abhinav\Desktop\todolist-abhinavkashyap061
Scanning source code
Detected a NodeJS app
Using the following build configuration:
Builder: heroku/buildpacks:20
? App Name (leave blank to use an auto-generated name): Error Incorrect function.

Had the same problem. Switched to PowerShell and it worked.
https://community.fly.io/t/launching-node-app-error-incorrect-function/6963

How to submit local jobs with dsl.pipeline

Trying to run and debug a pipeline locally. Pipeline is imeplemented with azure.ml.component.dsl.pipeline. When I try to set default_compute_target='local', the compute target cannot be found:
local not found in workspace, assume this is an AmlCompute
...
File "/home/amirabdi/miniconda3/envs/stm/lib/python3.8/site-packages/azure/ml/component/run_settings.py", line 596, in _get_compute_type
raise InvalidTargetSpecifiedError(message="Cannot find compute '{}' in workspace.".format(compute_name))
azure.ml.component._util._exceptions.InvalidTargetSpecifiedError: InvalidTargetSpecifiedError:
Message: Cannot find compute 'local' in workspace.
InnerException None
ErrorResponse
{
"error": {
"code": "UserError",
"message": "Cannot find compute 'local' in workspace."
}
}
The local run, for example, can be achieved with azureml.core.ScriptRunConfig.
src = ScriptRunConfig(script="train.py", compute_target="local", environment=myenv)
run = exp.submit(src)

We have different types of compute targets and one of those is local computer.
Create an experiment
from azureml.core import Experiment
experiment_name = 'my_experiment'
experiment = Experiment(workspace=ws, name=experiment_name)
Select the compute target where we need to run
compute_target='local'
If the compute_target is not mentioned or ScriptRunConfig is not mentioned, then AzureML will run the script locally
from azureml.core import Environment
myenv = Environment("user-managed-env")
myenv.python.user_managed_dependencies = True
Create the script job, based on the procedure mentioned in link
Submit the experiment
run = experiment.submit(config=src)
run.wait_for_completion(show_output=True)
To check for the troubleshooting the procedure, check with link

Terraform Vcloud provider is crashing when using terraform plan

I am trying to automate the deployment of VM's in Vcloud using terraform.
The server that I am using doesn't have an internet connection so I had to install terraform and VCD provider offline.
Terrafom init worked but when I use terraform plan is crashing...
Terraform version: 1.0.11
VCD provider version: 3.2.0(I am using this version because we have vcloud 9.7).
This is a testing script, to see if terraform works
terraform {
required_providers {
vcd = {
source = "vmware/vcd"
version = "3.2.0"
}
}
}
provider "vcd" {
user = "test"
password = "test"
url = "https://test/api"
auth_type = "integrated"
vdc = "Org1VDC"
org = "System"
max_retry_timeout = "60"
allow_unverified_ssl = "true"
}
resource "vcd_org_user" "my-org-admin" {
org = "my-org"
name = "my-org-admin"
description = "a new org admin"
role = "Organization Administrator"
password = "change-me"
}
When I run terraform plan I get the following error:
Error: Plugin did not respond
...
The plugin encountered an error, and failed to respond to the plugin.(*GRPCProvider).ConfigureProvider call. The plugin logs may contain more details
Stack trace from the terraform-provider-vcd_v3.2.0 plugin:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0xaf3b75]
...
Error: The terraform-provider-vcd_v3.2.0 plugin crashed!
In the logs I can see a lot of DEBUG messages where the provider is trying to connect to github. provider.terraform-provider-vcd_v3.2.0: github.com/vmware/go-vcloud-director/v2/govcd.(*VCDClient).Authenticate(...)
And for ERROR messages I only saw 2:
plugin.(*GRPCProvider).ConfigureProvider: error="rpc error: code = Unavailable desc = transport is closing"
Failed to read plugin lock file .terraform/plugins/linux_amd64/lock.json: open .terraform/plugins/linux_amd64/lock.json: no such file or directory
This is the first time when am I am configuring Terraform offline and am using VCD provider.
Did I miss something?

I have found the issue.
At the URL I was using the IP address of the Vcloud api, and for some reason terraform didn't like that and was causing the crash, after changing to the FQDN, terraform started working again.
Kind regards

Docker and AzureKeyVault: unable to load shared library 'libsecret-1.so.0'

I have Asp.net core Xunit integration tests that connect to MongoDb to test basic repositories on collections. The tests are built and run in a container in AKS. I have setup the test fixture to connect Azure Key Vault to retrieve connection string to a MongoDb.
var pathToSetting= Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location);
var configuration = new ConfigurationBuilder()
.SetBasePath(pathToSetting)
.AddJsonFile("appsettings.json")
.AddEnvironmentVariables();
var secretClient = new SecretClient(
new Uri("url_to_Azure_keyVault"),
new DefaultAzureCredential(),
new SecretClientOptions()
{
Retry =
{
Delay = TimeSpan.FromSeconds(2),
MaxDelay = TimeSpan.FromSeconds(4),
MaxRetries = 2,
Mode = RetryMode.Exponential
}
});
configuration.AddAzureKeyVault(secretClient, new KeyVaultSecretManager());
I am using the following Docker file for the integration tests:
#Grab an OS image made to run the .Net Core SDK
FROM mcr.microsoft.com/dotnet/core/sdk:3.1 AS build
#copy files for build
WORKDIR /testProject
COPY . .
RUN dotnet build tests/integrationTest.csproj --output /testProject/artifacts
FROM mcr.microsoft.com/dotnet/core/sdk:3.1 AS final
COPY --from=build ["/testProject/artifacts", "/testProject/artifacts"]
ENTRYPOINT dotnet test /testProject/artifacts/integrationTest.dll
The tests run fine locally from Visual Studio but fail with exception below when run in container both locally and in AKS.
[xUnit.net 00:00:03.10] IntegrationTest1 [FAIL]X
Test1 [1ms]
Error Message:
System.AggregateException : One or more errors occurred. (SharedTokenCacheCredential authentication failed: Persistence check failed. Inspect inner exception for details) (The following constructor parameters did not have matching fixture data: TestFixture testFixture)
---- Azure.Identity.AuthenticationFailedException : SharedTokenCacheCredential authentication failed: Persistence check failed. Inspect inner exception for details
-------- Microsoft.Identity.Client.Extensions.Msal.MsalCachePersistenceException : Persistence check failed. Inspect inner exception for details
------------ System.DllNotFoundException : Unable to load shared library 'libsecret-1.so.0' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment
variable: liblibsecret-1.so.0: cannot open shared object file: No such file or directory
Any ideas how to troubleshoot this error ?

I came across this potential fix while working on my own issue:
Wherever you create new DefaultAzureCredentialOptions, you should also set the property ExcludeSharedTokenCacheCredential to true.
In your WSL environment install libsecret-1-dev. In Ubuntu for example, run the command sudo apt install libsecret-1-dev. This will add libsecret-1.so.0 to your system so that MSAL can find it.
https://hungyi.net/posts/wsl2-msal-extensions/
It didn't work for me, but I am using a docker container that doesn't have full access to apt. I can't install libsecret-1-dev.

Not a root cause, but same error popped up for me this morning. Rolling Microsoft.Web.Identity package down from 1.7.0 to 1.6.0 did the trick.
Looks like from the GitHub issues on other Azure packages, wrapping these exceptions is a common bug that gets logged.

Switching Azure.Identity 1.2.3 to 1.2.2 did the trick for me (this page helped me https://hungyi.net/posts/wsl2-msal-extensions/).

How do I resolve Access Denied error when trying to deploy Echobot in Azure?

I had a operating bot that I tried to push an update to and got a failure response. I tried building and deploying in Kudu with no luck either. Just as a sanity check I also made a brand new echobot on Azure and tried to run the build and deploy commands in Kudu Console.
EDIT: Meant to mention I've seen a few other mentions of similar issues including:
Error - Access is denied - deployment to Azure App Services
https://github.com/projectkudu/kudu/issues/3177
https://medium.com/rare-crew/hot-issue-on-azure-and-deployment-of-apps-by-kudu-scripts-dotnet-sdk-v3-1-301-92d6e336756a
MSBUILD : error MSB1025:Unhandled exception. An internal failure occurred while running MSBuild.
System.ComponentModel.Win32Exception (5): Access is denied.
at System.Diagnostics.Process.set_PriorityClassCore(ProcessPriorityClass value)
at System.Diagnostics.Process.set_PriorityClass(ProcessPriorityClass value)
at Microsoft.Build.CommandLine.MSBuildApp.Execute(String[] commandLine)
System.ComponentModel.Win32Exception (5): Access is denied.
at System.Diagnostics.Process.set_PriorityClassCore(ProcessPriorityClass value)
at System.Diagnostics.Process.set_PriorityClass(ProcessPriorityClass value)
at Microsoft.Build.CommandLine.MSBuildApp.Execute(String[] commandLine)
at Microsoft.Build.CommandLine.MSBuildApp.Main(String[] args)
Failed exitCode=-532462766, command=dotnet restore "EchoBot.sln"
An error has occurred during web site deployment.

We arrived at an answer in this thread: Microsoft Help Link
For the default Echobot project generated by Azure you need a global.json file in "D:\home\site\wwwroot" with the following code. You can get to this folder by using the Kudu debug console.
{
"sdk": {
"version": "3.1.202"
}
}

The issue is recently introduced by latest dotnet sdk (2.1.515 and 3.1.301) versions. This impacts the projects with custom deployment script still using dotnet restore and publish to build.Could you please try the below workaround in deploy.cmd to fix it.
SET MSBUILD_PATH=%ProgramFiles(x86)%\MSBuild-16.4\MSBuild\Current\Bin\MSBuild.exe
call :ExecuteCmd "%MSBUILD_PATH%" -t:Restore "%DEPLOYMENT_SOURCE%\my-solution.sln"
call :ExecuteCmd "%MSBUILD_PATH%" -t:Publish "%DEPLOYMENT_SOURCE%\vstar-next\my-proj.csproj" -p:OutputPath="%DEPLOYMENT_TEMP%" -p:Configuration=Dev

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Azure Machine learning fails when trying to deploy model - azure

Related

Getting error while deploying using flyctl (flyio)

How to submit local jobs with dsl.pipeline

Terraform Vcloud provider is crashing when using terraform plan

Docker and AzureKeyVault: unable to load shared library 'libsecret-1.so.0'

How do I resolve Access Denied error when trying to deploy Echobot in Azure?

Categories

Resources