IotEdge - Error calling Create module image-classifier-service - azure

I'm very new to Azure IoT Edge and I'm trying to deploy to my Raspberry PI : Image Recognition with Azure IoT Edge and Cognitive Services
but after Build & Push IoT Edge Solution and Deploy it to Single Device ID I see none of those 2 modules listed in Docker PS -a & Iotedge list
And when try to check it on EdgeAgent Logs there's error message and it seems EdgeAgent get error while creating those Modules (camera-capture and image-classifier-service)
I've tried :
1. Re-build it from fresh folder package
2. Pull the image manually from Azure Portal and run the image manually by script
I'm stuck on this for days.
in deployment.arm32v7.json for those modules I define the Image with registered registry url :
"modules": {
"camera-capture": {
"version": "1.0",
"type": "docker",
"status": "running",
"restartPolicy": "always",
"settings": {
"image": "zzzz.azurecr.io/camera-capture-opencv:1.1.12-arm32v7",
"createOptions": "{\"Env\":[\"Video=0\",\"azureSpeechServicesKey=2f57f2d9f1074faaa0e9484e1f1c08c1\",\"AiEndpoint=http://image-classifier-service:80/image\"],\"HostConfig\":{\"PortBindings\":{\"5678/tcp\":[{\"HostPort\":\"5678\"}]},\"Devices\":[{\"PathOnHost\":\"/dev/video0\",\"PathInContainer\":\"/dev/video0\",\"CgroupPermissions\":\"mrw\"},{\"PathOnHost\":\"/dev/snd\",\"PathInContainer\":\"/dev/snd\",\"CgroupPermissions\":\"mrw\"}]}}"
}
},
"image-classifier-service": {
"version": "1.0",
"type": "docker",
"status": "running",
"restartPolicy": "always",
"settings": {
"image": "zzzz.azurecr.io/image-classifier-service:1.1.5-arm32v7",
"createOptions": "{\"HostConfig\":{\"Binds\":[\"/home/pi/images:/images\"],\"PortBindings\":{\"8000/tcp\":[{\"HostPort\":\"80\"}],\"5679/tcp\":[{\"HostPort\":\"5679\"}]}}}"
}
Error message from EdgeAgent Logs :
(Inner Exception #0) Microsoft.Azure.Devices.Edge.Agent.Edgelet.EdgeletCommunicationException- Message:Error calling Create module
image-classifier-service: Could not create module image-classifier-service
caused by: Could not pull image zzzzz.azurecr.io/image-classifier-service:1.1.5-arm32v7
caused by: Get https://zzzzz.azurecr.io/v2/image-classifier-service/manifests/1.1.5-arm32v7: unauthorized: authentication required
When trying to run the pulled image by script :
sudo docker run --rm --name testName -it zzzz.azurecr.io/camera-capture-opencv:1.1.12-arm32v7
None
I get this error :
Camera Capture Azure IoT Edge Module. Press Ctrl-C to exit.
Error: Time:Fri May 24 10:01:09 2019 File:/usr/sdk/src/c/iothub_client/src/iothub_client_core_ll.c Func:retrieve_edge_environment_variabes Line:191 Environment IOTEDGE_AUTHSCHEME not set
Error: Time:Fri May 24 10:01:09 2019 File:/usr/sdk/src/c/iothub_client/src/iothub_client_core_ll.c Func:IoTHubClientCore_LL_CreateFromEnvironment Line:1572 retrieve_edge_environment_variabes failed
Error: Time:Fri May 24 10:01:09 2019 File:/usr/sdk/src/c/iothub_client/src/iothub_client_core.c Func:create_iothub_instance Line:941 Failure creating iothub handle
Unexpected error IoTHubClient.create_from_environment, IoTHubClientResult.ERROR from IoTHub

When you pulled the image directly with docker run, it pulled but then failed to run outside of the edge runtime, which is expected. But when the edge agent tried to pull it, it failed because it was not authorized. No credentials were supplied to the runtime, so it attempted to access the registry anonymously.
Make sure that you add your container registry credentials to the deployment so that edge runtime can pull images. The deployment should contain something like the following in the runtime settings:
"MyRegistry" :{
"username": "<username>",
"password": "<password>",
"address": "<registry-name>.azurecr.io"
}
As #silent pointed out in the comments, the documentation is here, including an example deployment that includes container registry credentials.

Related

How do I access npm log files in GKE?

I'm running different nodejs microservices on Google Kubernetes Services.
Sometimes these services crash and according to Cloud Logging, I can find detailed information in a logging file. For example, the logging message says
{
"textPayload": "npm ERR! /root/.npm/_logs/2021-10-27T11_26_28_534Z-debug.log\n",
"insertId": "zoqxk8wvkuofhslm",
"resource": {
"type": "k8s_container",
"labels": {
"pod_name": "client-depl-7f679c6b49-5d9tz",
"container_name": "client",
"namespace_name": "production",
"cluster_name": "cluster-1",
"location": "europe-west3-a",
"project_id": "XXX"
}
},
"timestamp": "2021-10-27T11:26:28.701252670Z",
"severity": "ERROR",
"labels": {
"k8s-pod/app": "client",
"k8s-pod/skaffold_dev/run-id": "b5518659-05d6-4c08-9b55-9d58fdd5807f",
"k8s-pod/pod-template-hash": "7f679c6b49",
"compute.googleapis.com/resource_name": "gke-cluster-1-pool-1-8bfc60b2-ag86",
"k8s-pod/app_kubernetes_io/managed-by": "skaffold"
},
"logName": "projects/xxx-productive/logs/stderr",
"receiveTimestamp": "xxx"
}
Where do I find these logs on Google Cloud Platform?
---------------- Edit 2021.10.28 ---------------------------
I should clarify that I am already using the logs explorer. This is what I see there:
The logs show 7 consecutive error entries about npm failing. The last two entries indicate that there are more information in a log file "/root/.npm/_logs/2021-10-27T11_26_28_534Z-debug.log".
Does this log file has more info about the failure or is all the info I get in these 7 error log entries?
Thanks
kubectl logs <your_pod>
You can use GCP Logs Explorer
Assuming you already Enable Logging and Monitoring, You can view logs on:
a. Go to the Logs explorer in the Cloud Console.
b. Click Resource. Under ALL_RESOURCE_TYPES, select Kubernetes Container.
c. Under CLUSTER_NAME, select the name of your user cluster.
d. Under NAMESPACE_NAME, select default.
e. Click Add and then click Run Query.
f. Under Query results, you can see log entries from the monitoring-example Deployment. For example:
{
"textPayload": "2020/11/14 01:24:24 Starting to listen on :9090\n",
"insertId": "1oa4vhg3qfxidt",
"resource": {
"type": "k8s_container",
"labels": {
"pod_name": "monitoring-example-7685d96496-xqfsf",
"cluster_name": ...,
"namespace_name": "default",
"project_id": ...,
"location": "us-west1",
"container_name": "prometheus-example-exporter"
}
},
"timestamp": "2020-11-14T01:24:24.358600252Z",
"labels": {
"k8s-pod/pod-template-hash": "7685d96496",
"k8s-pod/app": "monitoring-example"
},
"logName": "projects/.../logs/stdout",
"receiveTimestamp": "2020-11-14T01:24:39.562864735Z"
}
How about
log into the pod while it is alive
kubectl exec -it your-pod -- sh
wait for it to crash and watch the crash file in real time while the pod is not restarted yet :)
How to login to a GCP Pod:
From the Google Cloud Platform main menu go to Kubernetes Engine -> Workloads
Click on the workload you're interested in:
Find the Managed Pods section and click on the Pod you want to access:
Click on KUBECTL -> Exec -> [name of workload/namespace]
A terminal should appear at the bottom of the browser page, SSHing you into the pod. You can look around for your log file from inside here

Azure Container Service (AKS) kubeconfig file outdated

I am learning about K8s and did setup a release pipeline with a kubectl apply. I've setup the AKS cluster via Terraform and on the first run all seemed fine. Once I destroyed the cluster I reran the pipeline, I get issues which I believe are related to the kubeconfig file mentioned in the exception. I tried the cloud shell etc. to get to the file or reset it but I wasn't succesful. How can I get back to a clean state?
2020-12-09T09:08:51.7047177Z ##[section]Starting: kubectl apply
2020-12-09T09:08:51.7482440Z ==============================================================================
2020-12-09T09:08:51.7483217Z Task : Kubectl
2020-12-09T09:08:51.7483729Z Description : Deploy, configure, update a Kubernetes cluster in Azure Container Service by running kubectl commands
2020-12-09T09:08:51.7484058Z Version : 0.177.0
2020-12-09T09:08:51.7484996Z Author : Microsoft Corporation
2020-12-09T09:08:51.7485587Z Help : https://learn.microsoft.com/azure/devops/pipelines/tasks/deploy/kubernetes
2020-12-09T09:08:51.7485955Z ==============================================================================
2020-12-09T09:08:52.7640528Z [command]C:\ProgramData\Chocolatey\bin\kubectl.exe --kubeconfig D:\a\_temp\kubectlTask\1607504932712\config apply -f D:\a\r1\a/medquality-cordapp/k8s
2020-12-09T09:08:54.1555570Z Unable to connect to the server: dial tcp: lookup mq-k8s-dfee38f6.hcp.switzerlandnorth.azmk8s.io: no such host
2020-12-09T09:08:54.1798118Z ##[error]The process 'C:\ProgramData\Chocolatey\bin\kubectl.exe' failed with exit code 1
2020-12-09T09:08:54.1853710Z ##[section]Finishing: kubectl apply
Update, workflow tasks of the release pipeline:
Initially I get the artifact, clone of the repo containing the k8s yamls, then the stage does a kubectl apply.
"workflowTasks": [
{
"environment": {},
"taskId": "cbc316a2-586f-4def-be79-488a1f503564",
"version": "0.*",
"name": "kubectl apply",
"refName": "",
"enabled": true,
"alwaysRun": false,
"continueOnError": false,
"timeoutInMinutes": 0,
"definitionType": null,
"overrideInputs": {},
"condition": "succeeded()",
"inputs": {
"kubernetesServiceEndpoint": "82e5971b-9ac6-42c6-ac43-211d2f6b60e4",
"namespace": "",
"command": "apply",
"useConfigurationFile": "false",
"configuration": "",
"arguments": "-f $(System.DefaultWorkingDirectory)/medquality-cordapp/k8s",
"secretType": "dockerRegistry",
"secretArguments": "",
"containerRegistryType": "Azure Container Registry",
"dockerRegistryEndpoint": "",
"azureSubscriptionEndpoint": "",
"azureContainerRegistry": "",
"secretName": "",
"forceUpdate": "true",
"configMapName": "",
"forceUpdateConfigMap": "false",
"useConfigMapFile": "false",
"configMapFile": "",
"configMapArguments": "",
"versionOrLocation": "version",
"versionSpec": "1.7.0",
"checkLatest": "false",
"specifyLocation": "",
"cwd": "$(System.DefaultWorkingDirectory)",
"outputFormat": "json",
"kubectlOutput": ""
}
}
]
```
I can see you are using kubernetesServiceEndpoint as the Service connection type in Kubectl task.
Once I destroyed the cluster I reran the pipeline, I get issues....
If the cluster was destroyed. The kubernetesServiceEndpoint in azure devops is still connected to the origin cluster. Kubectl task which using the origin kubernetesServiceEndpoint is still looking for the old cluster. And it will fail with above error, since the old cluster was destroyed.
You can fix this issue by updating the kubernetesServiceEndpoint in azure devops with the newly created cluster:
Go to Azure devops Project settings-->Service connections--> Find your Kubernetes Service connection-->Click Edit to update the configuration.
But if your kubernete cluster gets destroyed and recreated frequently. I would suggest using Azure Resource Manager as the Service connection type to connect to the cluster in Kubectl task. See below screenshot.
By using azureSubscriptionEndpoint and specifying azureResourceGroup, if only the cluster's name doesnot change, It doesnot matter how many times the cluster is recreated.
See document to create an Azure Resource Manager service connection
When you destroy and reprovision AKS cluster the kube API URL and some other things change, but as you found out, nothing updates this automatically on your configured clients.
What I do to get access new and reprovisioned AKS clusters is :
az aks get-credentials --subscription <sub> -g <rg> -n <aksname> -a --overwrite

Azure VM: can't install Qualys extension

I run the same code snippet as for other extensions:
az vm extension set \
--resource-group "azure-vm-arm-rg" \
--vm-name "azure-vm" \
--name "WindowsAgent.AzureSecurityCenter" \
--publisher "Qualys"
..and I'm getting:
The handler for VM extension type 'Qualys.WindowsAgent.AzureSecurityCenter'
has reported terminal failure for VM extension 'WindowsAgent.AzureSecurityCenter'
with error message: 'Enable failed for plugin (name: Qualys.WindowsAgent.AzureSecurityCenter,
version 1.0.0.10) with exception Command
C:\Packages\Plugins\Qualys.WindowsAgent.AzureSecurityCenter\1.0.0.10\enableCommandHndlr.cmd
of Qualys.WindowsAgent.AzureSecurityCenter has exited with Exit code: 4306'.
I have no issues installing this extension via Azure UI in Security Center
I suspect license to be the root cause but I don't have any dedicated licenses, I believe Security center manages them automatically
Any ideas how to install Qualys extension automatically?
I encountered the same issue. It was because the extension was added too soon after the vm had started. The pre-req is that the Azure Virtual Machine agent should be running on the vm before the extension is added.
for my solution, I added dependencies on other extensions before running this extension. That gave enough time for the machine to start and have the Azure Virtual Machine agent running before qualys extension is added.
{
"type": "microsoft.compute/virtualmachines/providers/serverVulnerabilityAssessments",
"apiVersion": "2015-06-01-preview",
"name": "[concat(parameters('virtualMachineName'), '/Microsoft.Security/Default')]",
"dependsOn": [
"[concat('Microsoft.Compute/virtualMachines/', parameters('virtualMachineName'))]",
"[concat('Microsoft.Compute/virtualMachines/', parameters('virtualMachineName'), '/extensions/AzurePolicyforWindows')]",
"[concat('Microsoft.Compute/virtualMachines/', parameters('virtualMachineName'), '/extensions/Microsoft.Insights.VMDiagnosticsSettings')]",
"[concat('Microsoft.Compute/virtualMachines/', parameters('virtualMachineName'), '/extensions/AzureNetworkWatcherExtension')]"
]
}
Make sure you have no Azure Policies configured which do things like require tags, as this can block the extension installation and only give the error message The resource operation completed with terminal provisioning state 'Failed'..

How to deploy a Linux Azure Function using the Github Docker Registry

I cannot get a deployment of an Azure Function by private repository, using then new Github artifact repo for Docker to work (https://github.com/features/packages).
My linux_fx_version is:
'linux_fx_version': 'DOCKER|{}'.format(self.docker_image_id)
with docker_image_id having the value organisation/project-name/container-name:latest
For the other settings, I am using
{ "name": "DOCKER_REGISTRY_SERVER_PASSWORD", "value": self.docker_password },
{ "name": "DOCKER_REGISTRY_SERVER_USERNAME", "value": self.docker_username },
{ "name": "DOCKER_REGISTRY_SERVER_URL", "value": self.docker_url },
with the docker_url being https://docker.pkg.github.com/, and the password being the token with read:packages
Things look good, and yet I get the following (I am not able to fetch any deployment logs as the runtime is unreachable).
Error:
Azure Functions Runtime is unreachable. Click here for details on storage configuration.
Solution found.
Use https://docker.pkg.github.com/ as the docker URL,
and docker.pkg.github.com/<org>/<project-name>/<container-name>:<version> as the linux_fx_version

Azure Function blobTrigger not registered

As title, when I try to run my nodejs based azure function, I come across the following error:
The following 1 functions are in error: [7/2/19 1:41:17 AM] ***: The binding type(s) 'blobTrigger' are not registered. Please ensure the type is correct and the binding extension is installed.
I tried func extensions install --force with no luck still, any idea? My development environment is macOS and I tried both nodejs based azure-functions-core-tools and brew based install both doesn't work.
The most scary part is this used to work fine on the same machine, all a sudden it just failed to work.
Basically, you can refer to the offical tutorial for Linux Create your first function hosted on Linux using Core Tools and the Azure CLI (preview) to start up your work.
Due to the same shell bash used in MacOS and Linux, I will start up my sample demo for you on Linux and avoid using those incompatible operations. First of all, assumed that there is an usable NodeJS runtime in your environment. The version of node and npm is v10.16.0 and 6.9.0.
To install azure-functions-core-tools via npm and inspect it, as the figure below.
Next to init a project MyFunctionProj via func
Then to new a function with blob trigger
There is an issue about the requirement for .NET Core SDK. So I move to https://www.microsoft.com/net/download to install it, here is incompatible with MacOS, but I think you can easy to fix it by yourself. So I followed the offical installation instruction to install it.
After installed .NET Core SDK, try to func new again.
And completed like this.
To change two configuration files MyFunctionProj/local.settings.json and MyFunctionProj/MyBlobTrigger/function.json, as below.
MyFunctionProj/local.settings.json
{
"IsEncrypted": false,
"Values": {
"FUNCTIONS_WORKER_RUNTIME": "node",
"AzureWebJobsStorage": "<your real storage connection string like `DefaultEndpointsProtocol=https;AccountName=<your account name>;AccountKey=<your account key>;EndpointSuffix=core.windows.net`"
}
}
MyFunctionProj/MyBlobTrigger/function.json
{
"bindings": [
{
"name": "myBlob",
"type": "blobTrigger",
"direction": "in",
"path": "<the container name you want to monitor>/{name}",
"connection": "AzureWebJobsStorage"
}
]
}
Then, command func host start --build to start up it without any error.
Let's upload a test file named test.txt via Azure Storage Explorer to the container <the container name you want to monitor> which be configured in the function.json file. And you will see that MyBlobTrigger has been triggered and work fine.
Hope it helps.

Resources