how to make azure external.metrics.k8s adapter work?

how to make azure external.metrics.k8s adapter work? - azure

I've setup Azure external metrics adapter following this document "https://github.com/Azure/azure-k8s-metrics-adapter/tree/master/samples/servicebus-queue"
After the helm installation using service-principal when executing the command kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1" | jq I should be getting an output as suggested by the document. but instead I'm facing an error stating Error from server (ServiceUnavailable): the server is currently unable to handle the request
The helm installation went successful and below are the logs
I0116 12:49:36.216094 1 controller.go:40] Setting up external metric
event handlers I0116 12:49:36.216148 1 controller.go:52] Setting up
custom metric event handlers I0116 12:49:36.216528 1 controller.go:69]
initializing controller I0116 12:49:36.353905 1 main.go:104] Looking
up subscription ID via instance metadata I0116 12:49:36.359887 1
instancemetadata.go:40] connected to sub: ********************* I0116
12:49:36.416858 1 controller.go:77] starting 2 workers with 1000000000
interval I0116 12:49:36.417062 1 controller.go:88] Worker starting
I0116 12:49:36.417068 1 controller.go:88] Worker starting I0116
12:49:36.417074 1 controller.go:98] processing item I0116
12:49:36.417078 1 controller.go:98] processing item I0116
12:49:36.680065 1 serving.go:312] Generated self-signed cert
(apiserver.local.config/certificates/apiserver.crt,
apiserver.local.config/certificates/apiserver.key) I0116
12:49:37.197936 1 secure_serving.go:116] Serving securely on [::]:6443
When I execute the command kubectl api-versions external.metrics.k8s.io/v1beta1 is displayed in the list. So this proves that the installation went successful. But why am I not able to hit the api???

Solved it. Initially I was installing in my custom namespace. Looks like Azure metrics adapter will work only if it is installed in namespace "custom-metrics". Probably they should mention it somewhere in the document. It cost me 2 days of trouble shooting to figure this out :-(

Related

Active message count in Azure service bus keep decrease after kill the app

I am using ServiceBusProcessorClient consume the events from topic:
ServiceBusProcessorClient serviceBusProcessorClient = new ServiceBusClientBuilder()
.connectionString(busConnectionString)
.processor()
.disableAutoComplete()
.topicName(topicName)
.subscriptionName(subscriptionName)
.processMessage(processMessage)
.processError(context -> processError(context,countdownLatch))
.maxConcurrentCalls(maxConcurrentCalls)
.buildProcessorClient();
serviceBusProcessorClient.start();
But after kill the app ,The message count in Azure service bus keep decrease until reach 0 .
I can not understand what goes wrong in my implementation.
The Topic configuration :
topic config
The subscription configuration :
subscription config

Looks like helm deletes using the background propagation policy which lets the garbage collector to delete in the background. This is probably why your service is processing messages even after you run uninstall.
You would have to kill the process directly in addition to helm uninstall to not have anymore messages from being processed.

IIS loghttp module spams EventLog with errors about a worker process which cannot obtain custom log data for N requests

we have setup our web application on a new Windows Server Datacenter 2019 edition.
Previously it ran on a Windows 2008 server.
The application runs smoothly, but the EventLog is being overflooded with these errors:
The loghttp module in the worker process with id '13104' could not obtain custom log data for '1' requests. The data field contains the error code.
The loghttp module in the worker process with id '11500' could not obtain custom log data for '1' requests. The data field contains the error code.
The loghttp module in the worker process with id '13536' could not obtain custom log data for '1' requests. The data field contains the error code.
We have a bunch of AppPools running, but the errors only comes in relation to 3 of them.
The relevant ones are identified via the worker process id, see below screendump.
Pattern, logs every minutes
If I stop the IIS Logging Module the EventLogs also stops.
There are no Custom Fields being logged.
The Logging module is using w3c-format with only Standard fields added.
C:\Windows\System32\LogFiles\HTTPERR shows nothing besides lines like this:
2021-11-02 09:10:58 10.217.24.201 34241 10.217.10.240 443 - - - - - - Timer_ConnectionIdle -
2021-11-02 09:10:58 10.217.24.201 5601 10.217.10.240 443 - - - - - - Timer_ConnectionIdle -
IIS Logs are being written to file OK.
There is a pattern in the log, so every minute the same 5-6 EventLogs-lines are written
Pattern with logs every minute

Stackdriver-trace on Google Cloud Run failing, while working fine on localhost

I have a node server running on Google Cloud Run. Now I want to enable stackdriver tracing. When I run the service locally, I am able to get the traces in the GCP. However, when I run the service as Google Cloud Run, I am getting an an error:
"#google-cloud/trace-agent ERROR TraceWriter#publish: Received error with status code 403 while publishing traces to cloudtrace.googleapis.com: Error: The request is missing a valid API key."
I made sure that the service account has tracing agent role.
First line in my app.js
require('#google-cloud/trace-agent').start();
running locally I am using .env file containing
GOOGLE_APPLICATION_CREDENTIALS=<path to credentials.json>
According to https://github.com/googleapis/cloud-trace-nodejs These values are auto-detected if the application is running on Google Cloud Platform so, I don't have this credentials on the gcp image

There are two challenges to using this library with Cloud Run:
Despite the note about auto-detection, Cloud Run is an exception. It is not yet autodetected. This can be addressed for now with some explicit configuration.
Because Cloud Run services only have resources until they respond to a request, queued up trace data may not be sent before CPU resources are withdrawn. This can be addressed for now by configuring the trace agent to flush ASAP
const tracer = require('#google-cloud/trace-agent').start({
serviceContext: {
service: process.env.K_SERVICE || "unknown-service",
version: process.env.K_REVISION || "unknown-revision"
},
flushDelaySeconds: 1,
});
On a quick review I couldn't see how to trigger the trace flush, but the shorter timeout should help avoid some delays in seeing the trace data appear in Stackdriver.
EDIT: While nice in theory, in practice there's still significant race conditions with CPU withdrawal. Filed https://github.com/googleapis/cloud-trace-nodejs/issues/1161 to see if we can find a more consistent solution.

Openshift 3 App Deployment Failed: Took longer than 600 seconds to become ready

I have a problem with my openshift 3 setup, based on Node.js + MongoDB (Persistent) https://github.com/openshift/nodejs-ex.git
Latest App Deployment: nodejs-mongo-persistent-7: Failed
--> Scaling nodejs-mongo-persistent-7 to 1
--> Waiting up to 10m0s for pods in rc nodejs-mongo-persistent-7 to become ready
error: update acceptor rejected nodejs-mongo-persistent-7: pods for rc "nodejs-mongo-persistent-7" took longer than 600 seconds to become ready
Latest Build: Complete
Pushing image 172.30.254.23:5000/husk/nodejs-mongo-persistent:latest ...
Pushed 5/6 layers, 84% complete
Pushed 6/6 layers, 100% complete
Push successful
I have no idea how to debug this? Can you help please.

Check what went wrong in console: oc get events
Failed to pull image? Make sure you included a proper secret

Unable to update VM with nodejs app on Google App Engine

When I try to deploy from the gcloud CLI I get the following error.
Copying files to Google Cloud Storage...
Synchronizing files to [gs://staging.logically-abstract-www-site.appspot.com/].
Updating module [default]...\Deleted [https://www.googleapis.com/compute/v1/projects/logically-abstract-www-site/zones/us-central1-f/instances/gae-builder-vm-20151030t150724].
Updating module [default]...failed.
ERROR: (gcloud.preview.app.deploy) Error Response: [4] Timed out creating VMs.
My app.yaml is:
runtime: nodejs
vm: true
api_version: 1
automatic_scaling:
min_num_instances: 2
max_num_instances: 20
cool_down_period_sec: 60
cpu_utilization:
target_utilization: 0.5
and I am logged in successfully and have the correct project ID. I see the new version created in the Cloud Console for App Engine, but the error is after that it seems.
In the stdout log I see both instances go up with the last console.log statement I put in the app after it starts listening on the port, but in the shutdown.log I see "app was unhealthy" and in syslog I see "WARNING: never got healthy response from app, but sending /_ah/start query anyway."

From my experience with nodejs using Google Cloud App Engine, I see that "Timed out creating VMs" is neither a traditional timeout nor does it have to do with creating VMs. I had found that other errors were reported during the launch of the server --which happens to be right after VMs are created. So, I recommend checking console output to see if it tells you anything.
To see the console output:
For a vm instance, then go to /your/ vm instances and click the vm instance you want, then scroll towards the bottom and click "Serial console output".
For stdout console logging, go monitoring /your/ logs then change the log type dropdown from Request to be stdout.
I had found differences in the process.env when running locally versus in the cloud. I hope you find your solution too --good luck!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

how to make azure external.metrics.k8s adapter work? - azure

Solved it. Initially I was installing in my custom namespace. Looks like Azure metrics adapter will work only if it is installed in namespace "custom-metrics". Probably they should mention it somewhere in the document. It cost me 2 days of trouble shooting to figure this out :-(

Related

Active message count in Azure service bus keep decrease after kill the app

IIS loghttp module spams EventLog with errors about a worker process which cannot obtain custom log data for N requests

Stackdriver-trace on Google Cloud Run failing, while working fine on localhost

Openshift 3 App Deployment Failed: Took longer than 600 seconds to become ready

Unable to update VM with nodejs app on Google App Engine

Categories

Resources