Could not get HttpClient cache - No ThreadContext available for thread id=1 - sap-cloud-sdk

I'm working on upgrading our service to use 3.63.0 (upgrading from 3.57.0) and I've noticed the following warning (with stack trace) shows up in the logs that wasn't there on the previous version:
2022-02-18 14:03:41.038 WARN 1088 --- [ main] c.s.c.s.c.c.AbstractHttpClientCache : Could not get HttpClient cache.
com.sap.cloud.sdk.cloudplatform.thread.exception.ThreadContextAccessException: No ThreadContext available for thread id=1.
at com.sap.cloud.sdk.cloudplatform.thread.ThreadLocalThreadContextFacade.lambda$tryGetCurrentContext$0(ThreadLocalThreadContextFacade.java:39) ~[cloudplatform-core-3.63.0.jar:na]
at io.vavr.Value.toTry(Value.java:1414) ~[vavr-0.10.4.jar:na]
at com.sap.cloud.sdk.cloudplatform.thread.ThreadLocalThreadContextFacade.tryGetCurrentContext(ThreadLocalThreadContextFacade.java:37) ~[cloudplatform-core-3.63.0.jar:na]
at io.vavr.control.Try.flatMapTry(Try.java:490) ~[vavr-0.10.4.jar:na]
at io.vavr.control.Try.flatMap(Try.java:472) ~[vavr-0.10.4.jar:na]
at com.sap.cloud.sdk.cloudplatform.thread.ThreadContextAccessor.tryGetCurrentContext(ThreadContextAccessor.java:84) ~[cloudplatform-core-3.63.0.jar:na]
at com.sap.cloud.sdk.cloudplatform.connectivity.RequestScopedHttpClientCache.getCache(RequestScopedHttpClientCache.java:28) ~[cloudplatform-connectivity-3.63.0.jar:na]
at com.sap.cloud.sdk.cloudplatform.connectivity.AbstractHttpClientCache.tryGetOrCreateHttpClient(AbstractHttpClientCache.java:78) ~[cloudplatform-connectivity-3.63.0.jar:na]
at com.sap.cloud.sdk.cloudplatform.connectivity.AbstractHttpClientCache.tryGetHttpClient(AbstractHttpClientCache.java:46) ~[cloudplatform-connectivity-3.63.0.jar:na]
at com.sap.cloud.sdk.cloudplatform.connectivity.HttpClientAccessor.tryGetHttpClient(HttpClientAccessor.java:153) ~[cloudplatform-connectivity-3.63.0.jar:na]
at com.sap.cloud.sdk.cloudplatform.connectivity.HttpClientAccessor.getHttpClient(HttpClientAccessor.java:131) ~[cloudplatform-connectivity-3.63.0.jar:na]
at com.octanner.mca.service.MarketingCloudApiContactService.uploadContacts(MarketingCloudApiContactService.java:138) ~[classes/:na]
...
This happens when the following calls are made...
Using the lower level API
HttpClient httpClient = HttpClientAccessor.getHttpClient(destination); // warning happens here
ODataRequestResultMultipartGeneric batchResult = requestBatch.execute(httpClient);
Using the higher level API
service
.getAllContactOriginData()
.withQueryParameter("$expand", "AdditionalIDs")
.top(size)
.filter(filter)
.executeRequest(destination)); // warning happens here
Even though this warning shows up in the logs the service requests do continue to work as expected. It's just a little concerning to see this and I'm wondering if maybe I have something misconfigured. I reviewed all of the java docs and the troubleshooting page and didn't see anything out of the ordinary other than how I am fetching my destination, but even using the DestinationAccessor didn't seem to make a difference. Also, I'm not doing any asynchronous or multi-tenant processing.
Any help you or guidance you can give on this would be appreciated!
Cheers!

Such an issue is often the result of missing Spring Boot annotations - especially in synchronous executions.
Please refer to our documentation to learn more about the SAP Cloud SDK Spring Boot integration.
Edit Feb. 28th 2022
It is safe to ignore the logged warning if your application does not need any of the SAP Cloud SDK's multitenancy features.
Error Cause
The SAP Cloud SDK for Java recently (in version 3.63.0) introduced a change to the thread propagation behavior of the HttpClientCache.
With that change, we also adapted the logging in case the propagation didn't work as expected - this is often caused by not using the ThreadContextExecutor for wrapping asynchronous operations.
This is the reason for logs like the one described by the issue author.
Planned Mitigation
In the meanwhile, we realized that these WARN logs are causing confusion on the consumer side.
We are working on improving the situation by degrading the log level to INFO for the message and to DEBUG for the exception.

Related

Not able to get all the logs in application insights even after disabling sampling

I am generating logs for my client application where there is very limited internet connectivity. I am storing the offline logs and generating it to application insights once the user is back online. The problem I am facing is out of all the logs only request logs are coming rest are getting discarded. This is happening because of sampling even though I have already disabled the sampling from Startup.cs. Here is my code:
var aiOptions = new Microsoft.ApplicationInsights.AspNetCore.Extensions.ApplicationInsightsServiceOptions();
aiOptions.EnableAdaptiveSampling = false;
services.AddApplicationInsightsTelemetry(aiOptions);
Any Suggestions how to completely remove the sampling so that I can have all the logs in application insight.
Check this document to see different log levels. if you have latest version of sdk than ILogger Can capture without required action.
It will capture log level.
Here is the configuration of logging level.
.ConfigureLogging(
builder =>
{
builder.AddApplicationInsights("ikey");
builder.AddFilter<Microsoft.Extensions.Logging.ApplicationInsights.ApplicationInsightsLoggerProvider>("", LogLevel.Information); // this will capture Info level traces and above.
}
For complete information check this SO thread.

EventHubConsumerClient Apache Qpid memory leak?

I am reading events from an Azure EventHub cluster synchronously via the receiveFromPartition method on the EventHubConsumerClient class.
I create the client once like so:
EventHubConsumerClient eventHubConsumerClient = new EventHubClientBuilder()
.connectionString(eventHubConnectionString)
.consumerGroup(consumerGroup)
.buildConsumerClient());
I then just use a ScheduledExecutorService to retrieve events every 1.5s via:
IterableStream<PartitionEvent> receivedEvents = eventHubConsumerClient.receiveFromPartition(
partitionId, 1, eventPosition);
The equivalent logic in V3 of the SDK worked fine (using PartitionReceivers), but now I am seeing OOMs in my JVM.
Running a profiler against a local version of the logic I see the majority of the heap (90%, mainly in OG) is being taken up by byte[]s, referenced by org.apache.qpid.proton.codex.CompositeReadableBuffer. This pattern is not present when I profile the V3 logic.
What could be causing a leak of the AMQP messages here, do I need to interact with the SDK further, for example close a connection that I'm not aware of after each call?
Any advise would be very appreciated, thanks!
Turns out it was a bug, solved here: https://github.com/Azure/azure-sdk-for-java/issues/13775

Stackdriver-trace on Google Cloud Run failing, while working fine on localhost

I have a node server running on Google Cloud Run. Now I want to enable stackdriver tracing. When I run the service locally, I am able to get the traces in the GCP. However, when I run the service as Google Cloud Run, I am getting an an error:
"#google-cloud/trace-agent ERROR TraceWriter#publish: Received error with status code 403 while publishing traces to cloudtrace.googleapis.com: Error: The request is missing a valid API key."
I made sure that the service account has tracing agent role.
First line in my app.js
require('#google-cloud/trace-agent').start();
running locally I am using .env file containing
GOOGLE_APPLICATION_CREDENTIALS=<path to credentials.json>
According to https://github.com/googleapis/cloud-trace-nodejs These values are auto-detected if the application is running on Google Cloud Platform so, I don't have this credentials on the gcp image
There are two challenges to using this library with Cloud Run:
Despite the note about auto-detection, Cloud Run is an exception. It is not yet autodetected. This can be addressed for now with some explicit configuration.
Because Cloud Run services only have resources until they respond to a request, queued up trace data may not be sent before CPU resources are withdrawn. This can be addressed for now by configuring the trace agent to flush ASAP
const tracer = require('#google-cloud/trace-agent').start({
serviceContext: {
service: process.env.K_SERVICE || "unknown-service",
version: process.env.K_REVISION || "unknown-revision"
},
flushDelaySeconds: 1,
});
On a quick review I couldn't see how to trigger the trace flush, but the shorter timeout should help avoid some delays in seeing the trace data appear in Stackdriver.
EDIT: While nice in theory, in practice there's still significant race conditions with CPU withdrawal. Filed https://github.com/googleapis/cloud-trace-nodejs/issues/1161 to see if we can find a more consistent solution.

How retry works in Azure Service Bus Java

I'm new to service bus, I'm curious about RetryPolicy and how it works, as per the documentation, retry had happened automatically for transient exceptions(MessagingExcepitons, ServerBusy), and the default retry count is 3, but we can set out custom retry policy using RetryExponential class.
I want to see the logs does the RetryPolicy did actually trying to connect or not when exception occurs.
How can I check this, how to replicate MessagingExcepitons, ServerBusy exceptions, so that I can see the logs. I'm using azure service bus java sdk.
Can any one help me to understand this. Thanks in advance
The Java SDK is open source and looking for retryPolicy in these files shows how the underlying implementation uses it
CoreMessageSender
CoreMessageReceiver
For example, here's the flow for CoreMessageSender when an error is thrown
When an error occurs and if its a ServiceBusException, a retry is scheduled - See line
After waiting, it ensures the link is still open and increments the retry count - See line
This continues and on successful completion it resets the count - See line
As for logging, the Java SDK uses SLF4J and you can see the required logs with a line like this in your code
import org.apache.log4j.Level;
import org.apache.log4j.Logger;
Logger.getLogger("com.microsoft.azure.servicebus").setLevel(Level.WARN);

Disable CloudWatch for AWS Kinesis at Spark Streaming

I would like to know If it's possible?
here is the code: numStreams I get it by using AmazonKinesisClient API
// Create the Kinesis DStreams
List<JavaDStream<byte[]>> streamsList = new ArrayList<>(numStreams);
for (int i = 0; i < numStreams; i++) {
streamsList.add(
KinesisUtils.createStream(jssc, kinesisAppName, streamName, endpointUrl, regionName,
InitialPositionInStream.TRIM_HORIZON, kinesisCheckpointInterval,
StorageLevel.MEMORY_AND_DISK_2(),accessesKey,secretKey)
);
}
I tried looking through the API and I just couldn't find any reference to disabling Apache Streaming CloudWatch.
here is the Warnings that i try getting rid of:
17/01/23 17:46:29 WARN CWPublisherRunnable: Could not publish 16 datums to CloudWatch
com.amazonaws.AmazonServiceException: User: arn:aws:iam:::user/Kinesis_Service is not authorized to perform: cloudwatch:PutMetricData (Service: AmazonCloudWatch; Status Code: 403; Error Code: AccessDenied; Request ID: *****)
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1377)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:923)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:701)
at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:453)
at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:415)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:364)
at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.doInvoke(AmazonCloudWatchClient.java:984)
at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.invoke(AmazonCloudWatchClient.java:954)
at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.putMetricData(AmazonCloudWatchClient.java:853)
at com.amazonaws.services.kinesis.metrics.impl.DefaultCWMetricsPublisher.publishMetrics(DefaultCWMetricsPublisher.java:63)
at com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable.runOnce(CWPublisherRunnable.java:144)
at com.amazonaws.services.kinesis.metrics.impl.CWPublisherRunnable.run(CWPublisherRunnable.java:90)
at java.lang.Thread.run(Unknown Source)
Preface : I know this is kind of old question, but just faced this so posting a solution for anyone who encounter this issue with Spark <= 2.3.3
It is possible to disable Cloudwatch metrics reporting at KCL (Kinesis Client) library level with withMetrics methods when building the client.
Unfortunately, Spark KinesisInputDStream method does not expose a way to change this setting and to make things worse, the default level is "DETAILED" which send 10s of metric every 10 seconds.
The way I took in order to disable it is to provide invalid credential to the method cloudWatchCredentials from KinesisInputDStream. IE : .cloudWatchCredentials(SparkAWSCredentials.builder.basicCredentials("DISABLED", "DISABLED").build())
Then comes the issue for CloudWatchAsyncClient logging warning at each tick, which I disabled by setting the following in spark log4j.properties config :
# Set Kinesis logging metrics to Warn - Since we intentionally provide
# wrong credentials in order to disable cloudwatch logging. Bad credential
# warning are logged at WARN level - so we still get errors.
log4j.logger.com.amazonaws.services.kinesis.metrics=ERROR
This will suppress warning for the metrics package class only (such as the one you mentioned) but will not suppress the error, in case those are needed.
This is nowhere close to an ideal solution, but this allowed us deploying a solution while existing Spark version deployed.
Next steps : open a ticket to Spark so they can hopefully allow us to disable it for the next versions.
Edit - created: https://issues.apache.org/jira/browse/SPARK-27420 for tracking

Resources