EF6 Garbage collecton increases unmanaged memory instead of releasing it - garbage-collection

I have a .net webapi executing tasks in a background queue and it's deployed on k8s cluster. The pods keep getting evicted because of the memory threshold.
private async Task BackgroundProcessing(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
var task = await _taskQueue.DequeueAsync(stoppingToken);
var scope = _services.CreateScope();
await task(scope.ServiceProvider, stoppingToken);
scope.Dispose();
}
}
My application seems to run "normaly", but when the garbage collection is is done, it seems to increase the unmanaged memory instead of releasing it.

Related

Azure CosmosDB: Bulk deletion using SDK

I want to delete 20-30k items in bulk. Currently I am using below method to delete these items. But its taking 1-2 mins.
private async Task DeleteAllExistingSubscriptions(string userUUId)
{
var subscriptions = await _repository
.GetItemsAsync(x => x.DistributionUserIds.Contains(userUUId), o => o.PayerNumber);
if (subscriptions.Any())
{
List<Task> bulkOperations = new List<Task>();
foreach (var subscription in subscriptions)
{
bulkOperations.Add(_repository
.DeleteItemAsync(subscription.Id.ToString(), subscription.PayerNumber).CaptureOperationResponse(subscription));
}
await Task.WhenAll(bulkOperations);
}
}
Cosmos Client:As we can see I have already set AllowBulkExecution = true
private static void RegisterCosmosClient(IServiceCollection serviceCollection, IConfiguration configuration)
{
string cosmosDbEndpoint = configuration["CosmoDbEndpoint"];
Ensure.ConditionIsMet(cosmosDbEndpoint.IsNotNullOrEmpty(),
() => new InvalidOperationException("Unable to locate configured CosmosDB endpoint"));
var cosmosDbAuthKey = configuration["CosmoDbAuthkey"];
Ensure.ConditionIsMet(cosmosDbAuthKey.IsNotNullOrEmpty(),
() => new InvalidOperationException("Unable to locate configured CosmosDB auth key"));
serviceCollection.AddSingleton(s => new CosmosClient(cosmosDbEndpoint, cosmosDbAuthKey,
new CosmosClientOptions { AllowBulkExecution = true }));
}
Is there any way to delete these item in a batch with CosmosDB SDK 3.0 in less time?
Please check the metrics to understand if the volume of data you are trying to send is not getting throttled because your provisioned throughput is not enough.
Bulk just improves the client-side aspect of sending that data by optimizing how it flows from your machine to the account, but if your container is not provisioned to handle that volume of operations, then operations will get throttled and the time it takes to complete will be longer.
As with any data flow scenario, the bottlenecks are:
The source environment cannot process the data as fast as you want, which would show as a bottleneck/spike on the machine's CPU (processing more data would require more CPU).
The network's bandwidth has limitations, in some cases the network has limits on the amount of data it can transfer or even the amount of connections is can open. If the machine you are running the code has such limitations (for example, Azure VMs have SNAT, Azure App Service has TCP limits) and you are running into them, new connections might get delayed and thus increasing latency.
The destination has limits in the amount of operations it can process (in the form of provisioned throughput in this case).

Azure Batch refuses to run docker tasks on docker supporting nodes

I have an Azure Batch pool with nodes supporting Docker.
That is, the OS offer is MicrosoftWindowsServer WindowsServer 2016-Datacenter-with-Containers.
I create a task as recommended:
private static CloudTask CreateTask(string id, string commandLine)
{
var autoUserSpec = new AutoUserSpecification(elevationLevel: ElevationLevel.Admin);
var containerSettings = new TaskContainerSettings(_imageName);
var task = new CloudTask(id, commandLine)
{
UserIdentity = new UserIdentity(autoUserSpec),
ContainerSettings = containerSettings,
};
return task;
}
When the task is run, it completes with the error ContainerPoolNotSupported, The compute node does not support container feature.
This does not make sense. When I connect to the node, I see docker there, the image is preinstalled so I can even run the container right away. The task finishes nearly immediately so likely Azure Batch just notices the container settings and for some reason throws.
Are there any workarounds? Google offers 0 references for the error name.
Without the context of how you created the pool, this answer is not definitive. But most likely you did not specify the ContainerConfiguration on VirtualMachineConfiguration of the CloudPool object.
Please see this guide for a tutorial on running container workloads on Azure Batch.
What helped was to disregard TaskContainerSettings altogether, that is to imitate an ordinary CloudTask yet to have docker run in the task command line:
private static CloudTask CreateTask(string id, string commandLine)
{
var autoUserSpec = new AutoUserSpecification(elevationLevel: ElevationLevel.Admin);
var task = new CloudTask(id, $"docker run {_imageName} {commandLine}")
{
UserIdentity = new UserIdentity(autoUserSpec),
};
return task;
}
So this is truly a workaround until the support of containers becomes more stable in Azure.

Spark worker shutdown - how to free shared resources

In Spark manual there is recommended to use shared static resource (e. g. connection pool) inside the worker code.
Example from the manual:
dstream.foreachRDD { rdd =>
rdd.foreachPartition { partitionOfRecords =>
// ConnectionPool is a static, lazily initialized pool of connections
val connection = ConnectionPool.getConnection()
partitionOfRecords.foreach(record => connection.send(record))
ConnectionPool.returnConnection(connection) // return to the pool for future reuse
}
}
What to do when the static resource needs to be freed/closed before the executor is shut down? There is no place where to call the close() function. Tried a shutdown hook, but it doesn't seem to help.
Actually currently my worker process becomes zombie, because I am using shared resource which creates a pool of non-deamon threads (HBase async client) that means the JVM hangs until forever.
I am using the Spark Streaming graceful shutdown called on the driver:
streamingContext.stop(true, true);
EDIT:
It seems there already exists an issue in the Spark JIRA that's dealing with the same problem
https://issues.apache.org/jira/browse/SPARK-10911

Calling WCF Service Operation in multithreaded Console Application

I have below application:
Its windows console .NET 3.0 application
I'm creating 20 workloads and assigning them to threadpool to process.
Each thread in ThreadPool creates WCF Client and calls service with request created using workload assigned.
Sometimes on production servers[12 core machines], I get following exception:
There was an error reflecting type 'xyz' while invoking operation using WCF client. This starts appearing in all threads. After sometime it suddenly disappears and starts appearing again.
Code:
Pseudo Code:
for(int i=0;i<20;i++)
{
MultiThreadedProcess proc =new MultThreadedProcess(someData[i]);
ThreadPool.QueueUserWorkItem(proc.CallBack,i);
}
In Class MultiThreadedProcess, I do something like this:
public void Callback(object index)
{
MyServiceClient client = new MyServiceClient();
MyServiceResponse response =client.SomeOperation(new MyServiceRequest(proc.SomeData));
client.close();
//Process Response
}
Can anyone suggest some resolutions for this problem?
If you can turn on diagnostic, appears to me serialization issue, there might be chance that certain data members/values are not able to de-serialized properly for operation call.

.NET 4.5 Increase WCF Client Calls Async?

I have a .NET 4.5 WCF client app that uses the async/await pattern to make volumes of calls. My development machine is dual-proc with 8gb RAM (production will be 5 CPU with 8gb RAM at Amazon AWS) . The remote WCF service called by my code uses out and ref parameters on a web method that I need. My code instances a proxy client each time, writes any results to a public ConcurrentDictionary, and then returns null.
I ran Perfmon, watching the thread count on the system, and it goes between 28-30. It takes hours for my client to complete the volumes of calls that are made. Yes, hours. The remote service is backed by a big company, they have many servers to receive my WCF calls, so the more calls I can throw at them, the better.
I think that things are actually still happening synchronously, even though the method that makes the WCF call is decorated with "async" because the proxy method cannot have "await". Is that true?
My code looks like this:
async private void CallMe()
{
Console.WriteLine( DateTime.Now );
var workTasks = this.AnotherConcurrentDict.Select( oneB => GetData( etcetcetc ).Cast<Task>().ToList();
await Task.WhenAll( workTasks );
}
private async Task<WorkingBits> GetData(etcetcetc)
{
var commClient = new RemoteClient();
var cpResponse = new GetPackage();
var responseInfo = commClient.GetData( name, password , ref (cpResponse.aproperty), filterid , out cpResponse.Identifiers);
foreach (var onething in cpResponse.Identifiers)
{
// add to the ConcurrentDictionary
}
return null; // I already wrote to the ConcurrentDictionary so no need to return anything
responseInfo is not awaitable beacuse the WCF call has ref and out parameters.
I was thinking that way to speed this up is not to put async/await in this method, but instead create a wrapper method where I can make things await/async, but I am not that is the smartest/safest way to work it.
What is a smart way to get more outbound calls to the service (expand IO completion thread pool, trick calls into running in the background so Task.WhenAll can complete quicker)?
Thanks for all ideas/samples/pointers. I am hitting a bottleneck somewhere.
1) Make sure you're really calling it asynchronously, rather than just blocking on the calls. Code samples would help here.
2) You may need to do this:
ServicePointManager.DefaultConnectionLimit = 100;
By default it only allows 2 simultaneous connections to the same server.
3) Make sure you dispose the proxy object after the call is complete so you're not tying up resources.
If you're doing things asynchronously the threadpool size shouldn't be a bottleneck. To get a better idea of what kind of problem you're having, you can use Interlocked.Increment and Interlocked.Decrement to track the number of pending calls and see if it's being limited somewhere.
You could also substitute your real call with a call to a very simple method that you know will not have any bottlenecks, to see if the problem is in the client or server.

Resources