I have a .NET 4.5 WCF client app that uses the async/await pattern to make volumes of calls. My development machine is dual-proc with 8gb RAM (production will be 5 CPU with 8gb RAM at Amazon AWS) . The remote WCF service called by my code uses out and ref parameters on a web method that I need. My code instances a proxy client each time, writes any results to a public ConcurrentDictionary, and then returns null.
I ran Perfmon, watching the thread count on the system, and it goes between 28-30. It takes hours for my client to complete the volumes of calls that are made. Yes, hours. The remote service is backed by a big company, they have many servers to receive my WCF calls, so the more calls I can throw at them, the better.
I think that things are actually still happening synchronously, even though the method that makes the WCF call is decorated with "async" because the proxy method cannot have "await". Is that true?
My code looks like this:
async private void CallMe()
{
Console.WriteLine( DateTime.Now );
var workTasks = this.AnotherConcurrentDict.Select( oneB => GetData( etcetcetc ).Cast<Task>().ToList();
await Task.WhenAll( workTasks );
}
private async Task<WorkingBits> GetData(etcetcetc)
{
var commClient = new RemoteClient();
var cpResponse = new GetPackage();
var responseInfo = commClient.GetData( name, password , ref (cpResponse.aproperty), filterid , out cpResponse.Identifiers);
foreach (var onething in cpResponse.Identifiers)
{
// add to the ConcurrentDictionary
}
return null; // I already wrote to the ConcurrentDictionary so no need to return anything
responseInfo is not awaitable beacuse the WCF call has ref and out parameters.
I was thinking that way to speed this up is not to put async/await in this method, but instead create a wrapper method where I can make things await/async, but I am not that is the smartest/safest way to work it.
What is a smart way to get more outbound calls to the service (expand IO completion thread pool, trick calls into running in the background so Task.WhenAll can complete quicker)?
Thanks for all ideas/samples/pointers. I am hitting a bottleneck somewhere.
1) Make sure you're really calling it asynchronously, rather than just blocking on the calls. Code samples would help here.
2) You may need to do this:
ServicePointManager.DefaultConnectionLimit = 100;
By default it only allows 2 simultaneous connections to the same server.
3) Make sure you dispose the proxy object after the call is complete so you're not tying up resources.
If you're doing things asynchronously the threadpool size shouldn't be a bottleneck. To get a better idea of what kind of problem you're having, you can use Interlocked.Increment and Interlocked.Decrement to track the number of pending calls and see if it's being limited somewhere.
You could also substitute your real call with a call to a very simple method that you know will not have any bottlenecks, to see if the problem is in the client or server.
Related
I have a Function app in Azure that is triggered when an item is put on a queue. It looks something like this (greatly simplified):
public static async Task Run(string myQueueItem, TraceWriter log)
{
using (var client = new HttpClient())
{
client.BaseAddress = new Uri(Config.APIUri);
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
StringContent httpContent = new StringContent(myQueueItem, Encoding.UTF8, "application/json");
HttpResponseMessage response = await client.PostAsync("/api/devices/data", httpContent);
response.EnsureSuccessStatusCode();
string json = await response.Content.ReadAsStringAsync();
ApiResponse apiResponse = JsonConvert.DeserializeObject<ApiResponse>(json);
log.Info($"Activity data successfully sent to platform in {apiResponse.elapsed}ms. Tracking number: {apiResponse.tracking}");
}
}
This all works great and runs pretty well. Every time an item is put on the queue, we send the data to some API on our side and log the response. Cool.
The problem happens when there's a big spike in "the thing that generates queue messages" and a lot of items are put on the queue at once. This tends to happen around 1,000 - 1,500 items in a minute. The error log will have something like this:
2017-02-14T01:45:31.692 mscorlib: Exception while executing function:
Functions.SendToLimeade. f-SendToLimeade__-1078179529: An error
occurred while sending the request. System: Unable to connect to the
remote server. System: Only one usage of each socket address
(protocol/network address/port) is normally permitted
123.123.123.123:443.
At first, I thought this was an issue with the Azure Function app running out of local sockets, as illustrated here. However, then I noticed the IP address. The IP address 123.123.123.123 (of course changed for this example) is our IP address, the one that the HttpClient is posting to. So, now I'm wondering if it is our servers running out of sockets to handle these requests.
Either way, we have a scaling issue going on here. I'm trying to figure out the best way to solve it.
Some ideas:
If it's a local socket limitation, the article above has an example of increasing the local port range using Req.ServicePoint.BindIPEndPointDelegate. This seems promising, but what do you do when you truly need to scale? I don't want this problem coming back in 2 years.
If it's a remote limitation, it looks like I can control how many messages the Functions runtime will process at once. There's an interesting article here that says you can set serviceBus.maxConcurrentCalls to 1 and only a single message will be processed at once. Maybe I could set this to a relatively low number. Now, at some point our queue will be filling up faster than we can process them, but at that point the answer is adding more servers on our end.
Multiple Azure Functions apps? What happens if I have more than one Azure Functions app and they all trigger on the same queue? Is Azure smart enough to divvy up the work among the Function apps and I could have an army of machines processing my queue, which could be scaled up or down as needed?
I've also come across keep-alives. It seems to me if I could somehow keep my socket open as queue messages were flooding in, it could perhaps help greatly. Is this possible, and any tips on how I'd go about doing this?
Any insight on a recommended (scalable!) design for this sort of system would be greatly appreciated!
I think the code error is because of: using (var client = new HttpClient())
Quoted from Improper instantiation antipattern:
this technique is not scalable. A new HttpClient object is created for
each user request. Under heavy load, the web server may exhaust the
number of available sockets.
I think I've figured out a solution for this. I've been running these changes for the past 3 hours 6 hours, and I've had zero socket errors. Before I would get these errors in large batches every 30 minutes or so.
First, I added a new class to manage the HttpClient.
public static class Connection
{
public static HttpClient Client { get; private set; }
static Connection()
{
Client = new HttpClient();
Client.BaseAddress = new Uri(Config.APIUri);
Client.DefaultRequestHeaders.Add("Connection", "Keep-Alive");
Client.DefaultRequestHeaders.Add("Keep-Alive", "timeout=600");
Client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
}
}
Now, we have a static instance of HttpClient that we use for every call to the function. From my research, keeping HttpClient instances around for as long as possible is highly recommended, everything is thread safe, and HttpClient will queue up requests and optimize requests to the same host. Notice I also set the Keep-Alive headers (I think this is the default, but I figured I'll be implicit).
In my function, I just grab the static HttpClient instance like:
var client = Connection.Client;
StringContent httpContent = new StringContent(myQueueItem, Encoding.UTF8, "application/json");
HttpResponseMessage response = await client.PostAsync("/api/devices/data", httpContent);
response.EnsureSuccessStatusCode();
I haven't really done any in-depth analysis of what's happening at the socket level (I'll have to ask our IT guys if they're able to see this traffic on the load balancer), but I'm hoping it just keeps a single socket open to our server and makes a bunch of HTTP calls as the queue items are processed. Anyway, whatever it's doing seems to be working. Maybe someone has some thoughts on how to improve.
If you use consumption plan instead of Functions on a dedicated web app, #3 more or less occurs out of the box. Functions will detect that you have a large queue of messages and will add instances until queue length stabilizes.
maxConcurrentCalls only applies per instance, allowing you to limit per-instance concurrency. Basically, your processing rate is maxConcurrentCalls * instanceCount.
The only way to control global throughput would be to use Functions on dedicated web apps of the size you choose. Each app will poll the queue and grab work as necessary.
The best scaling solution would improve the load balancing on 123.123.123.123 so that it can handle any number of requests from Functions scaling up/down to meet queue pressure.
Keep alive afaik is useful for persistent connections, but function executions aren't viewed as a persistent connection. In the future we are trying to add 'bring your own binding' to Functions, which would allow you to implement connection pooling if you liked.
I know the question was answered long ago, but in the mean time Microsoft have documented the anti-pattern that you were using.
Improper Instantiation antipattern
We are exposing our current logic/business layer using WebAPI. As per my understanding, if we want to keep our self safe from thread starvation for the requests, we should make Async WebAPI controller, so a large number of concurrent request can make up.
I do understand that as the underlying service/business layer is synchronous, so there will be no performance gain. We are just aiming for a large number of concurrent requests to pass through.
Below is the code that i am using:
public async Task<IHttpActionResult> Get()
{
var result = await Task.Run(() => Service.GetAllCompanies()); //existing business layer
return Ok(result);
}
Wrapping underlying layer in a Task, is this good to proceed with and achieve the goal.
My understanding is that if your await-ed methods' implementations aren't async, then you are not really accomplishing what you think you are. If you have CPU-bound service methods, you're just releasing a thread from the request-handling pool and then spinning up a new one (from the same pool, mind you) when you Task.Run(); So you incur some overhead from that switch, but the ASP.NET thread pool is still doing the work, so you haven't achieved what you want.
If you service methods can be converted to be pure (or mostly pure) async code, then you would stand to benefit from working bottom up.
Reference: https://msdn.microsoft.com/en-us/magazine/dn802603.aspx
I'm using MVC4 ApiController to upload data to Azure Blob. Here is the sample code:
public Task PostAsync(int id)
{
return Task.Factory.StartNew(() =>
{
// CloudBlob.UploadFromStream(stream);
});
}
Does this code even make sense? I think ASP.NET is already processing the request in a worker thread, so running UploadFromStream in another thread doesn't seem to make sense since it now uses two threads to run this method (I assume the original worker thread is waiting for this UploadFromStream to finish?)
So my understanding is that async ApiController only makes sense if we are using some built-in async methods such as HttpClient.GetAsync or SqlCommand.ExecuteReaderAsync. Those methods probably use I/O Completion Ports internally so it can free up the thread while doing the actual work. So I should change the code to this?
public Task PostAsync(int id)
{
// only to show it's using the proper async version of the method.
return TaskFactory.FromAsync(BeginUploadFromStream, EndUploadFromStream...)
}
On the other hand, if all the work in the Post method is CPU/memory intensive, then the async version PostAsync will not help throughput of requests. It might be better to just use the regular "public void Post(int id)" method, right?
I know it's a lot questions. Hopefully it will clarify my understanding of async usage in the ASP.NET MVC. Thanks.
Yes, most of what you say is correct. Even down to the details with completion ports and such.
Here is a tiny error:
I assume the original worker thread is waiting for this UploadFromStream to finish?
Only your task thread is running. You're using the async pipeline after all. It does not wait for the task to finish, it just hooks up a continuation. (Just like with HttpClient.GetAsync).
I am trying solve this problem. I have WCF service. Client can call web method from this service which only "fire" another method (this method only write data to database) in another thread.
Code is here:
//this method will write data to database
public void WriteToDb()
{
}
//this web method will call only mehod WriteToDb() in another thread
public void SomeWebMethod()
{
new Task(WriteToDb).Start();
}
Problem is that in same time can web method call 5 clients. This cause that method WriteToDb is called 5 times in 5 thread.
In all 5 cases method WriteToDb will use same data.
My aim is achieve this behavior. 5 clients called web method SomeWebMethod. Method WriteToDb will run in 5 thread.
But I would like execute first thread, then second thread ....etc and on the end 5th thread.
I don’t want run method WriteToDb in same time in 5 thread.
So maybe I can use lock.
{
private object locker = new object();
//this method will write data to database
public void WriteToDb()
{
lock(locker)
{
//write to DB
}
}
I am not sure because .net assembly is host on app domain a app domain is host on win process. I woud like to avoid deadlocks.
What happens if I have a machine with 6 CPU? Use mutex instead lock ?
Thank you for help...
I'm not particulary sure what you are writing to DB, but your question is loosely coupled with WCF to be frank, try to read CLR via C# on multithreading etc.
Also regarding WCF, you can setup how your service object is created upon requests, ie per call, per session or singleton, and for later use specify if it's methods will stuck in queue or will be called on object concurrently.
So depending on choosing architecture you can either relay on WCF ability to host single object which will have logic you described or you can go the way tried.
Links
http://msdn.microsoft.com/en-us/magazine/cc163590.aspx
http://msdn.microsoft.com/en-us/library/ms731193.aspx
A lock is fine here, but you should make your locker object static so the same object instance is used in the lock every time.
It does not matter how many cores you have - if you hold the lock on an object then any other threads that attempt to acquire the lock will wait until the lock is released.
A deadlock can only occur if you are acquiring multiple locks in different orders in different threads.
I suggest you read Joe Albahari's excellent free ebook
I have a very long running query that takes too long to keep my client connected. I want to make a call into my DomainService, create a new worker thread, then return from the service so that my client can then begin polling to see if the long running query is complete.
The problem I am running into is that since my calling thread is exiting right away, I am getting exceptions thrown when my worker tries to access any entities since the ObjectContext gets disposed when the original thread ends.
Here is how I create the new context and call from my Silverlight client:
MyDomainContext context = new MyDomainContext();
context.SearchAndStore(_myParm, SearchQuery,
p => {
if (p.HasError) { // Do some work and return to start
} // polling the server for completion...
}, null);
The entry method on the server:
[Invoke]
public int SearchAndStore(object parm)
{
Thread t = new Thread(new ParameterizedThreadStart(SearchThread));
t.Start(parms);
return 0;
// Once this method returns, I get ObjectContext already Disposed Exceptions
}
Here is the WorkerProc method that gets called with the new Thread. As soon as I try to iterate through my query1 object, I get the ObjectContext already Disposed exception.
private void WorkerProc(object o)
{
HashSet<long> excludeList = new HashSet<long>();
var query1 = from doc in this.ObjectContext.Documents
join filters in this.ObjectContext.AppliedGlobalFilters
.Where(f => f.FilterId == 1)
on doc.FileExtension equals filters.FilterValue
select doc.FileId;
foreach (long fileId in query1) // Here occurs the exception because the
{ // Object Context is already disposed of.
excludeList.Add(fileId);
}
}
How can I prevent this from happening? Is there a way to create a new context for the new thread? I'm really stuck on this one.
Thanks.
Since you're using WCF RIA. I have to assume that you're implementing two parts:
A WCF Web Service
A Silverlight client which consumes the WCF Service.
So, this means that you have two applications. The service running on IIS, and the Silverlight running on the web browser. These applications have different life cycles.
The silverlight application starts living when it's loaded in the web page, and it dies when the page is closed (or an exception happens). On the other hand (at server side), the WCF Web Service life is quite sort. You application starts living when the service is requested and it dies once the request has finished.
In your case your the server request finishes when the SearchAndStore method finishes. Thus, when this particular method starts ,you create an Thread which starts running on background (in the server), and your method continues the execution, which is more likely to finishes in a couple of lines.
If I'm right, you don't need to do this. You can call your method without using a thread, in theory it does not matter if it takes awhile to respond. this is because the Silvelight application (on the client) won't be waiting. In Silverlight all the operations are asynchronous (this means that they're running in their own thread). Therefore, when you call the service method from the client, you only have to wait until the callback is invoked.
If it's really taking long time, you are more likely to look for a mechanism to keep the connection between your silverlight client and your web server alive for longer. I think by modifying the service configuration.
Here is a sample of what I'm saying:
https://github.com/hmadrigal/CodeSamples/tree/master/wcfria/SampleWebApplication01
In the sample you can see the different times on client and server side. You click the button and have to wait 30 seconds to receive a response from the server.
I hope this helps,
Best regards,
Herber