I've a couple of questions regarding the (excellent) CouchDB .NET client MyCouch:
Is there some built-in retry policy in case of "transient" failure (like the server responding 503)?
Should instances of MyCouchClient or MyCouchStore be cached to be reused? Right now I'm creating one for each incoming request, but I'm wondering if that incurs a performance penalty.
I would like to customize the configuration of Json.NET as used by MyCouch, like adding a new StringEnumConverter { CamelCaseText = true } to the list of Converters. Is there a way to achieve that through the API?
Thanks
1) There's no magic in the MyCouchClient, it's just simple requests and responses. The MyCouchStore how-ever, I would gladly accept pull request to have options for retries or e.g. auto-batching queries.
2) Here are some links to information that would help you to decide on per request or per application.
Is async HttpClient from .Net 4.5 a bad choice for intensive load applications?
http://msdn.microsoft.com/en-us/library/system.net.servicepointmanager.defaultconnectionlimit.aspx
So doing one per application would probably need a reconf of the connection limit.
I have this centralized in my IoC-config, and by default I'm not doing per application. The first "connection" can take a bit longer, but the second has been measured down to milliseconds against Cloudant by other users so that should in general not be an issue.
3)
You can configure the serializer by providing a custom MyCouchClientBootstrapper and providing a custom implementation of: https://github.com/danielwertheim/mycouch/blob/master/source/projects/MyCouch.Net45/MyCouchClientBootstrapper.cs#L170
And you also have to extend this guy: https://github.com/danielwertheim/mycouch/blob/master/source/projects/MyCouch.Net45/Serialization/SerializationConfiguration.cs#L9
Feel free to suggest changes that makes this process simpler for you.
Related
My application sends requests to Azure Machine Learning REST API in order to invoke a batch endpoint and start scoring jobs as described here. It works well for small number of requests, but if the app sends many concurrent requests the REST API sometimes responds with status code 429 "TooManyRequests" and message "Received too many requests in a short amount of time. Retry again after 1 seconds.". For example, it happened after sending 77 requests at once.
The message is pretty clear and the best solution I can think about is to throttle outgoing requests. That is making sure the app doesn't exceed limits when it sends concurrent requests. But the problem is I don't know what are the request limits for Azure Machine Learning REST API. Looking through the Microsoft documentation I could only find this article which provides limits for Managed online endpoints whereas I'm looking for Batch endpoints.
I would really appreciate if someone helped me to find the Azure ML REST API request limits or suggested a better solution. Thanks.
UPDATE 20 Jun 2022:
I couldn't find out how many concurrent requests are allowed by Azure Machine Learning batch endpoints. So I ended with a limit of 10 outgoing requests which solved the "TooManyRequests" problem. In order to throttle requests I used SemaphoreSlim as described here.
According to the document, there is chance to enhance the quota of the request limit which is the way to solve the request limit exceed issue. Regarding batch quota limit, here is the document designed by Microsoft.
According to the above image, change the quota values.
Document Credit: prkannap and team
Alternatively, you could reduce the number of requests by storing multiple input files in a folder and invoking the job with the folder path.
If you want further assistance, please file a support ticket and a customer support engineer will assist you.
The example Scott Hanselman gives on his blog for using Parallel.ForEachAsync in .NET 6 specifies the value of MaxDegreeOfParallelism as 3.
However, if unspecified, the default MaxDegreeOfParallelism is ProcessorCount. This makes sense for CPU bound work, but for asynchronous I/O bound work, it seems like a poor choice for a default value.
If I'm doing something like in Scott's example below, but I want to do it as fast as possible, how should I determine the best value to use for MaxDegreeOfParallelism? Is it reasonable to specify this as int.MaxValue and just assume the TaskScheduler will do the most sensible thing when it comes to scheduling the work on the ThreadPool?
ParallelOptions parallelOptions = new()
{
MaxDegreeOfParallelism = 3
};
await Parallel.ForEachAsync(userHandlers, parallelOptions, async (uri, token) =>
{
var user = await client.GetFromJsonAsync<GitHubUser>(uri, token);
Console.WriteLine($"Name: {user.Name}\nBio: {user.Bio}\n");
});
IMHO The only way to get the number is...testing.
For http work there are two parties involved:
you code
the remote side that does the work for you.
Your fast may be too fast for the remote side. This can because of resources and/or throttling.
Note on the default
The default - which results in ProcessorCount - will depend on the machine that the code runs on and if you run your code in the cloud this number may be different than what's on your beefy laptop.
This can lead to unexpected differences between non-prod and prod environments.
GitHub specific
gitHub.com has a 5,000 requests per hour for non-enterprise users (from here) and there is also this:
In order to provide quality service on GitHub, additional rate limits may apply to some actions when using the API. For example, using the API to rapidly create content, poll aggressively instead of using webhooks, make multiple concurrent requests, or repeatedly request data that is computationally expensive may result in secondary rate limiting.
In Best practices for integrators we can read
Dealing with secondary rate limits
Secondary rate limits are another way we ensure the API's availability. To avoid hitting this limit, you should ensure your application follows the guidelines below.
...
Make requests for a single user or client ID serially. Do not make requests for a single user or client ID concurrently.
How to control the usage of APIs by consumers during a given period in Azure function app Http trigger. Simply how to set a requests throttle when exceed the request limit, and please let me know a solution without using azure API Gateway.
The only control you have over host creation in Azure Functions an obscure application setting: WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT. This implies that you can control the number of hosts that are generated, though Microsoft claim that “it’s not completely foolproof” and “is not fully supported”.
From my own experience it only throttles host creation effectively if you set the value to something pretty low, i.e. less than 50. At larger values then its impact is pretty limited. It’s been implied that this feature will be will be worked on in the future, but the corresponding issue has been open in GitHub with no update since July 2017.
For more details, you could refer to this article.
You can use the initialVisibilityDelay property of the CloudQueue.AddMessage function as outlined in this blog post.
This will throttle the message to prevent the 429 error if implemented correctly using the leaky bucket algorithm or equivalent.
I'm building out an API using Hapi.js. Some of my code is pushing small amounts of data to the API. The issue seems to be that the pusher code is swamping the API and I'm getting ECONNRESET errors -- which means messages are getting lost. I'm planning on installing a rate-limiter in the pusher code, probably node-rate-limiter (link).
The question is, what should I set that limit to? I want to max out performance for this app, so I could easily be attempting to send in thousands of messages per hour. The data just gets dumped into redis, so I doubt the code in the API will be an issue but I still need to get an idea of what kind of message rate Hapi is comfortable with. Do I need to just start with something reasonable and see how it goes? Maybe 1 message per 10 milliseconds?
Hapi = require('hapi');
server = new (Hapi.Server);
server.connection(port: config.port, routes: {
cors: {
origin: ['*']
}
});
server.route({method: 'POST', path: '/update/{id}', ...})
There is no generic answer to how many requests per second you can process. It depends upon many things in your configuration and code such as:
Type and performance of server hardware
The amount of CPU time an average request uses
Whether your requests are CPU or disk bound. If disk bounded, then it depends a lot on your database and disk performance.
Whether you implement clustering to use multiple cores (if CPU bound)
Whether you're on shared infrastructure or not
The max number of incoming connections your server is configured for
So, there is no absolute answer here that works for everyone. If you don't have some sort of design problem that is artificially limiting your concurrency, then the best way to discover what your server can actually handle is to build a test engine and test it. Find where and how it fails and either fix those issues to extend the scalability further or implement protections to avoid hitting that limit.
Note: When a public API makes rate limiting choices, it is typically done on a per-client basis and the limit is set to a value that seems to be a little above what a reasonable client would be doing. This is more to allow fair use of the server by many clients to that one single client does not consume too much of the overall resource. If issuing thousands of small requests from a single client is not considered "good practice" in using your API, then you can just pick a number that is much smaller than that for a per-client limit.
Note: You may also want to make it easier for clients by having your API let them upload multiple messages in one API request rather than lots of API requests.
In my spring integration application i have several stored-proc-outbound-gateway, i would to log how much time each call is taking, any help would be appreciated.
I would ideally like to be able to enable/disable logging for the parameters used, time taken and total rows retrieved (returning-resultset) to monitor and performance tuning purpose.
Thanks
You can add a ChannelInterceptor (subclass of ChannelInterceptorAdapter) to the request channel which will give you raw timing (preSend/postSend), but the time will include any processing downstream of the gateway (on direct channels).
Since you want to examine the results too, you could start a timer (e.g. Spring StopWatch) in the interceptor (preSend) on the request channel and stop the timer in an interceptor on the reply channel). If you use the same interceptor bean you can store the timer in a ThreadLocal.
You can turn on/off collection using a boolean property on the interceptor.
Alternatively, you can add a custom advice to the gateway.
EDIT
The advice is probably the best approach because with a ThreadLocal you will need to add code to the first interceptor to handle failures and clean up. With an around advice, the timer would just be a local method variable.