I have question regarding if it is possible in azure to make an alert if a End-To-End transaction duration exceeds a certain value like 30 sec. I can clearly see if a end-to-end transaction take longer then 30 sec in application insights. But I can't figure out how to make an alert that notifies me if any transacation exceeds a certain amount.
End-To-End transaction example
If anyone knows how to do this let me know.
Thanks in advance!
Related
Microsoft states that the SLA for Application Insights is:
We guarantee that the data latency of the Application Insights Service will not exceed two hours 99.9% of the time.
https://azure.microsoft.com/en-us/support/legal/sla/application-insights/v1_0/
For the 0.1% of time outside the SLA, when TelemetryClient.TrackEvent() executes in my code, Is Microsoft guaranteeing that the event will definitely be published at some point (just not within 2 hours)? Or could the event be lost during that 0.1% time?
No, just calling TrackEvent doesn't guarantee it is published, for lots of reasons:
sampling at any level of the process. see https://learn.microsoft.com/en-us/azure/application-insights/app-insights-sampling?toc=/azure/azure-monitor/toc.json but in general if sampling is on, some % of your events might be merged together. there are various ways to find those events, but in general it is possible that if you call trackMessage 1000 times in a tight loop with the same content, an SDK might sample that and send a single event with itemCount set to 1000.
the content of the event could be invalid (to large a payload, exceeding thresholds for sizes of fields, too many custom properties, too many custom metrics, etc)
the time of the event could be invalid. events too far in the past (>48h old?) or too far into the future (not sure the exact time there, but some future time is allowed to account for clock skew/drift)
caps - you could exceed the amount you're allowed to send per month - see https://learn.microsoft.com/en-us/azure/application-insights/app-insights-pricing, which at the time of this answer states:
The maximum cap is 1,000 GB/day unless you request a higher maximum for a high-traffic application.
throttling - you could exceed the allowed number of events per second/etc - see https://learn.microsoft.com/en-us/azure/application-insights/app-insights-pricing, which at the time of this answer states:
Throttling limits the data rate to 32,000 events per second, averaged over 1 minute per instrumentation key.
network issues, etc. calling track on the various sdks doesn't guarantee the data is accepted or retried. some of the sdks attempt to retry, some do not.
your application could shut down / crash between the call to track and the actual connection to application insights is created/completed.
other random issues, service issues, downtime of other dependent services, etc that account for that 0.1% of missing data. I'm not sure there's any APM/telemetry service that guarantees it will accept and process 100% of the events you send.
(100% - 99.9% is not 0.01%, it is 0.1%. there's a 10x difference there.)
I have escalated this issue to app insights team. If any feedback, I will update you.
As per my understanding, for the other 0.01% time outside SLA, if there is some downtime, the data would get lost. In any other condition, it would be published beyond 2 hours.
Hope it helps.
I found this document explaining what the resource/rate limits are for the docusign API. https://developers.docusign.com/esign-rest-api/guides/resource-limits However I didn't get any errors related to resource limits during development and testing. Are these only active in production environment? Is there a way to test these limits during development to make sure the application will work correctly in production? Is this document valid/up to date?
Update (I just also want to expand my question here too)
So there is only ONE TYPE of limit? 1000 calls per hour and that's it? Or do I also need to wait 15 minutes between requests to the same URL?
If the second type of limitation exists (multiple calls to the same URL in an interval of 15 minutes) does it apply only to GET requests? So I can still create/update envelopes multiple times in 15 minutes?
Also if the second type of limit exists can I test it in the sandbox environment somehow?
The limits are also active in the sandbox system.
Not all API methods are metered (yet).
To test, just do a lot of calls and you'll see that the limits are applied. Eg do 1,000 status calls in an hour. Or create 1,000 envelopes and you'll be throttled.
Added
Re: only one type of limit?
Correct. Calls per hour is the only hard limit at this time. If 1,000 calls per hour is not enough for your application in general, or not enough for a specific user of your application, then there's a process for increasing the limit.
Re: 15 minute per status call per envelope.
This is the polling limit. An application is not well behaved if it polls DocuSign more than once every 15 minutes per envelope. In other words, you can poll for status on envelope A once per 15 minutes and also poll once every 15 minutes about envelope B.
The polling limit is monitored during your application's test as part of the Go Live process. It is also soft-monitored once the app is in production. In the future, the monitoring of polling for production apps will become more automated.
If you have a lot of envelopes that you're polling on then you might also run into the 1,000 calls per hour limit.
But there's no need to run into issues with polling: don't poll! Instead set up a webhook via DocuSign Connect or eventNotification and we'll call you.
Re: limits for creating/updating an envelope and other methods
Only the (default) 1,000 calls per hour affects non-polling methods.
Eg, asking for the status of an envelope's recipients, field values, general status, etc, over and over again is polling. Creating /updating envelopes can be done as often as you want (up to the default of 1,000 per hour).
If you want to create more than 1,000 envelopes per hour, we'll be happy to accommodate you. (And many of our larger customers do exactly that.)
The main issue that we're concerned with is unnecessary polling.
There can be other unnecessary calls which we'd also prefer to not have. For example, the OAuth:getUser call is only needed once per user login. It shouldn't be repeated more often than that since the information doesn't change.
I'm a bit confused by the documentation : In order for Stripe to compute the number of units consumed during the billing cycle, you must report the customer’s usage by creating usage records
then : The usage reporting endpoint is rate-limited, so you might need to exercise caution and avoid making too many separate usage records.
So what is it saying exactly? after adding some usage for a few customers, my app will stop working? Then what should I use? line items? invoice items?
So far I've created a customer and subscribed him to a plan. How do I increment his usage without limit and risking my app to break for no apparent reason?
This just means that, if you're in danger of hitting the rate limiting, you should do something to batch up your calls to https://api.stripe.com/v1/subscription_items/{SUBSCRIPTION_ITEM_ID}/usage_records
Instead of POSTing there every time your customer's usage increases, for example, just keep track of it on your side and do one POST daily, at the end of the billing cycle, or at some other interval.
I have a document db database on azure. I have a particularly heavy query that happens when I archive a user record and all of their data.
I was on the S1 plan and would get an exception that indicated I was hitting the limit of RU/s. The S1 plan has 250.
I decided to switch to the Standard plan that lets you set the RU/s and pay for it.
I set it to 500 RU/s.
I did the same query and went back and looked at the monitoring chart.
At the time I did this latest query test it said I did 226 requests and 10 were throttled.
Why is that? I set it to 500 RU/s. The query had failed, by the way.
Firstly, Requests != Request Units, so your 226 requests will at some point have caused more than 500 Request Units to be needed within one second.
The DocumentDb API will tell you how many RUs each request costs, so you can examine that client side to find out which request is causing the problem. From my experience, even a simple by-id request often cost at least a few RUs.
How you see that cost is dependent on which client-side SDK you use. In my code, I have added something to automatically log all requests that cost more than 10 RUs, just so I know and can take action.
It's also the case that the monitoring tools in the portal are quite inadequate and I know the team are working on that; you can only see the total RUs for every five minute interval, but you may try to use 600 RUs in one second and you can't really see that in the portal.
In your case, you may either have a single big query that just costs more than 500 RU - the logging will tell you. In that case, look at the generated SQL to see why, maybe even post it here.
Alternatively, it may be the cumulative effect of lots of small requests being fired off in a small time window. If you are doing 226 requests in response to one user action (and I don't know if you are) then you probably want to reconsider your design :)
Finally, you can retry failed requests. I'm not sure about other SDKs but the .Net SDK retries a request automatically 9 times before giving up (that might be another explanation for the 229 requests hitting the server).
If your chosen SDK doesn't retry, you can easily do it yourself; the server will return a specific status code (I think 429 but can't quite remember) along with an instruction on how long to wait before retrying.
Please examine the queries and update your question so we can help further.
On this documentation page there is the following limitation of Application Insights documented:
Up to 500 telemetry data points per second per instrumentation key (that is, per application). This includes both the standard telemetry sent by the SDK modules, and custom events, metrics and other telemetry sent by your code.
However it doesn't explain what the implications of that limit are?
a) Does it buffer and throttle, but still persist all data eventually? So say - 1000 data points get pushed within a second - it will persist the first 500, then wait for a bit and push the other 500?
or
b) Does it just drop/not log data? So say - 1000 data points get pushed within a second and only the first 500 will be persisted and the other 500 not (ever)?
It is the latter (b) with the caveat that ALL data will start to be throttled in this case, i.e. once RPC is > 500 (100 for free apps, please see https://azure.microsoft.com/en-us/documentation/articles/app-insights-data-retention-privacy/ for details) is detected, it will start rejecting all data from this instrumentation key on data collection endpoint, until RPC rate is back to under 500.
EDIT: Further information from Bret Grinslade:
The current implementation averages over one minute -- so if you send 30K in 1 minute (500*60) it will throttle your application. The HTTP response will tell the SDK to retry later. If the incoming rate never comes down, the response will tell the SDK to drop the data. We are working on other features to improve this experience -- pre-aggregation on the client, improved burst data rates, etc.
A bit more detail on top of Alex's response. The current implementation averages over one minue -- so if you send 30K in 1 minute (500*60) it will throttle your application. The HTTP response will tell the SDK to retry later. If the incoming rate never comes down, the response will tell the SDK to drop the data. We are working on other features to improve this experience -- pre-aggregation on the client, improved burst data rates, etc.
AI now has the ingestion throttling limit of 16K EPS: https://learn.microsoft.com/en-us/azure/application-insights/app-insights-pricing