What exactly is Response Time metric from App Service

What exactly is Response Time metric from App Service - azure

Is it the same as "Time Taken (time-taken)" from IIS Logs as described here?
Time-taken is the time it takes from when IIS receives the first byte in the request until it sends out the last byte in that request. This includes network time getting to the client (for nearly all cases if you have a page that is less than 2KB, I think, that is cached) so a slow network connection will have a longer network time and thus time taken.
The start time of when the first byte is received by IIS is the "time" field and the finish time when IIS sends out that the last byte is the "time" field + the "time-taken" field

The Average response times is a composite metric composed of 2 metrics.
1) #of Requests
2) Sum of response times for a sample.
We publish metrics every 10s, so 6 samples in a minute. This means every 10 s we publish number of requests in that 10s and the Sum of response times in that 10s.
It does not have information about response time for each request. The smallest granularity is 1 minute so aggregates the 6 samples calculate computed sampling types (Average, Sum, Max, Min) .

Related

Am I applying Little's Law correctly to model a workload for a website?

Using these metrics (shown below), I was able to utilize a workload modeling formula (Little’s Law) to come up with what I believe are the correct settings to sufficiently load test the application in question.
From Google Analytics:
Users: 2,159
Pageviews: 4,856
Avg. Session Duration: 0:02:44
Pages / Session: 2.21
Sessions: 2,199
The formula is N = Throughput * (Response Time + Think Time)
We calculated Throughput as 1.35 (4865 pageviews / 3600 (seconds in an hour))
We calculated (Response Time + Think Time) as 74.21 (164 seconds avg. session duration / 2.21 pages per session)
Using the formula, we calculate N as 100 (1.35 Throughput * 74.21 (Response Time + Think Time)).
Therefore, according to my calculations, we can simulate the load the server experienced on the peak day during the peak hour with 100 users going through the business processes at a pace of 75 seconds between iterations (think time ignored).
So, in order to determine how the system responds under a heavier than normal load, we can double (200 users) or triple (300 users) the value of N and record the average response time for each transaction.
Is this all correct?

When you do a direct observation of the logs for the site, blocked by session duration, what are the maximum number of IP addresses counted in each block?
Littles law tends to undercount sessions and their overhead in favor of transactional throughput. That's OK if you have instantaneous recovery on your session resources, but most sites are holding onto them for a period longer than 110% of the longest inter-request window for a user (the period from one request to the next).

Below formula always worked good for me, If you are looking to calculate pacing
"Pacing = No. of Users * Duration of Test (in seconds) / Transactions you want to achieve in said Test Duration"
You should be able to get closer to the Transactions you want to achieve using this Formula. If Its API, then its almost always accurate.
For Example, You want to achieve 1000 transactions using 5 users in one hour of Test Duration
Pacing = 5 * 3600/1000
= 18 seconds

Maths - Calculate Time Required to Process Requests

I am trying to come up with a formula to calculate time required as per following configuration:
It takes 40 seconds to process each request. Server allows 3 requests per IP address per second.
If we use 50 proxies to process 200 requests, how much time will be required to complete all requests? I need the formula where number of requests and number of proxies can be variables.

Azure logic app twitter trigger not working

I have created a logic app to trigger when a tweet is posted with a given hashtag. The trigger is set to check every 10 seconds. The reality is that the Logic App does not run, even if I wait minutes for it, but then if i manually run it, it then executes with the expected input. Any idea what is happening here?

I was having a similar issue, and believe this is due to the specific limitations set for the Twitter Connector (4. Frequency of trigger polls: 1 hour).
https://learn.microsoft.com/en-us/connectors/twitterconnector/
LIMITS
The following are some of the limits and restrictions:
Maximum number of connections per user: 2
API call rate limit for POST operation: 12 per hour
API call rate limit for other operations: 600 per hour
Frequency of trigger polls: 1 hour
Maximum size of image upload: 5 MB
Maximum size of video upload: 15 MB
Maximum number of search results: 100
Maximum number of new tweets tracked within one polling interval: 5

There should be some error occurred. You can inspect all runs of the triggers on the 'Trigger History' blade. This page gives a good overview of monitoring of logic apps: https://azure.microsoft.com/en-us/documentation/articles/app-service-logic-monitor-your-logic-apps/

Does IoTHub delay messages by a batching interval in a custom endpoint to Azure Storage?

I am sending some messages in a pipeline using Azure IoT Edge. There is a custom endpoint (say, GenericEndpoint) that I have set up, which will send/put the messages to Azure Blob storage. I am using a route to push the device messages to the specific endpoint GenericEndpoint.
The batch frequency of GenericEndpoint is set at 60 seconds. So 1 batch creates 1 single file with some messages, in the container specified.
Lets say, there are N messages in a single blob batch file (say, blobX) in the specific container. If I take the average of the difference between the IoTHub.EnqueuedTime(i) of each message i, in blobX and the 'Creation Time' of blobX, and call it AVG, I get:
I think, this essentially gives me the average time that those N messages spent in iothub before being written in the blob storage. Now what I observe here is that, if p and q are respectively the first and last message written in blobX, then
But since the batching interval was set to 60 seconds, I would expect this average or AVG to be approximately near 30 seconds. Because, if the messages are written as soon as they arrive, then the average for each batch file would be near 30 seconds.
But in my case, AVG ≈ 90 seconds, which suggests the messages wait for atleast approximately one batching interval (60 seconds in this case) before being considered for a particular batch.
Assumption: When a batch of messages are written in a blob file, they are written all at once.
My question:
Is this delay of one batch interval or 60 seconds intentional? If yes, then I assume it will change on changing the batching interval to say 100 seconds.
If, no, then, does it usually take 60 seconds to process a message in iothub and then send it through a route to a custom endpoint? Or am I looking at this from a completely wrong angle?
I apologize beforehand if my question seems confusing.

Tracking metrics using StatsD (via etsy) and Graphite, graphite graph doesn't seem to be graphing all the data

We have a metric that we increment every time a user performs a certain action on our website, but the graphs don't seem to be accurate.
So going off this hunch, we invested the updates.log of carbon and discovered that the action had happened over 4 thousand times today(using grep and wc), but according the Integral result of the graph it returned only 220ish.
What could be the cause of this? Data is being reported to statsd using the statsd php library, and calling statsd::increment('metric'); and as stated above, the log confirms that 4,000+ updates to this key happened today.
We are using:
graphite 0.9.6 with statsD (etsy)

After some research through the documentation, and some conversations with others, I've found the problem - and the solution.
The way the whisper file format is designed, it expect you (or your application) to publish updates no faster than the minimum interval in your storage-schemas.conf file. This file is used to configure how much data retention you have at different time interval resolutions.
My storage-schemas.conf file was set with a minimum retention time of 1 minute. The default StatsD daemon (from etsy) is designed to update to carbon (the graphite daemon) every 10 seconds. The reason this is a problem is: over a 60 second period StatsD reports 6 times, each write overwrites the last one (in that 60 second interval, because you're updating faster than once per minute). This produces really weird results on your graph because the last 10 seconds in a minute could be completely dead and report a 0 for the activity during that period, which results in completely nuking all of the data you had written for that minute.
To fix this, I had to re-configure my storage-schemas.conf file to store data at a maximum resolution of 10 seconds, so every update from StatsD would be saved in the whisper database without being overwritten.
Etsy published the storage-schemas.conf configuration that they were using for their installation of carbon, which looks like this:
[stats]
priority = 110
pattern = ^stats\..*
retentions = 10:2160,60:10080,600:262974
This has a 10 second minimum retention time, and stores 6 hours worth of them. However, due to my next problem, I extended the retention periods significantly.
As I let this data collect for a few days, I noticed that it still looked off (and was under reporting). This was due to 2 problems.
StatsD (older versions) only reported an average number of events per second for each 10 second reporting period. This means, if you incremented a key 100 times in 1 second and 0 times for the next 9 seconds, at the end of the 10th second statsD would report 10 to graphite, instead of 100. (100/10 = 10). This failed to report the total number of events for a 10 second period (obviously).Newer versions of statsD fix this problem, as they introduced the stats_counts bucket, which logs the total # of events per metric for each 10 second period (so instead of reporting 10 in the previous example, it reports 100).After I upgraded StatsD, I noticed that the last 6 hours of data looked great, but as I looked beyond the last 6 hours - things looked weird, and the next reason is why:
As graphite stores data, it moves data from high precision retention to lower precision retention. This means, using the etsy storage-schemas.conf example, after 6 hours of 10 second precision, data was moved to 60 second (1 minute) precision. In order to move 6 data points from 10s to 60s precision, graphite does an average of the 6 data points. So it'd take the total value of the oldest 6 data points, and divide it by 6. This gives an average # of events per 10 seconds for that 60 second period (and not the total # of events, which is what we care about specifically).This is just how graphite is designed, and for some cases it might be useful, but in our case, it's not what we wanted. To "fix" this problem, I increased our 10 second precision retention time to 60 days. Beyond 60 days, I store the minutely and 10-minutely precisions, but they're essentially there for no reason, as that data isn't as useful to us.
I hope this helps someone, I know it annoyed me for a few days - and I know there isn't a huge community of people that are using this stack of software for this purpose, so it took a bit of research to really figure out what was going on and how to get a result that I wanted.

After posting my comment above I found Graphite 0.9.9 has a (new?) configuration file, storage-aggregation.conf, in which one can control the aggregation method per pattern. The available options are average, sum, min, max, and last.
http://readthedocs.org/docs/graphite/en/latest/config-carbon.html#storage-aggregation-conf

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string