95th, 99th percentile on IIS log parser - iis

Is there a way to get the 95th or 99th percentile of response time with log parser? I am using log parser to parse IIS logs unfortunately, i can only see ready made queries for avg, max, min response time.

You cannot get percentiles very easy from LogParser, but you can do it manually:
SELECT COUNT(*)
FROM $logDir\u_ex190314.log
WHERE [conditions]
This gives you the total number of requests. Then for 95th percentile, you calculate (1 - 0.95) * COUNT(*) and do another query:
SELECT TOP 123 time-taken
FROM $logDir\u_ex190314.log
WHERE [conditions]
ORDER BY time-taken DESC
Now, the last row in the result (or the minimum value in the set) is the 95th percentile "response time" (from IIS' point of view).
Another approach is to analyze the log files with a better tool, for example R, or to export to SQL Server or Excel.

Related

Reading response time percentile in Designing Data-Intensive Applications Book

In the book Designing Data-Intensive Applications, there is this sentence:
For example, if the 95th percentile response time is 1.5 seconds, that means 95 out of 100 requests take less than 1.5 seconds, and 5 out of 100 requests take 1.5 seconds or more.
The confusing part is the saying that 95 of these requests will take less than 1.5 seconds. Isn't that supposed to be that 95 of requests take 1.5 seconds or less, and the remaining 5 takes more than 1.5 seconds? Or, the one percent in the 95th percentile takes exactly 1.5 seconds, 89th percentile and below take less than 1.5, and the 96th and above percentiles take more than 1.5? What is the correct reading of these numbers?
I have done some research on this and found several articles. The interesting part is that some say what I say and some don't.
Some of the links that read the percentile similar to 95 of the requests take 1.5 or less:
average 90th percentile response time and average response time
90% percentile is a statistical measurement, in case of JMeter it means that 90% of the sampler response times were smaller than or equal to this time
https://www.dynatrace.com/news/blog/why-averages-suck-and-percentiles-are-great/
so 90 percent of the requests are processed in 3.0 seconds or less
https://www.adfpm.com/adf-performance-monitor-monitoring-with-percentiles
If the 90th percentile of the same transaction is at 1000ms it means that 90% are as fast or faster and only 10% are slower.
Other links that read the percentile similar to 95 of the requests take less than 1.5:
https://www.elastic.co/blog/averages-can-dangerous-use-percentile
In contrast, the 99th percentile says “99% of your values are less than 850ms”, which is a very different picture.
I got the answer from this website and according to them, both of them is true. It just depends on how the percentile rank is calculated:
The word “percentile” is used informally in the above definition. In common use, the percentile usually indicates that a certain percentage falls below that percentile. For example, if you score in the 25th percentile, then 25% of test takers are below your score. The “25” is called the percentile rank. In statistics, it can get a little more complicated as there are actually three definitions of “percentile.” Here are the first two (see below for definition 3), based on an arbitrary “25th percentile”:
Definition 1: The nth percentile is the lowest score that is greater than a certain percentage (“n”) of the scores. In this example, or n is 25, so we’re looking for the lowest score that is greater than 25%.
Definition 2: The nth percentile is the smallest score that is greater than or equal to a certain percentage of the scores. To rephrase this, it’s the percentage of data that falls at or below a certain observation. This is the definition used in AP statistics. In this example, the 25th percentile is the score that’s greater or equal to 25% of the scores.

average 90th percentile response time and average response time

In my test,i observed that value of average 90th percentile and value of average response time is same,say 28.
Can someone assist in which cases it might happened??
It might be the case that response time for all samplers is equal or similar.
Average Response Time is basically arithmetic mean, to wit sum of response times for all samplers divided by their count.
90% percentile is a statistical measurement, in case of JMeter it means that 90% of the sampler response times were smaller than or equal to this time
More information:
JMeter Glossary
Request Statistics Report
Generating Report Dashboard

Statistical functions on non-numerical value

I am not looking for any code or formula but a rationale/logic.
Background: My data set comes in Date/Time format where a new timestamp is created for each new occurrence of an event.
My goal is to calculate number of occurrences within each hour for a given day. Unfortunately, system does not capture number if occurrences per period as integers. So I have count the number of time an hour value appears within the hour i.e number of times 4 o'clock hour appears. I am currently using Pivot Table in Excel to count the number of times each hour appears. Fields in Rows are hour and dates, and field in Values is count of hour.
Trouble is that I cannot use any summarize functions to get stuff like sum, min, max, percentile, and standard deviation. For example, changing count to sum will only add up all hours. So sum of 4 o'clock hour will return 12 instead of 3. So I am having to use array formulas on pivot table to give me max and min etc.
If I was to use this data in data viz tools like Tableau or Power BI. I won't be able to get very far. I am looking for a suggestions/workaround that can allow me to manipulate my data in a way so it can be used in Pivot Tables in Excel and in data viz tools.
I know my questions is not specific to one tool but I am looking to enhance me understanding of data and data manipulations techniques.
EDIT: Please see attached image
Build a data model, using PowerPivot. Join your fact table to a calendar dimension table. Create a row count measure - you can then summarise that measure to suit (sum, average, min, etc)

Azure stream analysis query moving average

While using Azure Stream Analysis, I can create a kind of moving average by using AVG and group them by the HoppingWindow like the code shown below.
However, this will create a moving average of the points in the last 5 seconds. Is there a way to create a moving average of the last n datapoint? I understand that I can adjust the windows size to make n points coming into the window, but is there a way to measure the exact last n points like the one in MySQL and PostgresSQL?
SELECT System.TimeStamp AS OutTime, AVG (value)
INTO
[output]
FROM [input]
GROUP BY HoppingWindow(second,5,1)
Today ASA windows are only based on time. However you can use the LAG function to get previous events.
For your query, if you want to make the average in the 3 latest events, it will be something like this:
SELECT System.TimeStamp AS OutTime,
AvgValue= ( LAG(value,1) OVER (LIMIT DURATION(minute, 5))
+ LAG(value,2) OVER (LIMIT DURATION(minute, 5))
+ LAG(value,3) OVER (LIMIT DURATION(minute, 5))
)/3
FROM input
Sorry for the inconvenience.
Thanks,
JS

Calculate Average from Groups

I'm trying to take a table of web data (average % of page viewed) and create an average.
This is what my table looks like:
0-25% 954,353
26-50% 58,569
76-100% 73,653
51-75% 31,011
I'm looking to calculate in a cell that the average across all is XX %.
I guess this is what you are looking for:
Due to a lack of more information, we do not know what the actual distribution of the items in the range from 0 - 25% is. Hence, I am assuming that they all average out at 12,5% (the median). If you continue this line of thought then the overall average is nothing but an average of the medians or (looking at the formula) a SumProduct divided by the Sum of all items.

Resources