I'm in excel trying to count how many "peaks" there are in my data. Initially I thought "find the median, set a threshold value, count the number of values above said threshold value." So I ended up using =median(),=max() and =ifcount().
The problem is is that for a peak, there may be data-points that end up making the "slope" of said peak which are higher than the threshold value.
Wondering if there's an easy way in excel to count said peaks, or if I have to figure out a way to convert the data to a function and take a second derivative to find local maxima points.
Related
This may be more of a statistics question, and I'd like to find a solution with Excel. I'd rather use simple VBA if any coding is necessary.
Is there a way to estimate the percentile of a specific data point in a skewed distribution? I don't need exact percentiles and only need a reasonable estimate. I work on analyses that rely on weighted average benchmarks reported by multiple sources. All of my sources report the 25th, 50th, 75th, and 90th percentiles as well as the mean and standard deviation. We use these benchmarks to set a target range, and our goal is for our results from a specific analysis to land somewhere within the published percentiles. I'm often asked to indicate what percentile our specific result is at, and all I can provide is broad ranges like 25th-50th, etc. So, I'm then asked to use simple extrapolation to determine the specific percentile of the specific result, and I know that using this method is inaccurate.
Mean and median differ in 99% of cases in my data set, but % difference between mean and median on average is only 6%. Only about 10% of cases have mean and median with greater than 10% difference.
For the 90% of cases with relatively low % difference between mean and median, can I assume the normal distribution?
For cases with higher % difference between mean and median, can I make an assumption that will help me estimate more accurately? I could for these cases just use the normal distribution and send my percentile estimate along with a note indicating that the estimate is likely off in one direction or another, but I'd rather give a better estimate.
Responding to cybernetic.nomad:
First, thanks for commenting! Second, it doesn't seem to work. I think I don't have enough data. The attached image shows an example. The first 5 rows show one set of my weighted average benchmarks for a single case. Below that, I added two lines--one with my "target" amount. This could be any number but, to test out the formula you suggested, I entered my 50th percentile weighted average. The row below that has the results of the formula =percentrank.exc(25th:90th,target). The result should be 0.5 but it's not, so I don't think this works. example
I want to generate a single column of 6000 numbers with a normal distribution, with a mean of 30.15, standard deviation of 49.8, minium of -11.5, maximum 133.5.
I am a total newb at this so i tried to use the following formula in a cell and than just drag it down to cell 6000:
=NORMINV(RANDBETWEEN(-11.5,133.5)/100,30.15,49.8)
It returns a value but sometimes it returns #NUM! error. Thank you!
Unfortunately NORMINV expects a probability for the argument, which must be a value in the interval (0, 1). Any parameter outside that range will yield #NUM!.
What you're asking cannot be done directly with a normal distribution since that has no constraints on the minimum and maximum values.
One approach is to use a primary column to generate the normally distributed numbers, then filter out the ones you want in the adjacent column. But this will cause even the mean (let alone higher moments) to go off quite considerably due to your minimum and maximum values not being equidistant from the mean. You could get round this by recentering the distribution and adjusting afterwards.
I wrote a macro which as a tiny task on the side also calculates the average of around 39000 different values. I noticed that using WorksheetFunction.Average and calculating the average "step-wise" yield different results, but only at the 15th digit after the decimal point. By calculating "step-wise" I mean adding up each value to a total_sum variable, counting the amount of values in another variable and then dividing the former by the latter.
The 15th digit after the decimal point might be considered negligible but I find it unsettling nonetheless. Shouldn't those two values be exactly the same? They are when I use less values and as the macro might be applied on far more values than 39000 (100k+), I'm worried the error might increase.
So my questions are: What could cause the difference and more importantly which method is more precise?
What I tried was to declare all variables in the "step-wise" calculation as Variant to avoid using the wrong data type in any of those steps.
Thank you very much for your help!
I have a list of data points for given depths. Each depth is assigned a character.
My problem is that the data points are at inconsistent intervals, sometimes they can be 1m apart, sometimes 100m.
What I'd like to create is a list for every 1m interval, based on the values I already know. Every depth that is not included in the original data set will have the same character assigned to it as the depth above until the next known depth is reached.
Hopefully it's not to difficult but I'm rather useless at this kind of thing in excel so any answers can't expect too much prior knowledge :)
Thanks in advance.
Try this formula, which says 'If exact match for depth, use code for that, otherwise find largest depth less than current depth and use code corresponding to next depth after that'.
=IFERROR(INDEX(B$3:B$8,MATCH(D3,A$3:A$8,0)),INDEX(B$3:B$8,MATCH(D3,A$3:A$8)+1))
The depth values just increment as follows:
=IF(ROW()=3,A3,D2+1)
So I have an excel-sheet where I have different values, for example:
I want to judge these objects by their values. The values should be between 0 and 1 so in the end I can draw a Matrix. So far so good. What you could do is just take the maximum value and divide the value of the object with that maximum value. For the final result I just take the average of all 3 values.
Now I have the problem, that if one value is too big, it effects the whole situation. So I know, that the Median tries to resolve this, but how can I use this, to get the percentages/values between 0 and 1? And is there an easy way to do this in Excel?
Does excel not have a median function?
Otherwise, you can sort and find the middle value if the number of rows is odd, or the two middle values if the number of rows is even and take the average of those two values to get the median.