Finding stabilizing average of agent-based model runs - excel

So I ran about 200 agent-based model runs and I want to see how the average is changing over time.
For example if we have 10 points
2 4 2 8 6 5 9 8 1 3
I want to calculate the average as the number of points changes
(2+4)/ 2 = 3
now for the next point it will be (3+2)/2 = 2.5
so I can plot each average and see after how many runs does the average stabilize. Something like this image < https://imgur.com/a/VXeeuxy > Can someone provide an equation or method?
Thank you

I think you just want a 'cumulative average' of 1,2..n points. You can do this in a single formula if you don't mind using offset:
In most versions of Excel (F1):
=SUBTOTAL(1,OFFSET(B23,0,0,1,COLUMN(B23:K23)-COLUMN(A23)))
In Excel 365 only (F2):
=SUBTOTAL(1,OFFSET(B23,0,0,1,SEQUENCE(1,COLUMNS(B23:K23))))
Or a more dynamic version that works for a whole row (F3):
=SUBTOTAL(1,OFFSET(B23,0,0,1,COLUMN(A1:INDEX(1:1,COUNT(23:23)))))
and (F4)
=SUBTOTAL(1,OFFSET(B23,0,0,1,SEQUENCE(1,COUNT(23:23))))

Related

How to rank multiple columns on excel. I'm using Microsoft Office Professional Plus 2016 [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 months ago.
Improve this question
I have the following input data:
Contamination
10%
10%
20%
20%
30%
30%
Estimator
Trial 1
Trial 2
Trial 1
Trial 2
Trial 1
Trial 2
OLA
500
75
100
430
460
230
PWA
360
457
400
200
200
400
CA
470
270
450
250
350
150
HA
215
310
200
400
400
200
AM
300
500
315
200
500
250
Table has 5 different estimators each having 2 repeated trials for each of the 3 groups of percentages of contaminations (10%, 20%, and 30%) considered.
For each of the 5 estimators (in my real problem I have more than 5), I want to rank (from lower to highest value) among the trials within each group of percentages (in my real problem I have more than 3) simultaneously.
I am looking for a solution, that doesn't require manually to rank each group, since in my real problem I have a larger group of percentages, trials per group and experiments. The number of trials within each group are the same.
I want to get a formula that can rank it simultaneously. Here is the expected output from the input data sample:
Contamination
10%
10%
20%
20%
30%
30%
Estimator
Trial 1
Trial 2
Trial 1
Trial 2
Trial 1
Trial 2
OLA
2
1
1
2
2
1
PWA
1
2
2
1
1
2
CA
2
1
2
1
2
1
HA
1
2
1
2
2
1
AM
1
2
2
1
2
1
Notes:
I have an older version of Excel: Microsoft Office Professional Plus 2016, please consider that in your answer
The Contamination is the same within the trial set (Trial 1, Trial 2). Table Markdown feature doesn't allow to merge cells, that is why the percentage is repeated
I assume from your question:
I do not want to rank them individually as it will take much time
since I have more than 50 of them to rank
You are referring to the percentage groups and/or number of trials that can be large, which is the most difficult task to expand.
In the cell L3(see screenshot below) I put the first formula that can be expanded vertically and horizontally. Please pay attention to the $ mark to ensure it works in both directions.
=RANK(OFFSET($C3,0, L$2*$J$2 + L$1),
OFFSET($C3,0, L$2*$J$2,1,$J$2),1)
Using named ranges, for example naming $J$2 as totalTrials it is easier to understand the formula:
=RANK(OFFSET($C3,0, L$2*$totalTrials + L$1),
OFFSET($C3,0, L$2*totalTrials,1,totalTrials),1)
or even better, using the modern LET function to define all the variables first, is the easiest way to read it:
=LET(startCell,$C3, totalTrials,$J$2,
groupCnt, L$2*totalTrials, trialCnt,L$1,
RANK(OFFSET(startCell,0, groupCnt + trialCnt),
OFFSET(startCell,0, groupCnt,1,totalTrials),1)
)
and here is the output:
Changing the values of Trial per Group (J2) or # Groups (J4) and expanding the formula horizontally and vertically you can consider a large number of trials within the group, large number of groups and a large number of experiments.
Explanation
For doing the ranking per experiment within the group I use the RANK(number,ref,order). For jumping from one group to another and getting the specific trial within the group, the OFFSET(reference, rows, cols, height, width) function. (check this link for more information about OFFSET function).
For building the recurrence in the formula I use the following two helper rows:
L1:Q1: The trial number within the group, starting from zero, via the following formula: MOD((COLUMNS($A$1:A1)-1),$J$2). It generates the sequence: 0,1,2,...N, 0,1,2,..N, where N is the number of trials minus one (easier to start from zero for the recurrence). In the sample the number of trials is indicated in the cell: J2.
L2:Q2: The group number, starting from zero, that each trial belongs to via the following formula: INT((COLUMNS($A$1:A1)-1)/($J$2)). It generates the sequence: 0,0,0,..0, 1,1,1,..1,...M,M,M...M, where each number is repeated as many times as trials we have where M is the number of groups minus one.
Note: There is a more concise way for building such sequences in modern excel versions via SEQUENCE but you indicated you have an older version.
In order to generate both sequences, I took the idea from here: Excel Magic Trick 692: More About Incrementing Numbers In Formulas (that works for older excel versions)
Combining properly both helper rows in OFFSET, the rank is calculated per trial per experiment.
This explanatory picture from ExcelJet, helps to understand each input argument of the OFFSET function:

EXCEL PERCENTILE result is wrong compared to a textbook?

I am helping my son with his math homework, specifically statistics and this is the dataset:
1 2 3 4 5 6 7 8 9 10
I have 10 numbers from 1 to 10.
15 percentile:
in Excel I use the PERCENTILE or PERCENTILE.INC function with .15 and the result is 2.35, why?
The book way. .15*10 = 1.5 th number. There is not 1.5 number so round up to 2 or 2.
20 percentile:
In excel I get 2.8.
Book version: .2*10 = 2 (exact) so take average of 2nd and 3rd value for 2.5
50 percentile or median:
In excel I get 5.5.
Book version .5*10 = 5 (exact) so take average of 5th and 6th value for 5.5 (only match)
75 percentile 7.75:
Book, .75*10 * 7.5 so round up to 8.
Excel
80 percentile:
Excel I get 8.2
Book, .8108, average of 8 and 9 is 8.5.
Obviously Excel is doing more advanced math and additional smoothing, however I have not been able to find the exact math it uses replicate it, hence I will say it is wrong. Other programs and statistical packages match Excel so it is correct, but not useful as I need it.
How can I get Excel to give me the Book version of answers or at least replicate the Excel answers with paper and a basic calculator.
Most importantly I need to find a way to explain to my son that it is OK that the results don't match that he should do it the book way, at least for now or in school.
EDIT: After posting, SO found and similar question: Different results for percentiles in SAS and Excel It seems SAS gives the same results as the book version. The answer there is that Excel and most packages use different interpolation methods. However I need a better explanation for my son and maybe a way to create a proper percentile function for my son, but hopefully without VBA.

Excel Table of Daily hours summarized by week number for a user

I'm having a hard time getting my head around what I think is a simple enough problem.
I have an Excel table of hours by day for each user i.e.:
Date1, Date1+1, Date1+2, Date1+3,... Date1+n
User1 8 8 4 6 ... 2
User2 5 2 8 3 ... 7
User3 0 7 5 0 ... 8
For forecasting purposes this grid looks several months into the future.
I do my work daily, others want it by week. I'd like to automatically generate the same table of data but rolled up by WeekNum.
I tried setting a year-weeknum at the top of the daily table and then using a SumIfs function to compare the user name and week num to sum up the daily hours in another tab for weekly data but I just couldn't get it to function properly.
=SUMIFS('Act - Forecast Hours'!$G$6:$AAL$35,'Act - Forecast Hours'!$A26,$A25,'Act - Forecast Hours'!S$4,O$3)
I think I'm overcomplicating a solution, any help is appreciated.
TIA
Rob
OK, I may have come up with an approach.
Since on my main Hourly Sheet the format is fixed, i.e. each week is 7 days and increments.
I setup a second sheet where I called a vertical and a horizontal offset and used the following formula:
=SUM(OFFSET('Act - Forecast Hours'!$G$9,$A5,D$2,1,7))
$A5 and D$2 refer to offset counts that increment by 7. As you copy the formula to each cell it increments the Row / Column to point to the right spot. Then for the Height and Width I look at a grid 1 row high and 7 wide to select each day of the week.
It works, I'm happy. I'm certainly interested in a more refined approach if there is one :-)
Thank You to anyone that does read through the question!
Regards
Rob

count how often a piece of information appears and calculate that average

I do not want to know the traditional frequency or the traditional averages; so I'll give an example below:
I have this data:
1
3
5
5
2
3
5
5
1
3
The analysis that I would like to obtain is the following:
for example number 1 appears once every eight rows, number 3 appears once every four rows, number 5 appears twice every two rows....
I did it by hand, but now I have more than 21000 rows of data and I'm stuck.
I searched but I can not find a function that does it; But before I started developing my own, I decided to ask for a guide on how to achieve it.
I believe that I was able to achieve the desired result:
The formula is:
Or, if you want to copy/paste:
=IF(CONCATENATE("1-",MATCH(D1,INDIRECT(ADDRESS(MATCH(D1,A1:A17,0)+1,1,4)&":A17"),0))="1-1",CONCATENATE("2-",MATCH(D1,INDIRECT(ADDRESS(MATCH(D1,A1:A17,0)+2,1,4)&":A17"),0)-1),CONCATENATE("1-",MATCH(D1,INDIRECT(ADDRESS(MATCH(D1,A1:A17,0)+1,1,4)&":A17"),0)))
Note that the IF function solves the duplicates (like the number 5). In case you have triplicates you will have to add another instance of IF and adjust the formula accordingly.
Hope that helps!
Well this doesn't exactly reproduce your results, but you could start by looking at the max and min separation of the numbers:
=IF(COUNTIF(A$1:A$10,C2)<=1,"",MIN(IF((ROW(A$1:INDEX(A$1:A$10,COUNTIF(A$1:A$10,C2)+1))>1)
*(ROW(A$1:INDEX(A$1:A$10,COUNTIF(A$1:A$10,C2)+1))<=COUNTIF(A$1:A$10,C2)),
FREQUENCY(IF(A$1:A$10<>C2,ROW(A$1:A$10)),IF(A$1:A$10=C2,ROW(A$1:A$10)))))+1)
=IF(COUNTIF(A$1:A$10,C2)<=1,"",MAX(IF((ROW(A$1:INDEX(A$1:A$10,COUNTIF(A$1:A$10,C2)+1))>1)
*(ROW(A$1:INDEX(A$1:A$10,COUNTIF(A$1:A$10,C2)+1))<=COUNTIF(A$1:A$10,C2)),
FREQUENCY(IF(A$1:A$10<>C2,ROW(A$1:A$10)),IF(A$1:A$10=C2,ROW(A$1:A$10)))))+1)
This gives the min or max number of rows between each occurrence of the particular number.
Must be entered as an array formula using CtrlShiftEnter
You could add other statistics (like mean, standard deviation) the same way although the average could be calculated just by (lastrow-firstrow)/(count-1) e.g. for 5 it would be (8-3)/(4-1)=5/3.

Using an array to take average of rolling difference in excel?

Hi I need to calculate the average of a rolling difference in excel, Ideally using one formula
For example on the below data set the average of the 3day difference would be
A
1
2
3
4
5
8
9
1
`Average(A1-A4, A2-A5,A3-A6,A4-A7...etc)
Ideally I'd like build something where I can calculate the average difference between y number of days.
Is there an easier way without having to build out a difference matrix?
So my formula would look like this
=(SUM(INDEX(A:A,COUNT(A:A)-2):INDEX(A:A,COUNT(A:A)))-SUM(A1:A3))/(COUNT(A:A)-3)
assuming the data starts in A1 with no headers and there are at least 4 numbers.

Resources