How to compute the maximum series of a specific condition returning true - excel

i have a slight issue to count the MAX frequency of where the third colmn is bigger than the second. This is just a statistic with scores.
The issue is that i want to have it in one single formula without a macro.
B C
------
2 0
1 2
2 1
2 3
0 1
1 2
0 1
3 3
0 2
0 2
i have tried it with:
{=MAX(FREQUENCY(B3:B100;B3:B100>=C3:C100))} to get 1 for B
{=MAX(FREQUENCY(C3:C100;C3:C100>=B3:B100))} to get 7 for C
I excpected it to deliver me the longest series where the value in the one column was bigger than in the other one, but i failed hard...

Try this version to get 7
=MAX(FREQUENCY(IF(C3:C100>=B3:B100,IF(B3:B100<>"",ROW(B3:B100))),IF(C3:C100<B3:B100,ROW(B3:B100))))
confirmed with CTRL+SHIFT+ENTER
obviously reverse the ranges to get your other result
See example here

Related

Excel: How to get average of the between nonzero values?

I would need to get the average of the 2nd to the 8th nonzero value per row. Meaning, it would always move depending where the nonzero begins at.
I think what I would need is to determine the following:
Location of the 2nd nonzero value
Location of the 8th nonzero value
Average of numbers between 2nd and 8th nonzero values
Is that possible?
For example
0 | 6 | 10 |5| 9 | 0 | 6 | 0 |3 | 10| 1|9|
Those in bold have to be averaged. The zeroes in between have to be ignored
If you are using Windows Excel 2019, then:
=AVERAGE(FILTERXML("<t><s>" &TEXTJOIN("</s><s>",TRUE,IF(row_ref<>0,row_ref,""))&"</s></t>","//s[position()>1 and position()<10]"))
TEXTJOIN extracts only those values which are non-zero (and non-blank)
By using the appropriate delimiters, we create an XML
The xPath with the position function then extracts the 2nd to 8th values (positions 2 through 9)
AVERAGE
Assuming your data is laid out as per the example below, place the following formula in column T and copy down as required.
=IFERROR(AVERAGE(INDEX(22:22,AGGREGATE(15,6,COLUMN(B22:S22)/(B22:S22<>0),ROW($2:$9)))),"Less than " & rows($2:$9) & " non zero numerical entries")
B22:S22 - represents the row of data you are looking at. Feel free to change the column reference letters to suit your needs. Just ensure all the references with in the formula match.
$2:$9 - Represents the number of entries you want to use as part of the average. 2 is the starting number that you want to use and corresponds to 2nd non zero. 9 is the last number you want to include and makes a total of 8 numbers. Adjust these number to change the data range you want to include.
Ensure you keep the $ to prevent the row number from changing as the formula is copied.
Aggregate performs array like operations. As a result it may cause your system to bog down or crash if there is an excessive number of rows you are looking at. Also, full column references should generally be avoided within the aggregate function to avoid excess calculation.
I placed my answer not in A1 to make sure it will work anywhere on your sheet.
Assuming a layout like this (your first row of numbers from D2 to T2):
Paste this to Y2 (in the combined) column:
=AVERAGEIF(OFFSET(C2,0,AGGREGATE(15,3,((D2:T2)>0)/((D2:T2)>0)*COLUMN(D2:T2)-COLUMN(C2),2)):OFFSET(C2,0,AGGREGATE(15,3,((D2:T2)>0)/((D2:T2)>0)*COLUMN(D2:T2)-COLUMN(C2),8)),">0")
Then copy down to the other rows.
Data I used:
1 2 0 6 4 9 0 7 3 7 1 1 1 2 6 9 7
0 0 7 0 1 1 4 4 1 5 8 3 5 6 4 6 4
1 7 2 8 0 6 4 9 9 7 8 4 6 9 4 2 9
0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0
8 7 8 4 10 0 2 2 10 4 4 8 3 3 0 4 10
In this formula, C2 is the cell (empty or not) before the beginning of the row of numbers and not included in the numbers to be counted.
Assuming you have excel 365, this can be easily done using simply FILTER, INDEX and AVERAGE function
=AVERAGE(INDEX(FILTER(row_ref,row_ref>0),{2,3,4,5,6,7,8,9}))
Example
=AVERAGE(INDEX(FILTER(B5:N5,B5:N5>0),{2,3,4,5,6,7,8,9}))

Repeat X in seven rows before incrementing X

Like so:
1
1
1
1
1
1
1
2
2
2
2
2
2
2
...
Something like =IF(A1=4,1,A1+1) would work if the sequence wasn't all the same value.
I believe you are looking for division:
=INT((ROW()-1)/7)
NOTE: You will have to play with the 1 and 7 in the formula above to adapt to your needs. -1 is like an offset, and 7 is the number of times for the repeat. Use -2 if the numbers should start on row 2 for example. Lastly, if you want to start with 1 instead of zero, simply add 1.
=(the previous row's value) + IF(the previous 7 rows all have the same value, 1, 0 )
Obviously(?) this will not work for the first 7 rows; the first row could be the starting value, the next 6 just a straight copy of that.

Counting digit in column based on subject

I am just using formulas in excel and was wondering how you could count all the 0s until a 1 is reached, and then start the process over again, based on subject number. If this is not possible simply with formulas, how could I write a VBA code for this?
Right now I am trying to use,
=IF(OR(F4=0,F3=1),"",COUNTIFS($A$2:A2, $A$2,$F$2:F2,0)-SUM($I$2:I2))
which I input in I3 and I change the COUNTIFS($A$#:A#, $A$#...) part for each subject number.
This seems to work with the exception of the last grouping, as it won't output a number before the next subject.
Example Data:
subid yes number_yes(output)
1 0
1 0
1 0 3
1 1
1 0 1
1 1
1 0
2 0
2 0 2
2 1
2 0
2 0
3
etc.
A blank cell is numerically zero and that is one of your accepted conditions. Differentiate between blanks and zero values.
=IF(and(f4<>"", OR(F4=0,F3=1)),"",COUNTIFS($A$2:A2, $A$2,$F$2:F2,0)-SUM($I$2:I2))
Based on #Jeeped answer. If you use -SUMIF($A$2:A2,A3,$I$2:I2) instead of -SUM($I$2:I2) you don't need to adjust this part for each subject number. Just use the following formula in I3 and copy it down.
=IF(AND(F4<>"",OR(F4=0,F3=1)),"",COUNTIFS($A$2:A3,A3,$F$2:F3,0)-SUMIF($A$2:A2,A3,$I$2:I2))
Note that I also changed the second parameter in the COUNTIFS to A3.

Average ifs with or in excel

So, I have this problem, I would like to find the average of a column by using the OR function to check criteria from adjusted columns, I tried putting OR into AverageIf function, fail, also tried the "Average(IF(OR(" again not the correct return. Thought it is a simple thing could be done easily but don't know why it doesn't work. So my table is something like this:
ID: Rate Check 1 Check 2 Check 3
1 5 1 1 1
2 3 1 1
3 2 1
4 4
5 5 1 1
6 3
7 4 1
I would like to find the average of the rate column by checking if there are any value in either Check 1; Check 2 or Check 3 columns, so in the above case i will get the average of all but row with the id 4 and 6. Is this possible without using a helper column?
You can use SUMPRODUCT()
=SUMPRODUCT(((C2:C8<>"")+(D2:D8<>"")+(E2:E8<>"")>0)*(B2:B8<>"")*B2:B8)/SUMPRODUCT(--((C2:C8<>"")+(D2:D8<>"")+(E2:E8<>"")>0)*(B2:B8<>""))
If your first ID starts in A2, use this formula (edited to handle empty values in the "Rate" column):
=AVERAGE(IF(MMULT(LEN(C2:E8)*LEN(B2:B8),ROW(INDIRECT("1:"&COLUMNS($C$1:$E$1)))),B2:B8))

Compare multiple data from rows

I'm looking for a way to compare multiple rows with data to each other, trying to find the best possible match. Each number in every column must be an approximately match the other numbers in the same column.
Example:
Customer #1: 1 5 10 9 7 7 8 2 3
Customer #2: 10 5 9 3 5 7 4 3 2
Customer #3: 1 4 10 9 8 7 6 2 2
Customer #4: 9 5 6 7 2 1 10 5 6
In this example customer #1 and #3 is quite similar, and I need to find a way to highlight or sort the rows so I can easily find the best match.
I've tried using conditional formatting to highlight the numbers that are the similar, but that is quite confusing, because the amount of data is quite big.
Any ideas of how I could solve this?
Thanks!
The following formula entered in (say) L1 and pulled down gives the best match with the current row based on the sum of the absolute differences between corresponding cells:-
=MIN(IF(ROW($C$1:$K$4)<>ROW(),(MMULT(ABS($C1:$K1-$C$1:$K$4),TRANSPOSE(COLUMN($C$1:$K$4))^0))))
It is an array formula and must be entered with CtrlShiftEnter.
You can then sort on column L to bring the customers with lowest similarity scores to the top or use conditional formatting to highlight rows with a certain similarity value.
EDIT
If you wanted to penalise large differences in individual columns more heavily than small differences to try and avoid pairs of customers which are fairly similar except for having some columns very different, you could try something like the square of the differences:-
=MIN(IF(ROW($C$1:$K$4)<>ROW(),(MMULT(($C1:$K1-$C$1:$K$4)^2,TRANSPOSE(COLUMN($C$1:$K$4))^0))))
then the scores for your test data would come out as 7,127,7,127.
I'm assuming you want to compare customers 2-4 with customer 1 and that you are comparing only within each column. In this case, you could implement a 'scoring system' using multiple IFs. For example,:
A B C D E
1 Customer 1 1 1 2
2 Customer 2 1 2 2
3 Customer 3 0 1 0
you could use in E2
=if(B2=$B$1,1,0)+if(C2=$C$1,1,0)+if(D2=$D$1,1,0)
This will return a 'score' of 1 when you have a match and a 'score' of 0 when you don't. It then adds up the scores and your highest value will be your best match. Copying down would then give
A B C D E
1 Customer 1 1 1 2
2 Customer 2 1 2 2 2
3 Customer 3 0 1 0 1
so customer 2 is the best match.

Resources