Excel: Count until, then repeat? - excel

I have a list of numbers which are either 1's or 2's. What I'd like to do is count how many 1's there are before a 2 appears, and then keep repeating this down the list (i'm trying to find the average number of 1's between each 2).
What would be the best way of doing this considering I've got over 10,000 rows? (i.e. too many to do manually)

The average number of 1's between each number 2, is the same as the ratio between the number 1 and the number 2.
Example:
1
1
2
1
1
1
1
2
1
1
2
1
1
2
Contains 10 ones and 4 twos.
Or there are five groups of ones, with the following counts: 2, 4, 2, 2
Either way, it will give you and average of 2.5 (10/4 = 2.5)
Note: You have to make a design choice, regarding how to handle beginnings and ends. If you had another one, after the last two, how should it be handled?

You can use the formula as shown in the screenshot below:
Note that the formula in the first row is different.
B C
=IF(A2=1,B1,B1+1) =COUNTIF(B:B,B2)
=IF(A3=1,B2,B2+1) =IFERROR(IF(A4=2,COUNTIF(B:B,B4),"")-1,"")
Then to get the average use:
=AVERAGEIF(C:C,"<>"&0)
Noceo's solution as a formula:
=COUNTIF(A:A,1)/COUNTIF(A:A,2)
The output of all the above:

Related

Duplicating Data per number cycle

Im trying to duplicate data down a set of columns based on number cycle. Every time a number in sequence repeats, I'd like the number IM populating to increase by 1.
For example:
A
B
1
1
1
2
1
4
2
1
2
2
2
4
Every time column B repeats its cycle, In this case when it repeats back to 1, 2 ect. I'd like column A to increase by 1.
Initially I thought something like =IF(A3<>A2,B2+1,B2) would suffice, but that repeats.
Is there a different formula I can use to accomplish this?
Depending on your scenario, I'd do this.
These formulas need to originate in cell A2. If you're not working in the top left hand corner of your sheet, you'll need to adjust the formula accordingly.
Scenario 1 - All numbers in the sequence are unique.
=IF(INDIRECT(SUBSTITUTE(ADDRESS(1,B2),"1","") & "1") = B2, MAX($A$1:A1) + 1, MAX($A$1:A1))
Scenario 2 - All numbers in the sequence are NOT unique.
=IF(SUMIF($A$1:A1, MAX($A$1:A1), $A$1:A1) / 6 = MAX($A$1:A1), MAX($A$1:A1) + 1, MAX($A$1:A1))

Dynamically sort list based off associated values with tie-breaker values

I'm trying to sort students based off frequency of participation. I have a table that is automatically generated totaling up how often a student has participated in the last few days.
I want it to do 2 things that I can't figure out.
I want it to ignore students that are at 0 removing them from the resulting rankings.
The first number is most important but I want it to reference the next value in the result of a tie.
Short example of table:
Andy - 1 1 2 3
Brad - 0 1 2 3
Cade - 1 2 3 4
Dane - 1 1 1 2
Desired result:
Cade - 1
Andy - 1
Dane - 1
The tie-breaker isn't that important and I figure I can have conditional formatting to remove children at 0, but I still can't seem to figure it out.
The closest formulas I have found in my searching are:
=INDEX($A$10:$A$9,MATCH(ROWS($C$1:C1),$C$1:$C$9,0))
This one doesn't work because it returns #N/A for pretty much all students who are tied.
=IFERROR(INDEX($C$1:$C$9,MATCH(SMALL(NOT($C$1:$C$9="")*IF(ISNUMBER($C$1:$C$9),COUNTIF($C$1:$C$9,"<="&$C$1:$C$9),COUNTIF($C$1:$C$9,"<="&$C$1:$C$9)+SUM(--ISNUMBER($C$1:$C$9))),ROWS($C$1:C1)+SUM(--ISBLANK($C$1:$C$9))),NOT($C$1:$C$9="")*IF(ISNUMBER($C$1:$C$9),COUNTIF($C$1:$C$9,"<="&$C$1:$C$9),COUNTIF($C$1:$C$9,"<="&$C$1:$C$9)+SUM(--ISNUMBER($C$1:$C$9))),0)),"")
I had this formula that can handle ties but it needs to be OFFSET but I don't know how since it is an array formula. Also, with both these formulas it reverses the ranks with the lowest values at the top. If anyone could assist me I would greatly appreciate it. I'm doing this so that I can give all students a chance to participate equally.
Use a helper column. In that column put the following formula:
=IF(B1=0,"n/a",SUMPRODUCT(B1:E1/10^(COLUMN(B1:E1)-MIN(COLUMN(B1:E1)))))
This will return a single number based on the rankings.
Then in your output column use:
=IFERROR(INDEX(A:A,MATCH(LARGE(F:F,ROW(1:1)),F:F,0)),"")
Then a simple VLOOKUP to return the first number:
=IF(I1<>"",VLOOKUP(I1,A:B,2,FALSE),"")

Count occurrences of strings just once per row in Google Sheets

I have strings of spreadsheet data that need counting by 'type' but not instance.
A B C D
1 Lin 1 2 1
2 Tom 1 4 2
3 Sue 3 1 4
The correct sum of students assigned to teacher 1 is 3, not 4. That teacher 1 meets Lin in lessons B and D is irrelevant to the count.
I borrowed a formula which works in Excel but not in Google Sheets where I and others need to keep and manipulate the data.
F5=SUMPRODUCT(SIGN(COUNTIF(OFFSET(B$2:D$2, ROW($2:$4)-1, 0), E5)))
A B C D E
2 Lin 1 2 1
3 Tom 1 4 2
4 Sue 3 1 4
5 1 [exact string being searched for, ie a teacher name]
I don't know what is not being understood by Google Sheets in that formula. Does anyone know the correct expression to use, or a more efficient way to get the accurate count I need, without duplicates within rows inflating the count?
So this is the mmult way, which works by finding the row totals of students assigned to teacher 1 etc., then seeing how many of the totals are greater than 0.
=ArrayFormula(sum(--(mmult(n(B2:D4=E5),transpose(column(B2:D4)))>0)))
or
=ArrayFormula(sum(sign(mmult(n(B2:D4=E5),transpose(column(B2:D4))))))
Also works in Excel if entered as an array formula without the ArrayFormula wrapper.
A specific Google Sheets one can be quite short
=ArrayFormula(COUNTUNIQUE((B2:D4=E5)*row(B2:D4)))-1
counting the unique rows containing a match.
Note - I am subtracting 1 in the last formula above because I am assuming there is at least one zero (non-match) which should be ignored. This would fail in the extreme case where all students in all classes are assigned to the same teacher so you have a matrix (e.g.) of all 1's. This would be more theoretically correct:
=ArrayFormula(COUNTUNIQUE(if(B2:D4=E5,row(B2:D4),"")))

Compare multiple data from rows

I'm looking for a way to compare multiple rows with data to each other, trying to find the best possible match. Each number in every column must be an approximately match the other numbers in the same column.
Example:
Customer #1: 1 5 10 9 7 7 8 2 3
Customer #2: 10 5 9 3 5 7 4 3 2
Customer #3: 1 4 10 9 8 7 6 2 2
Customer #4: 9 5 6 7 2 1 10 5 6
In this example customer #1 and #3 is quite similar, and I need to find a way to highlight or sort the rows so I can easily find the best match.
I've tried using conditional formatting to highlight the numbers that are the similar, but that is quite confusing, because the amount of data is quite big.
Any ideas of how I could solve this?
Thanks!
The following formula entered in (say) L1 and pulled down gives the best match with the current row based on the sum of the absolute differences between corresponding cells:-
=MIN(IF(ROW($C$1:$K$4)<>ROW(),(MMULT(ABS($C1:$K1-$C$1:$K$4),TRANSPOSE(COLUMN($C$1:$K$4))^0))))
It is an array formula and must be entered with CtrlShiftEnter.
You can then sort on column L to bring the customers with lowest similarity scores to the top or use conditional formatting to highlight rows with a certain similarity value.
EDIT
If you wanted to penalise large differences in individual columns more heavily than small differences to try and avoid pairs of customers which are fairly similar except for having some columns very different, you could try something like the square of the differences:-
=MIN(IF(ROW($C$1:$K$4)<>ROW(),(MMULT(($C1:$K1-$C$1:$K$4)^2,TRANSPOSE(COLUMN($C$1:$K$4))^0))))
then the scores for your test data would come out as 7,127,7,127.
I'm assuming you want to compare customers 2-4 with customer 1 and that you are comparing only within each column. In this case, you could implement a 'scoring system' using multiple IFs. For example,:
A B C D E
1 Customer 1 1 1 2
2 Customer 2 1 2 2
3 Customer 3 0 1 0
you could use in E2
=if(B2=$B$1,1,0)+if(C2=$C$1,1,0)+if(D2=$D$1,1,0)
This will return a 'score' of 1 when you have a match and a 'score' of 0 when you don't. It then adds up the scores and your highest value will be your best match. Copying down would then give
A B C D E
1 Customer 1 1 1 2
2 Customer 2 1 2 2 2
3 Customer 3 0 1 0 1
so customer 2 is the best match.

IF THEN Statement Multiple Conditions

I'm trying to design an excel formula for some golf game scores. Golfers get points based on a random number (0-9) and the last digit of their score. So, if the random number is 0 and the golfers score ends in 0, they get 10 points. Still with a 0 random number, if the golfers score ends in a 1, they get 9 points. 8 points for a last digit of 2. 7 for 3. 6 for 4. 10 for 5. 9 for 6. 8 for 7. 7 for 8. 6 for 9.
Score ends in: Points:
0 10
1 9
2 8
3 7
4 6
5 10
6 9
7 8
8 7
9 6
As long as I come up with one formula for the random number of 0, I can adjust it for the remaining 9 random numbers.
The way I was hoping for this to work was to just be able to enter the scores into one column and then have the points calculated in a separate column. There will also have to be a cell where I enter the random number.
Any help is appreciated!
You can use something like this to get the score if the random number is 0:
=IF(A1=0,10-MOD(RIGHT(B1),5))
This will give the points you mentioned provided:
A1 is the cell containing the random number
B1 is the cell containing the points of the golfer.
The main formula here is:
10-MOD(RIGHT(B1),5)
RIGHT() takes the last digit of the points. MOD(,5) will get the remainder when this digit is divided by 5.
When you have 0, you get no remainder, hence 0.
When you have 1, you get a remainder of 1, hence 1.
When you have 2, you get a remainder of 2, hence 2.
When you have 6, you get a remainder of 1, hence 1 again.
Then 10 minus that remainder gives you the points you're looking for.
You could certainly use =RAND()*10to get a random value in excel. Or to get a value without commas use = ROUNDDOWN(RAND()*10;0)
Then you add something a VLOOKUPto get a value to each players score. RIGHT(A1;1)gives you the last value of a field.
VLOOKUPrequires you to have a table somewhere with the values for each score as you described.
edit: the MOD solution looks even better. Please note that RANDOM gets a fresh value everytime you refresh the XLS sheet. so probably use it to get a value and put that into another field manually.

Resources