Column "A" is a numbering column for each Row, some numbers are the same, ie..
A1 is 1
A2 is 3
A3 is 1
A4 is 3
I need a formula that will show how many cells with content are in this column without counting duplicates, as above would be 2. I was figuring an "If-Then" formula but am unable to get it straight. Any help out there? Thank you in advance!
If you're using Excel 2013, I want to say that there's a count distinct function. Nonetheless, you can do it like this:
=SUM(IF(FREQUENCY(A1:A4,A1:A4)>0,1))
EDIT: Adding an explanation. The FREQUENCY function gets the frequency of the unique values within the array A1:A4 (first parameter), binning it using the values within A1:A4 (second parameter). The IF checks to see if that returns anything, i.e. if the frequency is greater than 0, in which case it returns 1 for each unique value it finds. Then the SUM adds the number of 1s returned by the IF statement, in turn giving you the number of unique values within the array A1:A4.
Related
I have five columns in Excel and I want to return the maximum value's column heading name. However, there are cases where the max values are repeated more than once for the same row. So, I am trying to return both column names.
The green values are the min and red are the max. In row 4, it is clearly there is more than one Max with same value, I would like to return B and E in the stream cell.
I tried this formula in Excel using the index:
=IF(ISNUMBER(A6),INDEX($B$5:$F$5,1,MATCH(L6,B6:F6,0)),"")
MATCH returns the first match, so I think you need something gross like the following (I started with the first column and row, but you can shift the column numbers -- the rows are designed to be draggable)
=IF(ISNUMBER(F2),IF(F2=A2,$A$1,"")&IF(F2=B2,$B$1,"")&IF(F2=C2,$C$1,"")&IF(F2=D2,$D$1,"")&IF(F2=E2,$E$1,""))
You can use FILTER() to return multiple values. In this example, I've concatenated them with TEXTJOIN():
In cell E2 enter the formula =MAX(A2:D2). In cell F2 enter =TEXTJOIN(,,FILTER(A$1:D$1,A2:D2=E2)). Copy down.
thank you for taking the time to look at this question.
I'm looking for an equation that can easily take the numerical values from Sheet 1 (the first picture) which has 2 blank cells in between values for four values and then has 4 blank cells and then the other four values. I'm not sure if I am making sense but hopefully the picture I have attached helps.
Notice 2 blank rows between first 4 rows with values (Rows 2-11) and same between rows 16 and 25.
Also notice the 4 blank rows between the two sets of values.
For me, this is repeated for 700 values, same set up of 2 blank rows for 4 sets of values and then 4 blank rows and then four sets of values with 2 blank rows. I'm sure there is an easier way to do this.
I'm trying to recreate Sheet 2 from Sheet 1 using an equation. Is this possible?
Apologies in advance, English isn't my first language.
If the numbers are going to start in B2 and the intervals and offset staggers are static then,
=INDEX(B:B, 2+(ROW(1:1)-1)*3+INT((ROW(1:1)-1)/4)*2)
If the first number is in S6 then,
=INDEX(S:S, 6+(ROW(1:1)-1)*3+INT((ROW(1:1)-1)/4)*2)
Put this in D2:
=IFERROR(INDEX(Sheet1!B:B,AGGREGATE(15,6,ROW(Sheet1!$B$2:INDEX(Sheet1!B:B,MATCH("ZZZ",Sheet1!A:A)))/(Sheet1!$B$2:INDEX(Sheet1!B:B,MATCH("ZZZ",Sheet1!A:A))<>""),ROW(1:1))),"")
And copy down till you get blanks.
This will return the numbers in order that they appear on sheet 1.
The Sheet1!$B$2:INDEX(Sheet1!B:B,MATCH("ZZZ",Sheet1!A:A)) set the data set bounds. This being an array type formula it needs to reference the smallest possible data set. This part finds the last cell in Column A and sets that as the extent of the data set so we do not do unnecessary iterations.
The MATCH part will return the last row that has text in it, if Column A has numbers then we need to change the "ZZZ" to 1E+99 to get the last row in column A with a number.
The AGGREGATE is working like a small in that it will create an array of row numbers and Errors. It will return ROW Numbers where (Sheet1!$B$2:INDEX(Sheet1!B:B,MATCH("ZZZ",Sheet1!A:A))<>"") return true. And an Error where it returns FALSE.
The second criterion 6 in Aggregate tells it to ignore the errors, so it is only looking at the returned row numbers.
The ROW(1:1) is a counter. As the formula is dragged down it will iterate to 2 then 3 and so on. This tells the Aggregate that you want the 1st then the 2nd then the 3rd and so on.
The chosen row number is then passed to the INDEX and the correct value is returned.
If your numbers are in order (smallest to largest like your example) or you want the output in order(smallest to largest) then you can use this simple equation in D2:
=IFERROR(SMALL(Sheet1!B:B,ROW(1:1)),"")
Then copy down till you get blanks.
Here is another formula you might use.
=INDIRECT(ADDRESS((INT((ROW()-ROW($A$2))/4)*14+ROW(A$2))+(MOD(ROW()-ROW($A$2),4)*3),COLUMN($A$2),1,1,"Sheet1"))
You can paste it to the first cell where you want the result and copy down.
Note that $A$2 is the cell from where all the counting starts. If your data start from A3 you can change the references accordingly. Note further that ROW($A$2) is long for 2. I chose this syntax to enable you to identify the meaning.
COLUMN($A$2), on the other hand, just identifies Column A as the source of the data to be lifted. Row 2 in this formula is insignificant. It's the A that counts. However, COLUMN($A$2) is long for just 1, meaning column No. 1, meaning A. Once you get your bearing in the formula you can replace COLUMN($A$2) with 1.
I'm pushing beyond my Excel knowledge here. I'm trying to do a poll like thing in Excel. My problem lies on showing the selected result. Here's what I have so far:
I need to select the header corresponding to the cell with the highest value in the range B2:G2 (type 1). However, if there's a tie, I need to select the header corresponding to the highest value in the range B3:G3 amongst the cells with highest values in the range B2:G2.
In my sample, column "bb" and "cc" both share highest value on type 1 (5). So, in order to determine the winner, I need to compare the highest value for type 2 between them. Since "bb" is 0 and "cc" is 1, I expect "cc" as final result.
Components for formula are below:
J2: Displays the count of cells on line 2 with the highest value in the range. So, 2. I did that with COUNTIF comparing with MAX.
K2: Displays the first header it finds with the highest value on line 2. I managed with the following formula:
=INDEX($B$1:$G$1;0;MATCH(MAX($B$2:$G$2);$B$2:$G$2;0))
To be honest, I don't fully understand that formula. Did it with help of tutorials from the internet.
I2: Displays "TIE" when there's a tie on range B2:G2. Otherwise display the winning header (K2).
J3: Displays the number of cells with the maximum value on range B3:G3 but only considering winning cells from line 2. I did that with COUNTIFS.
=COUNTIFS(B3:G3;LARGE(B3:G3;1);B2:G2;MAX(B2:G2))
Edit: Just found out by entering number "4" on B3 that this formula above is also not working...
I3: Should follow the same pattern as the cell above. Displays "TIE" when there's still a TIE. Otherwise would display winning header (to be presented on K3).
K3: I don't know what to put here. Probably because I don't quite understand that formula with INDEX, MATCH and so on, I can't figure out a way to check the highest value between the two "winning" columns from the line above and get the header.
Could somebody help me with this?
First, let's establish if there is a tie. As you have discovered, you can do this by counting how many times the highest number appears in the range.
=COUNTIF($B2:$G2;MAX($B2:$G2))
If that count is more than 1, then there is a tie.
=IF(COUNTIF($B2:$G2;MAX($B2:$G2))>1;"TIE";"no tie")
In case of a tie you want to involve the values in row 3 as a tie breaker. We could add them to the values in row 2 using this array formula. You must confirm the array formula with Ctrl+Shift+Enter, not just Enter, otherwise it won't work.
=INDEX($B$1:$G$1,MATCH(MAX(((IF(B2:G2=MAX(B2:G2),MAX(B2:G2),0))+B3:G3)),INDEX((B2:G2+B3:G3),0)))
You only want to factor in row 3 if there is a tie, though, so you can re-use the IF statement from above and replace the "tie" in the formula above with the array formula and remember to press Ctrl+Shift+Enter!!
=IF(COUNTIF($B$2:$G$2,MAX($B$2:$G$2))>1,INDEX($B$1:$G$1,MATCH(MAX(((IF(B2:G2=MAX(B2:G2),MAX(B2:G2),0))+B3:G3)),INDEX((B2:G2+B3:G3),0))),"no tie")
You already have the formula to look up the value if there is no tie.
My system uses the comma as the list separator. I have manually replaced these with semicolons in the formulas I posted, but please bear with me if I may have missed one.
Now you can copy these formulas down to row 3. If there is a tie in the data in row 3, you will need data in row 4 to break the tie.
To understand the Index/Match combo, start with your first formula and read it from the inside out. The Max() finds the largest number. The Match() returns the position, i.e. column number, of the largest number in the range B2 to G2, i.e. 2 (the second column in the range). Index looks at B1 to G1 and returns the column value from the position that the Match returned, i.e. the 2nd column, which is the text bb.
Using row 3 as the tie breaker, the formula works pretty much the same, only that rows 2 and 3 are added together when the value in row 2 is the Max value and then that number is used to find the Max and the Match.
Here is an approach with sumproducts. I dont really inderstand what results you want in I3, J3, and K3. will try to workout.
I2:
=IF(SUMPRODUCT(--(B2:G2=MAX(B2:G2)))>1,"TIE","")
J2:
=SUMPRODUCT(--(B2:G2=MAX(B2:G2)))
K2:
=IF(B7>1,OFFSET(B1,0,SUMPRODUCT(--(B3:G3=MAX(B3:G3))*--(B2:G2=MAX(B2:G2))*{0,1,2,3,4,5})),OFFSET(B1,0,SUMPRODUCT(--(B2:G2=MAX(B2:G2))*{0,1,2,3,4,5})))
the {0,1,2,3,4,5} refers to the number of headers, if there are more, this array needs to changed
The following code allows me to determine distinct values in a pivot table in Excel:
=SUMPRODUCT(($A$A:$A2=A2)*($B$2:$B2=B2))
See also: Simple Pivot Table to Count Unique Values
The code runs perfectly fine. However, can somebody help me understand how this code actually works?
You write: the following code allows me to determine distinct values in a pivot table in Excel
No. That formula alone does not do that. Read on for the explanation of what does.
There's a typo in the formula. It should be
=SUMPRODUCT(($A$2:$A2=A2)*($B$2:$B2=B2))
See the difference?
The formula starts in row 2 and is copied down. In each row, the $A$2 reference and the $B$2 reference will stay the same. The $ signs make them absolute references. The relative references $A2 and A2 will change their row numbers when copied down, so in row 3 the A2 will change to A3 and B2 will change to B3. In the next row it will be A4 and B4, and so on.
You may want to create a sample scenario with data similar to that in the thread you link to. Then use the "Evaluate Formula" tool on the Formulas ribbon to see step by step what is calculated. The formula evaluates from the inside out. Let's assume the formula has been copied down to row 5 and we are now looking at
=SUMPRODUCT(($A$2:$A5=A5)*($B$2:$B5=B5))
($A$2:$A5=A5) this bit compares all the cells from A2 to A5 with the value in A5. The result is an array of four values, either true or false. The next bit ($B$2:$B5=B5) also returns an array of true or false values.
These two arrays are multiplied and the result is an array of 1 or 0 values. Each array has the same number of values.
The first value of the first array will be multiplied with the first value of the second array. (see the red arrows)
The second value of the first array will be multiplied with the second value of the second array. (see the blue arrows)
and so on.
True * True will return 1, everything else will return 0. The result of the multiplication is:
The nature of the SumProduct function is to sum the result of the multiplications (the product), so that is what it does.
This function alone does not do anything at all to establish distinct values in Excel. In the thread you link to, the Sumproduct is wrapped in an IF statement and THAT is where the distinct values are identified.
=IF(SUMPRODUCT(($A$2:$A2=A2)*($B$2:$B2=B2))>1,0,1)
In plain words: If the combination of the value in column A of the current row and column B of the current row has already appeared above, return a zero, otherwise, return a 1.
This marks distinct values of the combined columns A and B.
Firts, i think you made a type here, as the formula should be :
=SUMPRODUCT(($A$2:$A2=A2)*($B$2:$B2=B2))
Let's decompose it in 2 parts:
First, we check the cells between A2 and A2, so only one cell, and we check the number of cells wich are equals to A2. In this case, the output should be 1, as you're comparing A2 with A2. However, you're not limited to compare A2 with A2. If you had chosen 2 cells equals to A2, the results would have been 2.You can compare as many cells as you want with A2 (replace the characters after the $ to modulate).
We do the same for the second bracket, except the pivot value is B2.
After that, you need to understand what the function SUMPRODUCT does. It sum the value of the product for a range of array. For example, say you have the value 1 on A1, 1 on A2, 2 on B1 and 3 on B2, if you make SUMPRODUCT((A1:A2)*(B1:B2)) , you will obtain (1*2) + (1*3) = 5. So, in the example you gave us, it will give the sum of (A2=A2)*(B2=B2) = 1.
So, it will output the number of pair (Ax,Bx) which is equals to (A2,B2). With the link, you can see that, if you select the first line only, the function will output 1 (and so the IF will output 1), but if you select the first 2 lines, the function will output 2, (and so the IF will output 0).
I hope this made sense to you, as i hoped i didn't make any mistakes along the explanation.
I need to be able to find the row number of the row where matching criteria from A1 is equal or greater than values in column C and lesser or equal than values in column D
I can use INDEX and MATCH combo but not sure if this is something I should use for multiple criteria matching.
Any help or suggestions are highly appreciated.
I would not use MATCH to get the row number since you have multiple criteria. I would still use INDEX however to get the value of the row in E once the proper row number was discovered.
So instead of MATCH I would use an array formula using an IF statement that contained multiple criteria. Now note that array formulas need to be entered using ctrl + shift + enter. The IF statement would look like this:
=IF((A1>=C:C)*(A1<=D:D),ROW(A:A),"")
Note: I did not use the AND formula here because that cannot take in arrays. But since booleans are just 1's or 0's in Excel, multiplying the criteria works just fine.
This now gives us an array containing only blanks and valid row numbers. Such that if rows 5 and 7 were both valid the array would look like:
{"","","","",5,"",7,"",...}
Now if we encapsulate that IF statement with a SMALL we can get whatever valid row we want. In this case since we just want the first valid row we can use:
=SMALL(IF((A1>=C:C)*(A1<=D:D),ROW(A:A),""),1)
Which if the first valid row is 5 then that will return 5. Incrementing the K value of the SMALL formula will allow you to grab the 2nd, 3rd, etc valid row.
Now of course since we have the row number a simple INDEX will get us the value in column E:
=INDEX(E:E,SMALL(IF((A1>=C:C)*(A1<=D:D),ROW(A:A),""),1))
If you need to match more than one column value to retrieve a row number, that is, if two or more columns together create a unique ID you can use an array formula with MATCH as below:
MATCH(1,(A:A=J1)*(B:B=K1)*(C:C=L1),0)
where A, B, C contain the column array to be matched to retrieve the unique row number corresponding to the value in J1, K1, L1 respectively.
For a step-by-step guide, Christian Pedersen's Explainer