How to count distinct entries in a column in Excel - excel

I know that the formula: {=SUM(1/COUNTIF(A1:A8,A1:A8))}
will work for counting the distinct entries in the column A1 to A8.
But my question is, what exactly is this formula doing? I can't seem to follow the logic of this array formula.

Assume a value x in A1, blanks in A2:A8.
If you use the Evaluate Formula tool, you'll see that the first step of the array formula provides an array of COUNTIFs:
=SUM(1/{1,7,7,7,7,7,7,7})
Note that there is 1 1, and 7 7s, because there is one value x, and seven blank values.
Remember that (1/n)*n = 1. So in this example,
(1/7) * 7 = 1
(1/1) * 1 = 1
Sum these results, and it's as easy as 1 + 1 = 2. :-)

Using F9 really comes in handy here. Suppose the below values are in A1:A8:
dog
cat
dog
cat
cat
dog
cat
cat
In the formula bar highlight COUNTIF(A1:A8,A1:A8) and press F9 and you will see:
{3;5;3;5;5;3;5;5}
Because there are 3 dogs and 5 cats in the list, these are the numbers that are returned by the countif formula for each appropriate type.
Now undo with CTRL+ Z or press ESC to start over. This time highlight 1/COUNTIF(A1:A8,A1:A8) and you will see:
{0.333333333333333;0.2;0.333333333333333;0.2;0.2;0.333333333333333;0.2;0.2}
Because there are 3 dogs in the list, 1/3 produces .3333333. Now .3333333 appears in the list in the same position that dog appears in the list. Add up all the .3333333 and you get 1.
Do the same for cats. 5 Cats. 1/5 produces .2, and so on.
When in doubt how a formula works, highlight portions of the formula and press F9 and you can see what it's calculating.

Related

Excel Autofill nth cell value in left direction

These are the cells I have
T
1
2
3
H
1
2
3
A
1
2
3
N
1
2
3
I am trying to use autofill to fill in left direction order
N
A
H
T
When I get K,1,2,3 in the first table,
I just want to insert a cell in the very left side of the second table, drag, and automatically fill in the cell with K, and so on.
Any idea for nth cell autofill data in left direction order in Excel?
My results are here..
=LET(a,B1:Q1,b,TEXTJOIN(,,IF(ISTEXT(a),a,"")),MID(b,SEQUENCE(,LEN(b),LEN(b),-1),1))
=MID(TEXTJOIN(,,IF(ISTEXT(B1:Q1),B1:Q1,"")),SEQUENCE(,SUMPRODUCT(--ISTEXT(B1:Q1)),SUMPRODUCT(--ISTEXT(B1:Q1)),-1),1)
Using the character identification number, you can add or subtract like classic numbers and display a sequence of orthographic characters.
From your first cell, example "A" in A1
Write instead of your second cell example "B" in E1, replace "B" with: =CHAR(CODE(A1) + 1)
Then take your 3 cells : B | 1 | 2 | 3 and stretch your selection, if you don't want to increment the numbers, select "copy cells" and your letter sequence will be done automatically.
Finally it's look like that :
If you need a character break such as A | D | G you can do + 3 instead of + 1
*Ps : CHAR is for CHARACTER in English, if your Excel is in another language, this formula can change for example CAR for French version.
Have a good cells !
Assuming you are looking only for capital letters as per your sample data. You can try the following in cell R1:
=INDEX(A1:P1,SORT(TOROW(XMATCH(CHAR(ROW(65:90)), A1:P1),2),,-1,1))
or using LET to avoid repetition:
=LET(r, A1:P1, INDEX(r,SORT(TOROW(XMATCH(CHAR(ROW(65:90)), r),2),,-1,1)))
UPDATE: Taking the idea suggested by #Manoj's answer of using ISTEXT function, then it can be solved as follows:
=LET(r, A1:P1, f, FILTER(r, ISTEXT(r)), SORTBY(f,
SEQUENCE(,COLUMNS(f),COLUMNS(f),-1)))
It is more generic since it discriminates between numbers and letters.
Here is the output:

Sum of non-blank cells with sign inversion

suppose I have a set of rows with numbers. Some columns in these rows are blank. Each row however contains an even number of non-blank cells as follows:
row_1: 3 4 # 7 # 3
row_2: 5 # 3 7 # 8
row_3: # # 5 # # 3
...
where # is an empty cell.
I would like to find a formula that will compute the following (using row_1 as an example):
= -3 + 4 + -7 + 3
In other words, the formula is to compute the sum of non-blank cells where the value of every odd non-blank cell has an inverted sign.
Question: Is it possible to do it without VBA just with some excel formula? Any help is appreciated.
Based on data in A1:F1, array formula**
=SUM(INDEX(1:1,N(IF(1,MODE.MULT(IF(A1:F1<>{"";""},COLUMN(A1:F1))))))*-1^ROW(INDEX(A:A,1):INDEX(A:A,COUNT(A1:F1))))
Copy down to give similar results for data in A2:F2, A3:F3, etc.
As way of an explanation, using the data provided, this part:
N(IF(1,MODE.MULT(IF(A1:F1<>{"";""},COLUMN(A1:F1)))))
produces an array of column indices for which the entry within row 1 of that column is non-blank, i.e.:
{1;2;4;6}
We then pass these to INDEX, such that:
INDEX(1:1,N(IF(1,MODE.MULT(IF(A1:F1<>{"";""},COLUMN(A1:F1))))))
which is:
INDEX(1:1,{1;2;4;6})
gives:
{3;4;7;3}
which is then multiplied by the result of:
-1^ROW(INDEX(A:A,1):INDEX(A:A,COUNT(A1:F1)))
which is:
-1^ROW(A1:A4)
i.e.:
-1^{1;2;3;4}
i.e.:
{-1;1;-1;1}
Regards
**Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).
How does this work? You'll need to use helper columns. (There may be a way to skip that and combine the helper formula with Sum(), but I'm not there yet :P )
The formula to put in H1, and drag right/down is:
=IF(A1<>"",IF(MOD(COUNTA($A1:A1),2)=1,-1*A1,A1),"")
Then just sum up those numbers (column O above) to get your answer.
In column G, use the following formula:
=CONCATENATE(A1,B1,C1,D1,E1,F1)
Then in column H use this formula:
=-MID(G1,1,1)+MID(G1,2,1)+IF(LEN(G1)>2,-MID(G1,3,1)+MID(G1,4,1),0)
This is assuming you only have six columns, at least two of them are always filled in, and at least two will be blank. If you fall outside of that criteria, you will have to additionally modify it.

Excel - formula to find how many cells to sum of N

I want to know how many cells it take to sum N. Please see following example:
number | cells to sum of 100
100 | 1
50 | 2
20 | 3
25 | 4
15 | 4
90 | 2
10 | 2
See the last column, it find the min number of current cell + previous cells to sum of 100.
Is there a way to do so?
Thanks.
In B2, array formula**:
=IFERROR(1+ROWS(A$2:A2)-MATCH(100,MMULT(TRANSPOSE(A$2:A2),0+(ROW(A$2:A2)>=TRANSPOSE(ROW(A$2:A2)))),-1),"Not Possible")
Copy down as required.
Change the hard-coded threshold value (100 here) as required.
As way of an explanation as to the part:
MMULT(TRANSPOSE(A$2:A2),0+(ROW(A$2:A2)>=TRANSPOSE(ROW(A$2:A2))))
using the data provided and taking the version of the above from B5, i.e.:
MMULT(TRANSPOSE(A$2:A5),0+(ROW(A$2:A5)>=TRANSPOSE(ROW(A$2:A5))))
the first part of which, i.e.:
TRANSPOSE(A$2:A5)
returns:
{100,50,20,25}
and the second part of which, i.e.:
0+(ROW(A$2:A5)>=TRANSPOSE(ROW(A$2:A5)))
resolves to:
0+({2;3;4;5}>=TRANSPOSE({2;3;4;5}))
i.e.:
0+({2;3;4;5}>={2,3,4,5})
which is:
0+{TRUE,FALSE,FALSE,FALSE;TRUE,TRUE,FALSE,FALSE;TRUE,TRUE,TRUE,FALSE;TRUE,TRUE,TRUE,TRUE})
which is:
{1,0,0,0;1,1,0,0;1,1,1,0;1,1,1,1}
An understanding of matrix multiplication will tell us that:
MMULT(TRANSPOSE(A$2:A5),0+(ROW(A$2:A5)>=TRANSPOSE(ROW(A$2:A5))))
which is here:
MMULT({100,50,20,25},{1,0,0,0;1,1,0,0;1,1,1,0;1,1,1,1})
is:
{195,95,45,25}
i.e. an array whose four elements are equivalent to, respectively:
=SUM(A2:A5)
=SUM(A3:A5)
=SUM(A4:A5)
=SUM(A5:A5)
Regards
**Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).
I did the first 3 with an excel formula:
D3>100
C4 is where your numbers start, so C4=100, C5=50 etc.
Formula is on D4, D5, D6 etc
On D4:
=IF(C4>=D3;1;"False")
On D5:
=IF(C5>=D3;1;IF(C5+C4>=D3;2;"Error"))
On D6:
=IF(C6>=D3;1;IF(C6+C5>=D3;2;IF(C6+C5+C4>=D4;3;"Error")))
You can keep doing this, just keep replacing "Error" with an longer/updated version of IF(C6+C5+C4>=D4;3.
I don't know if this is the best way, but this will achieve it.
One way to solve this is to create an NxN matrix of equations instead of just a column. An example picture is provided. Columns E through I are hidden. The last column on the right determines the number required
Theoretically, you can also hard code the equations if the number of rows needed to get to 100 is a known small number. For example, if the number of rows is always four or less, C8 would be =IFS(B8>=100,1,SUM(B7:B8)>=100,2,SUM(B6:B8)>=100,3,SUM(B5:B8)>=100,4). BTW, you'll run into sum boundary problems with this equation on the first, second, and third rows. Therefore, the first row will need to be =if(B8>=100,1,""), the second row would be =IFS(B9>=100,1,SUM(B8:B9)>100,2,TRUE,"") and so on.

excel formula with "different" columns - not just a range

I try to do like this:
=COUNTIFS($AA:$AA;$AC:$AC;$AE:$AE;$AG:$AG;$AI:$AI;"yes")<1
Which is of course wrong.
What I would like to do is not use a range (like $AA:$AI) but instead use every second column in the formula source.
Possible or ?
Yes, this is possible with the following formula:
{=SUM(IF(AA:AI="yes";1;0)*IF(MOD(COLUMN(AA:AI);2)=0;1;1))<1}
Note, that this is an array formula. So, you need to press Ctrl + Shift + Enter. For more information on array formula read the following post: https://support.office.com/en-us/article/Guidelines-and-examples-of-array-formulas-7d94a64e-3ff3-4686-9372-ecfd5caa57c7
The above formula counts all the occurrences of the word "yes" in the columns AA through AI. But each occurrence is furthermore multiplied with 1 or 0 depending on whether the column number can be divided by 2 without rest. Example:
Column AA is column 27. 27 divided by 2 equates to 13 with a remainder of 1. So, since there is a remainder, the second portion of the above formula (the second if) will return a 1 and not a 0. Hence, any occurrence of "yes" in column AA is accounted for. At the same time all occurrences in column AB will get multiplied with 0 (not accounted for). Since, I chose to use the divisor 2 all "yes" in every other column will be accounted for.
You can try this:
=COUNTIF(AA:AA,"yes") + COUNTIF(AC:AC,"yes") + COUNTIF(AE:AE,"yes") + COUNTIF(AG:AG,"yes") + COUNTIF(AI:AI,"yes")
See image for reference:

Median/average does not return the right values

Image for reference
I'm trying to achieve the following:
if(cell A1 is found in list 1), for each row in which it's found and if(C4:C10 > B4:B10), then median(the subtraction between C and B values, for every row that has text1).
I've tried two 2 different formulas:
1 - {=MEDIAN(IF(AND((C4:C10>B4:B10);(B4:B10=A1));(C4:C10-B4:B10)))}
2 - {=MEDIAN((C4:C10>B4:B10)*(B4:B10=A1)*(C4:C10-B4:B10))}
For median it always returns 0 and for the average really small values that aren't accurate. I'm sure the median and the averages aren't correct.
What would the problem be?
Also, how would I use something like:
{=MEDIAN((C4:C10>B4:B10)*(B4:B10=A1)*(C4:C10-B4:B10))}
If one the columns had text in some rows? (which isn't the case for the former problem, but it has arisen before).
text1
list 1 list 2 list 3
text2 1 5
text4 2 4
text1 4 6
text4 1 6
text1 4 5
text4 2 4
text1 3 3
You can't use AND function in these type of formulas because AND returns a single result (TRUE or FALSE) not an array as required.
Your second formula is closer but by multiplying all the conditions you will get zeroes for every row where the conditions are not met, hence skewing the results.
You can use either one of these similar versions:
=MEDIAN(IF((C4:C10>B4:B10)*(A4:A10=A1);C4:C10-B4:B10))
=MEDIAN(IF(C4:C10>B4:B10;IF(A4:A10=A1;C4:C10-B4:B10)))
both need to be confirmed with CTRL+SHIFT+ENTER
To handle text in columns B or C (and to make the formula ignore those rows but work otherwise) you can add an extra IF function like this
=MEDIAN(IF(C4:C10>B4:B10;IF(A4:A10=A1;IF(ISNUMBER(C4:C10-B4:B10);C4:C10-B4:B10))))
All formulas will work equally well with AVERAGE function in place of MEDIAN
Another way to get the MEDIAN while ignoring text is to use AGGREGATE function like this:
=AGGREGATE(17;6;C4:C10-B4:B10/(C4:C10>B4:B10)/(A4:A10=A1);2)
That doesn't need "array entry" but will only work in Excel 2010 or later versions. There's no simple equivalent for AVERAGE
17 denotes QUARTILE function - second quartile is the equivalent of median
See attached screenshot demonstrating the last two formulas with your sample data....and some added text
Supposing that the values in column C that is list 3 are bigger than those in column B that is list 2, then you can use the following formula:
=MEDIAN(IF((A4:A10=A1)*(C4:C10>B4:B10);C4:C10-B4:B10))
this is an array formula, so press ctrl+shift+enter to calculate the formula.
tell me if it doesn't work.

Resources