Excel: SUMIF depending on number in field - excel

Hi All You Amazing People
Update
You know what, I should let you know that I am actually trying to do this with numbers and not alphabets. For instance, I have a field with value like 225566 and I am trying to pick out fields which have 55 in them. It is only now I realize this might make a huge difference
ColumnA | ColumnB |
225566 | 2
125589 | 3
95543 | 2
(Below is what I had asked first and later realized I wasn't asking the right question.)
*Lets say I have a table as
ColumnA | ColumnB |
AABBC | 2
AADDC | 3
ZZBBC | 2
Now how could I get a SUMIF for those rows where Column A has a field with BB in it? Assume that there are hundreds of rows. I realize that I have to borrow something conceptually from the way text to column is done. But I wonder if anyone would know how I could do this. Thanks a lot.*

Since you're trying to do this on numbers, you'll need to use an array formula.
If your test values are in A3:A5 and your values to sum are in B3:B5, this will work:
=SUM( IF(ISERROR(FIND("55", TEXT(A3:A5,"#"))), 0, 1) * B3:B5 )
When entering an array formula, use Ctrl-Shift-Enter rather than just hitting Enter.
This sums the product of the sum value and a 0 or 1 from the IF() statement, which tests whether or not each test value, after being converted to text, contains a "55".

I think you will need an matrix/array formular to do this:
{=SUM(IF(ISERROR(FINDEN("55";A2:A4;1));0;1))}
The weird brakets {} indicate it is an matrix formular you get them by pressing SHIFT+CTRL+RETURN instead of Return when editing the formula.
This formula will cycle through the range A2:A4, check if it finds "55" inside and if so add 1 to the sum.
Google array/matrix formulas as they are not self explanatory.
Best
Jan

In Excel 2003 and 2007 (and possibly earlier versions, I cannot test), you can use * as a wildcard character in the match. For example, with your sample data set C1 to
=SUMIF(A1:A3,"*BB*",B1:B3)
and you should see the value 4.

Create a 3rd column (ColumnC) and put this formula in it:
=Text(A2,0)
Drag that column down to complete your column. This will format the value as text. Next, use SUMIF as DocMax explained, except with different columns:
=SUMIF(C1:C3,"*BB*",B1:B3)
The reason you do this is because you need to be reading a Text value, not a Number value when using the *BB* comparison of SUMIF. Great question.

Related

excel conditional formating with multiple date ranges

I have looked for proper formula that would solve my problem but I couldn't find anything.
I have a table with multiple date ranges and I want to highlight all dates in my calendar between these ranges. I've tried to use formula AND
=AND(F5>=$A$6,F5<=$B$6)
however the formula highlights only dates between 1st range. I tried to put array ($A6:$A$9 and $B6:$B$9) but it doesn't work.
Column A Column B
row 6 | 05/01/2018 | 12/01/2018
row 7 | 03/04/2018 | 16/04/2018
row 8 | 06/05/2018 | 17/05/2018
row 9 | 01/11/2018 | 05/11/2018
My calendar starts in cell F5 and ends in AP16.
Regards,
Adrian
You need to wrap your AND's within an OR:
=OR(AND(F5>=$A$6,F5<=$B$6),AND(F5>=$A$7,F5<=$B$7), AND(...))
or, in a more compact but equivalent form:
=SUMPRODUCT((F5>=$A$6:$A$9)*(F5<=$B$6:$B$9))
or
=OR((F5>=$A$6:$A$9)*(F5<=$B$6:$B$9))
Each of the equality arrays returns an array of 1's or 0's. Multiplying them together is the equivalent of AND and will return a 1 if and only if both values in the same position are TRUE. Adding the arrays (the equivalent of OR) will then show if any result is a 1.
Although Excel 2016 will accept an OR in the conditional format formula, I seem to recall that some earlier versions will not, hence I have also supplied the equivalent SUMPRODUCT formula.
Or once again you can use countifs
=COUNTIFS($A$6:$A$10,"<="&F5,$B$6:$B$10,">="&F5)

Excel - formula to find how many cells to sum of N

I want to know how many cells it take to sum N. Please see following example:
number | cells to sum of 100
100 | 1
50 | 2
20 | 3
25 | 4
15 | 4
90 | 2
10 | 2
See the last column, it find the min number of current cell + previous cells to sum of 100.
Is there a way to do so?
Thanks.
In B2, array formula**:
=IFERROR(1+ROWS(A$2:A2)-MATCH(100,MMULT(TRANSPOSE(A$2:A2),0+(ROW(A$2:A2)>=TRANSPOSE(ROW(A$2:A2)))),-1),"Not Possible")
Copy down as required.
Change the hard-coded threshold value (100 here) as required.
As way of an explanation as to the part:
MMULT(TRANSPOSE(A$2:A2),0+(ROW(A$2:A2)>=TRANSPOSE(ROW(A$2:A2))))
using the data provided and taking the version of the above from B5, i.e.:
MMULT(TRANSPOSE(A$2:A5),0+(ROW(A$2:A5)>=TRANSPOSE(ROW(A$2:A5))))
the first part of which, i.e.:
TRANSPOSE(A$2:A5)
returns:
{100,50,20,25}
and the second part of which, i.e.:
0+(ROW(A$2:A5)>=TRANSPOSE(ROW(A$2:A5)))
resolves to:
0+({2;3;4;5}>=TRANSPOSE({2;3;4;5}))
i.e.:
0+({2;3;4;5}>={2,3,4,5})
which is:
0+{TRUE,FALSE,FALSE,FALSE;TRUE,TRUE,FALSE,FALSE;TRUE,TRUE,TRUE,FALSE;TRUE,TRUE,TRUE,TRUE})
which is:
{1,0,0,0;1,1,0,0;1,1,1,0;1,1,1,1}
An understanding of matrix multiplication will tell us that:
MMULT(TRANSPOSE(A$2:A5),0+(ROW(A$2:A5)>=TRANSPOSE(ROW(A$2:A5))))
which is here:
MMULT({100,50,20,25},{1,0,0,0;1,1,0,0;1,1,1,0;1,1,1,1})
is:
{195,95,45,25}
i.e. an array whose four elements are equivalent to, respectively:
=SUM(A2:A5)
=SUM(A3:A5)
=SUM(A4:A5)
=SUM(A5:A5)
Regards
**Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).
I did the first 3 with an excel formula:
D3>100
C4 is where your numbers start, so C4=100, C5=50 etc.
Formula is on D4, D5, D6 etc
On D4:
=IF(C4>=D3;1;"False")
On D5:
=IF(C5>=D3;1;IF(C5+C4>=D3;2;"Error"))
On D6:
=IF(C6>=D3;1;IF(C6+C5>=D3;2;IF(C6+C5+C4>=D4;3;"Error")))
You can keep doing this, just keep replacing "Error" with an longer/updated version of IF(C6+C5+C4>=D4;3.
I don't know if this is the best way, but this will achieve it.
One way to solve this is to create an NxN matrix of equations instead of just a column. An example picture is provided. Columns E through I are hidden. The last column on the right determines the number required
Theoretically, you can also hard code the equations if the number of rows needed to get to 100 is a known small number. For example, if the number of rows is always four or less, C8 would be =IFS(B8>=100,1,SUM(B7:B8)>=100,2,SUM(B6:B8)>=100,3,SUM(B5:B8)>=100,4). BTW, you'll run into sum boundary problems with this equation on the first, second, and third rows. Therefore, the first row will need to be =if(B8>=100,1,""), the second row would be =IFS(B9>=100,1,SUM(B8:B9)>100,2,TRUE,"") and so on.

Return all the cell whose next one contains a value range

I need to return in a cell all the cell values where the cell next to it contains a value range.
For example, if I have a table like this:
|Name |Evaluation
|------|------
| John | 3
| Sue | 4
| Jim | 2
| Andy | 6
| Tim | 1
| Bruce| 4
I'm looking for a formula to have all the names whose evaluation is >= 4, so, if applied to the table it should give as output in a single cell:
Sue
Andy
Bruce
I've already tried VLOOKUP, INDEX, MATCH and FIND functions but they all return a single value (the first cell that match) and not all of them.
If possible, I'm looking for an Excel Formula and not for VBA (this way I can share it easily with my working group that, as myself, is not very proficient in VBA).
Thank you very much!
=IF(B1>=4,A1,"") write the command in c1 column and drag the C1 column to the end of the column till the end name
(assuming you write first name in A1 column and Evaluation in B1 )
I've solved the issue by using (partially) the solution posted by dhS and a support table.
I've created the support table of the same height of the original one. This table goes from F1 till F120 (the end of the original table).
In the first cell i've used the formula
=IF(B1>=4;$B1;"")
In all the subsequent cells (from the second to the final one)
=IF(B2>=4;IF(F1="";B2;F1&CHAR(10)&B2);F1)
This way, in the last cell there will be all the names separated by a return (CHAR(10)).
To anyone who want to use this solution, remember to enable the Wrap Text option on the cell, otherwise you won't be able to visualize the returns.
Thanks to everybody for the help you gave me.

vlookup with multiple columns

I have the following formula in my B:B column
=VLOOKUP(A1;'mySheet'!$A:$B;2;FALSE)
It does output in B:B the values found in the mySheet!B:B where A:A = mySheet!A:A. It works fine. Now, I would like to also get the third column. It works if I add the following formula to the whole C:C column:
=VLOOKUP(A1;'mySheet'!$A:$C;3;FALSE)
However, I'm working with more than 100k lines and about 40 columns. I don't want to do 100k * 40 * VLOOKUP, I would like to only do it 100k and not have to multiply this by all the columns. Is there a way (with array-formulas maybe) to just do the VLOOKUP once per line to get all the columns I need?
data example
ID|Name
-------
1|AB
2|CB
3|DF
4|EF
ID|Column 1|Column 2
--------------------
1|somedata|whatever1
4|somedate|whatever2
3|somedaty|whatever3
I would like to get:
ID|Name|Column 1|Column 2
-------------------------
1|AB |somedata|whatever1
2|CB | |
3|DF |somedaty|whatever2
4|EF |somedate|whatever3
INDEX works fast than VLOOKUP, I would recommend using that. It'll reduce the strain that many vlookups would put on your system.
First find the row that contains what you need in a helper column with MATCH:
=MATCH(A1,'mySheet'!$A:$A,0)
Then an INDEX using that number, that you can drag across and populate all your columns with:
=INDEX('mySheet'!B:B,$B1)
Your output would be akin to:
ID|Name|Match |Column 1 |Column 2
-------------------------
1|AB |Match1|IndexCol1|IndexCol2
2|CD |Match2|IndexCol1|IndexCol2
3|EF |Match3|IndexCol1|IndexCol2
Also! I'd recomend setting these ranges to actually cover the data, rather than referencing the whole column, for additional speed gains, e.g.:
=INDEX('mySheet'!B1:B100000,$B1)
I was thinking more on your problem, and if you have contorl over the data you're looking up on, I have another suggestion you could try.
In 'mysheet', where the raw data is kept, add in a new column that concatenates each column into one cell, with some sort of unique divider not in your data:
=B1&"+"&C1&"+"&D1&"+"&E1 etc...
Then you could do one VLOOKUP or INDEX/MATCH for each row, instead of 40.
Once you have it in your new sheet, you could split the results back out.
Splitting without formulas
Copy/Paste the results of the lookup formulas as Values in the next column.
Select that column, and in the Data tab on your ribbon, select Text to Columns.
Leave it on Delimited, hit Next. Uncheck Tab, check Other, and input your delimeter (+ in my example).
Click Finish.
Splitting with formulas
Use =FIND() to locate each delimter, and =MID() to pull out the text between each set of delimeters, using the previous delimeter as the Start_num.
Definitely the more complex of the two methods.
If I'm understanding correctly one thing I would do to start would be to use =VLOOKUP(A1;'mySheet'!$A:LastColumn;COLUMN(B1);FALSE). This way your column reference will move as you drag your Vlookup to the right.
No formula.No output. So there can't be a way to apply formula on 1 column only and get on the others.
The other feasible way is, put i formula in 1 cell, use $ signs inteligently and drag across all cells in a giffy without having to put vlookup 40 times.
Vlookup has 4 codes to input
1-Lookup Value. Use this $A1 (put $ on A and not 1)
2-Source data- Put $ signs everywhere
3-Column index no. Just above your entire data,in the 1st row,add an empty row.Put the values 1 in A1, 2 in B1, 3 in C1 and so on. Now in the formula,instead of manually putting "2" or "3" Give reference to these cells.Put $ on Numberal and not column ( B$1).
4- Type false or 0
Then drag this across everywhere.
Lookup Value. Use this $A1 (put $ on A and not 1)
Source data- Put $ signs everywhere
Column index no. Just use column name from where data needs to be pulled (e.g. COLUMN(B1) if Lookup value is in Column A and you want value from column B).
Type false or 0

Conditional median in MS Excel

I'm trying to calculate the conditional median of a chart that looks like this:
A | B
-------
x | 1
x | 1
x | 3
x |
y | 4
z | 5
I'm using MS Excel 2007. I am aware of the AVERAGEIF() statement, but there is no equivalent for Median. The main trick is that there are rows with no data - such as the 4th "a" above. In this case, I don't want this row considered at all in the calculations.
Googling has suggested the following, but Excel won't accept the formula format (maybe because it's 2007?)
=MEDIAN(IF((A:A="x")*(A:A<>"")), B:B)
Excel gives an error saying there is something wrong with my formula(something to do with the * in the condition) I had also tried the following, but it counts blank cells as 0's in the calculations:
=MEDIAN(IF(A:A = "x", B:B, "")
I am aware that those formulas return Excel "arrays", which means one must enter "Ctrl-shift-enter" to get it to work correctly.
How can I do a conditional evaluation and not consider blank cells?
Nested if statements.
=MEDIAN(IF(A:A = "x",IF(B:B<>"",B:B, ""),"")
Not much to explain - it checks if A is x. If it is, it checks if B is non-blank. Anything that matches both conditions gets calculated as part of the median.
Given the following data set:
A | B
------
x |
x |
x | 2
x | 3
x | 4
x | 5
The above formula returns 3.5, which is what I believe you wanted.
Use the Googled formula, but instead of hitting Enter after you type it into the formula bar, hit Ctrl+Shift+Enter simultaneously (instead of Enter). This places brackets around the formula and will treat it as an array.
Be warned, if you edit it, you cannot hit Enter again or the formula will not be valid. If editing, you must do the same thing when done (Ctrl+Shift+Enter).
There is another way that does not involve the array formula that requires the CtrlShiftEnter operation.
It uses the Aggregate() function offered in Excel 2010, 2011 and beyond. The method also works for min,max and various percentiles.
The Aggregate() allows errors to be ignored, so the trick is to make all values that are not required cause errors. The easiest way is to do the task set above is:
=Aggregate(16,6,(B:B)/((A:A = "x")*(B:B<>"")),0.5)
The first and last parameters set the scene to do a percentile 50%, which is a median, the second says ignore all errors (including DIV#0) and the third says select the B column data, and divide it by a number which is one for all non empty values that have an x in the A column, and a zero otherwise.
The zeros create a divide by zero exception and will be ignored because a/1=a and a/0=Div#0
The technique works for quartiles (with an appropriate p value), all other percentiles of course, and for max and min using the large or small function with appropriate arguments.
This is a similar construct to the Sumproduct() tricks that are so popular, but which cannot be used on any quantiles or max min values as it produces zeros which look like numbers to these functions.
Bob Jordan
Perhaps to generalize it a little more, instead of this...
{=MEDIAN(IF(A:A="x",IF(B:B<>"",B:B)))}
... you could use the following:
{=QUARTILE.EXC(IF(A:A="x",IF(B:B<>"",B:B)),2)}
Note that the curly brackets refer to an array formula; you should not place the brackets in your formula but press CTRL+SHIFT+ENTER (or CMD+SHIFT+ENTER on macOS) when entering the formula
Then you could easily get the first and third quartile by altering the last number from 2 to 1 or 3 respectively. QUARTILE.EXC is what most commercial statistical software (e.g. Minitab) use by the way. The "regular" function is QUARTILE.INC, or for the older versions of Excel, just QUARTILE.

Resources