I've been struggling between the SUMPRODUCT and COUNTIFS formulas as there are a lot of specific dependencies in my data. Wondering if anyone can shed a bit more light on this issue.
Have tried SUMPRODUCT and COUNTIFS which give me calculations based on 1 set, but I need to include additional if/or statements.
I have the following:
| ID | Size | Dead/Alive | Duration | Days | Pass/Fil | Reason |
|----|---------|------------|-----------|------|----------|----------|
| 1 | Full | Dead | Permanent | 125 | Pass | Comments |
| 2 | Partial | Alive | Permanent | 500 | Pass | |
| 3 | Other | Dead | Temporary | 180 | Fail | Comments |
| 4 | No | Dead | Temporary | 225 | Fail | Comments |
| 5 | Yes | Alive | Permanent | 200 | Pass | |
with the following rules:
Only Count the ID/ROW if:
1) Values in column A = Full, Partial or Other
OR...
2) Values in column A = No AND values in column B = Dead
OR...
3) If values in column C = Permanent AND values in column D = >=100 or <=200
OR
4) If values in column C = Temporary AND values in column E = Pass, Fail AND column F=not blank
By my calculations, the total should be 5, but this is just a small sampling of my total data. Just not sure how to get that in Excel with either Sumproduct, Countifs or even someone suggested a Lookup function, although Ive never used that one.
Given that you have so many different conditions, I have to break it down one by one and create a few helper columns to account for each condition.
In my solution I created 10 helper columns as shown below, and I have added some sample data (ID 6 to 29) to test the solution.
I also named 7 conditions in my solution:
Cond_1 Values in column A = Full, Partial or Other
Cond_2 Values in column A = No AND values in column B = Dead
Cond_3A Values in column C = Permanent
Cond_3B Values in column D >=100
Cond_3C Values in column D <=200
Cond_3A, Cond_3B and Cond_3C must be TRUE at the same time
Cond_4 Values in column C = Temporary AND values in column E = Pass
Cond_5A Values in column C = Temporary AND values in column E = Fail
Cond_5B Column F is not blank (I did not give a name to this condition)
Cond_5A and Cond_5B must be TRUE at the same time
Please note my Cond_4, Cond_5A and Cond_5B are all related to your original condition 4), which reads a bit odd, and I am not 100% sure if my interpretation of the condition is correct. If not please re-state your last condition and I can amend my answer accordingly.
As shown in my screen-shot, the formulas in I2 to Q2 are listed in Column U. I only used MAX, AND, SUM, =, &, and/or <> to interpret each condition. Please note some of the formulas are Array Formula so you need to press Ctrl+Shift+Enter to make it work.
The To Count column is simply asking whether the SUM of the previous 9 columns is greater than 1, which means at least one of the conditions is met. If so returns 1 otherwise 0.
Then you just need to work out the total of To Count column. In my example it is 22. I have highlighted the entries that did not meet any of the given condition.
You can use only one helper column to capture all conditions in one formula, but I would not recommend it as it would be too long to be easily understood and modified in future.
{=--(SUM(MAX(--(A2=Cond_1)),MAX(--(A2&B2=Cond_2)),--(SUM(--(C2=Cond_3A),--(AND(D2>=Cond_3B,D2<=Cond_3C)))=2),MAX(--((C2&E2)=Cond_4)),--(SUM(MAX(--((C2&E2)=Cond_5)),--(F2<>""))=2))>0)}
Ps. I would also wonder if there is a formula-based solution without using any helper column...? :)
Related
I have an If formula in Excel as seen below:
=IF((Q2="O")*AND(R2="O")*AND(S2="O")*AND(T2="O")*AND(U2="O"),"O","X")
Basically, if cells Q2 to U2 is O the cell with formula will have O written in it. Otherwise it will have X. Now I want to change it into a nested If statement due to new conditions.
These conditions, in order are:
If any one of cells Q2 to U2 = X, cell = X
If any one of cells Q2 to U2 = date format, cell = ∆
If Q2 to U2 = O, cell = O
If none of the conditions are met, the cell will have value "FALSE". (Default appears)
Each of the cells have this condition to follow,
If any one of cells Q2 to U2 = -, ignore that cell and count the other cells to get final result.
I tried switching to Or in my original formula to test out conditions 1 and 3.
=IF((Q2="O")*or(R2="O")*or(S2="O")*or(T2="O")*or(U2="O"),"O","X")
But it doesn't work. Plus I'm not sure how to do condition 5 as well. Any help?
Is it possible to do something so complex just by using Excel formula? Or do I need to go into VBA?
Things like this are usually easier if you tackle the problem in order of priority. It looks like 1 has the highest priority:
=IF(COUNTIF(Q2:U2,"X")>0,"X","Does not contain X")
COUNTIF(Q2:U2,"X") returns the number of occurrences of X in the range Q2:U2.
In place of "Does not contain X", we can first check for condition 2:
=IF(SUMPRODUCT(--ISNUMBER(Q2:U2))>0,"∆","Does not contain X or dates")
A date in excel is literally a number with some decorative formatting. I am using ISNUMBER to find the numbers and SUMPRODUCT to count the identified cells in the range. If there are more than 0 (at least 1), then it will become ∆.
In place of "Does not contain X or dates" now, we could check for Os. I would count the Os and add the cells with - (conditions 3 and 5 together) and see if they add up to the total cells in Q2:U2 (which is 5 in this case):
=IF(COUNTIF(Q2:U2,"O")+COUNTIF(Q2:U2,"-")=5,"O","FALSE")
When combined, it would become:
=IF(COUNTIF(Q2:U2,"X")>0,"X",IF(SUMPRODUCT(--ISNUMBER(Q2:U2))>0,"∆",COUNTIF(Q2:U2,"O")+COUNTIF(Q2:U2,"-")=5,"O","FALSE"))
Since all these involve counts, it might be easier setting up helper columns, something like:
| Q | R | S | T | U | V | W | X | Y | Z
1 | | | | | | Count of X | Count of dates | Count of O | Count of - | Final value
2 | | | | | | A | B | C | D | E
A will be:
=COUNTIF(Q2:U2,"X")
B will be:
=SUMPRODUCT(--ISNUMBER(Q2:U2))
C will be:
=COUNTIF(Q2:U2,"O")
D will be:
=COUNTIF(Q2:U2,"-")
E will be:
=IF(V2>0,"X",IF(W2>0,"∆",X2+Y2=5,"O","FALSE"))
I have this formula in a calculated column that is working great:
=IFERROR(INDEX(Allocation_of_Funds[[#Headers],[End.Nursing]:[Unassigned14]],MATCH(TRUE,INDEX(Allocation_of_Funds[#[End.Nursing]:[Unassigned14]]>0,0),0)),"")
But this formula is giving me trouble and represents what I want in the next calculated column, (based on the value in the previous column above) but it returns a #REF! error:
=INDEX(INDIRECT("Allocation_of_Funds[[#Headers]"&"["&[#[SOURCE 1]]&"]"):[#Unassigned14],MATCH(TRUE,INDIRECT("Allocation_of_Funds[[#Headers]"&"["&[#[SOURCE 1]]&"]"):[#Unassigned14]>0,0),0)
The details of the tables setup is as follows, in case it's helpful:
I have a table with a range of columns and each column represents a different type of account. For each row, any combination of these columns could contain values or blanks, so I've got another set of columns that I want to identify the table column headers for the non-blank columns for each record.
SOURCE1 | SOURCE2 | SOURCE3 | ACCT1 | ACCT2 | ACCT3 | ACCT4 | ACCT5
ACCT1 | ACCT2 | ACCT4 | 500 | 300 | | 100 |
ACCT2 | ACCT3 | | | 200 | 100 | |
ACCT3 | | | | | 500 | |
| | | | | | |
ACCT3 | ACCT4 | ACCT5 | | | 200 | 300 | 50
ACCT1 | ACCT3 | ACCT4 | 123 | | 332 | 100 |
So I need the SOURCE2 column to use the value in the SOURCE1 column to identify the start of the range where I am looking for the next cell with a value, whereby the column header above that value will be returned for the SOURCE2 row value. The same formula will apply to the SOURCE3 column, using the value of the SOURCE2 column to identify the start of the next range.
Thank in advance for picking your brain!
-Lindsay
I used the following formula to pull the headers and place them under the source numbers:
=IFERROR(INDEX($D$1:$H$1,AGGREGATE(15,6,COLUMN($D2:$H2)/ISNUMBER($D2:$H2)-COLUMN($D$1)+1,RIGHT(A$1,1)*1)),"")
I assumed your table's top left corner was in A1 with 1 being the header row and A-C being your source columns and D to H being account columns. The above formula can be placed in cell A2 and copied to the right and down as need be.
You seem to have a grasp of the IFERROR and INDEX function so I will explain the AGGREGATE function:
=AGGREGATE(15,6,COLUMN($D2:$H2)/ISNUMBER($D2:$H2)-COLUMN($D$1)+1,RIGHT(A$1,1)*1)
The AGGREGATE function is a mixture of a bunch of different functions rolled into with the ability to ignore some calculations. Another added feature is that some of the built in functions perform array calculations without the need for arrays.
In this particular case I chose aggregate function 15 which is the same as the SMALL function. I have also told aggregate to ignore calculations which generate errors by using the "6". For the array calculation I have asked it to divide the column number it is working with by the True or False result of that column being a number:
COLUMN($D2:$H2)/ISNUMBER($D2:$H2)-COLUMN($D$1)+1
True in excel math is the same as 1 and False is the same as 0. Anytime the cell is not a number it will try to divide by zero, generate an error, and be ignored by Aggregate function. This basically generates a list of column numbers that meet the criteria of having a number in their column. The subtraction of the D1 followed by a +1 is to convert the column number that is determined, to a relative column under your accounts headers.
The next part of the aggregate function is telling the SMALL operation which number in sorted order needs to be returned. I used the last character in your source header to determine which column number to return. For SOURCE1 the last character is 1 so I want the smallest column number returned. For SOURCE2, the second smallest number is returned. The *1 at the end converts the character to a number instead of 1 as text.
RIGHT(A$1,1)*1
Ergo, if you want to use up to 9 sources you can. You can do more sources as well but you would need to revise this formula or come up with a different way of providing which number of the small list you want returned. And you can expand the D2:H2 reference to be all your accounts, and adjust the D1:H1 reference to cover all your account headers.
Proof of Concept
What I'm trying to accomplish doesn't seem too tricky, and should be doable I just can't seem to crack it.
All I need to do is create a calculated column that flags whether or not the row value in one column exists anywhere in another column within the same table.
id | Code 1 | Code 2 | Attempted results
--------------------------------------------------------------------------
1 | 1095829 | 1093895 | Y
2 | 1093895 | 1838949 | N
3 | 1095289 | 1093910 | Y
4 | 1093910 | 1840193 | N
So essentially, is Code 1 found anywhere in the code 2 column
You should be able to do this using an IF/MATCH formula. For example:
=IF(MATCH(B2,C:C,0),"Duplicate", "Unique")
This would check if B2 (1095829) shows up anywhere in the C column. If it does, it returns "Duplicate" to that cell, if it does not, then it returns "Unique".
I'm trying to generate a table that shows a count of how many items are in any given status on any given day. My result table has a set of Dates down column A and column headers are various statuses. A sample of my data table with headers looks like this:
Product | Notice | Assigned | Complete | In Office | In Accounting
1 | 5/5/13 | 5/7/13 | 5/9/13 | 5/10/13 | 5/11/13
2 | 5/5/13 | 5/6/13 | 5/8/13 | 5/9/13 | 5/10/13
3 | 5/6/13 | 5/9/13 | 5/10/13 | 5/10/13 | 5/10/13
4 | 5/4/13 | 5/5/13 | 5/7/13 | 5/8/13 | 5/9/13
5 | 5/7/13 | 5/8/13 | 5/10/13 | 5/11/13 | 5/11/13
If my output table were to contain a set of dates in the first column with the statuses as headers, I need a count of how many rows were at the given status and had not yet transitioned to the next status so that in the Notice column, I'd have a count of rows where the Notice Date was <= X AND where the Assigned, Complete, In Office, In Accounting are all greater than X.
I've used a Sum(if(frequency(if statement to get me REALLY close but I feel like I need to have an AND statement within the second IF like this =SUM(IF(FREQUENCY(IF(AND
Here's what I have that won't work:
=SUM(IF(FREQUENCY(IF(AND(Table1[Assigned]<=A279,Table1[[Complete]:[In Accounting]]<=A279),ROW(Table1[[Complete]:[In Accounting]])),ROW(Table1[[Complete]:[In Accounting]]))>0,1))
If I take the "AND" portion out, this works fine except I need it to ONLY count rows where the given status actually has a date so if an "Assigned" date is empty, I don't want that row to be counted for the Assigned column.
Here's an example of what I'd expect to see in the results. I've listed the count in the each column as well as the corresponding product numbers in parenthesis. The corresponding product numbers are for reference only and won't actually be in the result table.
Date | Notice | Assigned | Complete
5/6 | 2 (1,3) | 2 (2,4) | 0
5/7 | 2 (3,5) | 2 (1,2) | 1 (4)
5/8 | 1 (3) | 2 (1,5) | 1 (2)
OK, assuming you have the original data in A1:F6 then with 2nd table headers in B9:D9 and row labels in A10:A12 then you can use this "array formula" in B10
=SUM((B$2:B$6<=$A10)*(MMULT((C$2:$F$6>$A10)+(C$2:$F$6=""),TRANSPOSE(COLUMN(C$2:$F$6)^0))=COLUMNS(C$2:$F$6)))
confirmed with CTRL+SHIFT+ENTER and copied down and across (see screenshot below)
As you can see the results are as per your requirement. If you replace dates with blanks it will still work
MMULTis a way to get a single value from each row even when you are looking at multiple columns.
I used cell references because I think that's easier, especially when copying the formula across and having a reducing range.......but you can use structured references if you want
Have you tried using COUNTIFS to count based on multiple criteria. It is fairly well documented here: http://office.microsoft.com/en-us/excel-help/countifs-function-HA010047494.aspx (2007+ only)
Basically, you use it like
=COUNTIFS(first_range_to_check, value_you_want_in_first_range, ...)
where the ... represents as many pairs as you want (up to 127 total pairs), note the conditions are AND connection so if you have two pairs, the first pair AND the second pair must return true for that row to count.
I have over 100k rows of data like below:
ALLA,ALLA,"Company1, Inc.","Company1, Inc.",PSA,PSA,1,1,FALSE,FALSE
BCCO,BCCO,"Company2, Inc.","Company2, Inc.",PSB,PSB,1,1,FALSE,FALSE
CTTP,CTTP,"Company3, Inc.","Company3, Inc.",PSC,PSC,1,1,FALSE,FALSE
CMMZ,CMMZ,"Company4, Inc.","Company4, Inc.",PSD,PSD,1,1,FALSE,FALSE
I want to know how to figure if data in column 1 is the same as column 2, column 3 as column 4 and so on. How could I do that in excel?
Following Cory's formula, I found that I can compare whole columns using:
=if(A:A=B:B, "yay", "aww")
Problem is I have a header in the file:
c - symbol, symbol, c - companyname, companyname, c - tradingvenue, tradingvenue, c - tierrank, tierrank, c - iscaveatemptor, iscaveatemptor
Shouldn't this cause A:A=B:B to be false?
Given this:
| A | B |
---+-----+-----+
1 | X | X |
---+-----+-----+
2 | Y | Y |
---+-----+-----+
3 | Z | Z |
The formula =SUMPRODUCT(--(A1:A3=B1:B3)) will tell you how many times the A value matches the B value.
You should get 3 as a result here. If, for example, you change B3 to Q then it will give you 2.
To do this on two columns without specifying the end of the range, try:
=SUMPRODUCT(--(A:A=B:B),--(LEN(A:A)>0))
I've been using Excel since 1991, and unless you want to write a VB macro, I think the best way is to do the simple IF statement suggested in the comments. If you need to test several columns at once, which is what your question suggests, then I'd do
=IF(AND(A1=B1,C1=D1,E1=F1,G1=H1),0,1)
Fill that formula down the column and then you'll be able toinstantly count the number of rows that don't matchwith a data-filter, select all the rows which have a '1', so you'll be able to examine the rows that don't match