SUM columns on the same sheet based on conditions or SUMIFS from another sheet - excel

Here is a small sample table
+--------+-------+--------+
| COL 1 | COL 2 | COL 3 |
+--------+-------+--------+
| abc123 | Total | |
+--------+-------+--------+
| abc123 | cat1 | 100.00 |
+--------+-------+--------+
| abc123 | cat2 | 200.00 |
+--------+-------+--------+
| def123 | Total | |
+--------+-------+--------+
| def123 | cat1 | 100.00 |
+--------+-------+--------+
| def123 | cat2 | 200.00 |
+--------+-------+--------+
In COL 3, IF COL 2 is "Total" I need to SUM everything in COL 3 for each row in COL1 that is the same. (EG. COL3 Total row should be 300.00 for abc123 and then 300.00 for def123) Otherwise if COL 2 is NOT "Total" I need to do SUMIFS('Sheet3'!N:N,'Sheet3'!A:A,Sheet2!A473,'Sheet3'!Q:Q,Sheet2!Q473)*Sheet4!$U$2)
How can I can I accomplish the first part of the SUM?
Edit:
I think my example is too rigid and appears like it is set.
Let me see if I can explain in more fluid terms. I will have to describe this some what in database terms. All of the columns are on one sheet for the purposes of the "Total" portion.
COL 1 is my partition. Each of the "ID's" in COL 1 consists of 57 rows. Within 1 of those 57 rows is "Total" in another column, in the example that is COL 2.
So I have a large table that in COL 1 there are say 5 different ID's with 57 rows for each ID resulting in 285 rows.
Now I had a sorting function that would likely make this whole thing easier, but that function is crashing excel and not sorting both required sorts ( https://techcommunity.microsoft.com/t5/excel/sort-function-causes-a-crash-and-does-not-perform-secondary-sort/m-p/1477123#M66205 )
I suppose if I can get the sorting function to stop crashing excel this becomes slightly easier as then "Total" is consistently placed in row 2, 58, 116, etc. and I can add up everything below it. Right now, because that sort doesn't work, I have to add up everything from COL 3 that is NOT assigned to "Total" in COL 2 and has the same ID in COL1.
So in the table above abc123 is 3 rows and I need to add up the two rows that are not total for abc123 and have the formula spit out 300 into COL 3 for total.
Then def123 needs the same treatment.
Here is the tough part: the sorting is inconsistent because the data comes from a Redshift query so it is random for each ID. The IDs themselves are in random order. I think I can get the sort for COL 1 to work without crashing excel, but the secondary sort with the custom order is crashing it.

One way to avoid the Circular Reference error when trying to Total a column is to use two Sums, one above and one below.
So, assuming that your Columns 1, 2 and 3 are A, B and C, and that data starts in Row 2 (Row 1 being a header), you need the Sum of cells above the current row:
SUMIFS(C$1:C1, A$1:A1, A2)
Plus the Sum of the cells below the current row:
SUMIFS(C3:INDEX(C:C, 1+COUNTA(A:A)), A3:INDEX(A:A, 1+COUNTA(A:A)), A2)
(Note that this will actually terminate one row above and below the dataset)
Put this together with an IF statement:
=IF(B2="Total", SUMIFS(C$1:C1, A$1:A1, A2) + SUMIFS(C3:INDEX(C:C, 1+COUNTA(A:A)), A3:INDEX(A:A, 1+COUNTA(A:A)), A2), EXISTING_FORMULA_HERE)
Alternatively, you could try writing an Array Formula to calculate the SUM directly, a bit like when using multiple conditions in a MATCH, something like this: (not enough information in the question to do this exactly)
=SUMPRODUCT('Sheet3'!N:N*(COUNTIFS(A:A,'Sheet3'!$A:A)>0)*(COUNTIFS(B:B,'Sheet3'!$Q:Q)>0))
(Sum of Sheet3!N:N when a row exists in the current sheet that matches columns Sheet3!A:A in Column A and Sheet3!Q:Q in Column B)
Note that working on Entire Columns with Array Formulae is quite slow, so you may want to limit those just to the Used Range

Related

How can I tell if more than one 'IFS' condition is 'TRUE' and not just the first match?

In Excel 365 I'm using an "IFS" statement to scan through a number of columns to find out if a cell's value is in any of the columns. I believe "IFS" will process all your conditions until it reaches the first one that is "TRUE" then output. However, I'd like to be able to find ALL instances where my condition is true and output or evaluate them all somehow. Is there a way to do this with IFS (or some other method)? I think I'd like to output the matching value for each true condition in a separate row, but anything that could help me see how many matched and/or which column each match is in would be helpful.
The code I have is a bit much to share as my columns are in other workbooks, so I'll just share a close example. This formula would be in a cell that outputs the match, column D below.
A | B | C | D | E
------------------------------------
ColA | Col1 | Col2 | Formula | Notes
------------------------------------
1 | 1 | 2 | 1 | Two matches in same column (Col1)
2 | 1 | 2 | 2 | Two matches in same column (Col2)
3 | 3 | 3 | 3 | Two matches in diff column (Col1 & Col2)
=IFS(
NOT(ISERROR(MATCH(INDIRECT("A"&(ROW())),INDIRECT("B:B"),0))),
INDEX(INDIRECT("B:B"),MATCH(INDIRECT("A"&(ROW())),INDIRECT("B:B"),0)),
NOT(ISERROR(MATCH(INDIRECT("A"&(ROW())),INDIRECT("C:C"),0))),
INDEX(INDIRECT("C:C"),MATCH(INDIRECT("A"&(ROW())),INDIRECT("C:C"),0))
)
Of course the expected output is to dump the matching value of the first condition that's true, but I'd like to output all instances the condition is true in separate rows if possible. Maybe something like this...
A | B | C | D | E
------------------------------------
ColA | Col1 | Col2 | Formula | Notes
------------------------------------
1 | 1 | 2 | 1 | Two matches in same column (Col1)
... | ... | ... | 1 | Two matches in same column (Col1)
2 | 1 | 2 | 2 | Two matches in same column (Col2)
... | ... | ... | 2 | Two matches in same column (Col2)
3 | 3 | 3 | 3 | Two matches in diff column (Col1 & Col2)
... | ... | ... | 3 | Two matches in diff column (Col1 & Col2)
In the above and in my actual case the '...' would display what's in the column of that particular row match, which may vary from one row to another row throughout the worksheets. Basically, column D in the example would be on a separate 'results' sheet with the same amount of columns and column value types as all the 'data' sheets being searched. Furthermore, each column of the 'results' sheet would be a formula scanning that one specific column in all sheets, but only outputting the given column value of the matched row. Something like below...
DATA SHEET
A | B | C
----------------------
FName | LName | Amount
----------------------
John | Doe | 10
Jane | Doe | 4
Jack | Black | 10
RESULTS SHEET
(all cells are formulas)
A | B | C
----------------------
FName | LName | Amount
----------------------
John | Doe | 10 < matching value in C
Jack | Black | 10 < but different A & C
I hope that last part answered any "why" questions. ;)
ADDITION (7/25/19):
Below is the complete formula I'm using on sheets like above, but with more columns. It works well with the exception of my requirement to know where ALL matches occur and not just the first match on the IFS statement. Column "F" is the column I'm matching to output the corresponding value from the column cell on the match's row as found on the data sheets (5 sheets) to the formulated 'results' sheet, as displayed above. The only thing that changes in the formula between cells is the "A:A" to "B:B" etc., including "F:F" (the column with the value to be "MATCHED" from "SOURCES!$B$2"), which I made the last condition in the formula case nothing is found in the other data sheets, pasting its own data in lieu of something like 0, N/A, or FALSE.
=IFS(
NOT(ISERROR(MATCH(INDIRECT((SOURCES!$B$2)&"F"&(ROW())),INDIRECT((SOURCES!$B$3)&"F:F"),0))),
INDEX(INDIRECT((SOURCES!$B$3)&"A:A"),MATCH(INDIRECT((SOURCES!$B$2)&"F"&(ROW())),INDIRECT((SOURCES!$B$3)&"F:F"),0)),
NOT(ISERROR(MATCH(INDIRECT((SOURCES!$B$2)&"F"&(ROW())),INDIRECT((SOURCES!$B$4)&"F:F"),0))),
INDEX(INDIRECT((SOURCES!$B$4)&"A:A"),MATCH(INDIRECT((SOURCES!$B$2)&"F"&(ROW())),INDIRECT((SOURCES!$B$4)&"F:F"),0)),
NOT(ISERROR(MATCH(INDIRECT((SOURCES!$B$2)&"F"&(ROW())),INDIRECT((SOURCES!$B$12)&"F:F"),0))),
INDEX(INDIRECT((SOURCES!$B$12)&"A:A"),MATCH(INDIRECT((SOURCES!$B$2)&"F"&(ROW())),INDIRECT((SOURCES!$B$12)&"F:F"),0)),
NOT(ISERROR(MATCH(INDIRECT((SOURCES!$B$2)&"F"&(ROW())),INDIRECT((SOURCES!$B$13)&"F:F"),0))),
INDEX(INDIRECT((SOURCES!$B$13)&"A:A"),MATCH(INDIRECT((SOURCES!$B$2)&"F"&(ROW())),INDIRECT((SOURCES!$B$13)&"F:F"),0)),
NOT(ISERROR(MATCH(INDIRECT((SOURCES!$B$2)&"F"&(ROW())),INDIRECT((SOURCES!$B$14)&"F:F"),0))),
INDEX(INDIRECT((SOURCES!$B$14)&"A:A"),MATCH(INDIRECT((SOURCES!$B$2)&"F"&(ROW())),INDIRECT((SOURCES!$B$14)&"F:F"),0)),
NOT(ISERROR(MATCH(INDIRECT((SOURCES!$B$2)&"F"&(ROW())),INDIRECT((SOURCES!$B$2)&"F:F"),0))),
INDEX(INDIRECT((SOURCES!$B$2)&"A:A"),MATCH(INDIRECT((SOURCES!$B$2)&"F"&(ROW())),INDIRECT((SOURCES!$B$2)&"F:F"),0))
)
My formulated "results" workbook also has a worksheet named "SOURCES" that I use to paste file names to connect all the data sheets corresponding columns.
Btw, I'm using this as a tool to 'un-merge' customer data between profiles in our LIVE site/database after obtaining all the tables and columns the customer key has been found (using SQL) to then compare it (using Excel) to our TEST site so I can pull apart the data that doesn't belong on the 'kept' record from the LIVE merge. In this case there were 3 records merged. Two records have a profile in the TEST site, while the kept record from the LIVE site actually does not have a TEST record, giving me 5 sheets of data to examine.
Suppose your data starting from the range A2:C2
I thing this formula can help you,
Array Formula (Use Ctrl+Shift+Enter)
=INDEX($A2:$C2,MATCH("OK",IF(ISNUMBER($A2:$C2),"OK",""),0))

How to sort a column based on exact matches with another column

I have an inventory table that looks like this (subset):
part number | price | quantity
10115 | 14.95 | 10
1050 | 5.95 | 12
1074 | 7.49 | 8
110-1353 | 13.99 | 22
and i also have another table in sheet 2 that looks like this (subset):
part number | quantity
10023 | 1
110-1353 | 3
10115 | 2
20112 | 1
I want to basically subtract the quantities in the second table from the ones in the first table. What is the best way of doing this? I have looked in to VLOOKUP and INDEX MATCH but they are not quite right for this. Would this perhaps actually be better in say an Access DB ?
I have add another two columns next to sheet 1 last column. Let us assume that the second table range is A1:B5.
Image:
Formulas:
Column D:
=IFNA(VLOOKUP(A2,Sheet2!$A$2:$B$5,2,FALSE),0)
Column E:
=C2-D2
If you wanted to tackle this using MS Access, the SQL code might look like this:
select
t1.[part number],
t1.price,
t1.quantity - nz(t2.quantity, 0) as qty
from
inventory t1 left join table2 t2 on t1.[part number] = t2.[part number]
Here, I assume that you have a table called inventory and a table called table2 (change these to suit your database).
A left join is used to ensure that all records from inventory are returned, regardless of whether a match is found in table2, and the Nz function is used to return 0 for records for which there is no part number match in table2.

Generate new table from a current table in excel

I have sample table like this:
ID | 1 | 2 | 3
-------------
1 | 0 | 1 | 0
--------------
2 | 1 | 1 | 1
Then I want to generate a new table from that table. It will take the second row (1) then compare with each column (1, 2, 3) then print value of the matrix ( 0 - 1 - 0 ). For example:
Row_ID | Column_ID | Value
--------------------------
1 | 1 | 0
--------------------------
1 | 2 | 1
--------------------------
1 | 3 | 0
--------------------------
2 | 1 | 1
--------------------------
2 | 2 | 1
--------------------------
2 | 3 | 1
I'm not sure how or where to start by using formula. Please help. Thanks,
Well. There's no single formula that's going to do the job, obviously, but we have a few options we can use. I'll assume that the new table is going to start in cell A1 of Sheet2. Adjust accordingly.
Start with manually entered headers
Row_ID | Column_ID | Value
In the first column, first row, enter a 1. In rows below, use this formula: =IF(B3<B2,A2+1,A2) This will increment the value in the first column by 1 each time the second column resets its numbering.
In the second column, first row, enter a 1. The formula we'll use for this one will need some tweaking, but the basic version is: =IF(MOD(ROW()**+1**,**3**)=0,1,B2+1)
This formula is going to essentially count up to a certain point, then reset its numbering. The point it will count to, and where it will reset, will vary depending on the amount of data you have and which row you're starting from. Replace the 3 with the number of data columns you have, and remove the **s. The +1 is needed to increase the Row() counter to the SAME NUMBER as your number of data columns. So in my example, with 3 data columns and starting on row 2, the ROW() function gives us 2, so we need to add 1 to that to get up to a total of 3. If I had 5 data columns, I would add 3 to the total. Hope that makes sense.
These two formulae should give you a set of row and column numbers. Copying the formula down will force the values to increase as needed, thus:
Row_ID | Column_ID | Value
1 | 1 |
1 | 2 |
1 | 3 |
2 | 1 |
...etc.
Finally, to bring in the values, we'll use an OFFSET formula in the Value column: =OFFSET(Sheet1!$A$1,A2,B2) That formula starts from a reference cell - A1, in this case - then moves down x number of rows and across y number of columns to return a value. X and Y are provided by the formulas we already have. Your final structure will be something like this:
Row_ID | Column_ID | Value
1 | 1 |=OFFSET(...
=IF(...|=IF(MOD(...|=OFFSET(...
I hope all that made sense. Please let me know if there's anything that doesn't, and I'll try to troubleshoot.
EDITED TO ADD:
If the Row ID is something like a key that needs to be included with each value, we can get that fairly easily. We'll include another column with a slightly modified OFFSET formula: =OFFSET(Sheet1!$A$1,A2,0)
With this version of the formula we're not changing the column as we go down, just the row when it changes. It allows the values in the first row to be repeated in every row of the table. So this is my input:
And this is my output:
Notice that the ID repeats on each line of the output for the same item.

Counting the number of older siblings in an Excel spreadsheet

I have a longitudinal spreadsheet of adolescent growth.
ID | CollectionDate | DOB | MOTHER ID | Sex
1 | 1Aug03 | 3Apr90 | 12 | 1
1 | 4Sept04 | 3Apr90 | 12 | 1
1 | 1Sept05 | 3Apr90 | 12 | 1
2 | 1Aug03 | 21Dec91 | 12 | 0
2 | 4Sept04 | 21Dec91 | 12 | 0
2 | 1Sept05 | 21Dec91 | 12 | 0
3 | 1Aug03 | 30Jan89 | 23 | 0
3 | 4Sept04 | 30Jan89 | 23 | 0
This is a sample of how my data is formatted and some of the variables that I have. As you can see, since it is longitudinal, each individual has multiple measurements. In the actual database there are over 10 measurements per individual and over 250 individuals.
What I am wanting to do is input a value signifying the number of older brothers and older sisters each individual has. That is why I have included the Mother ID (because it represents genetic relatedness) and sex. These new variable columns would just say how many older siblings of each sex each individual has. Is there a formula that I could use to do this quickly?
=COUNTIFS($B:$B,"<>"&$B2,$H:$H,$H2,$AI:$AI,$AI2,$J:$J,"<"&$J2)
Create a column named Distinct with this formula
=1/COUNTIF([ID],[#ID])
Then you can find all the older 0-sexed siblings like this
=SUMPRODUCT(([DOB]>[#DOB])*([MOTHERID]=[#MOTHERID])*([Sex]=0)*([Distinct]))
Note that I made the data a Table and used table notation. If you're not familiar [COLUMNNAME] refers to the whole column and [#COLUMNNAME] refers to the value in that column on the current row. It's similar to saying $A:$A and A2 if you're dealing with column A.
The first formula gives you a value to count that will always result in 1 for a particular ID. So ID=1 has three lines and Distinct will result in .33333 for each line. When you add up the three lines you get 1. This is similar to a SELECT DISTINCT in Sql parlance.
The SUMPRODUCT formula sums [Distinct] for every row where the DOB is greater than the current DOB, the Mother is the same as the current Mother, and the Sex is zero.
I have a possible solution. It involves adding two columns -- One for "# older siblings" and one for "unique?". So here are all the headings I have currently:
A -- ID
B -- CollectionDate
C -- DOB
D -- MOTHER ID
E -- Sex
F -- # older siblings
G -- unique?
In G2, I added the following formula:
=IF(A2=A1,0,1)
And dragged down. As long as the data is sorted by ID, this will only display "1" once for each unique person.
In F2, I added the following formula:
=COUNTIFS(G:G,"=1",D:D,"="&D2,C:C,"<"&C2)
And dragged down. It seemed to work correctly for the sample data you provided.
The stipulations are:
You would need the two columns.
The data would need to be sorted by ID
I hope this helps.
You need a formula like this (for example, for row 2):
=COUNTIFS($A:$A,"<>"&$A2,$E:$E,$E2,$D:$D,$D2,$C:$C,"<"&$C2)
Assuming E:E is column for sex, D:D is column for mother ID and C:C is column for DOB.
Write this formula in H2 cell for example and drag it down.

How to automatically insert a blank row after a group of data

I have created a sample table below that is similar-enough to my table in excel that it should serve to illustrate the question. I want to simply add a row after each distinct datum in column1 (simplest way, using excel, thanks).
_
CURRENT TABLE:
column1 | column2 | column3
----------------------------------
A | small | blue
A | small | orange
A | small | yellow
B | med | yellow
B | med | blue
C | large | green
D | large | green
D | small | pink
_
DESIRED TABLE
Note: the blank row after each distinct column1
column1 | column2 | column3
----------------------------------
A | small | blue
A | small | orange
A | small | yellow
B | med | yellow
B | med | blue
C | large | green
D | large | green
D | small | pink
This does exactly what you are asking, checks the rows, and inserts a blank empty row at each change in column A:
sub AddBlankRows()
'
dim iRow as integer, iCol as integer
dim oRng as range
set oRng=range("a1")
irow=oRng.row
icol=oRng.column
do
'
if cells(irow+1, iCol)<>cells(irow,iCol) then
cells(irow+1,iCol).entirerow.insert shift:=xldown
irow=irow+2
else
irow=irow+1
end if
'
loop while not cells (irow,iCol).text=""
'
end sub
I hope that gets you started, let us know!
Philip
Select your array, including column labels, DATA > Outline -Subtotal, At each change in: column1, Use function: Count, Add subtotal to: column3, check Replace current subtotals and Summary below data, OK.
Filter and select for Column1, Text Filters, Contains..., Count, OK. Select all visible apart from the labels and delete contents. Remove filter and, if desired, ungroup rows.
This won't work if the data is not sequential (1 2 3 4 but 5 7 3 1 5) as in that case you can't sort it.
Here is how I solve that issue for me:
Column A initial data that needs to contain 5 rows between each number -
5
4
6
8
9
Column B -
1
2
3
4
5
(final number represents the number of empty rows that you need to be between numbers in column A) copy-paste 1-5 in column B as long as you have numbers in column A.
Jump to D column, in D1 type 1. In D2 type this formula - =IF(B2=1,1+D1,D1)
Drag it to the same length as column B.
Back to Column C - at C1 cell type this formula - =IF(B1=1,INDIRECT("a"&(D1)),""). Drag it down and we done. Now in column C we have same sequence of numbers as in column A distributed separately by 4 rows.
Figured it out.
Step 1
Put a new column to the left of column1 and copy+paste the following formula
=B2=B3
=B3=B4
=B4=B5
... all the way to the bottom (assume column B here is column1 in the original question).
This formula evaluates whether or not the next row is a new value in column1. Deopending on the result, you will have TRUE or FALSE. Copy and Paste these results as values and then swap "FALSE" for nil and "TRUE" for 0.5
Step 2
Then add that column full of only 0.5's to the column1 which will yield the following table:
newcolumn0 | column1 ("B") | column2 | column3
-----------------------------------------------------
| 1 | small | blue
| 1 | small | orange
1.5 | 1 | small | yellow
| 2 | med | yellow
2.5 | 2 | med | blue
3.5 | 3 | large | green
| 4 | large | green
4.5 | 4 | small | pink
Step 3
Lastly, copy and paste the values from newcolumn0 right below the values in column1 and then sort the table by column1 and you should have a blank row in between each distinct whole number in column1, with the table something like this:
newcolumn0 | column1 ("B") | column2 | column3
---------------------------------------------------------------
| 1 | small | blue
| 1 | small | orange
1.5 | 1.5 | |
| 1 | small | yellow
| 2 | med | yellow
| 2 | med | blue
2.5 | 2.5 | |
| 3 | large | green
3.5 | 3.5 | |
| 4 | large | green
| 4 | small | pink
4.5 | 4.5 | |
Alternative Solutions (still no VBA)
Put a value of 1 Column 1, Row 2 (assume this is A2)
Put this formula in A3 =IF(B3=B2,A2,A2+1) and copy+paste this formula for the rest of column 2
Then copy and paste all the values from column 1 into a new temp excel sheet, remove duplicates, then add 0.5 to all numbers, then paste these values below the values in original spreadsheet below the data in column 1, paste all data in column as values and then sort by that column, delete the temp excel sheet
Just an idea, if you know the categories, as small, medium, and large mentioned above...
At the bottom of the sheet, make 3 rows that only say small, medium, and large, change the font to white, and then sort so that it alphabetizes, placing a blank row between each section.
Insert a column at the left of the table 'Control'
Number the data as 1 to 1000 (assuming there are 1000 rows)
Copy the key field to another sheet and remove duplicates
Copy the unique row items to the main sheet, after 1000th record
In the 'Control' column, add number 1001 to all unique records
Sort the data (including the added records), first on key field and then on 'Control'
A blank line (with data in key field and 'Control') is added
I have a large file in excel dealing with purchase and sale of mutual fund units. Number of rows in a worksheet exceeds 4000. I have no experience with VBA and would like to work with basic excel. Taking the cue from the solutions suggested above, I tried to solve the problem ( to insert blank rows automatically) in the following manner:
I sorted my file according to control fields
I added a column to the file
I used the "IF" function to determine when there is a change in the control data .
If there is a change the result will indicate "yes", otherwise "no"
Then I filtered the data to group all "yes" items
I copied mutual fund names, folio number etc (no financial data)
Then I removed the filter and sorted the file again. The result is a row added at the desired place. (It is not entirely a blank row, because if it is fully blank, sorting will not place the row at the desired place.)
After sorting, you can easily delete all values to get a completely blank row.
This method also may be tried by the readers.

Resources