Excel multilevel array formula with partial string matches to sum resultant cells - excel

I've been trying to sort this for over a day now without much luck. I have successfully used SUMIFS, INDEX, MATCH, COUNTIF, "--" etc array functions previously and am not a novice, but also not an expert on these. I can't seem to weave these together correctly, and likely on an altogether incorrect path.
Basically, I am trying to aggregate data from multiple spreadsheets, requiring a mapping of various items (rows) into a canonical form for summing.
The image here shows a representative, but simplified version of my quest. Each "region" on this example spreadsheet (Final..., Mapping, DataSet1, DataSet2) is actually in different spreadsheets, and there are several sheets with 50-150 rows in each xlsx.
Note that the names in Column B are quite arbitrary (meaning not all P1's have an 'x' pattern, like shown here as x1, x2, etc. Do not rely on any pattern in the names, except the x, y , z in the Mapping table are substrings (case insensitive, trailing match) of the names in Column B in the DataSets.
And in the image, the Final Result Table (summed manually) is what I want to compute via(an array) formula: A single formula would be ideal (given I have many spreadsheets from which the monthly data is being pulled from, so I can't readily modify but can create an interim spreadsheet if required, so open to helper columns or helper rows).
Here's the process - For each name (B3-B5) in the Final Result Table, I want to sum the name from it's components as follows:
Lookup all the matches in the Mapping Table (so for P1, the formula =IF($C$10:$C$15=$B3, $B$10:$B$15,"") gives {"x1";"";"";"x2";"";"x3"}.
I then want to search each of x1, x2, and x3 in B19:B26 to get rows 21, 22, 24, 25, 26 in DataSet1 and B31:B35 to get row 32 in DataSet2, to then add up the Jan totals into C3. (Effectively,
C3=C21+C22+C24+C25+C26+C32). Same for P2 and P3, and thru Feb, Mar, ...
I am stuck on how to remove blank or 0 or Div0 or such "error rows" from the interim result in 2, and also need to use 2 arrays of different sizes (3 valid rows in example 2 above, ignoring blanks) to search many rows in DataSets. I tried SEARCH("*"&IF($C$10:$C$15=$B3, $B$10:$B$15,""), $B$19:$B$26) but get unexpected results. I have tried to replace text in the interim result {"x1";"";"";"x2";"";"x3"} with TRUE/FALSE, and 1/0, etc. to help with INDEX or MATCH, but am stymied by errors in downstream ("surrounding") formulas.
Thanks in advance.

Here is a solution without resorting to nasty (imo) CSE formulas.
= SUMPRODUCT($C$19:$F$26*(COUNTIFS($B$10:$B$15, RIGHT($B$19:$B$26,2),$C$10:$C$15,$B3)>0)*($C$18:$F$18=C$2))
+
SUMPRODUCT($C$31:$F$35*(COUNTIFS($B$10:$B$15, RIGHT($B$31:$B$35,2),$C$10:$C$15,$B3)>0)*($C$30:$F$30=C$2))
There is one SUMPRODUCT for each data set. If possible, it would be better to put all your data sets into a single table with a column identify which data set it is a part of.
The way it works is to takes each values in your data set and multiplies it by whether the 2 right most character appear in your mapping table for that P code, multiplied by whether the value is in the correct month. So it returns 0 if either of those conditions are false. Then returns the sum.
UPDATE IN RESPONSE TO OP COMMENTS
If, the X,Y, Z codes are not always 2 digits but the first part is ALWAYS 8 digits, you can easily amend the:
RIGHT($B$19:$B$26,2)
to be:
RIGHT($B$19:$B$26,LEN($B$19:$B$26)-8)
Making the formula for the first data set:
=SUMPRODUCT($C$19:$F$26*(COUNTIFS($B$10:$B$15, RIGHT($B$19:$B$26,LEN($B$19:$B$26)-8),$C$10:$C$15,$B3)>0)*($C$18:$F$18=C$2))
And you can amend for other data sets and simply add them together.

Nice challenge! Are you willing to drop all your tables (DataSet1, DataSet2...) into one spreadsheet, so that we can refer just one single range for each month?
Here's one solution (hopefully a good starting point) - array formula (Ctrl+Shift+Enter):
=SUMPRODUCT(IFERROR(IF(TRANSPOSE(IF($B3=$C$10:$C$15,$B$10:$B$15,""))=RIGHT($B$18:$B$36,2),C$18:C$36,0),0))

Related

Excel - Combine data from multiple tables dynamically

I would like to combine three different tables in Excel. I am struggling with the fact that the tables can vary in length.
For example:
What I would like to achieve is all the tables' data in one table without empty spaces. So first the two entries from the first table then the three entries from the second table and lastly the entry from the third table. But the amount of rows in each table can vary.
How can I do this dynamically so when the amount of entries in the tables change it can handle this? I'm using Mac with Office365. Thanks!
EDIT:
Output with Ron Rosenfeld's solution, the range of the list goes down from cell 5 - cell 103. Could this be reduced to 5 - 15?:
If you have Excel 2019 or Office 365, with the FILTERXML and TEXTJOIN functions, you can use:
=FILTERXML("<t><s>" & TEXTJOIN("</s><s>",TRUE,Table1,Table2, Table3) & "</s></t>","//s[.!=0]")
If those zero's are really blanks, you can omit [.!=0] from the xPath argument, but it won't hurt to leave it there
Edit:
With MAC versions of Office 365 that do not have the FILTERXML function, I believe the following will work:
=LET(
a,299,
x,IF(SEQUENCE(99,,0)=0,1,SEQUENCE(99,,0)*a),
y,TEXTJOIN(REPT(" ",a),TRUE,Table19,Table20,Table21),
z, TRIM(MID(y,x,a)),FILTER(z,(z<>"0")*(z<>""))
)
Note the a parameter in the above function
Because of how the splitting algorithm works, the sequence for each cell will not always start at the beginning of a string.
Hence, if there are enough letters in the various strings, the start number may eventually get offset enough to cause a split in the wrong location
One fix is to use an arbitrarily large number of space's to insert.
99 is frequently large enough, but not for this data set.
299 seems to be large enough for the data set as shown in your actual data.
I believe the minimum number should be the sum of the lengths of all the characters in the original tables (including the 0's) plus one (1). But not sure of this.
You can certainly adjust it as needed
If the number becomes too large, you could run into the 32,767 character limitation. If that happened, an error message would occur.
So, if you wanted to compute a, dynamically, you could try something like:
=LET(
a,SUM(LEN(Table19[Column1]),LEN(Table20[Column1]),LEN(Table21[Column1]))+1,
x,IF(SEQUENCE(99,,0)=0,1,SEQUENCE(99,,0)*a),
y,TEXTJOIN(REPT(" ",a),TRUE,Table19,Table20,Table21),
z, TRIM(MID(y,x,a)),FILTER(z,(z<>"0")*(z<>""))
)
but no guarantees.
Assuming the data is in A:C, and empty cell is blank (not 0).
In E1 put :
=IF(ROW()>COUNTA(A:C),"",
INDEX(A:C,
IF(ROW()<=COUNTA(A:A),ROW(),IF(ROW()<=COUNTA(A:B),ROW()-COUNTA(A:A),ROW()-COUNTA(A:B))),
IF(ROW()<=COUNTA(A:A),1,IF(ROW()<=COUNTA(A:B),2,3)))
)
Idea : use row() to guide in selection in index. counta() is used guide converting 'row()' to usable index numbers. Also make the output cell blank "" for row() > counta(a:c).
Please share if it works/not.

excel if and if error formula that has used 140 times and it throws an errors saying we can use it only 64 times

I have 140 unique numbers and trying to find that through the list which can be used in vba
The formula works fine till 64 ifs are used, later I am having a trouble
=IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IFERROR(IF(FIND("5216",A2,1)>0,"00000A-5216",""),IF(FIND("5140",A2,1)>0,"00000B-5140","")),IF(FIND("5148",A2,1)>0,"00000C-5148","")),IF(FIND("5117",A2,1)>0,"00000D-5117","")),IF(FIND("5204",A2,1)>0,"00000E-5204","")),IF(FIND("5238",A2,1)>0,"00000F-5238","")),IF(FIND("5203",A2,1)>0,"00000G-5203","")),IF(FIND("5237",A2,1)>0,"00000H-5237","")),IF(FIND("5051",A2,1)>0,"5051","")),IF(FIND("0101",A2,1)>0,"0101","")),IF(FIND("0700",A2,1)>0,"0700","")),IF(FIND("3208",A2,1)>0,"3208","")),IF(FIND("3209",A2,1)>0,"3209","")),IF(FIND("3900",A2,1)>0,"3900","")),IF(FIND("3901",A2,1)>0,"3901","")),IF(FIND("5029",A2,1)>0,"5029","")),IF(FIND("5030",A2,1)>0,"5030","")),IF(FIND("5032",A2,1)>0,"5032","")),IF(FIND("5033",A2,1)>0,"5033","")),IF(FIND("5036",A2,1)>0,"5036","")),IF(FIND("5049",A2,1)>0,"5049","")),IF(FIND("5067",A2,1)>0,"5067","")),IF(FIND("5068",A2,1)>0,"5068","")),IF(FIND("5069",A2,1)>0,"5069","")),IF(FIND("5072",A2,1)>0,"5072","")),IF(FIND("5073",A2,1)>0,"5073","")),IF(FIND("5075",A2,1)>0,"5075","")),IF(FIND("5076",A2,1)>0,"5076","")),IF(FIND("5078",A2,1)>0,"5078","")),
IF(FIND("5079",A2,1)>0,"5079","")),IF(FIND("5080",A2,1)>0,"5080","")),IF(FIND("5081",A2,1)>0,"5081","")),IF(FIND("5082",A2,1)>0,"5082","")),IF(FIND("5083",A2,1)>0,"5083","")),IF(FIND("5090",A2,1)>0,"5090","")),IF(FIND("5094",A2,1)>0,"5094","")),IF(FIND("5095",A2,1)>0,"5095","")),IF(FIND("5100",A2,1)>0,"5100","")),IF(FIND("5106",A2,1)>0,"5106","")),IF(FIND("5124",A2,1)>0,"5124","")),IF(FIND("5125",A2,1)>0,"5125","")),IF(FIND("5126",A2,1)>0,"5126","")),IF(FIND("5147",A2,1)>0,"5147","")),IF(FIND("5150",A2,1)>0,"5150","")),IF(FIND("5151",A2,1)>0,"5151","")),IF(FIND("5155",A2,1)>0,"5155","")),IF(FIND("5156",A2,1)>0,"5156","")),IF(FIND("5157",A2,1)>0,"5157","")),IF(FIND("5158",A2,1)>0,"5158","")),IF(FIND("5159",A2,1)>0,"5159","")),IF(FIND("5194",A2,1)>0,"5194","")),IF(FIND("5195",A2,1)>0,"5195","")),IF(FIND("5196",A2,1)>0,"5196","")),IF(FIND("5205",A2,1)>0,"5205","")),IF(FIND("5227",A2,1)>0,"5227","")),IF(FIND("5228",A2,1)>0,"5228",""))IF(FIND("5229",A2,1)>0,"5229","")),IF(FIND("5234",A2,1)>0,"5234","")),IF(FIND("5241",A2,1)>0,"5241","")),IF(FIND("5242",A2,1)>0,"5242","")),IF(FIND("5243",A2,1)>0,"5243","")),IF(FIND("5244",A2,1)>0,"5244","")),IF(FIND("5254",A2,1)>0,"5254","")),IF(FIND("5255",A2,1)>0,"5255","")),IF(FIND("5267",A2,1)>0,"5267","")),IF(FIND("5269",A2,1)>0,"5269","")),IF(FIND("5271",A2,1)>0,"5271","")),IF(FIND("5278",A2,1)>0,"5278","")),IF(FIND("5280",A2,1)>0,"5280","")),IF(FIND("5286",A2,1)>0,"5286","")),IF(FIND("5297",A2,1)>0,"5297","")),IF(FIND("5305",A2,1)>0,"5305","")),IF(FIND("5306",A2,1)>0,"5306","")),IF(FIND("5310",A2,1)>0,"5310","")),IF(FIND("5315",A2,1)>0,"5315","")),IF(FIND("5316",A2,1)>0,"5316","")),IF(FIND("5318",A2,1)>0,"5318","")),IF(FIND("5321",A2,1)>0,"5321","")),IF(FIND("5322",A2,1)>0,"5322","")),IF(FIND("5324",A2,1)>0,"5324","")),IF(FIND("5325",A2,1)>0,"5325","")),IF(FIND("5326",A2,1)>0,"5326","")),IF(FIND("5327",A2,1)>0,"5327","")),IF(FIND("5328",A2,1)>0,"5328","")),IF(FIND("5336",A2,1)>0,"5336","")),IF(FIND("5337",A2,1)>0,"5337","")),IF(FIND("5339",A2,1)>0,"5339","")),IF(FIND("5341",A2,1)>0,"5341","")),IF(FIND("5350",A2,1)>0,"5350",""))IF(FIND("5351",A2,1)>0,"5351","")),IF(FIND("5352",A2,1)>0,"5352","")),IF(FIND("5353",A2,1)>0,"5353","")),IF(FIND("5356",A2,1)>0,"5356","")),IF(FIND("5357",A2,1)>0,"5357","")),IF(FIND("5358",A2,1)>0,"5358","")),IF(FIND("5359",A2,1)>0,"5359","")),IF(FIND("5360",A2,1)>0,"5360","")),IF(FIND("5361",A2,1)>0,"5361","")),IF(FIND("5362",A2,1)>0,"5362","")),IF(FIND("5363",A2,1)>0,"5363","")),IF(FIND("5378",A2,1)>0,"5378","")),IF(FIND("5379",A2,1)>0,"5379","")),IF(FIND("5380",A2,1)>0,"5380","")),IF(FIND("5381",A2,1)>0,"5381","")),IF(FIND("5382",A2,1)>0,"5382","")),IF(FIND("5383",A2,1)>0,"5383","")),IF(FIND("5389",A2,1)>0,"5389",""))IF(FIND("5390",A2,1)>0,"5390","")),IF(FIND("5392",A2,1)>0,"5392","")),IF(FIND("6000",A2,1)>0,"6000","")),IF(FIND("6001",A2,1)>0,"6002","""")),IF(FIND("6003",A2,1)>0,"6003","")),IF(FIND("6004",A2,1)>0,"6004","")),IF(FIND("6005",A2,1)>0,"6005","")),IF(FIND("6006",A2,1)>0,"6006","")),IF(FIND("6653",A2,1)>0,"6653","")),IF(FIND("6654",A2,1)>0,"6654","")),IF(FIND("6655",A2,1)>0,"6655","")),IF(FIND("6656",A2,1)>0,"6656","")),IF(FIND("6657",A2,1)>0,"6657","")),IF(FIND("9202",A2,1)>0,"9202","")),IF(FIND("9401",A2,1)>0,"9401","")),RIGHT(A2,3,4))"
the result should return the number mentioned and I am planning to sort them in ascending order.
The value in A2 looks like PMGAG5216GC, PMG005216GC, PMGVV5140GC, PMG005140GC, PMGVV5148GCW, PMGAG5117GCW, PMG005117GCW, PMGAG5204GCB, PMG005204GCB, PMGAG5238GCB, PMGVV5238GCB, PMG005238GCB, PMGAG5203GCB, etc. these are some sample order numbers that are being updated and the numbers 5238 is a number that I have to find from that order to sort them in ascending order. In the same way, I have 140 numbers that have to found to sort them accordingly. The 4 digit numbers are fixed in the orders and it should be one from the 140 number list that I had mentioned
Rule of thumb, if you see yourself nesting anything deeper than 5 or 6 levels, stop and take the time to see if there wouldn't be a more easily maintainable way to do the same thing. Hitting hard limits (e.g. 64 levels of nesting) is rarely a sign that things are done in an optimal fashion.
PMGAG5216GC PMG005216GC PMGVV5140GC PMG005140GC PMGVV5148GCW PMGAG5117GCW PMG005117GCW PMGAG5204GCB PMG005204GCB PMGAG5238GCB PMGVV5238GCB PMG005238GCB PMGAG5203GCB
Assuming the format is consistently the same, you can grab the 4 characters starting at the 6th position, and then verify if these 4 characters exist in a lookup table that contains the 140 values you're interested in. The MID function can be used to do this.
You could leverage the fact that VLOOKUP in the first column of the lookup table would return the lookup value itself, and a lookup failure would be #N/A, so wrapping it with IFERROR to turn that into an empty string would look like this:
=IFERROR(VLOOKUP(MID(A2,6,4),theLookupTable[TheLookupColumn],1,FALSE),"")
Now, if looks like some of the values need a prefix e.g. "00000A-"; include that prefix (with the dash, so you don't have to conditionally add it in the formula) in the lookup table (say, in some [Prefix] column) where it's needed, and just concatenate it after the lookup.
=IFERROR(VLOOKUP(MID(A2,6,4),theLookupTable[TheLookupColumn],1,FALSE) & VLOOKUP(MID(A2,6,4),theLookupTable[#[TheLookupColumn]:[ThePrefixColumn]],2,FALSE),"")
Better if you can turn the MID(A2,6,4) part into a helper cell instead of computing it twice - use that MID function on your source data to populate the lookup table.
The lookup table might look like this:
TheLookupColumn ThePrefixColumn
5216 00000A-
5140 00000B-
5148 00000C-
...
3901
...
Sort the table by TheLookupColumn, and the lookups should be pretty fast.
If you just want to show the first number from your lookup list which is contained in any given order number you can do something like this:
It's an array formula so you need to enter it using Ctrl + Shift + Enter
Assumes there can be only one match per order number and that none of the items in your lookup list are substrings of another item (though a workaround for that would be to sort your lookup list in descending order of item length)

How do I calculate a Sum based on multiple If's in Excel?

Background is that I'm making a budget spreadsheet. I have different bills due on different days. (ie. bill due on Monday and bill due on the 10th)
I want a function that will place the appropriate amount of money going in/out in column D and the description of why the money is going in/out in column E.
Currently I have two different formulas that I created (probably incorrectly).
Formula for Column E: (Already is in the document and seems to work fine other than that fact that I cant add additional text to the cell)
=IF(DAY(C36)=7," Amy Pay","")&IF(DAY(C36)=22," Amy Pay","")&IF(DAY(C36)=8," Family Bills","")&IF(DAY(C36)=6," Dollar Shave Club","")&IF(DAY(C36)=2," Amy Cap One VISA","")&IF(DAY(C36)=3," Chase VISA","")&IF(DAY(C36)=8," Being Smart","")&IF(DAY(C36)=17," Gym","")&IF(DAY(C36)=11," Netflix","")&IF(DAY(C36)=19," Cap One MC","")&IF(DAY(C36)=29," CenturyLink","")&IF(DAY(C36)=6," Haley Cap One Visa","")&IF(DAY(C36)=10," SRP","")&IF(DAY(C36)=23, "Car Payment","")&IF(DAY(C36)=30, "Rent","")&IF((B36)="Mon"," Monday","")&IF((B36)="Fri"," Friday","")&IF((B36)="Fri"," Haley Pay","")
Formula for Column D: (not in the column yet, as it doesn't work how I want)
=IF(DAY(B40)=7,"1474.22","")&IF(DAY(B40)=22,"1474.22","")&IF(DAY(B40)=8,"-100","")&IF(DAY(B40)=6,"-9","")&IF(DAY(B40)=2,"-100","")&IF(DAY(B40)=3,"-100","")&IF(DAY(B40)=8,"-400","")&IF(DAY(B40)=17,"-20.05","")&IF(DAY(B40)=11,"-8.63","")&IF(DAY(B40)=19,"-450","")&IF(DAY(B40)=29,"-50","")&IF(DAY(B40)=6,"-150","")&IF(DAY(B40)=10,"-200","")&IF(DAY(B40)=23,"-325","")&IF(DAY(B40)=30,"-500","")&IF((A40)="Mon","-125","")&IF((A40)="Fri","-325","")&IF((A40)="Fri","400","")
http://imgur.com/IBINweh
      
The problem is that in column D, rather than providing a sum of the numbers, it lists the numbers in the column.
http://imgur.com/rPDS5h2
      
I had a suggestion to add =SUM( in front of the IF( function, but when I do, #VALUE! is what results in the field. Using this formula: (view image by changing appended text to /CVs0f1v )
=SUM(IF(DAY(B40)=7,"1474.22","")&IF(DAY(B40)=22,"1474.22","")&IF(DAY(B40)=8,"-100","")&IF(DAY(B40)=6,"-9","")&IF(DAY(B40)=2,"-100","")&IF(DAY(B40)=3,"-100","")&IF(DAY(B40)=8,"-400","")&IF(DAY(B40)=17,"-20.05","")&IF(DAY(B40)=11,"-8.63","")&IF(DAY(B40)=19,"-450","")&IF(DAY(B40)=29,"-50","")&IF(DAY(B40)=6,"-150","")&IF(DAY(B40)=10,"-200","")&IF(DAY(B40)=23,"-325","")&IF(DAY(B40)=30,"-500","")&IF((A40)="Mon","-125","")&IF((A40)="Fri","-325","")&IF((A40)="Fri","400",""))
Any ideas on how I can get all the to populate and sum appropriately?
Forgive my Non Excel Guru knowledge - trying to learn. :D
-Amy
If you take all of the options from your first working formula and change the method retrieving them, you will have a much more versatile worksheet that can easily accept new additions and schedule modifications.
    
In a couple of unused columns to the right, pit in the day-of-month and the action that occurs. I'm using columns Y & Z. You have two events occurring on the 6th so I put them together.
In a couple of other unused columns use the day-of-the-week and associated text.; I've used columns V & W. The default for Sunday is 1.
In E36 use this formula,      =TRIM(IFERROR(VLOOKUP(DAY(C36),$Y:$Z, 2, FALSE), "")&" "&IFERROR(VLOOKUP(WEEKDAY(C36),$V:$W, 2, FALSE), "")) 
Fill down as necessary.
If you want the day-of-the-week in column B, use =C36 and use a custom number format of ddd or dddd.
References:
  VLOOKUP function  WEEKDAY function
You are concatenating text strings that look like numbers. You probably want to be adding real numbers:
=SUM(IF(DAY(B40)=7,1474.22,0) + IF(DAY(B40)=22,0) + ...
although, whenever I see a formula as complex as what you have, I would consider looking for a different solution -- Vlookup comes to mind.
In addition, with a VLOOKUP table, you would have seen that you have some conflicts -- e.g: you list the same condition of B40=8 to return two different values; and the same condition of A40 = Fri, to also return two different values.

SUMIFS with intermediate VLOOKUP in the criteria

I have 3 tables, 1 of which I want to fill in columns with data based on the other 2. Tables are roughly structured as follows:
Table 1 (Semi-Static Data)
SubGroup Group
----------- -----------
subgroup(1) group(a)
subgroup(2) group(b)
subgroup(3) group(b)
subgroup(4) group(c)
etc.
Table 2 (Variable Data)
SubGroup DataValue
----------- -----------
subgroup(1) datavalue(i)
subgroup(2) datavalue(ii)
subgroup(3) datavalue(iii)
subgroup(4) datavalue(iv)
etc.
Table 3 (Results)
Group TotalValue
----------- -----------
group(a) totalvalue(m)
group(b) totalvalue(n)
group(c) totalvalue(o)
etc.
Where the TotalValue is the sum of all DataValue's for all subgroups that belong to that particular Group.
e.g. for group(b) ---> totalvalue(n) = datavalue(ii) + datavalue(iii)
I am looking to achieve this calculation without adding any additional columns to the Data tables nor using VBA.
Basically I need to perform a COUNTIFS where there is an additional VLOOKUP matching the subgroup criteria range to the group it belongs to, and then only summing for datavalue's that match the group being evaluated. I have tried using array formulas but I'm having issues making it work. Any assistance would be very appreciated. Thank you,
EDIT: Wanted to add some details surrounding my question. First all Google searches did not provide a suitable answer. All the links had solutions to a slightly different problem were the VLOOKUP term is not dependent on the SUMIFS criteria but rather another single static variable. Stack Overflow offered similar solutions. Please let me know if anymore details are required to make my post suitable for this forum. Thank you again.
You can use the SUMPRODUCT function to do it all at once. The first reference $B$2:$B$5 is for the Group names, the second reference $E$2:$E$5 is for the datavalues. The G2 reference is for the group names in the third table, you can enter this formula for the first reference and then drag and fill for the rest.
=SUMPRODUCT($E$2:$E$5 * (G2 = $B$2:$B$5))
Some cell references, and sample data, would be helpful but something like this might be what you want:
=SUMIF(C:C,"="&INDEX(A:A,MATCH(E5,B:B,0)),D:D)
WADR & IMHO, this is simply bad worksheet design. For lack of a single cross-reference column in Table2, any solution would have to be a VBA User Defined Formula or an overly complicated array formula (the latter of which I am not even sure is possible). The data tables are not normalized database tables you can INNER JOIN or GROUP BY ... HAVING.
The formula you are trying to achieve is akin to,
=SUMPRODUCT(SUMIF(D:D, {"subgroup(2)","subgroup(3)"}, E:E))
That only works with hard-coded values as arrayed constants (e.g. {"subgroup(2)","subgroup(3)"}). I know of no way to spit a dynamic list back into the formula using additional native Excel functions but VBA offers some possibilities.
HOWEVER,
The simple addition of one more column to Table2 with a very basic VLOOKUP reduces all of your problems to a SUMIF.
     
The formula in the new column D, row 2 is,
=VLOOKUP(E2, A:B, 2, FALSE)
The formula in I2 is,
=SUMIF(D:D, H2,F:F )
Fill each down as necessary. Sorry if that is not what you wanted to hear.
Thank you everyone that responded and reviewed this post. I have managed to resolve this using an array formula and some matrix algebra. Please note that I am not using VLOOKUP (this operator cannot be performed on arrays) nor SUMIFS as my title states.
My final formula looks like this:
{=SUM(IF([Table2.xlsx]Sheet1!SubGroup=TRANSPOSE(IF([Table1.xlsx]Sheet1!Group=G2,[Table1.xlsx]Sheet1!SubGroup,"")),[Table2.xlsx]Sheet1!DataValue))}
Very simply, I create an array variable that compares the Group being evaluated (e.g. cell G2) with the Groups column for Table 1 and outputs the corresponding matching SubGroups. This results in an array with as many rows as Table 1 had (N) and 1 column: Nx1. I then transpose that array (1xN) and compare it to the SubGroups column (Mx1, M being the number of rows in Table 2) and output the DataValues column for the rows that have a corresponding SubGroup (MxN). Then I perform a sum of the whole array to return a single value.
Notice that as I didn't include a value_if_false output return on either IF operators, it will just populate with FALSE in the arrays were the conditions are not met. This does not matter though for the final result. In the first IF, FALSE will not match the SubGroups so will be ignored. For the second all values FALSE passed to SUM will be calculated as 0. The more complicated question is that it grows the amount of memory required to process as we are not filtering to just have the values we want.
For this application I decided against filtering the subarray as the trade-off in resource utilization was acceptable. If the data sets were any bigger though, I would definitely try doing it. Another concern was that I did not understand fully the filtering logic that I was using based on http://exceltactics.com/make-filtered-list-sub-arrays-excel-using-small/ so decided to simplify. Will revisit this concept latter as I think it will work. I might have completed this solution but was missing transposing the array to compare properly so abandoned this route.

nested excel functions with conditional logic

Just getting started in Excel and I was working with a database extract where I need to count values only if items in another column are unique.
So- below is my starting point:
=SUMPRODUCT(COUNTIF(C3:C94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"}))
what i'd like to figure out is the syntax to do something like this-
=sumproduct (Countif range1 criteria..., where range2 criteria="is unique value")
Am I getting this right? The syntax is a bit confusing, and I'm not sure I've chosen the right functions for the task.
I just had to solve this same problem a week ago.
This method works even when you can't always sort on the grouping column (J in your case). If you can keep the data sorted, #MikeD 's solution will scale better.
Firstly, do you know the FREQUENCY trick for counting unique numbers? FREQUENCY is designed to create histograms. It takes two arrays, 'data' and 'bins'. It sorts 'bins', then creates an output array that's one longer than 'bins'. Then it takes each value in 'data' and determines which bin it belongs in, incrementing the output array accordingly. It returns the array. Here's the important part: If a value appears in 'bins' more than once, any 'data' value meant for that bin goes in the first occurrence. The trick is to use the same array for both 'data' and 'bins'. Think it through, and you'll see that there's one non-zero value in the output for each unique number in the input. Note that it only counts numbers.
In short, I use this:
=SUM(SIGN(FREQUENCY(<array>,<array>)))
to count unique numeric values in <array>
From this, we just need to construct arrays containing numbers where appropriate and text elsewhere.
In the example below, I'm counting unique days when the color is red and the fruit is citrus:
This is my conditional array, returning 1 or true for the rows I'm interested in:
($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0))
Note that this requires ctrl-shift-enter to be used as an array formula.
Since the value I'm grouping by for uniqueness is text (as is yours), I need to convert it to numeric. I use:
MATCH($C$2:$C$10,$C$2:$C$10,0)
Note that this also requires ctrl-shift-enter
So, this is the array of numeric values within which I'm looking for uniqueness:
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),"")
Now I plug that into my uniqueness counter:
=SUM(SIGN(FREQUENCY(<array>,<array>)))
to get:
=SUM(SIGN(FREQUENCY(
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),""),
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),"")
)))
Again, this must be entered as an array formula using ctrl-shift-enter. Replacing SUM with SUMPRODUCT will not cut it.
In your example, you'd use something like:
=SUM(SIGN(FREQUENCY(
IF(ISNUMBER(MATCH($C$3:$C$94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"},0)),MATCH($J$3:$J$94735,$J$3:$J$94735,0),""),
IF(ISNUMBER(MATCH($C$3:$C$94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"},0)),MATCH($J$3:$J$94735,$J$3:$J$94735,0),"")
)))
I'll note, though, that scaling might be a problem on data sets as large as yours. I tested it on larger data sets, and it was fairly fast on the order of 10k rows, but really slow on the order of 100k rows, such as yours. The internal arrays are plenty fast, but the FREQUENCY function slows down. I'm not sure, but I'd guess it's between O(n log n) and O(n^2) depending on how the sort is implemented.
Maybe this doesn't matter - none of this is volatile, so it'll just need to calculate once upon refreshing the data. If the column data is changing, though, this could be painful.
Asuming the source data is sorted by the key value [A], start with determining the occurence of the key column
B2: =IF(A2=A1;B1+1;1)
Next determine a group sum
C2: =SUMIF($A$2:$A$9;A2;$B$2:$B$9)
A key is unique if its group sum is exactly 1
D2: =(C2=1)
To count records which match a certain criterium AND are unique, include column D in a =IF(AND(D2, [yourcondition];1;0) and sum this column
Another option is to asume a key unique within a sorted list if it is unequal to both its predecessor and successor, so you could find the unique records like
E2: =AND(A2<>A1;A2<>A3)
G2: =IF(AND(E2;F2="this");1;0)
E and G can of course be combined into one single formula (not sure though if that helps ...)
G2(2): =IF(AND(AND(A2<>A1;A2<>A3);F2="this");1;0)
resolving unnecessarily nested AND's:
G2(3): =IF(AND(A2<>A1;A2<>A3;F2="this");1;0)
all formulas in row 2 should be copied down to the end of the list

Resources