Using UNIQUE with non-adjecent columns on different sheets - excel

I have two tables on two sheets - let's say tblFruits1 and tblFruits2.
Both have a column "Name".
Apple - for example - exists on both lists.
The lists might have a different number of rows
tblFruits1 on Sheet1
Name
Color
Apple
red
Peach
yellow
Ananas
yellow
tblFruits2 on Sheet2
Name
Color
Apple
red
Cherries
red
Banana
yellow
Melone
green
Now I would like to get - on a third sheet - a UNIQUE list of names of both tables.
expected result on Sheet3
Name
Apple
Peach
Ananas
Cherries
Banana
Melone
=UNION((tblFruits1[Name],tblFruits2[Name])) returns an error.
I tried variants with SEQUENCE and INDEX but didn't succeed.
So the question is:
How can I "construct" the matrix-parameter for UNIQUE from two column-ranges on two different sheets?
(What I am looking for is a non-VBA-solution - I know how to handle this in VBA.)

The VSTACK function makes the Union obsolete (only available to insiders at time of writing)
Since finding the Union of several ranges is a quite usefull function on its own, I use a LAMBDA to do that. The output of that can then be passed to UNIQUE
The Lambda, which I call, unimaginatively, UNION
=LAMBDA(tabl1, tabl2,
LET(rowindex, SEQUENCE(ROWS(tabl1)+ROWS(tabl2)),
colindex, SEQUENCE(1,COLUMNS(tabl1)),
IF(rowindex<=ROWS(tabl1),
INDEX(tabl1,rowindex,colindex),
INDEX(tabl2,rowindex-ROWS(tabl1),colindex)
)
)
)
Then
=UNIQUE(Union(tblFruits1[Name],tblFruits2[Name]))
gives the result you seek

Try:
=LET(X,CHOOSE({1,2},tblFruits1[Name],tblFruits2[Name]),Y,COUNTA(X),Z,MOD(SEQUENCE(Y)-1,Y/2)+1,A,INDEX(X,Z,CEILING(SEQUENCE(Y)/(Y/2),1)),UNIQUE(FILTER(A,NOT(ISNA(A)))))

This is a solution I created where you can replace a2# and c2# with any two arrays, dynamic arrays, etc. It also deduplicates and sorts it. This works on Excel for Mac (FILTERXML is not supported)
=LET(
firstArray, a2#,
secondArray, c2#,
totalCount, COUNTA(firstArray)+COUNTA(secondArray),
firstCount, COUNTA(firstArray),
SORT(UNIQUE(MAKEARRAY(totalCount,1,LAMBDA(r,c,IF(r<=firstCount,INDEX(firstArray,r),INDEX(secondArray,r-firstCount+1))))))
)

Can you try like this and make your Sheet1 data and Sheet2 data into Table
an in your Sheet3 cell A2 copy paste the formula below
=UNIQUE(FILTERXML(""&TEXTJOIN("",1,(IFNA(IF({0,1},Table1[Name],Table2[Name]),"")))&"","//b"),FALSE,FALSE)

There is a new function that simplifies this: VSTACK
For a unique (distinct) union (as per the original question), try this:
=UNIQUE((tblFruits1[Name],tblFruits2[Name]))
And to sort them:
=SORT(UNIQUE((tblFruits1[Name],tblFruits2[Name])))

I don't have VSTACK or HSTACK at this point (or LAMDA either), unless I resort to keeping workbooks on slow-and-clunky Excel Online. But say you have 3 dynamic arrays (named three, four and five), each with 2 columns and a variable number of rows. Then (to keep things readable) use a named formula combo defined as =SEQUENCE(ROWS(three)+ROWS(four)+ROWS(five)). Then this works:
=IFS(
combo<=ROWS(three), three,
combo<=ROWS(three)+ROWS(four), INDEX(four,combo-ROWS(three),{1,2}),
TRUE, INDEX(five,combo-ROWS(three)-ROWS(four),{1,2})
)
You can wrap UNIQUE and/or SORT around this, if you like. If the arrays have 3 columns rather than 2 then use {1,2,3} in the formula.
Obviously you can expand this to more than 3 arrays by building more conditions into the IFS formula.
Perhaps it's worth noting that three is the same thing as INDEX(three,combo,{1,2}). And using COLUMN(three) would make the syntax more generalisable, though it doesn't allow you to re-order the columns if you want something like {2,1,3} in your output. Also, TRUE is effectively how you say "ELSE" in an IFS formula--it's a bit more concise here than combo<=ROWS(three)+ROWS(four)+ROWS(five). Using these longer forms would make the formula more symmetrical but a lot wordier!
(It would be nice if VSTACK ever becomes available on the desktop.)

Related

Match and Conditional Formatting from Matrix Table

I am looking for some decent help with my matrix table, and is there a good or best approach to properly match dependent instances in certain matrix using drop downs.
This picture represents my matrix table (Picture 1):
As you can see there are a lot of instances, but horizontally and vertically they got the same number of "headers". Those "1`s" are representing not compatibility in my case but lets call it simply "match". That is on one sheet that is gonna be populated with some new values from time to time.
On another sheet which is actually sheet for showing the data and their compatibility possibilities is equipped with drop downs. There you got "Groups (Group1, Group2...)" in a sense of main parts and "dependent groups (AA1, BB2..)" as small components that are part of main parts. To avoid misunderstanding here you have explanations, I used for the sake of this example fictional values:
Groups aka. Main Parts
Dependent groups aka. components
As you can see beneath, is my fictional table but exactly the same concept as I should use in my real case.
I PUT AN EXPLANATION IN THE PICTURE 2 SO YOU CAN FOLLOW ALONG AND SEE EXACTLY WHERE/WHAT I DID!
What I used firstly there are =match functions, one for vertical position (A3) and one for horizontal (B4). This boolean row is done using =or(index) but reffering to the match positions as you can see. And from there I should use true/false for coloring my group boxes in a case compatibility is possible - thats all the science.
So, my question is if there is another approach to this problem? As you can see I have 3 different rows of functions at one place, or imagine if I will have more "groups" that can rise in many more rows and calculations.
Picture 2
EDITED:
This is screenshot of the original sheet, I just hid some rows that were with Infos that is reason the number is not consistent. As you can see it is almost the same as dummy example I provided above. Underneath every "box" you got three rows of calculations as I mentioned before. The two times number "2" that you see here is the position of some value that I found using =match function, one is for horizontal and another for vertical lookup. In this case it is model type, 070FX is position 2, 100FX is 3 and 200FX is 4th position in the matrix table, and so on for all the other groups. And those groups (Model, Endpoint, Gas sensor...) are defined separately on another sheet where I had to make unique list and dependent list so I can reference those to my drop down list.
EDIT Nr 4! So this formula I used for true/false:
=SUMPRODUCT(('0359-matrix'!$A$2:$A$101=F10)*(('0359-matrix'!$B$1:$CW$1=$B$10)+('0359-matrix'!$B$1:$CW$1=$C$10)+('0359-matrix'!$B$1:$CW$1=$D$10)+('0359-matrix'!$B$1:$CW$1=$E$10)+('0359-matrix'!$B$1:$CW$1=$F$10)+('0359-matrix'!$B$1:$CW$1=$G$10)+('0359-matrix'!$B$1:$CW$1=$H$10)+('0359-matrix'!$B$1:$CW$1=$I$10)+('0359-matrix'!$B$1:$CW$1=$J$10)+('0359-matrix'!$B$1:$CW$1=$K$10)+('0359-matrix'!$B$1:$CW$1=$L$10)+('0359-matrix'!$B$1:$CW$1=$M$10)+('0359-matrix'!$B$1:$CW$1=$N$10)+('0359-matrix'!$B$1:$CW$1=$O$10)+('0359-matrix'!$B$1:$CW$1=$P$10)+('0359-matrix'!$B$1:$CW$1=$Q$10)+('0359-matrix'!$B$1:$CW$1=F13)+('0359-matrix'!$B$1:$CW$1=G13)+('0359-matrix'!$B$1:$CW$1=H13)+('0359-matrix'!$B$1:$CW$1=I13)+('0359-matrix'!$B$1:$CW$1=J13))*'0359-matrix'!$B$2:$CW$101)>0
I copied only last part, or when it starts from second row..Because it is too long to write whole funciton - it cuts down automatically.
('0359-matrix'!$B$1:$CW$1=$Q$10)+('0359-matrix'!$B$1:$CW$1=$B$13)+('0359-matrix'!$B$1:$CW$1=$C$13)+('0359-matrix'!$B$1:$CW$1=$D$13)+('0359-matrix'!$B$1:$CW$1=$E$13)+('0359-matrix'!$B$1:$CW$1=$F$13))*'0359-matrix'!$B$2:$CW$101)>0
But on marked cells I am getting the same results: B22 - F22 has the same as B21 - F21 (boolean) what shouldnt be like that but to follow color, green is False, it has to be something with an array reference.
Checkout the following. A1 to E5 is the matrix that shows which pieces are incompatible (=1). The others have to be empty or 0.
In cell I8 I used the following formula (and copied it down up to I11):
=SUMPRODUCT(($A$2:$A$5=H8)*(($B$1:$E$1=$H$8)+($B$1:$E$1=$H$9)+($B$1:$E$1=$H$10)+($B$1:$E$1=$H$11))*$B$2:$E$5)
The formula result shows you the amount of incompatibilities a part has. Eg AA1 has one incompatibility with BB2 but BB2 is incompatible with 2 AA1 and CC3.
To get the TRUE/FALSE use the same formula and append >0: like =SUMPRODUCT(…)>0
For any additinonal "group" (Model, Endpoint, …) you need to add another +($B$1:$E$1=$H$12) where $B$1:$E$1 points to your matrix data and $H$12 to your selected group value.
Overview of the formula ranges:
Note that this kind of calculation can only tell the amount of incompatibilites a part has but not the names of the parts that are incompatible.
Edited horizontal version
Formula in the selected cell is
=SUMPRODUCT(($A$2:$A$5=G8)*(($B$1:$E$1=$G$8)+($B$1:$E$1=$H$8)+($B$1:$E$1=$I$8)+($B$1:$E$1=$J$8))*$B$2:$E$5)
you can pull it to the right.

How do I calculate a Sum based on multiple If's in Excel?

Background is that I'm making a budget spreadsheet. I have different bills due on different days. (ie. bill due on Monday and bill due on the 10th)
I want a function that will place the appropriate amount of money going in/out in column D and the description of why the money is going in/out in column E.
Currently I have two different formulas that I created (probably incorrectly).
Formula for Column E: (Already is in the document and seems to work fine other than that fact that I cant add additional text to the cell)
=IF(DAY(C36)=7," Amy Pay","")&IF(DAY(C36)=22," Amy Pay","")&IF(DAY(C36)=8," Family Bills","")&IF(DAY(C36)=6," Dollar Shave Club","")&IF(DAY(C36)=2," Amy Cap One VISA","")&IF(DAY(C36)=3," Chase VISA","")&IF(DAY(C36)=8," Being Smart","")&IF(DAY(C36)=17," Gym","")&IF(DAY(C36)=11," Netflix","")&IF(DAY(C36)=19," Cap One MC","")&IF(DAY(C36)=29," CenturyLink","")&IF(DAY(C36)=6," Haley Cap One Visa","")&IF(DAY(C36)=10," SRP","")&IF(DAY(C36)=23, "Car Payment","")&IF(DAY(C36)=30, "Rent","")&IF((B36)="Mon"," Monday","")&IF((B36)="Fri"," Friday","")&IF((B36)="Fri"," Haley Pay","")
Formula for Column D: (not in the column yet, as it doesn't work how I want)
=IF(DAY(B40)=7,"1474.22","")&IF(DAY(B40)=22,"1474.22","")&IF(DAY(B40)=8,"-100","")&IF(DAY(B40)=6,"-9","")&IF(DAY(B40)=2,"-100","")&IF(DAY(B40)=3,"-100","")&IF(DAY(B40)=8,"-400","")&IF(DAY(B40)=17,"-20.05","")&IF(DAY(B40)=11,"-8.63","")&IF(DAY(B40)=19,"-450","")&IF(DAY(B40)=29,"-50","")&IF(DAY(B40)=6,"-150","")&IF(DAY(B40)=10,"-200","")&IF(DAY(B40)=23,"-325","")&IF(DAY(B40)=30,"-500","")&IF((A40)="Mon","-125","")&IF((A40)="Fri","-325","")&IF((A40)="Fri","400","")
http://imgur.com/IBINweh
      
The problem is that in column D, rather than providing a sum of the numbers, it lists the numbers in the column.
http://imgur.com/rPDS5h2
      
I had a suggestion to add =SUM( in front of the IF( function, but when I do, #VALUE! is what results in the field. Using this formula: (view image by changing appended text to /CVs0f1v )
=SUM(IF(DAY(B40)=7,"1474.22","")&IF(DAY(B40)=22,"1474.22","")&IF(DAY(B40)=8,"-100","")&IF(DAY(B40)=6,"-9","")&IF(DAY(B40)=2,"-100","")&IF(DAY(B40)=3,"-100","")&IF(DAY(B40)=8,"-400","")&IF(DAY(B40)=17,"-20.05","")&IF(DAY(B40)=11,"-8.63","")&IF(DAY(B40)=19,"-450","")&IF(DAY(B40)=29,"-50","")&IF(DAY(B40)=6,"-150","")&IF(DAY(B40)=10,"-200","")&IF(DAY(B40)=23,"-325","")&IF(DAY(B40)=30,"-500","")&IF((A40)="Mon","-125","")&IF((A40)="Fri","-325","")&IF((A40)="Fri","400",""))
Any ideas on how I can get all the to populate and sum appropriately?
Forgive my Non Excel Guru knowledge - trying to learn. :D
-Amy
If you take all of the options from your first working formula and change the method retrieving them, you will have a much more versatile worksheet that can easily accept new additions and schedule modifications.
    
In a couple of unused columns to the right, pit in the day-of-month and the action that occurs. I'm using columns Y & Z. You have two events occurring on the 6th so I put them together.
In a couple of other unused columns use the day-of-the-week and associated text.; I've used columns V & W. The default for Sunday is 1.
In E36 use this formula,      =TRIM(IFERROR(VLOOKUP(DAY(C36),$Y:$Z, 2, FALSE), "")&" "&IFERROR(VLOOKUP(WEEKDAY(C36),$V:$W, 2, FALSE), "")) 
Fill down as necessary.
If you want the day-of-the-week in column B, use =C36 and use a custom number format of ddd or dddd.
References:
  VLOOKUP function  WEEKDAY function
You are concatenating text strings that look like numbers. You probably want to be adding real numbers:
=SUM(IF(DAY(B40)=7,1474.22,0) + IF(DAY(B40)=22,0) + ...
although, whenever I see a formula as complex as what you have, I would consider looking for a different solution -- Vlookup comes to mind.
In addition, with a VLOOKUP table, you would have seen that you have some conflicts -- e.g: you list the same condition of B40=8 to return two different values; and the same condition of A40 = Fri, to also return two different values.

Ranking with subsets

I'm trying to rank values and have managed to work out how to sort ties. My data looks at the total number of entries, ranks based on that and if there is a tie it looks to the next column of values to sort them out. However, I have two classes (East and West I've called them) of data within my dataset and want to rank them both separately (but stick to the rules above). So, if I had seven entries, 3 of them West and 4 of the East, I want West to have ranking 1,2,3 based on all the values that lie in that subset and East would have ranking 1,2,3,4. Can you explain what your formula is doing so I can understand how to apply your answer better in the future.
Effectively I'm asking what formula needs to go in achieve my result.
Cheers
Paul
There are a few related ways to do this, most involving SUMPRODUCT. If you don't like the solution below and would like to research other ways/explanations, try searching for "rankif".
The function looks up the Class and Value columns and, for every value in those columns, returns a TRUE or 1 if the current Class is a match AND if its Value is larger than the current Value, False or 0 if otherwise. The SUM adds up all these 1s, and the 1+ is for decoration. Remember to enter as an array formula using Ctrl+Shift+Enter before dragging down.
I used the array formula and SUM above to explain, but the following also works and might even be faster since it's not an array formula. It's the same idea, except we hijack SUMPRODUCT's ability to spit out a single value from an array.
=1+SUMPRODUCT(($A$2:$A$8=A2)*($B$2:$B$8>B2))
EDIT
To extend the rank-if, you could add more subsets to rank by multiplying more conditions:
You can also easily add tiebreakers by adding another SUMPRODUCT to treat the ties as an additional subset:
The first SUMPRODUCT is the 'base rank', while the second SUMPRODUCT is tiebreaker #1.

SUMIFS with intermediate VLOOKUP in the criteria

I have 3 tables, 1 of which I want to fill in columns with data based on the other 2. Tables are roughly structured as follows:
Table 1 (Semi-Static Data)
SubGroup Group
----------- -----------
subgroup(1) group(a)
subgroup(2) group(b)
subgroup(3) group(b)
subgroup(4) group(c)
etc.
Table 2 (Variable Data)
SubGroup DataValue
----------- -----------
subgroup(1) datavalue(i)
subgroup(2) datavalue(ii)
subgroup(3) datavalue(iii)
subgroup(4) datavalue(iv)
etc.
Table 3 (Results)
Group TotalValue
----------- -----------
group(a) totalvalue(m)
group(b) totalvalue(n)
group(c) totalvalue(o)
etc.
Where the TotalValue is the sum of all DataValue's for all subgroups that belong to that particular Group.
e.g. for group(b) ---> totalvalue(n) = datavalue(ii) + datavalue(iii)
I am looking to achieve this calculation without adding any additional columns to the Data tables nor using VBA.
Basically I need to perform a COUNTIFS where there is an additional VLOOKUP matching the subgroup criteria range to the group it belongs to, and then only summing for datavalue's that match the group being evaluated. I have tried using array formulas but I'm having issues making it work. Any assistance would be very appreciated. Thank you,
EDIT: Wanted to add some details surrounding my question. First all Google searches did not provide a suitable answer. All the links had solutions to a slightly different problem were the VLOOKUP term is not dependent on the SUMIFS criteria but rather another single static variable. Stack Overflow offered similar solutions. Please let me know if anymore details are required to make my post suitable for this forum. Thank you again.
You can use the SUMPRODUCT function to do it all at once. The first reference $B$2:$B$5 is for the Group names, the second reference $E$2:$E$5 is for the datavalues. The G2 reference is for the group names in the third table, you can enter this formula for the first reference and then drag and fill for the rest.
=SUMPRODUCT($E$2:$E$5 * (G2 = $B$2:$B$5))
Some cell references, and sample data, would be helpful but something like this might be what you want:
=SUMIF(C:C,"="&INDEX(A:A,MATCH(E5,B:B,0)),D:D)
WADR & IMHO, this is simply bad worksheet design. For lack of a single cross-reference column in Table2, any solution would have to be a VBA User Defined Formula or an overly complicated array formula (the latter of which I am not even sure is possible). The data tables are not normalized database tables you can INNER JOIN or GROUP BY ... HAVING.
The formula you are trying to achieve is akin to,
=SUMPRODUCT(SUMIF(D:D, {"subgroup(2)","subgroup(3)"}, E:E))
That only works with hard-coded values as arrayed constants (e.g. {"subgroup(2)","subgroup(3)"}). I know of no way to spit a dynamic list back into the formula using additional native Excel functions but VBA offers some possibilities.
HOWEVER,
The simple addition of one more column to Table2 with a very basic VLOOKUP reduces all of your problems to a SUMIF.
     
The formula in the new column D, row 2 is,
=VLOOKUP(E2, A:B, 2, FALSE)
The formula in I2 is,
=SUMIF(D:D, H2,F:F )
Fill each down as necessary. Sorry if that is not what you wanted to hear.
Thank you everyone that responded and reviewed this post. I have managed to resolve this using an array formula and some matrix algebra. Please note that I am not using VLOOKUP (this operator cannot be performed on arrays) nor SUMIFS as my title states.
My final formula looks like this:
{=SUM(IF([Table2.xlsx]Sheet1!SubGroup=TRANSPOSE(IF([Table1.xlsx]Sheet1!Group=G2,[Table1.xlsx]Sheet1!SubGroup,"")),[Table2.xlsx]Sheet1!DataValue))}
Very simply, I create an array variable that compares the Group being evaluated (e.g. cell G2) with the Groups column for Table 1 and outputs the corresponding matching SubGroups. This results in an array with as many rows as Table 1 had (N) and 1 column: Nx1. I then transpose that array (1xN) and compare it to the SubGroups column (Mx1, M being the number of rows in Table 2) and output the DataValues column for the rows that have a corresponding SubGroup (MxN). Then I perform a sum of the whole array to return a single value.
Notice that as I didn't include a value_if_false output return on either IF operators, it will just populate with FALSE in the arrays were the conditions are not met. This does not matter though for the final result. In the first IF, FALSE will not match the SubGroups so will be ignored. For the second all values FALSE passed to SUM will be calculated as 0. The more complicated question is that it grows the amount of memory required to process as we are not filtering to just have the values we want.
For this application I decided against filtering the subarray as the trade-off in resource utilization was acceptable. If the data sets were any bigger though, I would definitely try doing it. Another concern was that I did not understand fully the filtering logic that I was using based on http://exceltactics.com/make-filtered-list-sub-arrays-excel-using-small/ so decided to simplify. Will revisit this concept latter as I think it will work. I might have completed this solution but was missing transposing the array to compare properly so abandoned this route.

Excel Advanced filtering multiple columns with multiple acceptable data combinations

I have a large data set with 4 columns of interest all containing text, namely pokemon moves. The columns "move 1" through to "Move 4" each contain a different move, and each row differs in the combination.
eg.
" A | B | C | D | E".
" 1 Pokemon | Move 1 | Move 2 | Move 3 | Move 4".
" 2 Igglybuff | Tackle | Tailwhip | Sing | Attract".
" 3 Wooper | Growl | Tackle | Rain Dance| Dig".
~ 1000 more
My issue is this:
I wish to filter this data set for rows (pokemon) containing a certain combination of moves from a list.
eg. I want to find which pokemon have both "Growl" and "Tackle". These moves can appear in any of Moves 1 to 4 (aka order of the moves is unimportant)
How would I go about filtering for such a result. I have similar situations in which I would want to search for a combination of 3 or 4 moves, the specific order of which is not important, or also search for specific pokemon possessing a specific combination of moves.
I've attempted to use functions such as COUNTIF without avail.
Help / Ideas are much appreciated
There are a number of options for advanced filtering in excel that you might consider:
Option 1 - Advanced Filters
Advanced filters give you the power to query over multiple criteria (which is what you need). You can also easily do it as many times as you want to generate the final datasets using each filter. Here is a link to the advanced filter section for Microsoft Excel 2010, which is virtually identical here to 2007. It would be a great place to start if you want to move outside of just using basic formulas.
If you do go down this route, then follow the directions on the site in terms of steps:
Insert the various criteria that you have selected in the top rows in your spreadsheet and specify those rows in the list range
Set the criteria range to the place holding all your data on a single worksheet
Run the filter and look at the resulting data. You can easily do a count on the number of records in that reduced data set.
Option 2 - Pivot Tables
Another option that you might look at here would be to use Pivot tables. Pivot tables and pivot charts are just phenomenal tools that I use in the workplace every day to accomplish exactly what you are looking for.
Option 3 - Using Visual Basic
As a third option, you could try using visual basic code to write a solution. This would give you perfect control as you could specify exactly the ranges to look at for each of the conditions. Unfortunately, you would need to understand VB code in order to use this solution. There are some excellent online resources available that can help with this.
=COUNT(INDEX(MATCH(B2:E2, MoveList, 0), 0)) > 0
will return TRUE if any of the values in the range B2:E2 (Moves 1 through 4) are in the range defined by Move List. You want to use a named range so that you can easily copy this formula down for all of your thousand rows.
If you remove the last part that checks whether the COUNT() value is greater than zero, you get:
=COUNT(INDEX(MATCH(B2:E2, MoveList, 0), 0))
which will return the number of moves that the Pokemon has that match a move on the move list.
MATCH() takes three arguments: a lookup value, the lookup range, and the match type. I don't fully understand why, but wrapping that part of the formula in INDEX() seems to let you use an array for the first argument. Maybe someone here can provide a better explanation.
In any case, the formulae above do appear to solve the problem.
Finally, if you're only checking for a few moves, instead of using a confusing formula and a named range as above, you could just make a column for each move that you want to check for, e.g. "Has Growl?" and "Has Tackle?". You would then just use =COUNTIF(B2:E2, "Tackle") and =COUNTIF(B2:E2, "Growl"). You could then make another column that sums these columns and filter out the zero values to display only Pokemon who have Tackle or Growl.
I looked at these two pages when researching how to accomplish this:
https://www.excelforum.com/excel-general/786407-find-if-any-value-on-one-list-exists-on-another.html
https://www.deskbright.com/excel/using-index-match/

Resources