Excel Vlookup function to map duplicate and get the max occurence value - excel

Here Table A has values where A,C has duplicate. In Table B , for each of A,B,C which is considered as input to table A , it should fetch 12,33,14. The table b populates the output by checking the occurrence in Table A and getting the most common value . Here in this example A had 11 & 12. But 12 had 3 occurence.so it's value is 12 Table B can have other values like D or can miss EITHER A OR B OR C. is there any Excel function through this i can achive. I know if u put count and the sort based on count it will work. I am looking for 1 line function
something like Vlookup in excel

Assuming no Excel version constraint per tag listed in the question. You can try the following in cell D2:
=LET(A, A2:A8, B, B2:B8, cnts, COUNTIFS(A,A,B,B), ux, UNIQUE(A),
out, MAP(ux, LAMBDA(x, TEXTJOIN(",",,UNIQUE(FILTER(B, (A=x)
* (cnts = MAX(FILTER(cnts, A=x)))))))), HSTACK(ux, out))
The formula considers the scenario where one or more values have the maximum number of occurrences, like in the case of C value, both values 13, and 14 are present in just once. We use TEXTJOIN to collect both values.
Here is the output:
Note: You cannot use inside MAP the MAXIFS function because it is a RACON function and cnts is not a range. We use as a workaround MAX filtering by rows that match A=x. The approach of using: MAX((A=x) * cnts) also works, but under the assumption that all filtered values are not negative.

Related

Create random list of strings in random size in excel

I have the following data in my excel.
A
Text 1
Text 2
Text 3
Text 4
Text 5
Text 6
I want to fill column B with random joined data from column A. It should respect the criteria 1> B > 6. i.e. Column B should have a min 1 value from column A or can have a max of up to 6 unique values joined by ,. I can have column B dragged up to 100 rows. But still, they should respect the criteria.
I'm able to get a random value from column A using the formula INDEX($A$1:$A$6, RANDBETWEEN(1, ROWS($A$1:$A$6)), 1), and to join 2 random texts I'm using the formula
=TEXTJOIN(",",true, INDEX($A$1:$A$6, RANDBETWEEN(1, ROWS($A$1:$A$6)), 1), INDEX($A$1:$A$6, RANDBETWEEN(1, ROWS($A$1:$A$6)), 1))
Currently, I'm able to get 2 fixed strings using this formula. Instead of doing the above 6 times, I want to know If there is a way to get this joined string with a random number of unique strings(of the max size of column A length concatenated with a ,).
I'm able to get only 1 value using the random function. Please let me know how can I do this.
You could try:
Formula in B1:
=TEXTJOIN(",",,TAKE(SORTBY(A1:A6,RANDARRAY(COUNTA(A1:A6))),RANDBETWEEN(1,6)))
Note that TAKE() is a new function which is still in BETA. If you don't have access just yet, then try:
=TEXTJOIN(",",,INDEX(SORTBY(A1:A6,RANDARRAY(COUNTA(A1:A6))),SEQUENCE(RANDBETWEEN(1,6))))
In each option:
SORTBY(A1:A6,RANDARRAY(COUNTA(A1:A6))) - Will create a randomized array of the values in column A;
RANDBETWEEN(1,6) - The part which defines the lower- & upper-limit of strings to concatenate;
TAKE/INDEX - A way to retrieve an X amount of rows from the above randomized array. In your case X itself is randomized (see 2nd bullit);
TEXTJOIN() - Concatenate all selected values into a single string.
This formula exploits the functions RANDARRAY and RANDBETWEEN to get a random number of text items to join.
First, I created a dynamic named range called AllTextItems. This automatically expands to capture any number of rows in your dataset:
Then, use the formula:
=TEXTJOIN(",",TRUE,INDEX(AllTextItems,RANDARRAY(RANDBETWEEN(1,ROWS(AllTextItems)),1,1,ROWS(AllTextItems),TRUE)))
in the cells you'd like your joined list.
=TEXTJOIN(",", TRUE, INDEX(A:A, RANDARRAY(1, RANDBETWEEN(1, COUNTA(A:A)), 1, COUNTA(A:A), TRUE)))
EDIT didn't quite beat JvdV to it, but basically a similar basis to his but not using TAKE and not sorting the output

Excel SUMPRODUCT and dynamic text conditions

I am trying to do a summation of rows with certain dynamic conditions. I have rows like:
A can be only one value, K can have multiple OR-values. In the end M is to be summed.
I have tried to use SUMPRODUCT() which works for column A but not for K. What I am looking for is something like:
=SUMPRODUCT(--(!$A$2:$A$20000="AA")*--(!$K$2:$K$20000="AA" OR "BB")*$M$2:$M$20000)
I know I can do ="AA" and then ="BB" but I need "AA" and "BB" to be dynamic based on other cells. And the number of arguments is different. I tried {"AA";"BB"} but I know this will not work as the match then needs to be in the same row.
Can it at all be achieved?
Thanks a lot!
=SUMPRODUCT(($A$2:$A$20000="AA")*(($K$2:$K$20000="AA")+($K$2:$K$20000="BB"))*$M$2:$M$20000)
Note that:
Since you are multiplying/adding arrays, there's no need to include the double unary's
I don't know why you have a ! in your example formula.
To return an OR array of TRUE;FALSE, we add.
Your comments still do not provide a clear explanation of what you are making dynamic.
But to create a dynamic OR for column K, including testing for column A and summing column M, you can do the following:
For column K, let us assume that your possible OR's are entered separately in the range F2:F10
=SUMPRODUCT(MMULT(--($K$2:$K$20000=TRANSPOSE($F$2:$F$10)),--(ROW($F$2:$F$10)>0))*($A$2:$A$20000="AAA")*$M$2:$M$20000)
The matrix multiplication will produce a single column of 19,999 entries which will be a 1 for matches of any of the OR's and 0 if it does not match.
See How to do a row-wise sum in an array formula in Excel?
for information about the MMULT function in this application.
In the above formula, "blanks" in the OR range (F2:F10) will also match blank entries in column K. So it is conceivable that if there is a blank in K and F and a AAA in col A and a value in column M that a wrong result might be returned.
To avoid that possibility, we can use a dynamic formula to size column F where we are entering our OR values:
=INDEX($F$2:$F$10,1):INDEX($F$2:$F$10,COUNTA($F$2:$F$10))
will return only the values in col F that are not blank (assuming no blanks within the column)
So:
=SUMPRODUCT(MMULT(--($K$2:$K$20000=TRANSPOSE(INDEX($F$2:$F$10,1):INDEX($F$2:$F$10,COUNTA($F$2:$F$10)))),--(ROW(INDEX($F$2:$F$10,1):INDEX($F$2:$F$10,COUNTA($F$2:$F$10)))>0))*($A$2:$A$20000="AAA")*$M$2:$M$20000)
Given this data:
the last formula will return a value of 5 (sum of M2,M3,M7)
Use SUMIFS with SUMPRODUCT wrapper:
=SUMPRODUCT(SUMIFS($M$2:$M$20000,$A$2:$A$20000,"AA",$K$2:$K$20000,{"AA","BB"}))

How to select a column IF [duplicate]

Is there a formula that returns a value from the first line matching two or more criteria? For example, "return column C from the first line where column A = x AND column B = y". I'd like to do it without concatenating column A and column B.
Thanks.
True = 1, False = 0
D1 returns 0 because 0 * 1 * 8 = 0
D2 returns 9 because 1 * 1 * 9= 9
This should let you change the criteria:
I use INDEX/MATCH for this. Ex:
I have a table of data and want to return the value in column C where the value in column A is "c" and the value in column B is "h".
I would use the following array formula:
=INDEX($C$1:$C$5,MATCH(1,(($A$1:$A$5="c")*($B$1:$B$5="h")),0))
Commit the formula by pressing Ctrl+Shift+Enter
After entering the formula, you can use Excel's formula auditing tools to step through the evaluation to see how it calculates.
SUMPRODUCT definitely has value when the sum over multiple criteria matches is needed. But the way I read your question, you want something like VLOOKUP that returns the first match. Try this:
For your convenience the formula in G2 is as follows -- requires array entry (Ctrl+Shift+Enter)
[edit: I updated the formula here but not in the screen shot]
=INDEX($C$1:$C$6,MATCH(E2&"|"&F2,$A$1:$A$6&"|"&$B$1:$B$6,0))
Two things to note:
SUMPRODUCT won't work if the result type is not numeric
SUMPRODUCT will return the SUM of results matching the criteria, not the first match (as VLOOKUP does)
Apparently you can use the SUMPRODUCT function.
Actually, I think what he is asking is typical multiple results display option in excel. It can be done using Small, and row function in arrays.
This display all the results that matches the different criteria
Here is an answer that shows how to do this using SUMPRODUCT and table header lookups. The main advantage to this: it works with any value, numeric or otherwise.
So let's say we have headers H1, H2 and H3 on some table called MyTable. And let's say we are entering this into row 1, possibly on another sheet. And we want to match H1, H2 to x, y on that sheet, respectively, while returning the matching value in H3. Then the formula would be as follows:
=INDEX(MyTable[H3], ROUND(SUMPRODUCT(MATCH(TRUE, (MyTable[H1] & MyTable[H2]) = ($x1 & $y1),0)),0),1)
What does it do? The sum-product ensures everything is treated as arrays. So you can contatenate entire table columns together to make an array of concatenated valued, dynamically calculated. And then you can compare these to the existing values in x and y- somehow magically you can compare the concatenated array from the table to the individual concatenation of x & y. Which gives you an array of true false values. Matching that to true yields the first match of the lookup. And then all we need to do is go back and index that in the original table.
Notes
The rounding is just in there to make sure the Index function gets back an integer. I got #N/A values until I rounded.
It might be more instructive to run this through the evaluator to see what's going on...
This can easily be modified to work with a non table - just replace the table references with raw ranges. The tables are clearer though, so use them if possible. I found the original source for this here: http://dailydoseofexcel.com/archives/2009/04/21/vlookup-on-two-columns/. But there was a bug with rouding values to INTs so I fixed that.

Excel: match two columns and output third ... AND... there are multiple instances in each column

Working off a previous post:
Excel match two columns and output third
I have values in column A that are not unique and values in column B that are not unique, but together column A and B produce unique combinations:
A B C
1 Red Car Result#1
2 Blue Boat Result #2
3 Red Boat Result #3
4 Green Car Result #4
Let's say I want to find a match where Column A = Red and Column B = Boat which should return the corresponding value in Column C which should be Result #3.
Using the previous post's solution:
=IF(MATCH("Red",A1:A4,0)=MATCH("Boat",B1:B4,0),INDEX(C1:C4,MATCH("Boat",B1:B4,0)),0)
This would actually return value the first match for Boat in column B which would be result#2 rather than the intended result#3 where the match was true.
Any ideas on how to modify or write a function that would specify to retrieve information relative to specifically where the match was true (without using VBA)?
I've thought of a possible work around by creating another column that combines Col A and B to make a unique identifier but I was hoping to avoid that.
Thanks! Really appreciate it and sorry about the table formatting. I'm still very new at this.
You can retrieve a two column match using the AGGREGATE function to force anything that does not match into an error and ignore the errors.
      
The formula in E6 is,
=IFERROR(INDEX(C$1:C$99,AGGREGATE(15,6,ROW($1:$99)/((A$1:A$99="red")*(B$1:B$99="boat")), ROW(1:1))), "")
You are actually using the SMALL sub-function of the AGGREGATE function so you can get the second, third, etc. successive matches by increasing the k paramter. I done this above by using ROW(1:1) which equals 1 but will increase to 2, 3, etc as the formula is filled down.
Creating a third column is the most efficient solution. However if you absolutely have to avoid it, you could use a complicated formula like this:
=INDEX($C$1:$C$6,MATCH(A11,$A$1:$A$6,0) + MATCH(B11,OFFSET($A$1,MATCH(A11,$A$1:$A$6,0)-1,1,COUNTA($A$1:$A$6)-MATCH(A11,$A$1:$A$6,0),1),0)-1)
But the condition is that the lookup table is sorted for both column 1 and 2.

VLOOKUP with two criteria?

Is there a formula that returns a value from the first line matching two or more criteria? For example, "return column C from the first line where column A = x AND column B = y". I'd like to do it without concatenating column A and column B.
Thanks.
True = 1, False = 0
D1 returns 0 because 0 * 1 * 8 = 0
D2 returns 9 because 1 * 1 * 9= 9
This should let you change the criteria:
I use INDEX/MATCH for this. Ex:
I have a table of data and want to return the value in column C where the value in column A is "c" and the value in column B is "h".
I would use the following array formula:
=INDEX($C$1:$C$5,MATCH(1,(($A$1:$A$5="c")*($B$1:$B$5="h")),0))
Commit the formula by pressing Ctrl+Shift+Enter
After entering the formula, you can use Excel's formula auditing tools to step through the evaluation to see how it calculates.
SUMPRODUCT definitely has value when the sum over multiple criteria matches is needed. But the way I read your question, you want something like VLOOKUP that returns the first match. Try this:
For your convenience the formula in G2 is as follows -- requires array entry (Ctrl+Shift+Enter)
[edit: I updated the formula here but not in the screen shot]
=INDEX($C$1:$C$6,MATCH(E2&"|"&F2,$A$1:$A$6&"|"&$B$1:$B$6,0))
Two things to note:
SUMPRODUCT won't work if the result type is not numeric
SUMPRODUCT will return the SUM of results matching the criteria, not the first match (as VLOOKUP does)
Apparently you can use the SUMPRODUCT function.
Actually, I think what he is asking is typical multiple results display option in excel. It can be done using Small, and row function in arrays.
This display all the results that matches the different criteria
Here is an answer that shows how to do this using SUMPRODUCT and table header lookups. The main advantage to this: it works with any value, numeric or otherwise.
So let's say we have headers H1, H2 and H3 on some table called MyTable. And let's say we are entering this into row 1, possibly on another sheet. And we want to match H1, H2 to x, y on that sheet, respectively, while returning the matching value in H3. Then the formula would be as follows:
=INDEX(MyTable[H3], ROUND(SUMPRODUCT(MATCH(TRUE, (MyTable[H1] & MyTable[H2]) = ($x1 & $y1),0)),0),1)
What does it do? The sum-product ensures everything is treated as arrays. So you can contatenate entire table columns together to make an array of concatenated valued, dynamically calculated. And then you can compare these to the existing values in x and y- somehow magically you can compare the concatenated array from the table to the individual concatenation of x & y. Which gives you an array of true false values. Matching that to true yields the first match of the lookup. And then all we need to do is go back and index that in the original table.
Notes
The rounding is just in there to make sure the Index function gets back an integer. I got #N/A values until I rounded.
It might be more instructive to run this through the evaluator to see what's going on...
This can easily be modified to work with a non table - just replace the table references with raw ranges. The tables are clearer though, so use them if possible. I found the original source for this here: http://dailydoseofexcel.com/archives/2009/04/21/vlookup-on-two-columns/. But there was a bug with rouding values to INTs so I fixed that.

Resources