Concatenate/Merge values based on multiple columns - excel

How can you concatenate/join or merge values based on multiple columns?
This is the initial table
Name
Height
Weight
Class number
Mark
1m80
80kg
1
Steve
1m60
60kg
2
Bob
1m60
60kg
2
Based on the height, the weight and the class, if they match all then write in a new column the following output:
Names of people that share a lot in common (height,weight and class number)
Steve,Bob

If you are on Microsoft-365 then could try-
=TEXTJOIN(", ",1,FILTER($A$2:$A$4,($B$2:$B$4=B2)*($C$2:$C$4=C2)*($D$2:$D$4=D2)))

Related

Excel - getting a value based on the max value off another row in a Table

I'm looking for a solution for a problem I'm facing in Excel. This is my table simplified:
Every sale has an unique ID, but more people can have contributed to a sale. the column "name" and "share of sales(%)" show how many people have contributed and what their percentage was.
Sale_ID
Name
Share of sales(%)
1
Person A
100
2
Person B
100
3
Person A
30
3
Person C
70
Now I want to add a column to my table that shows the name of the person that has the highest share of sales percentage per Sales_ID. Like this:
Sale_ID
Name
Share of sales(%)
Highest sales
1
Person A
100
Person A
2
Person B
100
Person B
3
Person A
30
Person C
3
Person C
70
Person C
So when multiple people have contributed the new column shows only the one with the highest value.
I hope someone can help me, thanks in advance!
You can try this on cell D2:
=LET(maxSales, MAXIFS(C2:C5,A2:A5,A2:A5),
INDEX(B2:B5, XMATCH(A2:A5&maxSales,A2:A5&C2:C5)))
or just removing the LET since maxSales is used only one time:
=INDEX(B2:B5, XMATCH(A2:A5&MAXIFS(C2:C5,A2:A5,A2:A5),A2:A5&C2:C5))
On cell E2 I provided another solution via MAP/XLOOKUP:
=LET(maxSales, MAXIFS(C2:C5,A2:A5,A2:A5),
MAP(A2:A5, maxSales, LAMBDA(a,b, XLOOKUP(a&b, A2:A5&C2:C5, B2:B5))))
similarly without LET:
=MAP(A2:A5, MAXIFS(C2:C5,A2:A5,A2:A5),
LAMBDA(a,b, XLOOKUP(a&b, A2:A5&C2:C5, B2:B5)))
and here is the output:
Explanation
The trick here is to identify the max share of sales per each group and this can be done via MAXIFS(max_range, criteria_range1, criteria1, [criteria_range2, criteria2], ...). The size and shape of the max_range and criteria_rangeN arguments must be the same.
MAXIFS(C2:C5,A2:A5,A2:A5)
it produces the following output:
maxSales
100
100
70
70
MAXIFS will provide an output of the same size as criteria1, so it returns for each row the corresponding maximum sales for each Sale_ID column value.
It is the array version equivalent to the following formula expanding it down:
MAXIFS($C$2:$C$5,$A$2:$A$5,A2)
INDEX/XMATCH Solution
Having the array with the maximum Shares of sales, we just need to identify the row position via XMATCH to return the corresponding B2:B5 cell via INDEX. We use concatenation (&) to consider more than one criteria to find as part of the XMATCH input arguments.
MAP/XLOOKUP Solution
We use MAP to find for each pair of values (a,b) per row, of the first two MAP input arguments where is the maximum value found for that group and returns the corresponding Name column value. In order to make a lookup based on an additional criteria we use concatenation (&) in XLOOKUP first two input arguments.

Is there a way to distribute data according to a logic in Excel vba?

I have an Excel sheet with the below data.
There are 10,000 Data rows.
9000 are of "USA" & 1000 are of "Other" country.
I want to evenly distribute the data so that when I have 9 "USA" followed by 1 "Other" data distributed throughout.
Name
Country
Alice
USA
Brook
Other
Cathy
USA
David
USA
Esther
Other
Freddy
USA
Galin
USA
Henry
Other
Indigo
USA
Jenny
USA
Kalin
Other
Linda
USA
How do I accomplish this using manual & excel VBA? Appreciate both solutions. Thanks
This can be achieved with a formula if you have the newest version of Excel.
Try something like (adapt ranges and what you are filtering on as necessary):
=LET(x, FILTER($B$1:$C$12, $C$1:$C$12="a"),
y, FILTER($B$1:$C$12, $C$1:$C$12="b"),
z, ROW(D1:D12), myrows, MAX(z),
ratio, MAX((COUNTA(x)/2)/(COUNTA(y)/2), (COUNTA(y)/2)/(COUNTA(x)/2))+1,
IF(MOD(z,ratio)<>0,
INDEX(x, IF(MOD(SEQUENCE(myrows),ratio)=0, 0, SEQUENCE(myrows)-CEILING(ROW(G1:G12)/ratio-1,1)), SEQUENCE(1,2)),
INDEX(y, IF(MOD(SEQUENCE(myrows),ratio)<>0,0,SEQUENCE(myrows)/ratio), SEQUENCE(1,2))))
For example:
The trick is to create the "correct" sequence for each result; for the first array you want to skip every nth row (in your case 10), and having the nth+1 row not default to n+1, but n, while in the second array you want to skip every row that isn't a some multiple of n, and have the nth rows count sequentially.
A caveat-- as is, I don't believe the formula will work with repetition other than 1, i.e. if you want to do something like 8 rows followed by 2 rows, this won't work.
This works even with older Excel versions:
If this is your data:
Add a Sort column with the following formula in C2 and pull it down:
=IF(B2="USA",COUNTIF($B$2:B2,"USA")+INT((COUNTIF($B$2:B2,"USA")-1)/ROUNDUP(COUNTIF(B:B,"USA")/(COUNTA(B:B)-COUNTIF(B:B,"USA")),0)),COUNTIF($B$2:B2,"Other")*(ROUNDUP(COUNTIF(B:B,"USA")/(COUNTA(B:B)-COUNTIF(B:B,"USA")),0)+1))
Then sort by this column C and USA and Other are evenly spread:

Count Unique Dates Associated with Location

I am trying to count the total of Unique Dates based on the location.
Context: I trying to create a formula for counting the number of unique dates based on location. My Spreadsheet looks like this
A B C
1 **Participant Location Date**
2 Participant-A High School X 11/7
3 Participant-B High School X 11/7
4 Participant-C High School X 11/8
5 Participant-E High School Y 11/7
6 Participant-F High School Z 11/7
7 Participant-G High School Z 11/8
So for example: high School X had 2 different dates. What would the formula be to count the unique dates based on the location?
This is also being completed on google sheets.
Thank you!
Another way (with no helper columns) would be to use query() and unique().
=query(unique(B:C), "Select Col1, count(Col2) where Col1 <>'' group by Col1 label count(Col2)'# of unique dates'", 1)
Illustration:
With a simple helper column :
=1/COUNTIFS($A$2:$A$7,A7,$B$2:$B$7,B7)
And to get your results :
=SUMIF($A$2:$A$7,E2,$C$2:$C$7)
This is not one-formula solution but I think it works. First, create a third column concatenating the columns that you want to compare. In this case, at cell D2 write:
=CONCATENATE(B2,C2)
This is for the first row of your example. Then, replicate that to the following rows.
Finally, create a formula that counts unique values:
=SUM(IF(FREQUENCY(IF(LEN(D2:D7)>0,MATCH(D2:D7,D2:D7,0),""), IF(LEN(D2:D7)>0,MATCH(D2:D7,D2:D7,0),""))>0,1))
Assuming your new column of concatenated values is at D2:D7.

Search a Column for Specified Text, Then Return the Sum of the adjacent column which contains values to sum

I am creating a World Cup spreadsheet.
Example:
Team | Pts |
Brazil | XXX |
Switz. | XXX |
C. Rica| XXX |
Serbia | XXX |
I have another table which displays the Country, Score & whether it was a W (win), L (loss) or T (tie). A WIN will add 3 points, a TIE will add 1 point & a LOSS will add 0 points. Where I put the XXX, I want a formula that will search the table column for the Country Name and ADD up the W's, L's & T's and display the sum of every game there team played with the point system I mentioned.
Any help would be greatly appreciated! Thanks.
Try,
=SUM(COUNTIFS(D:D, A2, E:E, {"T","W","W","W"}))
You can use COUNTIFS. I set up a mock table to demonstrate:
The first COUNTIFS will count a given teams wins, and then multiple by a factor of 3.
The second COUNTIFS will count a given teams ties (no need to multiply here).
Since a Loss equates to 0 points, there is no need to sum or count anything.
Adjust ranges to fit your setup and then auto-fill the equation down your table to calculate for each team.
You can use a formula in H2, as below, and drag down as many rows as required
=(3*SUMPRODUCT(--($A$2:$A$13=$G2),--($C$2:$C$13="W"))+SUMPRODUCT(--($A$2:$A$13=$G2),--($C$2:$C$13="T")))
Data:
You only care about Wins or Ties. So count the wins per country and * 3 and add to that the count of ties, which you don't need to multiply as is 1.
Sumproduct is used to handle the arrays nicely.

Find the top n values in a range while keeping the sum of values in another range under x value

I'd like to accomplish the following task. There are three columns of data. Column A represents price, where the sum needs to be kept under $100,000. Column B represents a value. Column C represents a name tied to columns A & B.
Out of >100 rows of data, I need to find the highest 8 values in column B while keeping the sum of the prices in column A under $100,000. And then return the 8 names from column C.
Can this be accomplished?
EDIT:
I attempted the Solver solution w/ no luck. 200 rows looks to be the max w/ Solver, and that is what I'm using now. Here are the steps I've taken:
Create a column called rank RANK(B2,$B$2:$B$200) (used column D -- what is the purpose of this?)
Create a column called flag just put in zeroes (used column E)
Create 3 total cells total_price (=SUM(A2:A200)), total_value (=SUM(B2:B200)) and total_flag (=(E2:E200))
Use solver to minimize total_value (shouldn't this be maximize??)
Add constraints -Total_price<=100000 -Total_flag=8 -Flag cells are binary
Using Simplex LP, it simply changes the flags for the first 8 values. However, the total price for the first 8 values is >$100,000 ($140k). I've tried changing some options in the Solver Parameters as well as using different solving methods to no avail. I'd like to post an image of the parameter settings, but don't have enough "reputation".
EDIT #2:
The first 5 rows looks like this, price goes down to ~$6k at the bottom of the table.
Price Value Name Rank Flag
$22,538 42.81905675 Blow, Joe 1 0
$22,427 37.36240932 Doe, Jane 2 0
$17,158 34.12127693 Hall, Cliff 3 0
$16,625 33.97654031 Povich, John 4 0
$15,631 33.58212402 Cow, Holy 5 0
I'll give you the solver solution as a starting point. It involves the creation of some extra columns and total cells. Note solver is limited in the amount of cells it can handle but will work with 100 anyway.
Create a column called rank RANK(B2,$B$2:$B$100)
Create a column called flag just put in zeroes
Create 3 total cells total_price, total_value and total_flag
Use solver to minimize total_value
Add constraints
-Total_price<=100000
-Total_flag=8
-Flag cells are binary
This will flag the rows you want and you can grab the names however you want.

Resources