How do I create single cell arrays based on unique contiguous groups? - excel

Column A identifies unique families using multiple other columns of data.
Column B is a list of individuals.
I would like Column C to contain cell arrays of these families (Shown Below).
For some reason, the MATCH formula in my attempted solution is returning the last occurrence of the match, so it does not work.
I have tried this formula (the output of this is shown in Column D in the picture):
{=OFFSET(INDEX(A:A, MATCH(A1,A:A)),0,1,COUNTIF(A:A,A1))}
A B C D
1 Tom 1 {Tom One, Sue One} Sue 1
1 Sue 1 {Tom One, Sue One} Sue 1
2 Bob 2 {Bob Two, Joan Two, John Two} John 2
2 Joan 2 {Bob Two, Joan Two, John Two} John 2
2 John 2 {Bob Two, Joan Two, John Two} John 2
3 Tom 3 {Tom Three} Tom 3
4 Joe 4 {Joe Four} Joe 4

You can use the following formula, the condition is that it is sorted baed on column A:
="{"&TEXTJOIN(",",TRUE,INDEX(B:B,MATCH(A1,A:A,0)):INDEX(B:B,MATCH(A1,A:A,1)))&"}"

Related

How update a dataframe column value from second dataframe where values on two specific columns that can repeat on first match on both dataframes?

I have two dataframes with different information about a person, on the first dataframe, person's name may repeat in different rows. I want to add/update the first dataframe with data from the second dataframe where the two columns containing person's data matches on both. Here an example on what I need to accomplish:
df1:
name surname
0 john doe
1 mary doe
2 peter someone
3 mary doe
4 john another
5 paul another
df2:
name surname account_id
0 peter someone 100
1 john doe 200
2 mary doe 300
3 john another 400
I need to accomplish this:
df1:
name surname account_id
0 john doe 200
1 mary doe 300
2 peter someone 100
3 mary doe 300
4 john another 400
5 paul another <empty>
Thanks!

Returning the last item in a subset with excel formula

In this example I would like to mark any customer that has bought a pen most recently(or bottom of the list). I have my data sorted by CustomerID and ServiceDate with the most recent as the last. I would like to be able to mark all of the customer’s transactions only if the last purchase was a pen (333).
I have been trying formulas with COUNTA but, not sure how to do it when relying on a subset of data.
=INDEX(C:C,COUNTA(C:C))
This will give me the last value in a column.
Customer ID Custmer Name Item Number Item Name Date Desired Results
1 Bob 222 Paper 1/1/2016 X
1 Bob 111 Tape 1/1/2017 X
1 Bob 333 Pen 1/1/2018 X
4 Greg 333 Pen 1/1/2015
4 Greg 111 Tape 1/1/2016
6 Chris 111 Tape 1/1/2015 X
6 Chris 333 Pen 1/1/2018 X
8 Luke 333 Pen 1/1/2013
8 Luke 333 Pen 1/1/2014
8 Luke 222 Paper 1/1/2015
8 Luke 111 Tape 1/1/2016
8 Luke 111 Tape 1/1/2018
9 Tom 333 Pen 1/1/2013 X
You can do this by creating an additional column. The additional column will find all customers whose last purchase was a pen using this formula: =IF(AND(C2=333,B2<>B3),B2,"").
The next column will give you your desired output: =IF(OR(B2=$F$4,B2=$F$8,B2=$F$14),"X","").
Thanks to joe I was able to figure this one out.
I still had to make another column.
I put this in column F.
=IF(AND(C2=333,B2<>B3),1,"")
Then in column G.
=IF(AND(COUNTIFS(A:A,A2,F:F,1)=1),"Yes","")
This worked great.

Finding MAX value using VLOOKUP with many duplicate "IDs"

Using an Excel formula, I'm trying to pull the MAX value for a NAME that has a certain LETTER next to it.
Eg: Highest # for a specific % for each unique Name
So Jeff's Q value would be 7.
(I'm trying to over explain because it makes sense in my mind but it might not make sense to others..)
Name % #
Jeff O 4
Jeff D 3
Jeff Q 4
Jeff O 1
Jeff D 9
Jeff Q 7
Tom O 6
Tom D 7
Tom Q 8
Tom O 2
Tom D 8
Tom Q 3
Peter O 3
Peter D 8
Peter Q 7
Peter O 4
Peter D 10
Peter Q 3
Bob O 2
Bob D 6
Bob Q 10
Bob O 6
Bob D 10
Bob Q 9
Mark O 4
Mark D 7
Mark Q 4
Mark O 7
Mark D 8
Mark Q 1
I can't think of a way to run this without having dedicated worksheets for each person and running MAX on the specific column.
I've tried IF, VLOOKUP and MAX in various configurations but I get nothing.
Has anyone got any experience with this and could please point me in the right direction?
The MAXIFS function should be what you want. For example, assuming that you have your data in columns A:C the formula
=MAXIFS(C:C,A:A,"Jeff",B:B,"Q")
will give you the max number in column C where the value in row A is "Jeff" and the value in row B is "Q".

Get Top Performer by Subgroup Using Index and Match

I am trying to rank names in Column C from largest to smallest score.
Category Score Name Total Rank Apple Rank Orange Rank
Apple 10 Joe Rachel Rachel 0
Orange 15 Don Natalie 0 Natalie
Apple 20 James Tom Tom 0
Apple 1 Rob Nothing Nothing 0
Orange 3 Mary Gina 0 Gina
Orange 100 Rachel James 0 James
Orange 99 Natalie Don 0 Don
Orange 87 Tom Joe 0 Joe
Apple 27 Gina Mary Mary 0
Orange 30 Nothing Rob 0 Rob
This works in Column E for Apples AND Oranges, with formula in E2 that is
=INDEX($C$2:$C$25,MATCH(1,INDEX(($B$2:$B$25=LARGE($B$2:$B$25,ROWS(E$1:E1)))*(COUNTIF(E$1:E1,$C$2:$C$25)=0),),0))
However, the goal is to compare Apples to Apples and Oranges to Oranges.
Only, the formulas in Columns F and G show "0" values for those rows that aren't in the right Apple/Orange category.
For F2:
=IF($A:$A="Apple",INDEX($C:$C,MATCH(1,INDEX(($B:$B=LARGE($B:$B,ROWS(F$1:F1)))*(COUNTIF(F$1:F1,$C:$C)=0),),0)),0)
For G2:
=IF($A:$A="Orange",INDEX($C:$C,MATCH(1,INDEX(($B:$B=LARGE($B:$B,ROWS(G$1:G1)))*(COUNTIF(G$1:G1,$C:$C)=0),),0)),0)
How do I modify the codes so that 0 values won't show up?
Something like this would be great: (screenshot made by just copy pasting values...)
Apple Rank Orange Rank
Rachel Natalie
Tom Gina
Nothing James
Mary Don
Joe
Rob
Note: Unless the whole column ranges are required the steps below may seem to take an uncomfortably long time if these ranges are not restricted.
Assuming you have what below is in ColumnA:G and a corresponding layout:
then ColumnsI:J may be achieved quite simply by copying ColumnF:G and Paste Special..., Values into I1, then select ColumnsI:J, HOME > Editing - Find & Select, Replace..., Find what: 0, Replace with: , Replace All followed by Find & Select, Go To Special..., select Blanks (only), OK, right-click on one of the chosen cells and Delete..., Shift cells up, OK.
To remove the 0s from ColumnF:G only replacing the final 0 in each formula with "" is sufficient.

count data using two columns as references

Is it possible to count or countif by using a column as the data, a cell for the criteria (or what to match) and range of what to count?
Here is what I am looking at:
A1 B C D E F G H I J K L M N O
2 Running Data Total Count of Tardies (by category)
3 Date Employees Leader Start of Shift Break 1 Lunch Break 2 Employees Start of Shift Break 1 Lunch Break 2 Total
4 1-Jul Abe Sue 15 Abe 0
5 3-Jul Steve Bob 20 Anna 0
6 5-Jul Eve Andy 9 20 Eve 0
7 7-Jul Anna Andy 30 Helen 0
8 15-Jul Abe Sue 15 Mark 0
9 18-Jul Anna Andy 10 Steve 0
10 20-Jul Helen Sue 9 0
11 31-Jul Mark Bob 45 0
I am trying to count the data entered on the left (running data) in each category and having it show based on the Employees on the right (in the orange cells). So Abe should show 1 for Start of Shift, Eve should show 1 for Break 1 and Break 2, and Anna should show 2 for Start of Shift.
I have tried using:
=countif(C:C,$J4,D:D) to get the data from JUST Column D for Start of shift, but it gives and error saying too many arguments for the function have been entered.
Help...
...and Thanks!
Countif will only look at 1 column to decide what to count.
Countifs will look at multiple columns. Your formula would look something like this:
=COUNTIFS($C:$C,$J4,E:E,">0")

Resources