Get all matching entries for a certain value in excel - excel

This is something I need for my computer science research thesis.
I have a big excel data file, with several columns, and two columns of interest are structured like this:
Column A Column B
-------- ---------
PersonType1 GroupType1
PersonType2 GroupType3
PersonType1 GroupType13
PersonType5 GroupType1
PersonType5 GroupType3
What I would like to receive for each PersonType a list of its' GroupTypes. For example, I would like to have a result of: [PersonType1 = {GroupType1, GroupType13}], [PersonType2 = {GroupType3}], [PersonType5 = {GroupType1, GroupType3]]. (not necessarily syntactically-structured like this, just an example)
Is there a convenient set of actions I can do in excel to almost automate such info derivation?
If I were to do it manually, I would begin filtering for person type one at a time, and then copying its' B column. filter for the second person type, copy its' B column, but that is too much work.
I must mention that this comes after some filtering on the columns via excel's filter features.

You can do this with an array formula (meaning, enter with CTRL+SHIFT+ENTER).
If your data is laid out like this:
Where Orange is the large table with all the data, and the Green row is the unique "PersonType#", you can put a formula below that to return just the values that match the type.
Then, use
=IFERROR(INDEX($B$2:$B$6,SMALL(IF($A$2:$A$6=F$2,ROW($A$2:$A$6)-ROW($A$2)+1),ROWS($A$2:$A2))),"")
and enter with CTRL+SHIFT+ENTER
and drag over and down a lot. (If your main data has 6 rows, drag the formula down at least 6 rows). When it doesn't find any more matches, it'll return "".

Related

Query another table in Excel

I want to pull all the values from a particular column in another table. My goal is to take a handful of different tables and put particular columns from each of them into a single, collated table.
For example, let's say I have tables about different kinds of objects
FRUITS
name flavor
banana savory
orange sweet
peach sweet
PETS
name lifespan
dog long
fish short
cat long
Imagine that I now want to make a third table with the name column from fruits and pets.
COLLATED
name source
banana fruits
orange fruits
peach fruits
dog pets
fish pets
cat pets
I tried to install the powerpivot add-in to do this, but I wasn't sure how to do it with a Mac. I'd prefer to use any "table connection" features that Excel offers in case that is possible.
A combination of ideas from both #Ike and #JosWoolley great answers would be this:
=LET(
n,{"Fruits","Pets","Cars"},
w,(tblFruits[Name],tblPets[Name],tblCars[Name]),
y,COUNTA(w),
s,SEQUENCE(AREAS(w)*y,,0),
q,1+QUOTIENT(s,y),
z,CHOOSE({1,2},IFERROR(INDEX(w,1+MOD(s,y),,q),""),INDEX(n,q)),
FILTER(z,INDEX(z,0,1)<>""))
For a new table, the table name would be added to the n variable and the column/range to the w variable without the need to edit the rest of the formula.
Edit #1
Adding more columns can get tricky using this approach but it can be done. For example having an extra 'Price' column in all tables would require something like this:
=LET(
n,{"Fruits","Pets","Cars"},
w,(tblFruits[Name],tblPets[Name],tblCars[Name]),
p,(tblFruits[Price],tblPets[Price],tblCars[Price]),
y,COUNTA(w),
s,SEQUENCE(AREAS(w)*y,,0),
q,1+QUOTIENT(s,y),
z,CHOOSE({1,2,3},IFERROR(INDEX(w,1+MOD(s,y),,q),""),INDEX(n,q),IFERROR(INDEX(p,1+MOD(s,y),,q),"")),
FILTER(z,INDEX(z,0,1)<>""))
where you have an extra p variable and the CHOOSE is updated to reflect the new values. Of course, you could change the order of the columns in the CHOOSE by either changing the order of the 3 parts or by simply changing the numbers in the {1,2,3} array (e.g. {1,3,2}).
=LET(w,(tblFruits[name],tblPets[name],tblCars[name]),x,AREAS(w),y,COUNTA(w),z,IFERROR(INDEX(w,1+MOD(SEQUENCE(x*y,,0),y),,1+INT(SEQUENCE(x*y,,0)/y)),""),FILTER(z,z<>""))
Amend the table column names as required, adding in as many as required.
This should work for reasonably small ranges, though x*y could certainly be improved as a lower bound.
Agreed with Ike that a recursive lambda would probably be of help here.
I added two tables to a sheet: tblFruits and tblPets.
Then you can put the following formula in any cell on the same sheet or another sheet.
=LET(
a,CHOOSE({1,2},tblFruits[name],"Fruits"),
b,CHOOSE({1,2},tblPets[name],"Pets"),
rowIndex,SEQUENCE(ROWS(a) + ROWS(b)),
colIndex,SEQUENCE(1,COLUMNS(a)),
IF(rowIndex<=ROWS(a),
INDEX(a,rowindex,colIndex),
INDEX(b,rowindex-ROWS(a),colIndex)
)
)
The first four rows of the formula are used to retrieve variables that are then used in the final IF-function:
a and b will return "virtual" arrays of each name column plus the "new" column giving the type.
rowIndex returns a single array {1,2,...(number of rows of both tables)}
colIndex returns an array that is build of the number of columns - in this case 2 (name and type)
These variables are used in the IF-formula:
Think of it as a For i = 1 to Ubound(rowIndex)-loop.
If the first value from the rowIndex-Array is smaller than the number of rows of tblFruits,
then INDEX-result is based on virtual array a,
if not the rowindex for b is calculated and INDEX-result is based on virtual array b.
The result is a spill-down array - you can use a filter on it. Just add a header row and add filter.
But you won't be able to create a table based on it.
Therefore you will have to use VBA to create the combined data.
This would be the formula with a third table:
=LET(
a,CHOOSE({1,2},tblFruits[Name],"Fruits"),
b,CHOOSE({1,2},tblPets[name],"Pets"),
c,CHOOSE({1,2},tblRooms[name],"Rooms"),
rowIndex,SEQUENCE(ROWS(a)+ROWS(b)+ROWS(c)),
colIndex,SEQUENCE(1,COLUMNS(a)),
IF(rowIndex<=ROWS(a),
INDEX(a,rowIndex,colIndex),
IF(rowIndex<=ROWS(a) + ROWS(b),
INDEX(b,rowIndex-ROWS(a),colIndex),
INDEX(c,rowIndex-(ROWS(a)+ROWS(b)),colIndex))))

Excel check if cell contains text from list and return values from list for every match

I have a list of products with Material no and Text description as visible in Sheet1 image.
Every part of the text has a defined value in another sheet.
How to get result in Sheet1 column C for each product based on its Text description and appropriate values defined in Sheet2?
Let's say I would like to parse the text and for each part of text to determine the value from another sheet and summarize this to one number.
For example:
Material 1 has A B C D in Text, formula should result with 100 => 10+20+30+40 (A=10+B=20+C=30+D=40)
etc...
I know I can use IF to check for each variant then return value with vlookup, but this is something I would like to avoid. Variants will change, their number can be pretty big therefore I would like to avoid changing formulas every time when we change Variants...
Looking by your screenshots you have a version of Excel prior to Micrososft365. In that case I think you can use:
Formula in C2:
=SUMPRODUCT(ISNUMBER(FIND(" "&E$2:E$11&" "," "&B2&" "))*F$2:F$11)

Excel: Is there a way to match Two Columns while matching a third

Column B & C's values match each other and Column ID & A's values match each other (as in the codes are the same customer). Column A & B contain the same values but in a different order, is it possible to match those values, allowing all the columns to match?
E.g
---ID---|---A------|------B---|----C---|---D---|
23-------AB12------BA13---------K00
12-------BA13------BC33---------K01
45-------AC31------AB12---------K02
65-------BC33------CC31---------K03
11-------AA22------CB21---------K04
02-------CB21------AC31---------K05
57-------CC31------AA22---------K06
Ideally the first row should be:
| ID | A | B | C | D |
23 AB12 AB12 K02
Can this be done on a large spreadsheet 10,000+ ?
You don't need to change the order of your columns to do that, just use INDEX(MATCH()) instead of vlookup - I'm not quite sure why vlookup is even a thing to be honest.
But first, if you work on a large dataset and maybe you'll have to revisit things later, let's work smartly and NAME OUR RANGES instead of using dirty $A$5 notations.
SO here's our reduced dataset:
Let's select the whole of the columns containing data, like so, and in the upper left corner we're going to name it "col_B" (or whatever else you fancy):
Do that for all the columns you'll use.
You can now set a separate section (or another worksheet completely since you have lots of data) to re-order your dataset. Like so. We're going to use col_A as our reference, so let's already populate it (just copy-past I guess) with your source data:
In our final result, we want column B to be the same as column A. So let's just write that formula in our first cell of column B, and increment that all the way to the end:
You now want col C to be whatever value that correspond to each col B value in your old table - we're going to use Index(MatchT)) like so:
In english this means:
INDEX(COL_TO_RETURN, MATCH(COL_TO_SEARCH_FOR_VALUE, VALUE_TO_SEARCH, 0))
In other words, it will return the corresponding value for col_C (of the source data, so the column you have previously named). It will take the value in D3 (your new sheet's column B) as the thing to search for. It will search for that D3 value inside col_B, the range that you have named so in your source data. "0" just means you want an exact match.
The fact that we named our columns just makes them easier to reference -you could do without but it's a lot cleaner.
Much, much better would be to work with table & structured notation, but that's abit more of a stretch so let's stop it here. The result is this:
Apply the same logic for your ID column & you're done.
If you like the answer, please accept it! (the check mark next to the post)

Check successive excel records and if they are the same, assign them identical ID's

I have a large excel file with the following format:
Contact First Name Contact Last Name Contact ID
Brandi Aasen 1602940
Brandi Aasen 1600622
Brandi Aasen 1600622
Angela Abate 1600846
Angela Abate 1600846
Edahena Lucido 1603494
Guadalupe Delgado 1602523
Guadalupe Delgado 1602087
Tonya Addams 1602339
What I am needing is to adjust it so that if the contact name is the same, the contact ID must be the same as well. As of now, every single ID in the file is different. It doesn't even matter if I use any of the actual ID's listed there in the file. For instance, Brandi Aasen is just fine with the ID "0001", so long as "0001" is the ID set for all three instances of her. The file is sorted by Last Name then first name, so all of the duplicate contacts follow each other one after another.
I'm having a hard time finding an efficient way to do this. Admittedly I don't have much experience with excel. If I try something simple like:
=IF((AND(F2=F3,G2=G3)),(H2),(H3))
I run into trouble immediately, because series continues as I move down the column and the conditional cell numbers get all out of sorts.
What I was thinking is that I might be better off if I combine columns A and B into one. If I have the full name in a single column, is there anyway I could implement something like (pseudocode):
For all instances of A2 -> Set the adjacent column cell(B) to an arbitrary value
OR
If A2 = A3 -> B3 = B2
The original simple formula I posted at the beginning would almost work if it could go something like:
=IF((AND(F2=F2,G3=G3)),(H3=H2),(H3))
But excel doesn't seem to allow for me to use the "H3=H2" statement as the "Value if True"
Truly appreciate any help or guidance in the right direction.
I don't know if it is perfect solution, however I would do something like this:
I consider that First Name is in A column and Last Name is in B column. The ID you want to insert will be in the D column. The header is in 1st row, so Brandi, Aasen is in 2nd row.
In D2 you just type 1, as this is the first index.
In D3 type =IF(AND(A3=A2;B3=B2);D2;D2+1). Copy formula to all other D cells.
The function checks if the active pair (A3, B3) is the same as previous one (A2, B2). If true, the same number is taken (from D2). If not, the number is taken from above and increased.

Excel: If Cell in Column = text value of X, then display text (in the same row, but different column) on another sheet

This is a confusing request.
I have an excel tab with a lot of data, for now I'll focus on 3 points of that data.
Team
Quarter
Task Name
In one tab I have a long list of this data displaying all the tasks for all the teams and what Quarter they will be on.
I WANT to load another tab, and take that data (from the original tab) and insert it into a non-list format. So I would have Quarters 1,2,3,4 as columns going across the screen, and Team Groups going down. I want each "task" that is labeled as Q1 to know to list in the Q1 section of that Teams "Block"
So something like this: "If Column A=TeamA,AND Quarter=Q1, then insert Task Name ... here."
Basically, if the formula = true, I want to print a list of those items within that team section of the excel document.
I'd like to be able to add/move things around at the data level, and have things automatically shift in the Display tab. I honestly have no idea where to start.
If there is never a possibility that there could be more that 1 task for a given team and quarter, then you can use a formula solution.
Given a data setup like this (in a sheet named 'Sheet1'):
And expected results like this (in a different sheet):
The formula in cell B2 and copied over and down is:
=IFERROR(INDEX(Sheet1!$C$2:$C$7,MATCH(1,INDEX((Sheet1!$A$2:$A$7=$A2)*(Sheet1!$B$2:$B$7=B$1),),0)),"")
I came across this situation. When I have to insert the values into a table from an Excel sheet I need all information in 1 Column instead of 2 multiple rows. In Excel my Data looks like:
ProductID----OrderID
9353510---- 1212259
9650934---- 1381676
9572474---- 1381677
9632365---- 1374217
9353182---- 1212260
9353182---- 1219361
9353182---- 1212815
9353513---- 1130308
9353320---- 1130288
9360957---- 1187479
9353077---- 1104558
9353077---- 1130926
9353124---- 1300853
I wanted single row for each product in shape of
(ProductID,'OrdersIDn1,OrderIDn2,.....')
For quick solution I fix it with a third column ColumnC to number the Sale of Product
=IF(A2<>A1,1,IF(A2=A1,C1+1,1))
and fourth Column D as a placeholder to concatenate with previous row value of same product:
=IF(A2=A1,D1+","&TEXT(B2,"########"),TEXT(B2,"########"))
Then Column E is the final column I required to hide/blank out duplicate row values and keep only the correct one:
=IF(A2<>A3,"("&A2&",'"&D2&"'),","")
Final Output required is only from Column E
ProductID Order Id Sno PlaceHolder Required Column
9353510 1212259 1 1212259 (9353510,'1212259'),
9650934 1381676 1 1381676 (9650934,'1381676'),
9572474 1381677 1 1381677 (9572474,'1381677'),
9632365 1374217 1 1374217 (9632365,'1374217'),
9353182 1212260 1 1212260
9353182 1219361 2 1212260,1219361
9353182 1212815 3 1212260,1219361,1212815 (9353182,'1212260,1219361,1212815'),
9353513 1130308 1 1130308 (9353513,'1130308'),
9353320 1130288 1 1130288 (9353320,'1130288'),
9360957 1187479 1 1187479 (9360957,'1187479'),
9353077 1104558 1 1104558
9353077 1130926 2 1104558,1130926 (9353077,'1104558,1130926')
You will notice that final values are only with the Maximum Number of ProductSno which I need to avoid duplication ..
In Your case Product could be Team and Order could be Quarter and Output could be
(Team,Q1,Q2,....),
Based on my understanding of your summary above, you want to put non-numerical data into a grid of teams and quarters.
The offset worksheet function will work well for this in conjunction with the match or vlookup functions. I have often done this task by doing the following steps.
In my data table, I have to concatenate the Team and quarter columns so I have a unique lookup value at the leftmost column of your table (Note: you can eventually hide this for ease of reading).
Note: You will want to name the input range for best formula management. Ideally use an Excel Table (2007 or greater) or create a dynamically named range with the offset and CountA functions working together (http://tinyurl.com/yfhfsal)
First, VLOOKUP arguments are VLOOKUP(Lookup_Value,Table_Array,Col_Index_num,[Range Lookup]) See http://tinyurl.com/22t64x7
In the first cell of your output area you would have a VLOOKUP formula that would look like this
=Vlookup(TeamName&Quarter,Input_List,Column#_Where_Tasks_Are,False)
The Lookup value should be referencing cells where you have the team names and quarter names listed down the sides and across the top. The input list is from the sheet you have the data stored. The number three represents the column number the tasks are listed in your source data, and the False tells the function it will only use an exact match in your putput.

Resources