So lets say that I have a table like this..
TID Person Type Name
1 Andy F Orange
2 Andy M Beef
3 Andy V Carrot
4 Andy V Spinach
5 Bobby M Ham
6 Bobby F Apple
7 Bobby V Carrot
I want to transpose it so that it will be sorted according to the Type, I want it to look like so
Person F M V
Andy Orange Beef Carrot
Bobby Apple Ham Carrot
How can I manage to do this? Oh, and I'll also point some stuff in case you guys missed it:
The Types have no particular order, if you notice Andy's, the order is F M V V, but Bobby's is M F V.
Multiple instances of Type may occur, just like in Andy's case, notice the double V. But even so, I want it so that the only V that counts is the first one, thats why in the transposed table, the V is Carrot, because the Carrot occurred first (the Spinach is ignored).
I dont know if I ask too much, but even just the gist of the solution would be very helpful for me. The main point of my question is to ask how can I transpose such unsorted items, whilst paying attention to the 1st point. The 2nd point is important too, but I can wait or ask later if you guys dont feel like answering.
Thanks for reading, please share me your knowledge.
The easisest/quickest way is to create a new column before the TID column which has this formula in it.
=[Person]&"_"&[Type]
For instance say your data started in column B, see screen shots (TID), then the first formula would be:
=C2&"_"&D2 and will result in Andy_F being created. Copy this down for all the names you have.
You should have something like this:
NEW TID Person Type Name
Andy_F 1 Andy F Orange
Andy_M 2 Andy M Beef
Andy_V 3 Andy V Carrot
Andy_V 4 Andy V Spinach
Bobby_M 5 Bobby M Ham
Bobby_F 6 Bobby F Apple
Bobby_V 7 Bobby V Carrot
Next, set up a table like this (using copy unique items, if necessary), with unique names on the vertical and Types along the horizontal:
F M V
Andy [form]
Bobby
Where [form] is a vlookup formula as in the screen shots below:
Resulting in the correct table for you, once the formula is copied to all cells in the new table:
Vlookup will grab the first item that matches its search critera, so multiple matches will be ignored.
The formula for Andy F in the table is VLOOKUP($G2&"_"&H$1,$A$2:$E$8,5,0), with the data as in the screen shots.
A better way might be to use VBA, but this should do the trick.
Related
I have Excel data in a table with a single row, and multiple values in two categories, and I want to summarize the two categories.
Input data:
Recipe
Meal
Ingredients
Plum pie
Coffee
Dessert
Plums
Sugar
Eggs
Plum jam
Breakfast
Coffee
Plums
Sugar
Fried eggs
Breakfast
Lunch
Eggs
Pancakes
Breakfast
Dessert
Eggs
Flour
Milk
Desired output:
Eggs
Flour
Milk
Plums
Sugar
Breakfast
2
1
1
1
1
Coffee
1
2
2
Lunch
1
Dessert
1
1
1
Of course restructuring the input data and summarizing via a Pivot table or Countif is a solution, but not a practical possibility due to the source of the data.
Can anybody help with an intelligent solution (and apologies for the table pictures - can anybody help pasting tables other than as pictures - I solved the tables problem partially via https://tableconvert.com/excel-to-markdown but alas - no colors)
Thanks,
Anders
Quite verbose and an high percentage of lambda, but dynamic enough for you to only enter two variables at the start:
Formula in I2
=LET(meals,B2:C5,ingredients,D2:F5,uq_ing,SORT(UNIQUE(TOROW(ingredients,1),1),,,1),REDUCE(HSTACK("",uq_ing),SORT(UNIQUE(TOCOL(meals,1))),LAMBDA(x,y,VSTACK(x,HSTACK(y,MAP(uq_ing,LAMBDA(z,SUM(BYROW(meals,LAMBDA(v,SUM(N(v=y))))*BYROW(ingredients,LAMBDA(w,SUM(N(w=z))))))))))))
=LET(
meal, B2:C5,
ing, D2:F5,
uMeal, UNIQUE(TOCOL(meal)),
uIng, UNIQUE(TOCOL(ing, 1)),
arr, MAKEARRAY(
ROWS(uMeal),
ROWS(uIng),
LAMBDA(row, col,
SUM(
BYROW(meal, LAMBDA(r, SUM(--(r = INDEX(uMeal, row))))) *
BYROW(ing, LAMBDA(r, SUM(--(r = INDEX(uIng, col)))))
)
)
),
VSTACK(HSTACK("", TRANSPOSE(uIng)), HSTACK(uMeal, arr))
)
Probably very similar to #JvdV's answer, but at least you can check the results against each other.
I'm looking for a formula for the Party column in Table 3 that will produce its values based on the data contained in Table 1 and Table 2.
NumSelect value in Table 3 determines Party value in Table 3.
Where NumSelect has "p", it refers to data in Table 1. If no "p" in NumSelect, then it refers to Table 2.
Number in NumSelect refers to row number.
If the corresponding ShortName has a value, that value should be returned.
If the corresponding ShortName is blank, then the corresponding Name should be returned.
Uppercase "P" and lowercase "p" in the NumSelect should both point to Table 1.
Each table is an Excel Table and its rows may expand or contract.
Certain rows in Table 1 and Table 2 may be empty.
Formula should not be volatile, not require control+shift+enter to enter the formula, and not require VBA.
Thanks!
Sorry for the bad formatting. I had this question formatted perfectly, but Stack Overflow kept preventing me from posting it because it claimed, "Your post appears to contain code that is not properly formatted as code. Please indent all code by 4 spaces using the code toolbar button or the CTRL+K keyboard shortcut. For more editing help, click the [?] toolbar icon."
Table 1
Name
Gender
ShortName
Occupation
Grace Turner
F
Singer
Cadie Crawford
F
Tiger
Fine Artist
Paige Johnston
F
Archeologist
Dexter Payne
M
Klondike
Veterinarian
Valeria Barnes
F
Chef
Florrie Reed
F
Lawer
Emily Ferguson
F
Scientist
Sam Hawkins
M
Alpha
Biochemist
Savana Ellis
F
Cook
Table 2
Name
Gender
ShortName
Occupation
Vanessa Cooper
F
Producer
Jasmine Morris
F
Beta
Baker
Evelyn Taylor
F
Economist
Adelaide Roberts
F
Historian
Blake Cunningham
M
Lion
Chef
Adelaide Harrison
F
Chemist
Frederick Watson
M
Journalist
Table 3
NumSelect
Party
p2
Tiger
3
Evelyn Taylor
P8
Alpha
2
Beta
7
Frederick Watson
p7
Emily Ferguson
Long Formula
Your formula has 717 characters, this one has 347.
=IF(ISNUMBER(SEARCH("P",[#NumSelect])),
IF(INDEX(Table1[ShortName],VALUE(RIGHT([#NumSelect],1)))="",
INDEX(Table1[Name],VALUE(RIGHT([#NumSelect],1))),
INDEX(Table1[ShortName],VALUE(RIGHT([#NumSelect],1)))),
IF(INDEX(Table2[ShortName],[#NumSelect])="",
INDEX(Table2[Name],[#NumSelect]),
INDEX(Table2[ShortName],[#NumSelect])))
A pseudo-code could look like this:
=IF(ISNUMBER(A),IF(B="",C,B),IF(D="",E,D))
The issue is that B (lines 2 & 4) and D (lines 5 & 7) are repeated expressions.
Hopefully, this will help someone to make a major improvement.
Microsoft 365
Using the LET function, you could use the following:
=LET(iIndex,[#NumSelect],sIndex,VALUE(SUBSTITUTE(LOWER(iIndex),"p","")),
IF(LEN(iIndex)>LEN(sIndex),
LET(nShort,INDEX(Table1[ShortName],sIndex),nLong,INDEX(Table1[Name],sIndex),
IF(nShort="",nLong,nShort)),
LET(nShort,INDEX(Table2[ShortName],sIndex),nLong,INDEX(Table2[Name],sIndex),
IF(nShort="",nLong,nShort))))
Welp, I figured out the formula. But it's very inefficient. I'm sure someone here could make it a lot shorter and more efficient.
Here it is:
=IF(
INDEX(FILTER(CHOOSE(IF(LOWER(LEFT([#NumSelect],1))="p",1,2),Table1[[Name]:[ShortName]],Table2[[Name]:[ShortName]]),CHOOSE(IF(LOWER(LEFT([#NumSelect],1))="p",1,2),Table1[Name],Table2[Name])<>""),SUBSTITUTE(LOWER([#NumSelect]),"p",""),3)
=0,
INDEX(FILTER(CHOOSE(IF(LOWER(LEFT([#NumSelect],1))="p",1,2),Table1[[Name]:[ShortName]],Table2[[Name]:[ShortName]]),CHOOSE(IF(LOWER(LEFT([#NumSelect],1))="p",1,2),Table1[Name],Table2[Name])<>""),SUBSTITUTE(LOWER([#NumSelect]),"p",""),1),
INDEX(FILTER(CHOOSE(IF(LOWER(LEFT([#NumSelect],1))="p",1,2),Table1[[Name]:[ShortName]],Table2[[Name]:[ShortName]]),CHOOSE(IF(LOWER(LEFT([#NumSelect],1))="p",1,2),Table1[Name],Table2[Name])<>""),SUBSTITUTE(LOWER([#NumSelect]),"p",""),3)
)
I'm trying to compile a best 5 and worst 5 list. I have two rows, column B with the number score and column C with the name. I only want the list to include the name.
In my previous attempts the formula would get the top/bottom 5 but as soon as a duplicate score appeared the first known name with that value would just repeat.
Here is my data
26 Cal
55 John
55 Mike
100 Steve
26 Thomas
100 Jaden
100 Jack
95 Josh
87 Cole
75 Brett
I've managed to get the bottom 5 list formula correct. This formula works perfectly and includes all names of duplicate scores.
Example of what I get:
Cal
Thomas
John
Mike
Brett
=INDEX($C$56:$E$70,SMALL(IF($B$56:$B$70=SMALL($B$56:$B$70,ROWS(E$2:E2)),ROW($B$56:$B$70)-ROW($B$56)+1),SUM(IF($B$56:$B$70=SMALL($B$56:$B$70,
ROWS(E$2:E2)),1,0))-SUM(IF($B$56:$B$70<=SMALL($B$56:$B$70,ROWS(E$2:E2)),1,0))+ROWS(E$2:E2)))
Here is the formula I've tried to get the top 5 - however I keep getting an error.
=INDEX($C$56:$E$70,LARGE(IF($B$56:$B$70=LARGE($B$56:$B$70,ROWS(E$2:E2)),ROW($B$56:$B$70)-ROW($B$56)+1),SUM(IF($B$56:$B$70=LARGE($B$56:$B$70,
ROWS(E$2:E2)),1,0))-SUM(IF($B$56:$B$70<=LARGE($B$56:$B$70,ROWS(E$2:E2)),1,0))+ROWS(E$2:E2)))
Example of what I'm looking for
Steve
Jaden
Jack
Josh
Cole
You can set two queries like this for both cases:
=QUERY(B56:C70,"Select C order by B desc limit 5")
=QUERY(B56:C70,"Select C order by B limit 5")
Use SORTN() function like-
=SORTN(A1:B10,5,,1,1)
To keep only one column, wrap the SORTN() function with INDEX() and specify column number. Try-
=INDEX(SORTN(A1:B10,5,,1,1),,2)
I am trying to figure out the names who only have specific column value and nothing else.
I have tried filtering the rows according to the column value but that isn't what I want, I want the names who only went to eat pizza.
I want names who only had pizza, so my code should return John only and not peter as john only had pizza
Click to view data frame
Your description is not clear. At first, it looks like a simple .loc will be enough. However, after viewing your picture of sample data, I realized it is not that simple. To get what you want, you need to identify duplicated or non-duplicated names having one Restaurant value only, and pick it. To do this, you need to use nunique and check it eq(1), and assign it a mask m. Finally, using m with slicing to get your desire output:
Your sample data:
In [512]: df
Out[512]:
Name Restaurant
0 john pizza
1 peter kfc
2 john pizza
3 peter pizza
4 peter kfc
5 peter pizza
6 john pizza
m = df.groupby('Name').Restaurant.transform('nunique').eq(1)
df[m]
Out[513]:
Name Res
0 john pizza
2 john pizza
6 john pizza
If you want to show only one row, just chain additional .drop_duplicates
df[m].drop_duplicates()
Out[515]:
Name Restaurant
0 john pizza
Let's say I've got two tables with two columns. In both cases, the first column consists of a name and a second column consist string of characters with the similar pattern. It looks like this:
Table 1
Peter xxxxx01
John xxxxx01
Bill xxxxx01
William xxxxx01
Table 2
Richard xxxxx02
John xxxxx02
Bill xxxxx02
Arthur xxxxx02
Now, I'd like to compare these two tables, find values where the names are duplicated and display data stored in second columns, just like this:
(Peter excluded)
John xxxxx01 xxxxx02
Bill xxxxx01 xxxxx02
(William, Arthur excluded)
I am familiar with pivot tables, however, it won't allow doing this.
I've also tried messing with index match formulas but without much success.
Any advices?
You can use the VLOOKUP function for this.
If your "Table1" is in B3:C6, and your "Table 2" is in F3:G6, then you can use the following formula in D3:D6 to lookup the values in table 2;
Cell D3: =IFERROR(VLOOKUP(B3,$F$3:$G$6,2,FALSE),"")
This is first looking up the name in table 1 (Cell B3) against table 2 (F3:G6), and returning the second column of table 2 if it finds the name. If it doesn't find the name, it will return an error, so we wrap the VLOOKUP in an "IFERROR" function, and replace any errors with an empty string, so it looks a bit friendlier. This results in the following table;
A B C D E F G
1
2 Table 1 Result Table 2
3 Peter xxxxxx01 Richard xxxxxx02
4 John xxxxxx01 xxxxxx02 John xxxxxx02
5 Bill xxxxxx01 xxxxxx02 Bill xxxxxx02
6 William xxxxxx01 Arthur xxxxxx02
You can then filter on the (Non-Blanks) in column D to only get the results you're interested in.