Excel: take ID from one list to another list - excel

I want to combine two lists with the ID from one of them but haven't found a solution for this. These are my two lists:
List 1:
NAME | CAR
john | audi
karen | opel
List 2:
CAR | ID
audi | 1
opel | 2
what I want is the ID to get from list 2 to list 1, so that my list 1 will look like:
NAME | CAR | ID
john | audi | 1
and so on…
How can I do this in Excel?

Use procv function
Exemple
=VLOOKUP(CAR;LIST2;2;0)
where CAR is cell with name of car and LIST2 is range of cell with list 2.

Related

Pandas, combine unique value from two column into one column while preserving order

I have data in four column as shown below. There are some values which are present in column 1, and some value of column 1 is again duplicated in column 3. I would like to combine column 1 with 3, while removing the duplicates from column 3. I would also like to preserve the order of column. Column 1 is associated with column 2 and column 3 is associated with column 4, so it would be nice if I can move column 1 items with column 2 and column 3 items with column 4 during merge. Any help will be appreciated.
Input table:
Item
Price
Item
Price
Car
105
Truck
54822
Chair
20
Pen
1
Cup
2
Car
105
Glass
1
Output table:
Item
Price
Car
105
Chair
20
Cup
2
Truck
54822
Pen
1
Glass
1
Thank you in advance.
After separating the input table into the left and right part, we can concatenate the left hand items with the unduplicated right hand items quite simply with boolean indexing:
import pandas as pd
# this initial section only recreates your sample input table
from io import StringIO
input = pd.read_table(StringIO("""| Item | Price | Item | Price |
|-------|-------|------|-------|
| Car | 105 | Truck| 54822 |
| Chair | 20 | Pen | 1 |
| Cup | 2 | Car | 105 |
| | | Glass| 1 |
"""), ' *\| *', engine='python', usecols=[1,2,3,4], skiprows=[1], keep_default_na=False)
input.columns = list(input.columns[:2])*2
# now separate the input table into the left and right part
left = input.iloc[:,:2].replace("", pd.NA).dropna().set_index('Item')
right = input.iloc[:,2:] .set_index('Item')
# finally construct the output table by concatenating without duplicates
output = pd.concat([left, right[~right.index.isin(left.index)]])
Price
Item
Car 105
Chair 20
Cup 2
Truck 54822
Pen 1
Glass 1

Spreadsheet - Im trying to find the sum of the top n numbers in two columns

-UPDATED- Answered, thanks for all who helped.
Consider the following Google spreadsheet:
A B C D E
1 John | Bob | Sue | Tony
2 h1 2 | 1 | 3 | 2
3 h2 3 | 3 | 4 | 2
4 h3 1 | 2 | 1 | 3
5 h4 2 | 2 | 3 | 1
6 h5 2 | 1 | 1 | 3
7 h6 1 | 2 | 2 | 1
8 h7 1 | 2 | 1 | 3
Team | Player1 | Player 2 | Score
1 | John | Sue | ?
2 | Bob | Tony | ?
Each team is made of two partners, e.g. John and Sue. Each row contains a match: the team's score is the best of each member's. The team total score of the game is the sum of the match scores.
In the example:
Team 1 : John & Sue. Match scores: (3,4,1,3,2,2,1). Total score = 16.
Team 2 : Bob & Tony. Match scores: (2,3,3,2,3,2,3). Total score = 18.
Another example would be two golfers working as team and the best score between them is counted per hole, the at the end we add those up.
Can this be done using a spreadsheet?
To get the desired result, the formula comes up quite complicated:
=SUMPRODUCT(IF(MMULT((B12=$B$1:$E$1)*$B$2:$E$8,ROW(A1:A4)^0)>MMULT((C12=$B$1:$E$1)*$B$2:$E$8,ROW(A1:A4)^0),(B12=$B$1:$E$1)*$B$2:$E$8,(C12=$B$1:$E$1)*$B$2:$E$8))
but works in both Excel and GS
In Excel you can use the LARGE() function. It is the easiest option but a bit verbose.
If you want to the sum of the top 3 values in a column/row:
= large(A1:A10, 1), large(A1:A10, 2) + large(A1:A10, 3)
In Excel
If one has the new dynamic array formula LET():
=LET(x,INDEX($B$2:$E$8,0,MATCH(I2,$B$1:$E$1,0)),y,INDEX($B$2:$E$8,0,MATCH(J2,$B$1:$E$1,0)),SUMPRODUCT(((x>y)*x)+((y>=x)*(y))))
Else
=SUMPRODUCT(((INDEX($B$2:$E$8,0,MATCH(I2,$B$1:$E$1,0))>INDEX($B$2:$E$8,0,MATCH(J2,$B$1:$E$1,0)))*INDEX($B$2:$E$8,0,MATCH(I2,$B$1:$E$1,0)))+((INDEX($B$2:$E$8,0,MATCH(J2,$B$1:$E$1,0))>=INDEX($B$2:$E$8,0,MATCH(I2,$B$1:$E$1,0)))*(INDEX($B$2:$E$8,0,MATCH(J2,$B$1:$E$1,0)))))
Either of the following formulas will produce the desired sums of 16 and 18 (tested on my machine):
=ArrayFormula(SUM(IF($B$2:$B$8>$D$2:$D$8,$B$2:$B$8,$D$2:$D$8)))
=SUMPRODUCT(IF($B$2:$B$8>$D$2:$D$8,$B$2:$B$8,$D$2:$D$8))
Adjust to B->C and D->E for Bob + Tony. These formulas work by operating on arrays. They evaluate the IF statement once per cell in the B2:B8 range and generate an array of values ({3,4,1,3,2,2,1}). Then SUM or SUMPRODUCT will sum those values. ArrayFormula is necessary to force SUM to deal with the IF as an array.
Further customization can be built from here as desired. Play around with ArrayFormula and SUMPRODUCT as they have much more powerful use cases than this and have parallels in other spreadsheet softwares including Excel.

I have data stored in excel where I need to sort that data

In excel, I have data divided into
Year Code Class Count
2001 RAI01 LNS 9
2001 RAI01 APRP 4
2001 RAI01 3
2002 RAI01 BPR 3
2002 RAI01 BRK 3
2003 RAI01 URE 3
2003 CFCOLLTXFT APRP 2
2003 CFCOLLTXFT BPR 2
2004 CFCOLLTXFT GRL 2
2004 CFCOLLTXFT HDS 2
2005 RAI HDS 2
where I need to find the top 3 products for that particular customer for that particular year.
The real trick here is to rank each row based on a group.
Your rank is determined by your Count column (Column D).
Your group is determined by your Year and Code (I think) columns (Column A and B respectively).
You can use this gnarly sumproduct() formula to get a rank (Starting at 1) based on the Count for each Group.
So to get a ranking for each Year and Code from 1 to whatever, in a new column next to this data:
=SUMPRODUCT(($A$2:$A$50=A2)*(B2=$B$2:$B$50)*(D2<$D$2:$D$50))+1
And copy that down. Now you can AutoFilter on this to show all rows that have a rank less than 4. You can sort this on Customer, then Year and you should have a nice list of top 3 within each year/code.
Explanation of sumproduct.
Sumproduct goes row by row and applies the math that is defined for each row. When it is done it sums the results.
As an example, take the following worksheet:
+---+---+---+
| | A | B |
+---+---+---+
| 1 | 1 | 1 |
| 2 | 1 | 4 |
| 3 | 2 | 2 |
| 4 | 4 | 1 |
| 5 | 1 | 2 |
+---+---+---+
`=SUMPRODUCT((A1:A5)*(B1:B5))`
This sumproduct will take A1*B1, A2*B2, A3*B3, A4*B4, A5*B5 and then add those five results up to give you a number. That is 1 + 4 + 4 + 4 + 1 = 15
It will also work on conditional/boolean statements returning, for each row/condition a 1 or a 0 (for True and False, which is a "Boolean" value).
As an example, take the following worksheet that holds the type of publication in a library and a count:
+---+----------+---+
| | A | B |
+---+----------+---+
| 1 | Book | 1 |
| 2 | Magazine | 4 |
| 3 | Book | 2 |
| 4 | Comic | 1 |
| 5 | Pamphlet | 2 |
+---+----------+---+
=SUMPRODUCT((A1:A5="Book")*(B1:B5))
This will test to see if A1 is "Book" and return a 1 or 0 then multiple that result by whatever is B1. Then continue for each row in the range up to row 5. The result will 1+0+2+0+0 = 3. There are 3 books in the library (it's not a very big library).
For this answer's sumproduct:
So ($A$2:$A$50=A2) says to return a 1 if A2=A2 or a 0 if A2<>A2. It does that for A2 through A50 comparing it to A2, returning a 1 or a 0.
(B2=$B$2:$B$50) will test each cell B2 through B50 to see if it is equal to B2 and return a 1 or 0 for each test.
The same is true for (D2<$D$2:$D$50) but it's testing to see if the count is less than the current cells count.
So... essentially this is saying "For all the rows 1 through 50, test to find all the other rows that have the same value in Column A and B AND have a count less than this rows count. Count all of those rows up that meet that criteria, and add 1 to it. This is the rank of this row within its group."
Copying this formula has it redetermine that rank for each row allowing you to rank and filter.

Formula for count of distinct values, multiple conditions, one of which = or <> all repeating values

Excel formula (I know this may work with a pivot table, but wanting a formula) to count distinct values. If this is my table in Excel:
Region | Name | Criteria
------ | ------ | ------
1 | Jill | A
1 | Jill | A
1 | John | B
1 | John | A
2 | Jane | B
2 | Jane | B
2 | Bill | A
2 | Bill | B
3 | Mary | B
3 | Mary | B
3 | Gary | A
3 | Gary | A
In this example, I have the following formual to calculate the distinct values within each region =SUM(--(FREQUENCY(IF((Table1[Region]=A2)*(Table1[Name]<>""),MATCH(Table1[Name],Table1[Name],0)),ROW(Table1[Name])-ROW(Table!B2)+1)>0)) which results in 2 each (Region 1=Jill & John; 2=Jane & Bill, 3=Mary & Gary, each distinct name counted once).
I have an addition formula to calculate how many distinct values with criteria where there is at least 1 "B" for each distinct name within each region, by adding *(Table1[Category]="B") after <>"") ... in this example, it would return Region 1=1, Region 2=2, 3=1, because Jill nor Gary do not have "B" - all others have at least one "B".
Now I'm getting stuck on my last formula, where I want to count how many distinct values within each Region have ALL B's in all their occurrences. The outcome should be Region 1=0 (Jill has no B's and John has a B, but also has an A), Region 2=1 (Jane appears twice, counts as 1 distinct value, and both occurrences are B, Bill has a B in one of his), and 3=1 (Mary has all Bs).
It's too complex for a formula-only task, but feasible.
The following array formula does the job. Although you did not specify it, but I suppose that if "Mary" has an A in another region, this should not cancel her counting in region 3, so long as all records with name "Mary" in region 3 have a "B". In other words, names can repeat in different regions but should not interfere across regions (which made the formula even longer. I added a test case for this, Mary in region 4 with an A did not interfere with Mary in region 3).
=SUM(IF((Table1[Region]=Table1[#Region])*(0=COUNTIFS(Table1[Region],Table1[#Region],
Table1[Name],Table1[Name],Table1[Criteria],"<>B")), 1/COUNTIFS(Table1[Name],Table1[Name],
Table1[Criteria],"B",Table1[Region],Table1[#Region]), 0))
Enter it then press CtrlShiftEnter. then copy/paste down the column.

excel, combine multiple columns into one row

Am new to this so any help would be greatly appreciated.
I have an excel spread sheet with multiple columns, and I want to combine several columns into one column but maintain the same row. i.e.
from:
Name Address Phone1 Phone2 Phone3
Joe box 5 123-456-7890 Null 312-778-2564
Sue 3 w 2nd ST. 345-789-3214 156-879-5461 278-444-5687
Mike box 12 Null 666-879-4518 777-548-9851
To:
name address Phone
Joe box 5 123-456-7890
Null
312-778-2564
Sue 3 w 2nd ST. 345-789-3214
156-879-5461
278-444-5687
Mike box 12 Null
666-879-4518
777-548-9851
Say your information is in columns A and B so:
A | B
1 | z
2 | y
3 | x
To combine A and B in column C you would write the following in cell C1 and copy it down
=A1&B1
The final result would be
A | B | C
1 | z | 1z
2 | y | 2y
3 | x | 3x

Resources