match multiple columns within the same row - excel

Table 1. I have a table that looks like this:
X Y Z
1 a p
2 a p
6 b p
7 c p
9 c p
Table 2. I have a different table that looks like this:
Col1 Col2 Col3 Col4
Row1 p p p
Row2 a b c
Row3 1
Row4 2
Row5 3
Row6 4
Row7 5
Row8 6
Row9 7
Row10 8
Row11 9
I want to mark "TRUE" when rows of table 1 match with values of its column in Table 1. As a result for example:
Col1 Col2 Col3 Col4
Row1 p p p
Row2 a b c
Row3 1 TRUE
Row4 2 TRUE
Row5 3
Row6 4
Row7 5
Row8 6 TRUE
Row9 7 TRUE
Row10 8
Row11 9 TRUE
Here is what I have tried so far. This is the formula for Col2 Row3:
=IFERROR(IF(AND(AND(MATCH(Col1Row3,X:X,0), MATCH(Col2Row1,Z:Z,0)), MATCH(Col2Row2,Y:Y,0)), "TRUE", ""),"")
I think it's not working because I am not containing the matches within the same row. How can I achieve my result?
Also, I do not want to specify a specific row in the formula because I have thousands of rows in Table 1, and Table 2 has to select values among those thousands of rows.

Use COUNTIFS
=IF(COUNTIFS($F:$F,$A3,$G:$G,B$2,$H:$H,B$1),TRUE,"")

Related

excel find the count of 2 filtered columns

There are paired columns that I am comparing(col1 and col2, col3 and col4) with either blank or '0' or '1'. I basically want to know how many are intersect
id col1 col2 col3 col4
id1 0 1
id2 1 1 0
id3 0 1 1
id4
id5 0
for this table I want to count of how many ids are 0 or 1(between col1 and col2). If I use countA(b2:c4) I get 4 but I need to get 3 as only 3 ids are affected for each pair
. Is therea formula that would actually give 3 for col1 and col2 and 3 for col3 and col4.
SUMPRODUCT(--(B$2:B$7+C$2:C$7=0))
fails here and provides 3 instead of 5

Grouping corresponding Rows based on One column

I have an Excel Sheet Dataframe with no fixed number of rows and columns.
eg.
Col1 Col2 Col3
A 1 -
A - 2
B 3 -
B - 4
C 5 -
I would like to Group Col1 which has the same content. Like the following.
Col1 Col2 Col3
A 1 2
B 3 4
C 5 -
I am using pandas GroupBy, but not getting what I wanted.
Try using groupby:
print(df.replace('-', pd.np.nan).groupby('Col1', as_index=False).first().fillna('-'))
Output:
Col1 Col2 Col3
0 A 1 2
1 B 3 4
2 C 5 -

Merge 2 Different Data Frames - Python 3.6

Want to merge 2 table and blank should fill with first table rows.
DF1:
Col1 Col2 Col3
A B C
DF2:
Col6 Col8
1 2
3 4
5 6
7 8
9 10
I am expecting result as below:
Col1 Col2 Col3 Col6 Col8
A B C 1 2
A B C 3 4
A B C 5 6
A B C 7 8
A B C 9 10
Use assign, but then is necessary change order of columns:
df = df2.assign(**df1.iloc[0])[df1.columns.append(df2.columns)]
print (df)
Col1 Col2 Col3 Col6 Col8
0 A B C 1 2
1 A B C 3 4
2 A B C 5 6
3 A B C 7 8
4 A B C 9 10
Or concat and replace NaNs by forward filling with ffill:
df = pd.concat([df1, df2], axis=1).ffill()
print (df)
Col1 Col2 Col3 Col6 Col8
0 A B C 1 2
1 A B C 3 4
2 A B C 5 6
3 A B C 7 8
4 A B C 9 10
you can merge both dataframes by index with outer join and forward fill the data
df1.merge(df,left_index=True,right_index=True,how='outer').fillna(method='ffill')
Out:
Col6 Col8 Col1 Col2 Col3
0 1 2 A B C
1 3 4 A B C
2 5 6 A B C
3 7 8 A B C
4 9 10 A B C

Sorting pivot table (multi index)

I'm trying to sort a pivot table's values in descending order after putting two "row labels" (Excel term) on the pivot.
sample data:
x = pd.DataFrame({'col1':['a','a','b','c','c', 'a','b','c', 'a','b','c'],
'col2':[ 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3],
'col3':[ 1,.67,0.5, 2,.65, .75,2.25,2.5, .5, 2,2.75]})
print(x)
col1 col2 col3
0 a 1 1.00
1 a 1 0.67
2 b 1 0.50
3 c 1 2.00
4 c 1 0.65
5 a 2 0.75
6 b 2 2.25
7 c 2 2.50
8 a 3 0.50
9 b 3 2.00
10 c 3 2.75
To create the pivot, I'm using the following function:
pt = pd.pivot_table(x, index = ['col1', 'col2'], values = 'col3', aggfunc = np.sum)
print(pt)
col3
col1 col2
a 1 1.67
2 0.75
3 0.50
b 1 0.50
2 2.25
3 2.00
c 1 2.65
2 2.50
3 2.75
In words, this variable pt is first sorted by col1, then by values of col2 within col1 then by col3 within all of those. This is great, but I would like to sort by col3 (the values) while keeping the groups that were broken out in col2 (this column can be any order and shuffled around).
The target output would look something like this (col3 in descending order with any order in col2 with that group of col1):
col3
col1 col2
a 1 1.67
2 0.75
3 0.50
b 2 2.25
3 2.00
1 0.50
c 3 2.75
1 2.65
2 2.50
I have tried the code below, but this just sorts the entire pivot table values and loses the grouping (I'm looking for sorting within the group).
pt.sort_values(by = 'col3', ascending = False)
For guidance, a similar question was asked (and answered) here, but I was unable to get a successful output with the provided output:
Pandas: Sort pivot table
The error I get from that answer is ValueError: all keys need to be the same shape
You need reset_index for DataFrame, then sort_values by col1 and col3 and last set_index for MultiIndex:
df = df.reset_index()
.sort_values(['col1','col3'], ascending=[True, False])
.set_index(['col1','col2'])
print (df)
col3
col1 col2
a 1 1.67
2 0.75
3 0.50
b 2 2.25
3 2.00
1 0.50
c 3 2.75
1 2.65
2 2.50

Looking for the Max Sum, based on Criteria and Unique Values

Col1 Col2 Col3
a 3 x
b 2 x
c 2 x
a 1 x
b 3 x
c 1 y
a 2 y
b 1 y
c 3 y
Using the table above, can anyone give me a formula to find:
The max sum of Col2 when Col3=X per each unique value in Col1
(Answer should be 5, would be 4 based on Col3=Y)
Create a PivotTable with Col3 as FILTERS (select x), Col1 for ROWS and Sum of Col2 for VALUES. Uncheck Show grand totals for Columns and then for whichever column contains Sum of Col2 take the maximum, say:
=MAX(F:F)
Well it's not ideal but it works:
Column D put an array formula in for Max If:
in D2: =MAX(IF($C$2:$C$10=C2,SUM(IF($A$2:$A$10=A2,IF($C$2:$C$10=C2,$B$2:$B$10)))))
Change the ranges obviously.
Then in E2 put this: =MAX(IF($C$2:$C$10=C2,$D$2:$D$10))
These are both array formulas so after inputting them you must press CTRL-SHIFT-ENTER not just enter.
Then drag down.
There may be a way to combine these but my array formula knowledge is limited
Here are the results:
Col1 Col2 Col3 Sum of max per col 1 Max of col 4 per col 3
a 3 x 4 5
b 2 x 5 5
c 2 x 2 5
a 1 x 4 5
b 3 x 5 5
c 1 y 4 4
a 2 y 2 4
b 1 y 1 4
c 3 y 4 4
If you don't use CTRL-SHIFT-ENTER you will get 18 and 5 all the way down.

Resources