In VBA, check matches by grouping variable - excel

I need to produce a new column that checks for matches in the m1, m2 columns, per ID and returns a character, depending on matching status. So, for each ID, if m1, m2 match values, return 'a', if m1 = 'No', return 'b' and if m2 = 'No', return 'c'.
Example below
ID m1 m2 new_col
111 1 1 a
111 2 2 a
222 1 1 a
222 No 2 b
222 2 3 a
333 1 No c
333 2 1 a
333 3 2 a
333 4 3 a

You can use a formula to calculate the new column:
Assuming your data starts at A2, here is the formula for E2:
=IF(IF(B2="No","b",IF(C2="No","c",COUNTIFS($A$2:$A$10,A2,$B$2:$B$10,C2)))=1,"a",IF(B2="No","b",IF(C2="No","c",COUNTIFS($A$2:$A$10,A2,$B$2:$B$10,C2))))
Note that the formula only checks the first 9 rows based on your sample data.
Also note that the 5th line (222\2\3) has an unknown result because none of the criteria match.
Here is the output with the formula:

Related

if specific value/string occurs in the entire dataframe I want to sum its index values

i have a dataframe in which I need to find a specific image name in the entire dataframe and sum its index values every time they are found. SO my data frame looks like:
c 1 2 3 4
g
0 180731-1-61.jpg 180731-1-61.jpg 180731-1-61.jpg 180731-1-61.jpg
1 1209270004-2.jpg 180609-2-31.jpg 1209270004-2.jpg 1209270004-2.jpg
2 1209270004-1.jpg 180414-2-38.jpg 180707-1-31.jpg 1209050002-1.jpg
3 1708260004-1.jpg 1209270004-2.jpg 180609-2-31.jpg 1209270004-1.jpg
4 1108220001-5.jpg 1209270004-1.jpg 1108220001-5.jpg 1108220001-2.jpg
I need to find the 1209270004-2.jpg in entire dataframe. And as it is found at index 1 and 3 I want to add the index values so it should be
1+3+1+1=6.
I tried the code:
img_fname = '1209270004-2.jpg'
df2 = df1[df1.eq(img_fname).any(1)]
sum = int(np.sum(df2.index.values))
print(sum)
I am getting the answer of sum 4 i.e 1+3=4. But it should be 6.
If the string occurence is only once or twice or thrice or four times like for eg 180707-1-31 is in column 3. then the sum should be 45+45+3+45 = 138. Which signifies that if the string is not present in the dataframe take vallue as 45 instead the index value.
You can multiple boolean mask by index values and then sum:
img_fname = '1209270004-1.jpg'
s = df1.eq(img_fname).mul(df1.index.to_series(), 0).sum()
print (s)
1 2
2 4
3 0
4 3
dtype: int64
out = np.where(s == 0, 45, s).sum()
print (out)
54
If dataset does not have many columns, this can also work with your original question
df1 = pd.DataFrame({"A":["aa","ab", "cd", "ab", "aa"], "B":["ab","ab", "ab", "aa", "ab"]})
s = 0
for i in df1.columns:
s= s+ sum(df1.index[df1.loc[:,i] == "ab"].tolist())
Input :
A B
0 aa ab
1 ab ab
2 cd ab
3 ab aa
4 aa ab
Output :11
Based on second requirement:

How to find a value in one column in another column and return cells in row where value was found

I want to find a value in column E, then get values from same row from columns B, C, and D. So I want to find 1 in column E, and the values from B, C, and E, then find 2 in column E and values from B, C, and D and all the way through 18.
I have tried VLOOKUP, INDEX/MATCH
B C D E
1 4 365 3
2 5 464 2
3 3 151 15
4 4 417 1
5 4 284 7
F G H I
1 4 4 417
2 2 5 464
3 1 4 365
G2: =INDEX($B$2:$E$6,MATCH($F2,$E$2:$E$6,0),1)
H2: =INDEX($B$2:$E$6,MATCH($F2,$E$2:$E$6,0),2)
I2: =INDEX($B$2:$E$6,MATCH($F2,$E$2:$E$6,0),3)
and fill down as far as needed, adjusting the array address as required.

Excel Formula: check for every Col1 unique value, if COL2 has duplicates, then Col3 should not be unique

Consider the following data. I have added help text across each data row to explain scenarios: Bill is Col A, Ref is col B and so on
Form: is Formula; Exceptns: With Current Formula the results I get and expected output is what I'm looking at
Bill Ref Ref2 HelpText Form Form Excptns Excepted output
123 557 123 Scenario1 1 1 FALSE FALSE
123 589 123-1 Scenario1 1 1 FALSE FALSE
123 591 123-2 Scenario1 2 1 TRUE FALSE
123 591 123-3 Scenario2 2 1 TRUE TRUE
124 432 124 Scenario1 1 1 FALSE FALSE
124 433 124-1 Scenario1 1 1 FALSE FALSE
Scenario1:
Since they have same bill number and unique Ref Number, Ref2 column has to be unique
Scenario2:
Since same bill number and same ref number, we can't have unique Ref2 values - therefore (it should be 123-2 and not 123-3 and I need to identify all such cases using excel formula)
Duplicates scenario will not exist as we are taking pivot
My effort:
A B C D E F G
Bill REf Ref2 HelpText Formula Formula Formul
123 557 123 Scenario1 =countIFS(A:A,A2,B:B,B2)
123 589 123-1 Scenario1
123 591 123-2 Scenario1
123 592 123-2 Scenario2
I used countifs(A:A,A2,B:B,B2) in columnE
countifs(B:B,B2,C:C,C2) in columnF
columnG FORMULA is =AND(IF(E2>1,"TRUE","FALSE"),IF(F2=1,"TRUE","FALSE"))
when I FILTER Column G by true/false - true means exceptions:
I get both instances as output. I'm however not interested at the first occurrence because it's valid. From there on, 2nd and incase 3rd instances - I will be interested
Please help me arriving at the formula of how can I can ignore the first occurrence. I can not use VBA. Only excel formulae - and no limit on column numbers

excel match index if exists

I have 2 excel sheets:
Sheet1: (Value = Prevalue)
Id Preval Value
111 1 1
123 2 2
100 3 3
Sheet2:
Id Num Date
111 5
123 6 1/1/18
100 7
I want to perform a logic saying that: Matching the 2 sheets by Id, if Date on sheet2 exists then Value on sheet1 = num on sheet2 else = Prevalue
Id Value
111 1 (same)
123 6 (update since date exists)
100 3 (same)
How would this be done using index or vlookup? Many thanks!
Try this formula:
=IF(VLOOKUP(A2,Sheet2!$A$1:$C$4,3,0)="",B2,VLOOKUP(A2,Sheet2!$A$1:$C$4,2,0))

PowerPivot field of same Row in Calculation

Im trying to have a formula, that gets the first result of entry, for every line.
An Example Table would be like this:
Column A Column B Column C Excepted Output from Formula
3 99 P 18 P 4
4 88 P 144 P 1
2 77 P 2
2 77 P 2
1 88 P 1 P 1
1 99 P 4 P 4
2 44 P 5
3 22 P 7
1 88 P 99 P 1
Now, on Column D it should always show the first time it finds Coulmn A = 1, and Column B the same value as the own row (99 for the first row, 88 for the second, 77 for the 3rd...), and Display the Column C of it.
I tried it with the following Formula:
=CALCULATE(
FIRSTNONBLANK('Table'[Column C]; TRUE());
FILTER('Table';'Table'[Column A]=1);
FILTER('Table';'Table'[Column B]='Table'[Column B])
)
Which doesnt work. No errors, but it ignores the second filter.
If i now replace the "='Table'[Column B]" with a number that it should take (99,88,77...) it shows the correct result. But since its now a static number, it shows the same Result in every line, instead of calc it always new.
Can someone help?
Try this:
= CALCULATE(FIRSTNONBLANK('Table'[Column C], TRUE()),
FILTER(FILTER('Table','Table'[Column A]=1),'Table'[Column B] = earlier('Table'[Column B])))

Resources