I have 2 excel sheets:
Sheet1: (Value = Prevalue)
Id Preval Value
111 1 1
123 2 2
100 3 3
Sheet2:
Id Num Date
111 5
123 6 1/1/18
100 7
I want to perform a logic saying that: Matching the 2 sheets by Id, if Date on sheet2 exists then Value on sheet1 = num on sheet2 else = Prevalue
Id Value
111 1 (same)
123 6 (update since date exists)
100 3 (same)
How would this be done using index or vlookup? Many thanks!
Try this formula:
=IF(VLOOKUP(A2,Sheet2!$A$1:$C$4,3,0)="",B2,VLOOKUP(A2,Sheet2!$A$1:$C$4,2,0))
Related
I need to produce a new column that checks for matches in the m1, m2 columns, per ID and returns a character, depending on matching status. So, for each ID, if m1, m2 match values, return 'a', if m1 = 'No', return 'b' and if m2 = 'No', return 'c'.
Example below
ID m1 m2 new_col
111 1 1 a
111 2 2 a
222 1 1 a
222 No 2 b
222 2 3 a
333 1 No c
333 2 1 a
333 3 2 a
333 4 3 a
You can use a formula to calculate the new column:
Assuming your data starts at A2, here is the formula for E2:
=IF(IF(B2="No","b",IF(C2="No","c",COUNTIFS($A$2:$A$10,A2,$B$2:$B$10,C2)))=1,"a",IF(B2="No","b",IF(C2="No","c",COUNTIFS($A$2:$A$10,A2,$B$2:$B$10,C2))))
Note that the formula only checks the first 9 rows based on your sample data.
Also note that the 5th line (222\2\3) has an unknown result because none of the criteria match.
Here is the output with the formula:
How can I drop the whole group by city and district if date's value of 2018/11/1 not exits in the following dataframe:
city district date value
0 a c 2018/9/1 12
1 a c 2018/10/1 4
2 a c 2018/11/1 5
3 b d 2018/9/1 3
4 b d 2018/10/1 7
The expected result will like this:
city district date value
0 a c 2018/9/1 12
1 a c 2018/10/1 4
2 a c 2018/11/1 5
Thank you!
Create helper column by DataFrame.assign, compare by datetime and test if at least one true per groups with GroupBy.any and GroupBy.transform for possible filter by boolean indexing:
mask = (df.assign(new=df['date'].eq('2018/11/1'))
.groupby(['city','district'])['new'].transform('any'))
df = df[mask]
print (df)
city district date value
0 a c 2018/9/1 12
1 a c 2018/10/1 4
2 a c 2018/11/1 5
If error with misisng values in mask one possivle idea is replace misisng values in columns used for groups:
mask = (df.assign(new=df['date'].eq('2018/11/1'),
city= df['city'].fillna(-1),
district= df['district'].fillna(-1))
.groupby(['city','district'])['new'].transform('any'))
df = df[mask]
print (df)
city district date value
0 a c 2018/9/1 12
1 a c 2018/10/1 4
2 a c 2018/11/1 5
Another idea is add possible misisng index values by reindex and also replace missing values to False:
mask = (df.assign(new=df['date'].eq('2018/11/1'))
.groupby(['city','district'])['new'].transform('any'))
df = df[mask.reindex(df.index, fill_value=False).fillna(False)]
print (df)
city district date value
0 a c 2018/9/1 12
1 a c 2018/10/1 4
2 a c 2018/11/1 5
There's a special GroupBy.filter() method for this. Assuming date is already datetime:
filter_date = pd.Timestamp('2018-11-01').date()
df = df.groupby(['city', 'district']).filter(lambda x: (x['date'].dt.date == filter_date).any())
I need some help with an excel formula or format that can help me with the following:
TABLE 1
(Row) |(a) ID | (b)FROM | (c) TO | (d) VALUE
(1) 123 0 1 50
(2) 123 1 2 50
(3) 123 2 3 50
(4) 123 3 4 50
(5) 123 4 5 60
(6) 123 5 6 60
TABLE 2
(Row) |(a) ID | (b)FROM | (c) TO | (d) VALUE
(1) 123 0 4 50
(2) 123 4 6 60
So table one has been incremented to 1 (from and to) whereas table two contains ranges with the values. The incremented values in table 1 should equal the values in table two if the range in table one falls within the range from table 2.
OUTPUT
(Row) |(a) ID | (b)FROM | (c) TO | (d) VALUE
(1) 123 0 1 50 TRUE
(2) 123 1 2 50 TRUE
(3) 123 2 3 50 TRUE
(4) 123 3 4 50 TRUE
(5) 123 4 5 60 TRUE
(6) 123 5 6 60 TRUE
Basically ID'123' has a value of 50 for the increments from 0 to 4 and value of 60 for the increments from 4 – 6 and as per TABLE 2 0-4 = 50 and 4-6 = 60 therefore the statement should be TRUE.
=IF(AND(A1=table2!a:a, table1!B1>=table2!b:b,table1!a1<table!B:B),IF(table1!d2=table2!d:d, TRUE,FALSE))
It might be with the way excel deals with ranges etc?
I like the answer from #tigeravatar (credits to him/her) but if you haven't got a ordered list I have created something below that doesn't need that. If table two is not ordered by ID number you could use this formula in cell E2 and copy down:
=IFERROR(IF(B2>=INDEX(table2!$A:$D,MATCH(D2,table2!$D:$D,0),2),IF(C2<=INDEX(table2!$A:$D,MATCH(D2,table2!$D:$D,0),3),IF(A2=INDEX(table2!$A:$D,MATCH(D2,table2!$D:$D,0),1),TRUE,FALSE))),FALSE)
This does essentially the same but with an AND statement
=IFERROR(IF(AND(A2=INDEX(table2!$A:$D,MATCH(D2,table2!$D:$D,0),1),B2>=INDEX(table2!$A:$D,MATCH(D2,table2!$D:$D,0),2),C2<=INDEX(table2!$A:$D,MATCH(D2,table2!$D:$D,0),3)),TRUE,FALSE),FALSE)
On table1, cell E2, use this formula and copy down:
=D2=VLOOKUP(B2,INDEX(table2!B:B,MATCH(A2,table2!A:A,0)):INDEX(table2!D:D,MATCH(A2,table2!A:A,0)+COUNTIF(table2!A:A,A2)-1),3)
I am trying return unique values from a third column if the values on the first and the second column match. I have been able to get the array formula below to work:
{=IFERROR(INDEX(Funding!$A$3:$A$3384,SMALL(IF(C$3:W$3=Funding!$J4:$J3385, ROW(Funding!$A$3:$A$3384)),ROW(1:1))-1,1),"")}
However when I try to insert the FREQUENCY function I get an error:
{=IFERROR(INDEX(Funding!$A$3:$A$3384,SMALL(IF(C$3:W$3=Funding!$J3:$J3384, ROW(FREQUENCY(Funding!$A$3:$A$3384,Funding!$A$3:$A$3384)),ROW(1:1)-1),1)),"")}
**Column A** **Column B** **Column C** **Column D**
11 1 1 **11**
22 1 1 **22**
33 3 3 **33**
44 7 0
55 5 0
66 5 0
Columns B and C have 1,1 and 3 in common therefore the indexed values in Column A (11,22 and 33) are returned in Column D.
I am looking for help to solve this excel problem.
Essentially I want to create a formula for cells in column F which does a multiple search on 3 criteria (on cells in columns A,B,C) and want to access the corresponding column D values where all these (multiple) matches occur, and sum this in column F. I'd also like a count of the amount of matches found to calculate the value in column F; placed alongside in column G.
e.g.
IF col_A_value (anywhere in whole A column) = current_col_A_value +/- 1
AND col_B_value (anywhere in whole B column) = current_col_B_value +/- 1
AND col_C_value (anywhere in whole C column) = current_col_C_value - 1
THEN (output in column F) the sum of all values from row D where this criteria is met
(also, as a seperate but related cell formula, output in column G) the total Count of times this occurs.
Note: the values in columns A,B,C are all integars and the +/- above means to search for any values which are either +1, 0, or -1 different in value. (i.e. this includes the value itself).
e.g. If the value in cell A1 = 10, B1 = 45, C1 = 881, then the first search criteria would look for all other rows with values of 9, 10 or 11 in column A. Then based on these rows, the second search criteria would refine the search to only those rows which also include either a 44, 45 or 46 in column B, and the third search criteria would refine the search again to only include those rows where the column C value is 880.
Next, the values in the column D cells from all of these 'filtered' rows would be summed and the result placed in the column F cell. (The count of these results rows would be put in column G. (seperate formula required))
Since these are all unique entries (think of columns A,B,C creating unique vector coordinates in space), there should be a maximum of 9 entries found and summed. A +/-1: 3 variations, B +/-1: 3 variations and C -1 only: 1 variation. In total: 3x3x1 = 9 unique rows maximum (and potentially none as a minimum, as in the below example).
(If no match is found a value of 0 is good.)
Example with A,B,C,D and E as given values, and column F values calculated (together with the count shown in col G):
A B C D E F G
1 1 1 90 8 0 0
1 2 1 80 6 0 0
1 3 1 70 1 0 0
1 4 1 60 6 0 0
2 1 1 50 1 0 0
2 2 1 40 8 0 0
2 3 1 30 6 0 0
2 4 1 20 8 0 0
3 1 1 10 8 0 0
3 2 1 11 6 0 0
3 3 1 12 1 0 0
3 4 1 13 1 0 0
1 1 2 99 8 260 4
1 2 2 89 6 360 6
1 3 2 79 1 300 6
1 4 2 69 6 180 4
2 1 2 59 1 281 6
2 2 2 49 8 393 9
etc
To illustrate how column F values are calculated here is the working:
260 = 90+80+50+40
360 = 90+80+70+50+40+30
300 = 80+70+60+40+30+20
180 = 70+60+30+20
281 = 90+80+50+40+10+11
393 = 90+80+70+50+40+30+10+11+12
Thanks a lot for any help with this!
These formulas should do what you desire:
F1: =SUMIFS(D:D,A:A,"<="&A1+1,A:A,">="&A1-1,B:B,"<="&B1+1,B:B,">="&B1-1,C:C,C1-1)
G1: =COUNTIFS(A:A,"<="&A1+1,A:A,">="&A1-1,B:B,"<="&B1+1,B:B,">="&B1-1,C:C,C1-1)
The formulas can simply be copied down as you need them...
(Still I don't know what col E is for)