Excel Formula based on previous rows - excel

There are 3 columns:
Date, Name, Bonus_Point?
If a player scores a 4 or lower in the Name Column for three consecutive Dates, then Bonus_Point will return a 'Yes' or 'No'
For example, for 1/30/22, there would be a 'Yes' because there were 3 previous instances (including 1/30/22) where the score is less than or equal to 4.
But for 2/2/22, Bonus_Point? would be 'No' because on the third day, Name scored a 5.

Assuming your columns are A through C, and the row 1 is the header row and your data is in rows 2 and down, enter this formula in C4:
=AND(B2<=4,B3<=4,B4<=4)
Then fill down. (See further down for "yes" and "no")
Date
Name
Bonus_Point?
1/28/22
3
1/29/22
3
1/30/22
3
TRUE
1/31/22
3
TRUE
2/1/22
4
TRUE
2/2/22
5
FALSE
2/3/22
2
FALSE
2/4/22
5
FALSE
2/5/22
4
FALSE
2/6/22
3
FALSE
2/7/22
2
TRUE
2/8/22
3
TRUE
2/9/22
4
TRUE
2/10/22
3
TRUE
2/11/22
2
TRUE
2/12/22
2
TRUE
3/13/22
3
TRUE
If you want "Yes" and "No", you can do that through formatting or add it to the formula:
=IF(AND(B2<=4,B3<=4,B4<=4),"Yes","No")

Related

Python: How to set n previous rows as True if row x is True in a DataFrame

My df (using pandas):
Value Class
1 False
5 False
7 False
2 False
4 False
3 True
2 False
If a row has Class as True, I want to set all n previous rows as true as well. Let's say n = 3, then the desired output is:
Value Class
1 False
5 False
7 True
2 True
4 True
3 True
2 False
I've looked up similar questions but they seem to focus on adding new columns. I would like to avoid that and just change the values of the existing one. My knowledge is pretty limited so I don't know how to tackle this.
Idea is replace False to missing values by Series.where and then use back filling function with limit parameter by Series.bfill, last replace missing values to False and convert values to boolean:
n = 3
df['Class'] = df['Class'].where(df['Class']).bfill(limit=n).fillna(0).astype(bool)
print (df)
Value Class
0 1 False
1 5 False
2 7 True
3 2 True
4 4 True
5 3 True
6 2 False

Identify and count alternating parts of a column in a (timeseries) dataframe

I am analyzing trades done in a futures contract, based on a csv file with a list of trades (columns are Side, Qty, Price, Date).
I have imported the file and sorted the trades chronologically by time. The column "Side" (BUY/SELL) is now:
B
S
S
B
B
S
S
B
B
B
B
I want to give each row of B's and each row of S's a unique number, in order for me to group each individual parts of B's and S's for further analysis. I want for example to find out what the average price of each row of Bs and each row of Ss are.
In the example above there are 5 rows/parts in total, 3 B's and 2 S's. The first row of B's should be 1. The second row of B's should be 3 and the last row of B's should be 5. Basically I want to add a column with this output:
1
2
2
3
3
4
4
5
5
5
5
Now I should be able to find the average price of the four B's in row number 5 using groupby with the new column as argument and mean().
But how can I make the counter needed for this new column? I am able to identify each change using somehing like np.where(), diff(), abs() + cumsum() and 1 and -1, but I dont see how I can add +1 to each alternation.
Use Series.shift with compare not equal and cumulative sum by Series.cumsum:
df['new'] = df['Side'].ne(df['Side'].shift()).cumsum()
How it working:
df = df.assign(shifted = df['Side'].shift(),
mask = df['Side'].ne(df['Side'].shift()),
new = df['Side'].ne(df['Side'].shift()).cumsum())
print (df)
Side shifted mask new
0 B NaN True 1
1 S B True 2
2 S S False 2
3 B S True 3
4 B B False 3
5 S B True 4
6 S S False 4
7 B S True 5
8 B B False 5
9 B B False 5
10 B B False 5

How to find the number of rows that has been updated in pandas

How can we find the number of rows that got updated in pandas.
New['Updated']= np.where((New.Class=='B')&(New.Flag=='Y'),'N',np.where((New.Class=='R')&(New.Flag=='N'),'Y',New.Flag))
data.Flag=data['Tracking_Nbr'].map(New.set_index('Tracking_Nbr').Updated)
You need store the Flag before the change , here I using Flag1
df2['Updated']=np.where((df2.Class=='B')&(df2.Flag=='Y'),'N',np.where((df2.Class=='R')&(df2.Flag=='N'),'Y',df2.Flag))
df1['Flag1']=df1['Flag']
df1.Flag=df1['Tracking_Nbr'].map(df2.set_index('Tracking_Nbr').Updated)
df1[df1['Flag1']!=df1['Flag']]
More information
df1['Flag1']!=df1['Flag']
Out[716]:
0 True
1 True
2 True
3 True
4 True
5 True
6 True
dtype: bool

Excel: Consider only cells with given value - Recursive formula

I'm trying to make a formula that lets me easily extrapolate a quality within a subser
Let's say I have the following set of data:
Week Name Accepted? Accept Week?
1 a TRUE
1 b TRUE
1 c TRUE
2 d FALSE
2 e TRUE
2 f TRUE
3 g FALSE
3 h FALSE
3 i FALSE
Three weeks, three entries each
I'm trying to make a formula that fills Column 4:
Week 1 would be TRUE because all three entries (B2:B4) are accepted week TRUE
Week 2 has a non accepted entry, therefore all three entries (B5:B7) are FALSE
Week 3 is false as well in Accept Week (B8:B10)
I would appreciate any tip you can give to me.
Use this formula:
=COUNTIFS(A:A,A2,C:C,TRUE) = COUNTIF(A:A,A2)

Excel 2013 complex countif formula

I have a source sheet set up like this:
Days Open Month
10 1
4 1
6 1
2 1
4 2
2 2
-1 2
4 3
6 3
7 4
3 4
etc
I'm trying to set up a formula to count rows based on the following criteria:
cells in Days Open column <=5 and <>-1 where the month is either 2, 3, or 4 (the worksheet will eventually have month numbers up to 12, and I need to group results quarterly). The total must then be divided by the total of ALL rows in which 2, 3, or 4 appears in the Month column.
I can't seem to get the first part of the COUNTIFS to work with both criteria... this is what I have so far that I'm trying to make work:
=COUNTIFS('Cumulative Complaints'!K:K,"<=5",'Cumulative Complaints'!K:K,"<>-1")/(COUNTIF('Cumulative Complaints'!L:L,"2")+COUNTIF('Cumulative Complaints'!L:L,"3")+COUNTIF('Cumulative Complaints'!L:L,"4"))
I've been looking around here and other excel forums and think maybe SUMPRODUCT is the way to go? I haven't been able to get that to work though, given the criteria needed on the Days Open column (<=5 and <>-1).
Try this SUMPRODUCT() FORMULA:
=SUMPRODUCT(('Cumulative Complaints'!K:K<=5)*('Cumulative Complaints'!K:K<>-1)*('Cumulative Complaints'!L:L>=2)*('Cumulative Complaints'!L:L<=4))/SUMPRODUCT(('Cumulative Complaints'!L:L>=2)*('Cumulative Complaints'!L:L<=4))
When using the SUMPRODUCT the condition AND is replaced with the *. It requires all four conditions to be True to return a 1; 1*1*1*1 = 1 if any are false they return 0 so 1*1*0*1 = 0. So as it iterates through the rows it returns a 1 or a 0 to the be added to the sum.
Wrapping a COUNTIF or COUNTIFS function in a SUM function allows you to use an array of constants as OR citeria.
=SUM(COUNTIFS('Cumulative Complaints'!K:K, "<>"&-1,'Cumulative Complaints'!K:K, "<="&5,'Cumulative Complaints'!L:L, {2,3,4}))/SUM(COUNTIF('Cumulative Complaints'!L:L, {2,3,4}))
This is not an array formula and does not require CSE.
My answer would be to take a different approach.
Excel has a very powerful feature called Pivot Tables, and I think it might be a good fit for your problem and other similar problems you may face.
First, I would add a couple columns to your table, like so:
Days Open Month Quarter RecentlyOpened
10 1 1 FALSE
4 1 1 TRUE
6 1 1 FALSE
2 1 1 TRUE
4 2 1 TRUE
2 2 1 TRUE
-1 2 1 FALSE
4 3 1 TRUE
6 3 1 FALSE
7 4 2 FALSE
3 4 2 TRUE
The formula for Quarter is: =CEILING(B2/3,1)
The formula for RecentlyOpened is: =AND(A2<>-1,A2<=5)
Second, select the table, and do Insert > Pivot Table.
Third, drag from the fields to the boxes, like so:
Drag Quarter to the ROWS box
Drag RecentlyOpened to the FILTERS box
Drag Month to the VALUES box
Fourth, click Sum of Month, and select Value Field Settings to change Sum to Count.
Fifth, set the RecentlyOpened filter to TRUE.
The result is this:
Pivot Tables often provide a solution that is more flexible and easier to read and understand versus complex formulas.

Resources