How to exclude null value when applying Over function in Spotfire - spotfire

This is the date I have now,
Student Original End Date Start Date
A 3/22/2018 3/23/2018
A 3/22/2018 3/23/2018
A 3/23/2018
A 3/23/2018
A 5/20/2018 5/21/2018
A 5/20/2018 5/21/2018
B 2/1/2018 3/1/2018
B 3/1/2018
B 2/1/2018 2/2/2018
C 3/1/2018 3/2/2018
C 3/1/2018 3/2/2018
And I would like to get result like this,
Student Original End Date Start Date Result
A 3/22/2018 3/23/2018 TRUE
A 3/22/2018 3/23/2018 TRUE
A 3/23/2018 TRUE
A 3/23/2018 TRUE
A 5/20/2018 5/21/2018 TRUE
A 5/20/2018 5/21/2018 TRUE
B 2/1/2018 3/1/2018 FALSE
B 3/1/2018 FALSE
B 2/1/2018 2/2/2018 FALSE
C 3/1/2018 3/2/2018 TRUE
C 3/1/2018 3/2/2018 TRUE
I would like to have result returned as True if Original End Date + 1 = Start Date for each student. For example, A, although it has null in original end date, but the rest matches the logic, so all its results are True. B, it has null original end date, and only one of its Original Start Date + 1 = Start Date, therefore, all the B results should be False
Here is the Spotfire calculated fields' code I have now, but it does not return the results for the null value, and for B's result, it has both True and False.
DateAdd('dd',1,Min([Original End Date]) over [Student]) = [Start Date]
I am wondering if I should add case when or if here? If so, how to do it?

Related

Excel Formula based on previous rows

There are 3 columns:
Date, Name, Bonus_Point?
If a player scores a 4 or lower in the Name Column for three consecutive Dates, then Bonus_Point will return a 'Yes' or 'No'
For example, for 1/30/22, there would be a 'Yes' because there were 3 previous instances (including 1/30/22) where the score is less than or equal to 4.
But for 2/2/22, Bonus_Point? would be 'No' because on the third day, Name scored a 5.
Assuming your columns are A through C, and the row 1 is the header row and your data is in rows 2 and down, enter this formula in C4:
=AND(B2<=4,B3<=4,B4<=4)
Then fill down. (See further down for "yes" and "no")
Date
Name
Bonus_Point?
1/28/22
3
1/29/22
3
1/30/22
3
TRUE
1/31/22
3
TRUE
2/1/22
4
TRUE
2/2/22
5
FALSE
2/3/22
2
FALSE
2/4/22
5
FALSE
2/5/22
4
FALSE
2/6/22
3
FALSE
2/7/22
2
TRUE
2/8/22
3
TRUE
2/9/22
4
TRUE
2/10/22
3
TRUE
2/11/22
2
TRUE
2/12/22
2
TRUE
3/13/22
3
TRUE
If you want "Yes" and "No", you can do that through formatting or add it to the formula:
=IF(AND(B2<=4,B3<=4,B4<=4),"Yes","No")

Excel Filter with formula

I have the following data:
Date Day Ranch
25/05/2018 Friday FALSE
26/05/2018 Saturday TRUE
27/05/2018 Sunday FALSE
28/05/2018 Monday FALSE
29/05/2018 Tuesday TRUE
30/05/2018 Wednesday FALSE
I would like to have a formula which scans the ranch column for the lowermost TRUE value, and remembers it's corresponding date, then scans the ranch column for the second-most-low TRUE value, and remember it's corresponding date, and then subtracts the first date from the second date.
To put it a bit more simply, I want to add a column to this table which tells me the days since the last TRUE value occurred. so the resulting table should look something like this:
Date Day Ranch Days since last Ranch
25/05/2018 Friday FALSE 0 (Hardcoded)
26/05/2018 Saturday TRUE 0
27/05/2018 Sunday FALSE 1
28/05/2018 Monday FALSE 2
29/05/2018 Tuesday TRUE 0
30/05/2018 Wednesday FALSE 1
How could this be done?
Assuming above mentioned data is in grid A2:C7 you can try below formula:
=IF(C2,0,IFERROR(A2-LOOKUP(2,1/$C$1:C1,$A$1:A1),"0/Unknown"))
I have assumed that column C values are Boolean values.

pandas create a Boolean column for a df based on one condition on a column of another df

I have two dfs, A and B. A is like,
date id
2017-10-31 1
2017-11-01 2
2017-08-01 3
B is like,
type id
1 1
2 2
3 3
I like to create a new boolean column has_b for A, set the column value to True if its corresponding row (A joins B on id) in B does not have type == 1, and its time delta is > 90 days comparing to datetime.utcnow().day; and False otherwise, here is my solution
B = B[B['type'] != 1]
A['has_b'] = A.merge(B[['id', 'type']], how='left', on='id')['date'].apply(lambda x: datetime.utcnow().day - x.day > 90)
A['has_b'].fillna(value=False, inplace=True)
expect to see A result in,
date id has_b
2017-10-31 1 False
2017-11-01 2 False
2017-08-01 3 True
I am wondering if there is a better way to do this, in terms of more concise and efficient code.
First merge A and B on id -
i = A.merge(B, on='id')
Now, compute has_b -
x = i.type.ne(1)
y = (pd.to_datetime('today') - i.date).dt.days.gt(90)
i['has_b'] = (x & y)
Merge back i and A -
C = A.merge(i[['id', 'has_b']], on='id')
C
date id has_b
0 2017-10-31 1 False
1 2017-11-01 2 False
2 2017-08-01 3 True
Details
x will return a boolean mask for the first condition.
i.type.ne(1)
0 False
1 True
2 True
Name: type, dtype: bool
y will return a boolean mask for the second condition. Use to_datetime('today') to get the current date, subtract this from the date column, and access the days component with dt.days.
(pd.to_datetime('today') - i.date).dt.days.gt(90)
0 False
1 False
2 True
Name: date, dtype: bool
In case, A and B's IDs do not align, you may need a left merge instead of an inner merge, for the last step -
C = A.merge(i[['id', 'has_b']], on='id', how='left')
C's has_b column will contain NaNs in this case.

Excel: Consider only cells with given value - Recursive formula

I'm trying to make a formula that lets me easily extrapolate a quality within a subser
Let's say I have the following set of data:
Week Name Accepted? Accept Week?
1 a TRUE
1 b TRUE
1 c TRUE
2 d FALSE
2 e TRUE
2 f TRUE
3 g FALSE
3 h FALSE
3 i FALSE
Three weeks, three entries each
I'm trying to make a formula that fills Column 4:
Week 1 would be TRUE because all three entries (B2:B4) are accepted week TRUE
Week 2 has a non accepted entry, therefore all three entries (B5:B7) are FALSE
Week 3 is false as well in Accept Week (B8:B10)
I would appreciate any tip you can give to me.
Use this formula:
=COUNTIFS(A:A,A2,C:C,TRUE) = COUNTIF(A:A,A2)

Excel array query

I'm struggling to understand the mechanics of a particular array formula. I have a row of data ranging from January 2015 to December 2016. Let's assume the data is populated up to October 2016 and the sum in October is £1,000. When data is entered into November 2016 say £1,250, the formula below automatically calculates the delta between the two months. How did the formula do that. Could someone help provide a simple explanation of the below, in particular how it knew to deduct the latest month from the prior month.
=(INDEX(60:60,MAX(IF(M60:AV60<>"",COLUMN(M60:AV60)))))-(INDEX(60:60,MAX(IF(M60:AV60<>"",COLUMN(M60:AV60)-1))))
Thanks for your help,
Miles
It's a little complex, but let's break it down a piece at a time.
This looks to be an array formula, which means that rather than dealing with a single cell, it can deal with a whole set of cells at once.
M60:AV60<>"" This segment produces an array (list) of TRUE and FALSE values, looking at each cell between M60 and AV60. Wherever the cell contains a value - ie is not blank - it returns TRUE. Wherever the cell does not contain a value, it returns FALSE. This list exists only in the program's working memory, and it isn't recorded anywhere in the sheet. So we have something like this:
TRUE
TRUE
TRUE
TRUE
TRUE
TRUE
FALSE
FALSE
FALSE
FALSE
FALSE
COLUMN(M60:AV60) This segment produces another array, the same size as the TRUE/FALSE array above, that simply contains the column numbers of every cell from M60 to AV60. We now have two lists - one containing TRUE/FALSE, and one containing numbers, both the same length.
TRUE | 1
TRUE | 2
TRUE | 3
TRUE | 4
TRUE | 5
TRUE | 6
FALSE | 7
FALSE | 8
FALSE | 9
FALSE | 10
FALSE | 11
IF(M60:AV60<>"",COLUMN(M60:AV60)) This IF statement combines the TRUE/FALSE array with the column numbers array to get something more useful. Wherever there is a TRUE in the first array, it is replaced with the corresponding number from the second array; wherever there is a FALSE in the first array, nothing is changed, and the value stays at FALSE. This way, we end up with a list of numbers, representing the columns of each non-blank cell. It's the equivalent of running the IF formula on all the members of the array.
IF | TRUE |THEN| 1 = 1
IF | TRUE |THEN| 2 = 2
IF | TRUE |THEN| 3 = 3
IF | TRUE |THEN| 4 = 4
IF | TRUE |THEN| 5 = 5
IF | TRUE |THEN| 6 = 6
IF | FALSE |THEN| 7 = 0
IF | FALSE |THEN| 8 = 0
IF | FALSE |THEN| 9 = 0
IF | FALSE |THEN| 10 = 0
IF | FALSE |THEN| 11 = 0
The last column, after the =, is what is passed to the MAX function.
MAX(IF(M60:AV60<>"",COLUMN(M60:AV60))) This segment cuts down the list of numbers to just one number, the Max or highest number in the list. Thus we end up with a single result, which represents the last column that contains a value.
INDEX(60:60,MAX(IF(M60:AV60<>"",COLUMN(M60:AV60))))) The INDEX function looks at all of row 60, and returns a value from a specified column in that row. That being the column returned by the previous segments - the last column that contains a value.
The second half of the formula with the second INDEX function does exactly the same thing, but it subtracts 1 from the column number returned - that is, it gets the second-to-last column that has a value.
The end result is subtracting the second-to-last value from the last value, to get the difference between them.

Resources