Difference between last and first in Spotfire - spotfire

I have the following data set with 2 columns - Period, Score
Period Score
3/1/2016 2
3/1/2017 3
12/1/2018 3
3/1/2016 3
3/1/2017 3
12/1/2018 3
3/1/2016 2
3/1/2017 3
12/1/2018 4
3/1/2016 2
3/1/2017 3
12/1/2018 4
3/1/2016 2
3/1/2017 2
I am looking for an expression which finds out the Difference of average scores between the first and last period. In the above example,
Average Score in first period = Avg(score) in 3/1/2016 = (2+3+2+2)/4 = 2.25
Average Score in last period = Avg(score) in 12/1/2018 = (3+3+4+4)/4 = 3.5
Difference in average score change between first and last period = 3.5 - 2.25 = 1.25

Calculated column 1: Average Score Over Period
Avg([Score]) OVER ([Period])
Calculated column 2: Difference
Max([Average Score Over Period]) - Min([Average Score Over Period])

Related

Spotfire calculate difference with respect to previous row value

I have a data as below. I have created column "difference in values" manually, the calculation is value at 8:15 AM - value at 8:00 AM which is 2 in second row and so on for all values of column Tushar and Lohit respectively. How can i do this calculation in Spotfire i believe over and previous function can help but i am unable find anything on this. Please help
Name Time Values Difference in values
Tushar 08:00 AM 2 0
Tushar 08:15 AM 4 2
Tushar 08:30 AM 5 1
Tushar 08:45 AM 6 1
Tushar 09:00 AM 7 1
Lohit 08:00 AM 2 0
Lohit 08:15 AM 4 2
Lohit 08:30 AM 5 1
Lohit 08:45 AM 6 1
This should work
SN([Values] - Max([Values]) over (Intersect(Previous([Time]),[Name])),0)
where Max(..) is just to have an aggregation, since it is only looking at the previous Time row for each value of Name. [so Min would work just as well].
SN(...) is there to set the result to 0 when it is empty (as in the first row of each Name).

Time manipulations

Hello I have to count how many people were scheduled on each hour in excel so I transformed starting and ending data/time to only contain time and basing on it I tried to substract these two information but I only get an hour then but what I need is the hours to be like this:
instead
starting on 9:00
ending on 17:00
this
9:00
10:00
11:00
12:00
13:00
14:00
15:00
16:00
17:00
to count every hour that employee was at work. But I don't know how :(
Or is there a better way of doing that?
Assuming your table looks something like this:
Person
Start
End
09:00
10:00
11:00
12:00
13:00
14:00
15:00
Alice
08:35
16:35
1
1
1
1
1
1
1
Bob
09:35
17:35
0
1
1
1
1
1
1
Carl
10:35
18:35
0
0
1
1
1
1
1
Dan
11:35
19:35
0
0
0
1
1
1
1
Ed
12:35
20:35
0
0
0
0
1
1
1
Total present
1
2
3
4
5
5
5
You can compute the entries 0 or 1 in each cell under the times using the formula
=IF(AND((E$4>$C6);(E$4<=$D6));1;0)
In the formula, E$4 is a reference to the column header, e.g. "9:00", $C6 and $D6 are references to the start and end times of the person. They are defined using partial absolute references ($) so the same formula can be copied and pasted in all the cells.
The result will be 1 if the person was present at that time and 0 if not.
The "Total present" formulas just sum up the 1's and 0's in the column.

Pandas : Finding correct time window

I have a pandas dataframe which gets updated every hour with latest hourly data. I have to filter out IDs based upon a threshold, i.e. PR_Rate > 50 and CNT_12571 < 30 for 3 consecutive hours from a lookback period of 5 hours. I was using the below statements to accomplish this:
df_thld=df[(df['Date'] > df['Date'].max() - pd.Timedelta(hours=5))& (df.PR_Rate>50) & (df.CNT_12571 < 30)]
df_thld.loc[:,'HR_CNT'] = df_thld.groupby('ID')['Date'].nunique().to_frame('HR_CNT').reset_index()
df_thld[(df_thld['HR_CNT'] >3]
The problem with this approach is that since lookback period requirement is 5 hours, so, this HR_CNT can count any non consecutive hours breaching this critieria.
MY Dataset is as below:
DataFrame
Date IDs CT_12571 PR_Rate
16/06/2021 10:00 A1 15 50.487
16/06/2021 11:00 A1 31 40.806
16/06/2021 12:00 A1 25 52.302
16/06/2021 13:00 A1 13 61.45
16/06/2021 14:00 A1 7 73.805
In the above Dataframe, threshold was not breached at 1100 hrs, but while counting the hours, 10,12 and 13 as the hours that breached the threshold instead of 12,13,14 as required. Each id may or may not have this critieria breached in a single day. Any idea, How can I fix this issue?
Please excuse me, if I have misinterpreted your problem. As I understand the issues you have a dataframe which is updated hourly. An example of this dataframe is illustrated below as df. From this dataframe, you want to filter only those rows which satisfy the following two conditions:
PR_Rate > 50 and CNT_12571 < 30
If and only if the threshold is surpassed for three consecutive hours
Given these assumptions, I would proceed as follows:
df:
Date IDs CT_1257 PR_Rate
0 2021-06-16 10:00:00 A1 15 50.487
1 2021-06-16 12:00:00 A1 31 40.806
2 2021-06-16 14:00:00 A1 25 52.302
3 2021-06-16 15:00:00 A1 13 61.450
4 2021-06-16 16:00:00 A1 7 73.805
Note in this dataframe, the only time fr5ame which satisfies the above conditions is the entries for the of 14:00, 15:00 and 16:00.
def filterFrame(df, dur, pr_threshold, ct_threshold):
ff = df[(df['CT_1257']< ct_threshold) & (df['PR_Rate'] >pr_threshold) ].reset_index()
ml = list(ff.rolling(f'{dur}h', on='Date').count()['IDs'])
r = len(ml)- 1
rows= []
while r >= 0:
end = r
start = None
if int(ml[r]) < dur:
r -= 1
else:
k = int(ml[r])
for i in range(k):
rows.append(r-i)
r -= k
rows = rows[::-1]
return ff.filter(items= rows, axis = 0).reset_index()
running filterFrame(df, 3, 50, 30) yields:
level_0 index Date IDs CT_1257 PR_Rate
0 1 2 2021-06-16 14:00:00 A1 25 52.302
1 2 3 2021-06-16 15:00:00 A1 13 61.450
2 3 4 2021-06-16 16:00:00 A1 7 73.805

Calculating Cumulative Average every x successive rows in Excel( not to be confused with Average every x rows gap interval)

I want to calculate cumulative average every 3 rows from the value field. Above figure shows the Column cumulative average which is expected output. Tried offset method but it gives the average after every 3 rows gap interval and not the cumulative average every 3 continuous rows.
Use Series.rolling with mean and then Series.shift:
N = 3
df = pd.DataFrame({'Value': [6,9,15,3,27,33]})
df['Cum_sum'] = df['Value'].rolling(N).mean().shift(-N+1)
print (df)
Value Cum_sum
0 6 10.0
1 9 9.0
2 15 15.0
3 3 21.0
4 27 NaN
5 33 NaN

Power BI - DAX equivalent of averageif

how can I get DAX version of average if for the following datatable?
Week NScheduled Ave per week
1 1 1
1 1 1
1 1 1
1 1 1
2 6 3.5
2 1 3.5
3 4 2.666666667
3 3 2.666666667
3 1 2.666666667
It is simple average for each week?
You can use this formula for a calculated column:
Ave per week = CALCULATE(
AVERAGE(Table1[NScheduled]);
FILTER(Table1; Table1[Week] = EARLIER(Table1[Week]))
)

Resources