Difference between the last and the first item of a criteria in a list - excel

I have the following Excel spreadsheet
A B C D
1 Product ID Time of Event
2 27152 01.04.2017 08:45:00 27152 70 Min.
3 27152 01.04.2017 09:00:00 29297 108 Min.
4 27152 01.04.2017 09:55:00 28802 28 Min.
5 29297 02.04.2017 11:02:00
6 29297 02.04.2017 12:50:00
7 28802 18.04.2017 11:48:00
8 28802 18.04.2017 12:00:00
9 28802 18.04.2017 12:13:00
10 28802 18.04.2017 12:16:00
In Column A you can find different Product IDs.
In Column B the time when an event happens in the Product ID.
Each event is listed in the table; therefore, a ProductID can appear
several times in Column A.
In Column D I want to show now the difference in minutes between
the first and the last event which happens in a product ID.
D2 = 9:55:00 - 8:45:00 = 70 Min.
D3 = 12:50:00 - 11:02:00 = 108 Min.
D4 = 12:16:00 - 11:48:00 = 28 Min.
Therefore, I would need something like a DIFFERENCE-IF-Formula.
One of my ideas so far was going by the LARGE and SMALL function.
=LARGE(B2:B4;1)-SMALL(B2:B4;1)
However, this way I would have to find each array (B2:B4, B5:B6, B7:B10) seperatly; therefore, I would prefer to have the productID as a criteria in the formula.
Summarized:
Do you have any idea how I could calculate the difference in minutes between the last and the first event of a certain ProdcutID in the list?
I would prefer to avoid any kind of array formula.

=ROUND(MMULT(AGGREGATE({14,15},6,B$2:B$10/(A$2:A$10=C2),1),{1;-1})*1440,1)&" Min"
and copied down.
I've a feeling the separators for horizontal and vertical arrays in German versions of Excel are the period (.) and semicolon (;) respectively, so I believe you'll need:
=RUNDEN(MMULT(AGGREGAT({14.15};6;B$2:B$10/(A$2:A$10=C2);1);{1;-1})*1440;1)&" Min"
though please let me know if that doesn't give the required results.
Regards

With some conditions,
1. assuming that you convert column B into 2 columns
2. times is in ascending order
A B C
Product ID Time of Event TIMES
27152 01.04.2017 8:45:00
27152 01.04.2017 9:00:00
27152 01.04.2017 9:55:00
29297 02.04.2017 11:02:00
29297 02.04.2017 12:50:00
28802 18.04.2017 11:48:00
28802 18.04.2017 12:00:00
28802 18.04.2017 12:13:00
28802 18.04.2017 12:16:00
This will work without using array
=(INDEX($C$2:$C$10,SUMPRODUCT(MAX(ROW($A$2:$A$10)*(D2=$A$2:$A$10))-1))-INDEX($C$2:$C$10,MATCH(D2,$A$2:$A$10,0)))*1440
Convert time into minutes
=(time*1440)
Look for first matching value
=INDEX($C$2:$C$10,MATCH(D2,$A$2:$A$10,0))
Look for last matching value
=INDEX($C$2:$C$10,SUMPRODUCT(MAX(ROW($A$2:$A$10)*(D2=$A$2:$A$10))-1)
NOTE If last value is SMALLER then first value, you will receive an error.

Related

Spotfire calculate difference with respect to previous row value

I have a data as below. I have created column "difference in values" manually, the calculation is value at 8:15 AM - value at 8:00 AM which is 2 in second row and so on for all values of column Tushar and Lohit respectively. How can i do this calculation in Spotfire i believe over and previous function can help but i am unable find anything on this. Please help
Name Time Values Difference in values
Tushar 08:00 AM 2 0
Tushar 08:15 AM 4 2
Tushar 08:30 AM 5 1
Tushar 08:45 AM 6 1
Tushar 09:00 AM 7 1
Lohit 08:00 AM 2 0
Lohit 08:15 AM 4 2
Lohit 08:30 AM 5 1
Lohit 08:45 AM 6 1
This should work
SN([Values] - Max([Values]) over (Intersect(Previous([Time]),[Name])),0)
where Max(..) is just to have an aggregation, since it is only looking at the previous Time row for each value of Name. [so Min would work just as well].
SN(...) is there to set the result to 0 when it is empty (as in the first row of each Name).

How to group by an Attribute and calculate time between consecutive tickets for that Attribute

So, I am working with a Dataframe where there are around 20 columns, but only two columns are really of importance.
Index
ID
Date
1
01-40-50
2021-12-01 16:54:00
2
01-10
2021-10-11 13:28:00
3
03-48-58
2021-11-05 16:54:00
4
01-40-50
2021-12-06 19:34:00
5
03-48-58
2021-12-09 12:14:00
6
01-10
2021-08-06 19:34:00
7
03-48-58
2021-10-01 11:44:00
There are 90 different ID's and a few thousand rows in total. What I want to do is:
Group the entries by the ID's
Order those ID rows by the Date
Then calculate the difference between one timestamp to another
And create a column that has those entries (to then visualize it for the 90 different ID's)
While I thought it would be an easy thing to use the function groupby, I am having quite a bit of trouble. Would appreciate any input as to how to start this! Thank you!
You can do it this way:
>>> df.groupby("ID")["Date"].apply(lambda x: x.sort_values().diff())
ID Index
01-10 6 NaT
2 65 days 17:54:00
01-40-50 1 NaT
4 5 days 02:40:00
03-48-58 7 NaT
3 35 days 05:10:00
5 33 days 19:20:00

How to select a set of values in pandas data frame (multiple colums with multiple row conditions)

I have a huge ass csv file like given below which I opened as dataframe using pandas. I want to extract data from multiple columns at different date sets.
I want to select from a particular date and hour to another for the last 3 column values. The slicing options I tried and googled were for single column.
date heure PM10 NO2 O3
0 01/01/2016 1 27 22 36
1 01/01/2016 2 25 29 27
2 01/01/2016 3 26 47 10
3 01/01/2016 4 16 40 13
4 01/01/2016 5 15 34 13
5 02/01/2016 1 15 34 13
6 02/01/2016 2 15 34 13
Target output - taking data from a particular data and hour to another one.
3 01/01/2016 4 16
4 01/01/2016 5 15
Thank you. The data set is obviously way bigger than 4 No.
You can do this:
df_selected = df[(df.date >= "01/01/2016") &
(df['hour']>=4) &
(df.date < "02/01/2016") &
(df['hour']<6)
].iloc[:,:3] #first three columns
Alternatively, for the columns selection you can use .loc[:,['name', 'of', 'columns']] or for the last n columns .iloc[:,-n:].
Be careful with date because I'm not sure what happens with an "English" date, maybe you have to change the date using df['date'] = pd.to_datetime(df.date).

Computing most recent smaller value

I have an excel sheet with dates (sorted) in one column and values in another. Ex:
1/1/2019 10
1/2/2019 12
1/3/2019 8
1/4/2019 20
1/10/2019 8
1/12/2019 22
I want to compute in a third column, the most recent date such that value was less than or equal to the current value (if the current is the lowest, then use the current date). So, for the sample data above,
1/1/2019 10 1/1/2019
1/2/2019 12 1/1/2019
1/3/2019 8 1/3/2019
1/4/2019 20 1/3/2019
1/10/2019 8 1/3/2019
1/12/2019 22 1/10/2019
Is there a way of accomplishing this without VBA macros?
Here's a way. Paste these in and copy down the column.
Column C: =IF(COUNTIF(B2:B6,D1)=0,A1,MINIFS(A2:A6,B2:B6,D1))
Column D: =CONCATENATE("<",TEXT(VALUE(B1),"#"))
You can hide column D to make it prettier. It's the criteria being used by the COUNTIF and MINIFS. Column C is the output.
1/1/2019 10 1/3/2019 <10
1/2/2019 12 1/3/2019 <12
1/3/2019 8 1/3/2019 <8
1/4/2019 20 1/10/2019 <20
1/10/2019 8 1/10/2019 <8
1/12/2019 22 1/12/2019 <22
Formula view:
43466 10 =IF(COUNTIF(B2:B6,D1)=0,A1,MINIFS(A2:A6,B2:B6,D1)) =CONCATENATE("<",TEXT(VALUE(B1),"#"))
43467 12 =IF(COUNTIF(B3:B7,D2)=0,A2,MINIFS(A3:A7,B3:B7,D2)) =CONCATENATE("<",TEXT(VALUE(B2),"#"))
43468 8 =IF(COUNTIF(B4:B8,D3)=0,A3,MINIFS(A4:A8,B4:B8,D3)) =CONCATENATE("<",TEXT(VALUE(B3),"#"))
43469 20 =IF(COUNTIF(B5:B9,D4)=0,A4,MINIFS(A5:A9,B5:B9,D4)) =CONCATENATE("<",TEXT(VALUE(B4),"#"))
43475 8 =IF(COUNTIF(B6:B10,D5)=0,A5,MINIFS(A6:A10,B6:B10,D5)) =CONCATENATE("<",TEXT(VALUE(B5),"#"))
43477 22 =IF(COUNTIF(B7:B11,D6)=0,A6,MINIFS(A7:A11,B7:B11,D6)) =CONCATENATE("<",TEXT(VALUE(B6),"#"))
This is a little sloppy in that you could use a named value or absolute value for the end of the range, e.g. B$6. Otherwise you're going to be looking at cells below your table, which is fine as long as they're empty, but kind of sloppy.
Column C: =IF(COUNTIF(B2:B$6,D1)=0,A1,MINIFS(A2:A$6,B2:B$6,D1))

Allocate class based on school ranking

In Excel I am trying to allocate classes to pupils based on their ranking in school. The set of data I have looks like this:
S/N Name LevelPosition
1 Andrea 10
2 Bryan 25
3 Catty 5
4 Debbie 26
5 Ellie 30
6 Freddie 28
I would like to have a formula that could sort the pupils based on the LevelPosition and allocate the class in order of this sequence - A,B,C,C,B,A. Hence the result would be:
S/N Name LevelPosition AllocatedClass
3 Catty 5 A
1 Andrea 10 B
2 Bryan 25 C
4 Debbie 26 C
6 Freddie 28 B
5 Ellie 30 A
This was the sort of thing I had in mind.
Column D is just a ranking from bottom to top:-
=RANK(C2,C$2:C$7,1)
Colum D is adjusted for any ties:-
=D2+COUNTIF(D$1:D1,D2)
Column E is based on the #pnuts formula:-
=CHOOSE(MOD(E2-1,6)+1,"A","B","C","C","B","A")
I've put some ties in to show what would happen. The last two students' allocations are reversed because the second to last has the higher mark.

Resources