How can i calculate the difference of two cells based on the values of a check-column? - excel

I am struggling with finding a way to calculate the difference between two cells, if a criteria is matched on a check-column that I have. to further explain this, i have the dataset as below:
Example of the table
In this table, column A is where I have times in which a bus has performed a cycle, stopping at each bus stop. In column B I have distinguished between start of the cycle, half of the cycle and end of the cycle. In column C i want to calculate the full time in which a cycle is performed, in other words i want to calculate the difference between time (in column A) of the end of the cycle (when "End" in column B is met) and time of the start of the cycle (when "Start" in column B is met)
Please note that the number of stops is different from cycle to cycle as the bus does not always perform the same stops each cycle. Therefor this task becomes a bit complicated for me. And also the number of cycles is very large, that why I am trying to find an automated way to calculate the difference explained above based on the criteria that first parameter ofthe difference should be taken if "End" is met and second parameter should be taken if "Start" is met.
I have no clue on where to start with this, so until now i have gotten nowhere in trying a solution for myself.
Thank you for your time

Well maybe a Index and match calculation.
=(INDEX(Table1[Column1],MATCH(E10,Table1[Column2],0),1))-(INDEX(Table1[Column1],MATCH(E9,Table1[Column2],0),1))
Maybe SumIF?
=SUMIF(Table1[Column2],E10,Table1[Column1])-SUMIF(Table1[Column2],E9,Table1[Column1])
Regardless the result looks like this:

If the time is sequential, I like FILTER.
Something like: =IF(B2="stop", A2-MAX(FILTER(A$2:A2, B$2:B2="Start")), "")
Here you filter the previous rows in the data frame (make sure you freeze the right cells!) to only show those that also have "Start" in the row, then select the largest one (the MAX function) from the filter to subtract from the current time. If the current row is not a "Stop", we elect to leave the cell blank.

Related

I need to average cells in a column based on the time of day in another column in excel, i.e. morning, afternoon and night

I am trying to get the averages of a column that meet certain criteria into a single cell.
The following formula works: =AVERAGEIFS(tblData[Sys],tblData[Time],">=12:00 PM",tblData[Time],"<6:00 PM") but when I adjust the time values to: =AVERAGEIFS(tblData[Sys],tblData[Time],">=6:00 PM",tblData[Time],"<4:00 AM") I get an error. I'm guessing it's because the time range goes into the next day.
Is there a better function to use or a workaround?
This formula seems to work for you scenario. It checks for the possibility that the end time is before the start time (but on the next day) and changes the logic accordingly:
=AVERAGE(
IF(
IF(tmStart>=tmEnd,
(tblData[Time]>=tmStart)+(tblData[Time]<=tmEnd),
(tblData[Time]>=tmStart)*(tblData[Time]<=tmEnd))
=1,tblData[Sys],""))
It's important to understand this function does, and the underlying intentions are - I elaborate in the what follows, proffer a workable solution, provide a reconciliation, as well as a link to the workbook with screenshot below.
You will need to specify # days spanned - even the scenario that 'worked' for you (i.e.. =AVERAGEIFS(tblData[Sys],tblData[Time],">=12:00 PM",tblData[Time],"<6:00 PM") could, in theory, span 2 (or more) days. The fact that this 'works' (doesn't return #DIV/0!) is that the 'intersection' of conditions is an non-empty set (i.e. {12pm-5pm}).
As I say, if this was intended to be ">=12:00" from day 1 through to "<6pm" the following, there is no way of determining whether this is indeed the case by simply 'comparing the times' (e.g. 12pmvs 6pm).
Screenshot/here refer::
=IF($I$5="Y",(SUM(1*($C$5:$C$28*(($D$5:$D$28>=$J$3)+1*($D$5:$D$28<$K$3)))))/SUM((($D$5:$D$28>=$J$3)+1*($D$5:$D$28<$K$3))),IFERROR(AVERAGEIFS(C5:C28,D5:D28,">="&J3,D5:D28,"<"&K3),"times don't intersect! "))
where: I5 = 'Y' or 'N' (i.e. multiple days'). When scenario B is selected, with multiple days = 'Y', outcome = 12.9 which reconciles to a manual calculation.

Excel Group rows based on time interval

I have the following excel result:
I want to group the above result in groups based on sessions i.e. if the time gap between two successive timestamps is greater than 5 minutes, it must be a new row.
For example :
I need some formula to achieve this. As I'm fairly new to Excel this is causing to be a major headache for me. Please help me, if anyone knows how to do it or at least point me in a direction.
Thanks a ton !!!
Judging by your screenshot, it appears your timestamps are actually text values. Text by default is usually left aligned where as numbers are right aligned. You seem to have a space at the end of your time stamp suggesting that it is probably left aligned and therefore text. You can test it with the following formula which will return TRUE if its text.
=ISTEXT(P2)
where P2 is one of your time stamps.
CONVERT TIMESTAMPS TO TIME
There are a variety of ways to do this. Some will depend on system settings. Take a look at the following functions as each might be useable depending on your system. The first two are a guarantee, the last two are more dependent on system settings.
DATE
TIME
DATEVALUE
TIMEVALUE
Something important to remember here is that in excel dates are integers counting the days since 1900/01/01 with that date being 1. Time is stored as a decimal and represents fraction/percentage of a day. 24:00:00 is not a valid time in excel though some functions may work with it.
So in order to convert your time stamp in P2 I used the following formula to pull out the date:
=DATE(LEFT(P2,4),MID(P2,FIND("-",P2)+1,2),MID(P2,FIND(" ",P2)-2,2))
Basically it goes into the text and strips out the individual numbers for Year, Month and Day.
To pull out the time, I could have done the same procedure but elected to demonstrate the TIMEVALUE method which is a little more robust than DATEVALUE and not a subjective to system settings as much. With the following formula I stripped out the whole time code (MINUS"UTC"):
=TIMEVALUE(TRIM(MID(P2,FIND(" ",P2)+1,FIND("UTC",P2)-FIND(" ",P2)-1)))
I also made an assumption that you are not mixing and matching UTC with other time zones which means it can be ignored. Now to get DATE and TIME all in one cell, you just need to add the two formulas together to get:
=DATE(LEFT(P2,4),MID(P2,FIND("-",P2)+1,2),MID(P2,FIND(" ",P2)-2,2))+TIMEVALUE(TRIM(MID(P2,FIND(" ",P2)+1,FIND("UTC",P2)-FIND(" ",P2)-1)))
In the example at the end, I placed that formula in Q2 and copied down
DELTA TIME
Since you want to break your groups out based on a time difference between individual entries, I used a helper column to store the time difference. In my example at the end I stored this difference in Column S. The first entry is blank as there is no time before it. I used the following formula in S3 and copied it downward.
=Q3-Q2
I applied the custom formatting of [h]:mm:ss to the cell to get it to display as shown.
FIND GROUP BREAK POINTS
In my example I am using helper column T to hold breakpoint flags. At a minimum, you will have two break points. Your first time entry and your last time entry. To make like simple I simply hard coded my first breakpoint flag in T2 as 1. Stating in T3, Three checks need to be made. If any of them are TRUE then the next flag needs to be added with a value increase by one. the three checks are:
Is this the last entry
Is the next time delta greater than 5 minutes (means end of a group)
Is this time delta greater than 5 minutes (means start of a group)
Based on those three checks I placed the following formula in T3 and copied down:
=IF(OR(S4="",S4>TIME(0,5,0),S3>TIME(0,5,0)),MAX($T$2:T2)+1,"")
Note the $ on the first part of the range for the MAX function. This will lock the start of the range while the formula gets copied down while the end of the range increases accordingly.
Also the row after the last time entry must be blank. IF it is not blank and has a set value in it, change the S4="" to S4="set value".
GENERATE TABLE
There are multiple ways to reference the flags and pull the corresponding times. a couple of formulas you can look into are:
INDEX / MATCH
LOOKUP
In this example I elected to use LOOKUP though I believe INDEX and MATCH are more appropriate and robust. For starters we want to generate a list of ODD number and EVEN numbers. These represent the start and end of the groups and correspond to the flags set in column T. One way to generate ODD and EVEN numbers as you copy down is:
=ROW(A1)*2-1 (ODD)
=ROW(A1)*2 (EVEN)
The next step is to find the generated number in Column T and then pull its corresponding timestamp in Column Q. I did this with the following formula in V2 and copied down.
=LOOKUP(ROW(A1)*2-1,T:T,Q:Q)
And in W2
=LOOKUP(ROW(A1)*2,T:T,Q:Q)

Spotfire: calculating difference between two rows in same column based on attributes in different columns

I am new to Spotfire and need help in getting the right expression for a calculated column.
My Data contains different subjects grouped in column ID. For every ID, Bodyweight was measured on different days. Days are given in column Day and stated as 1,2,3...
The last day is denoted by Last and Bodyweight measurements given in another column. Another column is present which is called Baseline. The Body weight measured is considered as baseline if the column contains a Y for that row.
I need to insert a calculated column, which will contain the difference between Body measurement measured on Day denoted Last and Body measurement marked by Y in column Baseline.
This should be done for every new ID. I am not able to figure this out. Could someone advise me on how to go about it?
Here is an example attached
So, the calculated column for Rita will give -4 (body weight at Last=56 and BodyWeight at baseline=56, so 52-56 =-4)
the sample data you provided is a little weird, particularly the [Day] column. if it's within your control, I suggest to use actual dates rather than a number/string here.
barring that, I was able to get your desired results, but it required two calculated columns: the first one will consolidate the [Day] and [Baseline] columns into a single column, and the second one contains your desired info.
column 1, which I called Day (int):
CASE
WHEN [Day]="Last" THEN 1000000
WHEN [Baseline]="Y" THEN -1000000
WHEN [Day]!="Last" THEN Integer([Day])
END
I picked a random high and low max to establish a chronological order. this will put 1000000 in place of "Last" (if you have any programs that are longer than one million days, you'll need to increase this number). the same for the [Baseline] column, but that value will be -1000000, which is presumably the lowest value you will ever see in this column. both of these are assumptions and may not work for your implementation. finally, in all other cases, the day number will be used.
column 2, which I called Diff:
Last([Weight]) OVER (Intersect([Name],LastNode([Day (int)]))) -
First([Weight]) OVER (Intersect([Name],FirstNode([Day (int)])))
the first line uses what's called an OVER expression to retrieve the first value for [Weight], ordered by [Day (int)], per [Name]. the second line gives the reverse of that, and so the difference is calculated as -4 (or whatever is appropriate).

Spotfire- calculated column with row ratios based on condition

I’m having trouble understanding if Spotfire allows for conditional computations between arbitrary rows containing numerical data repeated over data groups. I could not find anything to cue me onto a right solution.
Context (simplified): I have data from a sensor reporting state of a process and this data is grouped into bursts/groups representing a measurement taking several minutes each.
Within each burst the sensor is measuring a signal and if a predefined feature (signal shape) was detected the sensor outputs some calculated value, V quantifying this feature and also reports a RunTime at which this happened.
So in essence I have three columns: Burst number, a set of RTs within this burst and Values associated with these RTs.
I need to add a calculated column to do a ratio of Values for rows where RT is equal to a specific number, let’s say 1.89 and 2.76.
The high level logic would be:
If a Value exists at 1.89 Run Time and a Value exists at 2.76 Run Time then compute the ratio of these values. Repeat for every Burst.
I understand I can repeat the computation over groups using OVER operator but I’m struggling with logic within each group...
Any tips would be appreciated.
Many thanks!
The first thing you need to do here is apply an order to your dataset. I assume the sample data is complete and encompasses the cases in your real data, thus, we create a calculated column:
RowID() as [ROWID]
Once this is done, we can create a calculated column which will compute your ratio over it's respective groups. Just a note, your B4 example is incorrect compared to the other groups. That is, you have your numerator and denominator reversed.
If(([RT]=1.89) or ([RT]=2.76),[Value] / Max([Value]) OVER (Intersect([Burst],Previous([ROWID]))))
Breaking this down...
If(([RT]=1.89) or ([RT]=2.76), limits the rows to those where the RT = 1.89 or 2.76.
Next comes the evaluation if the above condition is TRUE
[Value] / Max([Value]) OVER (Intersect([Burst],Previous([ROWID])))) This takes the value for the row and divides it by the Max([Value]) over the grouping of [Burst] and AllPrevious([ROWID]). This is noted by the Intersect() function. So, the denominator will always be the previous value for the grouping. Note that Max() was a simple aggregate used, but any should do for this case since we are only expecting a single value. All Over() functions require and aggregate to limit the result set to a single row.
RESULTS

How do I sum time differences between "SET" and "RESET" events?

I want to go through my logs and find out how long each output has been on during a specific time period. Pseudo coding it I would find the oldest SET then look for the next RESET entry calculate the duration between the two timestamps and add that time quantity in minutes or decimal hours to a sum then find the next SET and Reset and add that to the sum as well. There will likely be times when there is a RESET event but the SET event is outside my search window and I can ignore those. There will be many different outputs and I want the sum for each distinct "system" and "code" in the code column as the code contains unit information and the other columns are ancillary.
Example
Time window from 11/15/2015 03:00 to 11/18/2015 03:00
Spreadsheet looks like this:
System,Time Stamp,Code,Unit,Event Text,Set/Reset,
1,11/17/2015 21:41,ABCD,A,Temp is too high,RESET,
1,11/17/2015 21:39,ABCF,B,Movement is too slow,SET,
1,11/17/2015 21:41,DCTY,A,Air flow rate is unstable,SET,
1,11/17/2015 21:44,DCTY,A,Air flow rate is unstable,RESET,
1,11/17/2015 21:43,ABCF,B,Movement is too slow,RESET,
1,11/17/2015 21:43,CATG,C,Door ajar,SET,
When manually crunching the numbers I know Unit B had 4 minutes of code ABCF and unit B had 3 minutes of DCTY. Unit C's CATG has 1 day 13 hours and 45 minutes of set time since it has not yet reset before the end of the window. Also there will most of the time be gaps between the reset and the next set event, so there are two modes: time from set to next reset and reset to next set. I only care about the set to reset duration sums as each set to reset sequence may repeat multiple times. My purpose in seeking these durations is right now I have only been using set event frequency to track issues but an event that is not cleared is not highlighted that way.
Bonus: Can this be done without VBA scripting?
This is a solution that addresses the majority of your requirements. In column G starting in G2 you would place the following Array Formula:
=IF(F2="SET",SUM(IF($C:$C=C2,1,0)*IF($F:$F="RESET",1,0)*IF(ISNUMBER($B:$B),$B:$B,0))-B2,"")
This is inputted using CTRL + SHIFT + ENTER. Then copied down to all your rows.
Basically it extracts into an array values of 1 for rows that correspond to the same Code and are "RESET" but only for rows that are "SET". It then multiplies these 2 arrays (not matrix multiplication FYI) by the column with the Time Stamps effectively extracting only Time Stamp that confirm to both conditions. Note that as the formula then sums the array this will only be effective if there is only 1 set of SET/RESET for that code. You can add another condition based on the Unit much the same way.
Not a full solution but I hope a starting point.

Resources