I am trying to insert a calculated column such that when T1 = CMP 1 Stops it should copy the timestamp when T1 = CMP 1 starts'
timestamp T1 Calculated Expected
5/1/2017 14:00
5/1/2017 14:15
5/1/2017 14:30 CMP 1 Starts
5/1/2017 14:45 CMP 1 Stops 5/1/2017 14:30 5/1/2017 14:30
5/1/2017 15:00
5/1/2017 15:15
5/1/2017 15:30
5/1/2017 15:45
5/1/2017 16:00
5/1/2017 16:15
5/1/2017 16:30 CMP 1 Starts
5/1/2017 16:45 CMP 1 ON
5/1/2017 17:00 CMP 1 Stops 5/1/2017 16:45 5/1/2017 16:30
5/1/2017 17:15
5/1/2017 17:30
5/1/2017 17:45
5/1/2017 18:00
5/1/2017 18:15
5/1/2017 18:30
5/1/2017 18:45 CMP 1 Starts
5/1/2017 19:00 CMP 1 ON
5/1/2017 19:15 CMP 1 Stops 5/1/2017 19:00 5/1/2017 18:45
5/1/2017 19:30
5/1/2017 19:45
Example: Expected column
Note: It is not necessary that it should fill the same row when T1=CMP 1 Stops, even it fill all null values with values when T1=CMP 1 Starts it will work for me
The first expression you will need is:
If((Trim([T1])="CMP 1 Stops") or (Trim([T1])="CMP 1 Starts"),Max([timestamp]) over (PreviousPeriod([timestamp]))) as [YourNewColumn]
Then, if you want to limit it to the rows where [T1] = "CMP 1 Stops" just add another calculated column:
case when [T1] = "CMP 1 Stops" then [YourNewColumn] end as [YourFinalColumn]
Related
I would like to fetch the start datetime and end datetime while the value of data is zero.
The data is in Postgresql.
If I get the Postgresql solution much help full or Python using numpy or pandas.
for example
column 1 will contain datetime
column 2 will contain values.
DateTime Value
06-07-2021 12:00 -521362.8779
06-07-2021 12:15 -57275.52732
06-07-2021 12:30 0
06-07-2021 12:45 0
06-07-2021 13:00 0
06-07-2021 13:15 0
06-07-2021 13:30 0
06-07-2021 13:45 0
06-07-2021 14:00 -57275.52732
06-07-2021 14:15 -377411.4886
06-07-2021 14:30 -377411.4886
06-07-2021 14:45 0
06-07-2021 15:00 0
06-07-2021 15:15 0
06-07-2021 15:30 -889863.5254
06-07-2021 15:45 -1194683.49
06-07-2021 16:00 0
06-07-2021 16:15 0
06-07-2021 16:30 0
06-07-2021 16:45 0
06-07-2021 17:00 -89539.05766
06-07-2021 17:15 -1117269.624
06-07-2021 17:30 -857357.2725
The required output shall be
Column 1 serial no,
Column 2 Start DateTime,
Column 3 End DateTime
Serial No Start DateTime End DateTime
1 06-07-2021 12:30 06-07-2021 13:45
2 06-07-2021 14:45 06-07-2021 15:15
3 06-07-2021 16:00 06-07-2021 16:45
Assuming the type of your DateTime column is already datetime or you transform your above string into a dataframe using
df = pd.read_csv(io.StringIO(df_string), sep='\s{2,}',engine='python',parse_dates=['DateTime'])
then you do
x = df['Value'].to_numpy()
mask = np.empty(x.shape[0], 'bool')
mask[0] = x[0] == 0
mask[1:] = (x[1:] == 0) & (x[:-1] != 0)
mask2 = np.empty(x.shape[0], 'bool')
mask2[-1] = x[0] == 0
mask2[:-1] = (x[1:] != 0) & (x[:-1] == 0)
df2 = pd.DataFrame({'Start': df['DateTime'][mask].reset_index(drop=True),
'End' :df['DateTime'][mask2].reset_index(drop=True)})
and you get
Start End
0 2021-06-07 12:30:00 2021-06-07 13:45:00
1 2021-06-07 14:45:00 2021-06-07 15:15:00
2 2021-06-07 16:00:00 2021-06-07 16:45:00
I just compare the current row with next/previous row values. If one is zero and the other is not, then it's a Start or End.
You can use the shift method to shift the rows.
df1 = pd.DataFrame()
df1['Start DateTime'] = (
df[(df['Value'] == 0) & (df['Value'].shift() != 0)]
['DateTime'].reset_index(drop=True) )
df1['End DateTime'] = (
df[(df['Value'] == 0) & (df['Value'].shift(-1) != 0)]
['DateTime'].reset_index(drop=True))
Start DateTime
End DateTime
0
06-07-2021 12:30
06-07-2021 13:45
1
06-07-2021 14:45
06-07-2021 15:15
2
06-07-2021 16:00
06-07-2021 16:45
I would like to sum the sales depending on the time diapasons of the day-night in which they occur. For example, I would like to sum all sales that happened between 22:00h and 2:00h.
Hour Sales
18:58 49
18:00 49.5
03:01 31
20:00 139
09:15 61.5
11:36 5
08:00 24
16:32 25
12:30 96.5
17:30 75.5
09:00 80
00:10 24
15:00 24
18:00 216
09:30 24
06:30 47.5
So if I try to do a sumifs where the hour is >=22:00 and <23:00, the formula works. However, if I try to sumifs the values between 22:00 and 2:00, in other words the first criteria is ">=22:00" and the second is "<2:00", the sumifs cannot work. I do understand why but I'm struggling to find an alternative way to solve this task.
As stated, we need to add 1 when it rolls to the next day, which means SUMPRODUCT:
=SUMPRODUCT($B$2:$B$17,((E2<D2)+$A$2:$A$17>=D2)*(($A$2:$A$17<D2)+$A$2:$A$17<(E2<D2)+E2))
I'm trying to get the column name the value from ffill is from.
I've searched google and stack overflow and haven't found a way to accomplish this.
This is the ffill code:
df["LAST_PUNCH"] = df.ffill(axis=1).iloc[:, -1]
This is my dataframe:
SHIFT IN OUT IN_1
DA6-0730 07:30 12:35 13:05
DB0-ACOM 08:18 12:30
DC4-0730 07:30 12:39 13:09
DC4-0730 07:30 12:34 13:04
This is my dataframe after using ffill:
SHIFT IN OUT IN_1 LAST_PUNCH
DA6-0730 07:30 12:35 13:05 13:05
DB0-ACOM 08:18 12:30 12:30
DC4-0730 07:30 12:39 13:09 13:09
DC4-0730 07:30 12:34 13:04 13:04
I would like to get the column name where the ffill value came from and
append to the end of the ffill value:
SHIFT IN OUT IN_1 LAST_PUNCH
DA6-0730 07:30 12:35 13:05 13:05_IN_1
DB0-ACOM 08:18 12:30 12:30_OUT
DC4-0730 07:30 12:39 13:09 13:09_IN_1
DC4-0730 07:30 12:34 13:04 13:04_IN_1
Ummm this is a little bit tricky
(df+'_'+pd.DataFrame(dict(zip(df.columns.values,df.columns.values)),index=df.index)).\
reindex(columns=df.columns).ffill(axis=1).iloc[:,-1]
Out[360]:
0 13:05_IN_1
1 12:30_OUT
2 13:09_IN_1
3 13:04_IN_1
Name: IN_1, dtype: object
Or using idxmax with reversed order of columns
df.ffill(axis=1).iloc[:, -1]+'_'+df[df.columns[::-1]].notnull().idxmax(1)
Out[375]:
0 13:05_IN_1
1 12:30_OUT
2 13:09_IN_1
3 13:04_IN_1
dtype: object
Hi Guys I know I should know this but I am haveing a brain freeze!
So in sheet1 I have a list of people and each coloumn = a date
Emp No. Rota 01/04/2018 02/04/2018 03/04/2018 04/04/2018 05/04/2018 06/04/2018 07/04/2018 08/04/2018 09/04/2018 10/04/2018 11/04/2018 12/04/2018 13/04/2018 14/04/2018 15/04/2018 16/04/2018 17/04/2018 18/04/2018 19/04/2018 20/04/2018 21/04/2018 22/04/2018
10087248 1
10111378 1
10104720 1
10103818 1
10128761 1
10109686 1
10110853 1
10123778 1
10105003 1
10115410 1
10109674 1
10117543 1
10114185 1
10105990 1
10114457 1
10087185 1
10121055 1
in sheet 2 I have a list of dates and then each coloumn = a team
Date 1 2 3 4 5 7A 7B R1 E1
Mon 01/01/2018 06:00 14:00 14:00 06:00 14:00 18:00 18:00 08:00 14:00
Tue 02/01/2018 06:00 14:00 14:00 06:00 14:00 18:00 18:00 08:00 14:00
Wed 03/01/2018 14:00 14:00 06:00 14:00 18:00 18:00 08:00 14:00
Thu 04/01/2018 06:00 14:00 14:00 06:00 14:00 18:00 18:00 08:00 14:00
Fri 05/01/2018 06:00 14:00 06:00 14:00 14:00 08:00 14:00
Sat 06/01/2018
Sun 07/01/2018 14:00 06:00 18:00
Mon 08/01/2018 14:00 06:00 06:00 14:00 14:00 18:00 18:00 08:00 08:00
Tue 09/01/2018 14:00 06:00 06:00 14:00 14:00 18:00 18:00 08:00 08:00
Wed 10/01/2018 14:00 06:00 14:00 14:00 18:00 18:00 08:00 08:00
Thu 11/01/2018 14:00 06:00 06:00 14:00 14:00 18:00 18:00 08:00 08:00
Fri 12/01/2018 06:00 06:00 14:00 14:00 14:00 08:00 08:00
Sat 13/01/2018
Sun 14/01/2018 14:00 06:00 18:00
Mon 15/01/2018 06:00 14:00 14:00 06:00 14:00 18:00 18:00 08:00 14:00
Tue 16/01/2018 06:00 14:00 14:00 06:00 14:00 18:00 18:00 08:00 14:00
I want to be able to see when the person is due to start in sheet 1.
I am trying to use index match but cant get it to work.
You need some common denominator in your data sheet.
That means, in both Table1 and Table2 you need a column to contain the same data which will act as your criteria range.
Then it appears you need to use =VLOOKUP()
Can't really help you further without you editing either your Sheet1 or Sheet2 to have at least 1 column in common,
As of now, you have no way of matching something to a specific number given there are no possible matches (unless I'm missunderstanding what you're trying to achieve)
Ok So I cheated I made the dates match so both start from 01/03/2018 then I fliped the Rota so both tables had the date across the top and then just used =VLOOKUP($B2,Sheet3!$A:$QN,COUNT($C$1:C$1,FALSE))
I have two data sets:
Data set 1: It is a 10min resolution and it is a binary flag indicating "System OK" or "System not OK". For example:
01/01/2018 12:10-12:20 System not OK
Data set 2: This is the fault log with hh:mm:ss timestamps indicating the start-end times of a fault. For example:
Active Fault code X: 01/01/2018 12:08:23-12:19:14
Ideally, for every time span indicated as "System not OK" there must be a fault logging covering part of that 10min period. However, there are inconsistencies in both ends; either I see "System not OK" but no Fault covering that 10min period or there is a fault but "System OK".
What I would like to achieve is to filter the 10min timespans where there is an inconsistency of either kind (and ideally flag it as "System OK but active fault" or "System not OK but no active fault).
Do you think that would be possible in Excel or VBA?
Thanks in advance for your help!
Fault Log
1 Dec 13:47 - 1 Dec 13:48
1 Dec 16:44 - 1 Dec 16:45
1 Dec 19:47 - 1 Dec 19:47
1 Dec 20:23 - 1 Dec 21:08
1 Dec 21:08 - 1 Dec 21:08
1 Dec 21:43 - 2 Dec 01:44
2 Dec 01:44 - 2 Dec 01:45
3 Dec 14:52 - 3 Dec 16:28
3 Dec 16:52 - 3 Dec 17:10
3 Dec 17:34 - 3 Dec 17:36
4 Dec 00:48 - 4 Dec 00:49
4 Dec 02:06 - 4 Dec 02:07
4 Dec 04:59 - 4 Dec 04:59
4 Dec 06:47 - 4 Dec 06:48
6 Dec 09:34 - 6 Dec 09:35
6 Dec 09:39 - 6 Dec 14:16
6 Dec 14:19 - 6 Dec 14:19
6 Dec 14:19 - 6 Dec 14:20
System Ok log
12/1/2018 12:00 OK
12/1/2018 12:10 NOK
12/1/2018 12:20 OK
12/1/2018 12:30 OK
12/1/2018 12:40 OK
12/1/2018 12:50 OK
12/1/2018 13:00 OK
12/1/2018 13:10 OK
12/1/2018 13:20 OK
12/1/2018 13:30 OK
12/1/2018 13:40 NOK
12/1/2018 13:50 OK
12/1/2018 14:00 OK
12/1/2018 14:10 OK
12/1/2018 14:20 OK
12/1/2018 14:30 OK
12/1/2018 14:40 OK
12/1/2018 14:50 OK
12/1/2018 15:00 OK
12/1/2018 15:10 OK
12/1/2018 15:20 OK
12/1/2018 15:30 OK
12/1/2018 15:40 OK
12/1/2018 15:50 OK
12/1/2018 16:00 OK
12/1/2018 16:10 OK
12/1/2018 16:20 OK
12/1/2018 16:30 OK
12/1/2018 16:40 OK
12/1/2018 16:50 OK
12/1/2018 17:00 OK
12/1/2018 17:10 OK
Desired outcome:
Evaluation
12/1/2018 12:00
12/1/2018 12:10 system NOK but no fault
12/1/2018 12:20
12/1/2018 12:30
12/1/2018 12:40
12/1/2018 12:50
12/1/2018 13:00
12/1/2018 13:10
12/1/2018 13:20
12/1/2018 13:30
12/1/2018 13:40
12/1/2018 13:50
12/1/2018 14:00
12/1/2018 14:10
12/1/2018 14:20
12/1/2018 14:30
12/1/2018 14:40
12/1/2018 14:50
12/1/2018 15:00
12/1/2018 15:10
12/1/2018 15:20
12/1/2018 15:30
12/1/2018 15:40
12/1/2018 15:50
12/1/2018 16:00
12/1/2018 16:10
12/1/2018 16:20
12/1/2018 16:30
12/1/2018 16:40 system OK but fault
12/1/2018 16:50
12/1/2018 17:00
12/1/2018 17:10
I am posting an idea that may be helpful for you to try doing this. Sorry, I did not tested it, this is only a starting point.
First add a reference to "Microsoft ActiveX Data Objects 6.1" first. Then try the macro.
Basically you can filter out those records from TableA (that is on Sheet1) whose datetime values do not fall into any range in TableB (that is on Sheet2)
TableA is Sheet1 that has in first row (the header) TimeOk. Next rows have the data.
TimeOK
12/1/2018 12:00
TableB is Sheet2 that has in first row (the headers) FromTime and ToTime. Next rows have the data
FromTime ToTime
12/1/2018 13:47 12/1/2018 13:48
Set cell type as dates and format then nicely as datetime in a consistent mode on both sheets. Do not use different date formats as in the data yuu posted.
The macro should write the result on Sheet3 (make sure you add it)
Sub GetSpecialRecords()
Dim sQuery As String, sFileName as String
Dim conn As New ADODB.Connection
Dim rs As New ADODB.Recordset
sFileName = ThisWorkbook.FullName
sConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=""" & _
sFileName & """;Extended Properties=""Excel 12.0 Xml;HDR=YES;IMEX = 1"""
Conn.Open sConnection
sQuery = "SELECT t1.[TimeOk] From [Sheet1$] t1 INNER JOIN [Sheet2$] t2 ON t1.[TimeOk]>= t2.FromTime AND t1.[TimeOk] <= t2.[ToTime]"
rs.Open sQuery , Conn
ThisWorkbook.Sheets("Sheet3").Range("A2").CopyFromRecordset rs
rs.Close
conn.Close
End Sub
As I said, start with this, you need a little SQL language knowledge though in order to solve the problem...