Stata/Excel time formatting issue - excel

I have a variable minutes that I am having trouble cleaning/standardizing.
It is imported from Excel in a Date Time format, but I just want the minutes. For example, if a player plays 30 minutes and 34 seconds, it's displayed as 30:34 or 30:34:00. However, it's stored in Excel as 6:34 AM (30:34 is interpreted as military time), or 12:34 AM, depending on whether it is 30:34 or 30:34:00. Thus it ends up getting imported into Stata as 6:34 or 12:34, when the value I want is what's displayed (30:34 or 30:34:00). Is there a way for you to format a number in Excel to just be the value that is displayed?
Once it's imported into Stata it's impossible to standardize, because you cannot differentiate a player that plays 30:34 (when it displays 30:34:00) from a player that plays 6:34 (they will both show 6:34).

Please make a greater effort when posting questions. People in capacity to help might ignore the question because it is difficult to understand, because you provide no code (and thus show no effort), because the problem is not reproducible, and more.
Suppose an MS Excel sheet like the following
Then the following should get you started:
clear
set more off
import excel timetest.xls, cellrange(C2:C4) firstrow
gen hour = hh(time)
gen sec = mm(time)
gen realmin = hour + 24
order realmin, before(sec)
list
resulting in
. list
+-------------------------------------------+
| time hour realmin sec |
|-------------------------------------------|
1. | 01jan1900 06:34:00 6 30 34 |
2. | 01jan1900 00:00:00 0 24 0 |
+-------------------------------------------+
See help datetime. This is a must if working with dates and/or time.
Note that adding 24 won't work for any data set.
A general solution takes the form of
clear
set more off
import excel timetest.xls, cellrange(C2:C5) firstrow
gen hour = hh(time)
gen sec = mm(time)
gen t = dofc(time)
format t %td
gen dayselap = t - td(01jan1900) + 1
gen realmin = hour + (24 * dayselap)
drop hour t dayselap
order time realmin
list
For example, this data in MS Excel:
time
30:34:00
24:00:00
58:04:00
65:00:00
will produce
. list
+------------------------------------+
| time realmin sec |
|------------------------------------|
1. | 01jan1900 06:34:00 30 34 |
2. | 01jan1900 00:00:00 24 0 |
3. | 02jan1900 10:04:00 58 4 |
4. | 02jan1900 17:00:00 65 0 |
+------------------------------------+
(There might be an issue with leap years that you are encouraged to research yourself.)
An alternative solution may involve converting the values to text (within MS Excel) and then managing the text within Stata.

I am not at all familiar with Stata.
The problem is that the actual data in Excel is being entered incorrectly. Excel stores dates and times as days + fractions of a day, and then displays it according to the formatting.
So:
30:34 is really 0 hours 30 minutes 34 seconds and is stored as 2.1226851851851854E-2 which is the computation: 30/(24*60)+34/(24*60*60)
However 30:34:00 is really 30 hours, 34 minutes 0 seconds and is stored as 1.273611111111111 which is the computation 30/24+34/(24*60)
Depending on whether it is entered properly as 0:30:34, or improperly as 30:34 determines the value that is stored.
In Excel, you could possibly pre-process the values if you have some test to tell if the data was entered correctly or not.
For example:
=IF(time_unreasonable,A1/60,A1)
would convert the values if they had been incorrectly entered as hours:minutes rather than minutes:seconds.
Other solutions may also be feasible, perhaps even in Stata.

Related

Using MOD with Time values to determine time differences that transition midnight

I created an Excel Spreadsheet with a Total Time for the Duration of the Shift ie 8:30-17:30.
Then I created this formula
=(MOD([#[Supposed Shift End]]-[#[Supposed Shift Start]],1))*24
with Format cells -> Number 2 decimal places. This gives me the [Total Supposed Shift Hours]
Giving me Duration of the Shift that needs to be covered.
Now I also created another column for the ACTUAL time the shift covered
=(MOD([#[Actual Time End]]-[#[Actual Time Start]],1))*24
This gives me the [Total Actual Time Hours]
For the Actual Time End and Actual Time Start When the employee DIDN'T show up we entered 0 for both cells with the same cell format Number -> 2 decimal places
The Problem:
This is the formula I wrote to subtract these 2 Columns
=(MOD([#[Total Supposed Shift Hours]]-[#[Total Actual Time Hours]],1))*24
Why when I subtract say the [Total Supposed Shift Hours] 9.00 - the [Total Actual Time Hours] 0.00 = 24.00 ???
9 - 0 = 9 not 24.... sigh
Will the formula be affected if the time goes from previous night 21:00 to 8:00 with the MOD formula?
Sample data (as requested)
Note: Nomenclature differs from description above: Open = Supposed and Covered = Actual
Service Date
Open Post Start
Open Post End
Total Hrs Missing
Covered Post Start
Covered Post End
Total Hrs Covered
Category
Hours Not Covered
02/06/2021
16:00
00:00
8
16:00
03:00
11
A
0
04/06/2021
16:00
00:00
8
00:00
00:00
0
A
0
10/21/2021
10:30
00:00
13.5
18:00
19:30
1.5
B
0
Initial Answer
A minor point first: You don't need to wrap the MOD function in brackets. As the function already produces a result to the *24. Thus the following works just fine:
=MOD([#[Supposed Time End]]-[#[Supposed Time Start]],1)*24
To your question: Your non-working formula reads as if it simply wants the difference (in hours) between Supposed and Actual. If that's so, simply do this:
=[#[Total Supposed Shift Hours]]-[#[Total Actual Shift Hours]]
EDIT: Using the (now posted) table, I've constructed what I think it is you're trying to doRefer to Sample results image below.
Formula in Colum E: =MOD([#[Open Post End]]-[#[Open Post Start]],1)*24
Formula in Colum H: =MOD([#[Covered Post End]]-[#[Covered Post Start]],1)*24
Formula in Colum J: =[#[Total Hours Missing]]-[#[Total Hours Covered]]
Sample results:
Now: If Column J (i.e. the response to your core question) isn't the result you're after, can you tell me what it is you would expect there (using actual expected values for each row).
Notes:
Your table column header Total Hours Missing is somewhat confusing.
But, I'm reading that a Post Duration (Duration of Shift in your original parlance).
If I understand what transpired correctly, the "spanner in the works" was #P.b suggestion to remove MOD. Revert your formulas in columns E and H as shown above.

aggregate hours minutes seconds text columns to time - Excel

I have 3 columns formatted as text: hours, minutes, seconds
I need to consolidate them in a time format
e.g.
1 34 56 -> I need a cell with 1:34:56
23 02 11 -> 23:02:11
Is this possible without macro/code?
I had issues with escaping the colon
Did you look up the "TIME" built in function? See the screen shot ...

Adding/Subtracting Whole numbers from Time

I have tried every which way to format cells to subtract the result from time for instance the formula in the cell = 11(this is 11 minutes) I want to take that result minus 8:00:00 to give me 7:49:00 but it doesn't work the result is ####### no matter how big I make the cell. And if I format the cells with the formula to custom [m]:ss then the value changes.
Sample of the Worksheet:
I want Y2 = X3-W3 in a time format.
So, if A1=11
Then in some other cell, (B1 in this example): =TIME(,A1,)
Then subtract from the cell with 8:00:00. (If it's C1...:)
=C1-B1
That will give you the time you want.
Info: The main thing is that you have to tell Excel that your cell with the "11" in it, is minutes. By using the =TIME(,A1,) you will get the value of: 12:11 am. (If you keep it in Date format.) 12:11 am could also be viewed as: 0 Hours, 11 minutes, 0 seconds. And now that it knows, you should be able to subtract.
Try this:
=TIME(HOUR(X3),MINUTE(X3)-W3,SECOND(X3))
The ######### is because you have a negative time. Becuase Excel reads time as decimals of 24 hours, 8:00:00 is .3333 and if you subtract 11 from that you get -10.6666 and date/time can not be negative.

How to count hours in excel

I have xls file in following format
Name 1 2 3 4
John 09:00-21:00 09:00-21:00
Amy 21:00-09:00 09:00-21:00
Where 1,2,3,4 and so on represent days of current month,
09:00-21:00 - working hours.
I want to calculate salary based on the following conditions:
09:00-21:00 - 10$/hour
21:00-00:00 - 15$/hour
00:00-03:00 - 20$/hour
etc.
and so on (every hour can have it's own cost, for example 03:00-04:00 - 20$/hour, 04:00-05:00 - 19$/hour, etc.)
How can i accomplish this using only Excel (functions or VBA)?
P.S. Easy way: export to csv and process in python/php/etc.
Here is a non-VBA solution. It's a pretty nasty formula, but it works. I am sure it could be made even easier to use and understand with some more ingenuity:
Assuming the spreadsheet is set up like this:
Enter this formula in cell G1 and drag down for your data set:
=IF(ISBLANK(B2),"",IF(LEFT(B2,2)<MID(B2,FIND("-",B2)+1,2),SUMIFS($P$2:$P$24,$Q$2:$Q$24,">="&LEFT(B2,2),$Q$2:$Q$24,"<="&MID(B2,FIND("-",B2)+1,2)),SUMIF($Q$2:$Q$24,"<="&MID(B2,FIND("-",B2)+1,2),$P$2:$P$24)+SUMIF($Q$2:$Q$24,">="&LEFT(B2,2),$P$2:$P$24)))
To explain the formula in detail:
IF(ISBLANK(B2),"" will return a empty string if there is no time for a given person / day combination.
LEFT(B2,2) extracts the start-time into an hour.
Mid(B2,Find("-",B2)+1,2) extracts the end-time into an hour.
IF(LEFT(B2,2)<MID(B2,FIND("-",B2)+1,2) will check if the start-time is less than the end-time (meaning no over-night work). If the start-time is less than the end-time, it will use this formula to calculate the total cost per hour: SUMIFS($P$2:$P$24,$Q$2:$Q$24,">="&LEFT(B3,2),$Q$2:$Q$24,"<="&MID(B3,FIND("-",B3)+1,2))
If the start-time is higher than the end-time (meaning overnight work), it will use this formula to calculate: SUMIF($Q$2:$Q$24,"<="&MID(B3,FIND("-",B3)+1,2),$P$2:$P$24)+SUMIF($Q$2:$Q$24,">="&LEFT(B3,2),$P$2:$P$24)
The use of the Find("-",[cell]) splits the start-and- end times into values excel can use to do math against the Time / Cost table.
The formula in column Q of the Time / Cost table is =VALUE(MID(O2,FIND("-",O2)+1,2)) and turns the ending hour to consider the cost into a value Excel can use to add, instead of having the text from your original source format.
Do this in VBA! It is native to excel and is easy to learn. Functionally, I would loop through the table, write a function to calculate the dollars earned based on the info given. If you want your results to be live updating (like a formula in excel) you can write a user defined function. A helpful function might be an HoursIntersect function, as below:
Public Function HoursIntersect(Period1Start As Date, Period1End As Date, _
Period2Start As Date, Period2End As Date) _
As Double
Dim result As Double
' Check if the ends are greater than the starts. If they are, assume we are rolling over to
' a new day
If Period1End < Period1Start Then Period1End = Period1End + 1
If Period2End < Period2Start Then Period2End = Period2End + 1
With WorksheetFunction
result = .Min(Period1End, Period2End) - .Max(Period1Start, Period2Start)
HoursIntersect = .Max(result, 0) * 24
End With
End Function
Then you can determine the start and end time by splitting the value on the "-" character. Then multiply each payment schedule by the hours worked within that time:
DollarsEarned = DollarsEarned + 20 * HoursIntersect(StartTime, EndTime, #00:00:00#, #03:00:00#)
DollarsEarned = DollarsEarned + 10 * HoursIntersect(StartTime, EndTime, #09:00:00#, #21:00:00#)
DollarsEarned = DollarsEarned + 15 * HoursIntersect(StartTime, EndTime, #21:00:00#, #00:00:00#)
I have a method that uses nothing but formulas. First create a lookup table which contains every hour and rate in say columns K & L, something like this:
K L
08:00 15
09:00 10
10:00 10
11:00 10
12:00 10
13:00 10
14:00 10
15:00 10
16:00 10
17:00 10
18:00 10
19:00 10
20:00 10
21:00 15
22:00 15
23:00 15
Make sure you enter the hours as text by entering a single quote before the digits.
Then if your hours were in cell B2 you could then use this formula to calculate the total:
=SUM(INDIRECT("L"&MATCH(LEFT(B2,5),K2:K40,0)&":L"&MATCH(RIGHT(B2,5),K2:K40,0)))
All the formula is doing is getting the left and right text of your work time, using MATCH to find their positions in the lookup table which is used to create a range address which is then passed to SUM via the INDIRECT function.
If you need to worry about minutes all you need to do is create a bigger lookup table which holds every minute of the day. You may need to add some extra logic if your work days span midnight.

Convert 'x hrs y min z sec' to seconds

a) So I have a huge folder of .csv data with a column about time duration where the cells are 'x min y sec' (e.g. 15 min 29 sec) or 'x hrs y min z sec' (e.g. 1 hrs 48 min 28 sec). The cells are formatted by text.
I want to batch change them to the number of seconds, but I have no idea where to start. I can't get the data in another format.
I thought about somehow using 'hrs', 'min' or 'sec' as delimiters, but I don't know how to move from there. I also thought about using ' ' as delimiters, but then the first column is filled with either hours or minutes depending on the time duration.
I also thought about using PostgreSQL's SELECT EXTRACT(EPOCH FROM INTERVAL '5 days 3 hours'), but I haven't been able to work out how to use this on a column from a table.
b) Is there a better way to change this time format 'Fri Mar 14 11:29:27 EST 2014' to epoch time? Right now I'm thinking of using macros in Excel to get rid of 'Fri' and 'EST', then put the columns back together, then use the to_timestamp function in PostgreSQL.
In Excel if you have data in only those 2 formats and starting from A2 you can use this formula in B2 copied down to get the number of seconds:
=IFERROR(LEFT(A2,FIND("hrs",A2)-1)*3600,0)+SUM(MID(0&A2,FIND({"min","sec"},0&A2)-3,2)*{60,1})
It finds the relevant text then gets the number in front for each and multiplies by the relevant number to get seconds
You can do:
SELECT EXTRACT(EPOCH FROM column_name::interval)
FROM my_table;
The interval can use the regular time units (like hour), abbreviations thereof (hr) and plurals (hours). I am not sure about a combination of plural and abbreviation (hrs) though. If that does not work, UPDATE the column and replace() the sub-string "hrs" to "hours".
If you want to save the number of seconds in your table, then you convert the above statement into an UPDATE statement:
UPDATE my_table SET seconds_column = extract(epoch FROM column_name::interval);
I would split with space as the delimiter, then examine the second column. If it contains the string "hrs", then your seconds answer is:
3600 * column 1 + 60 * column 3 + column 5
Otherwise it is:
60 * column 1 + column 3

Resources