Non-standard Julian day time stamp - python-3.x

I have a timestamp in a non-standard format, its a concatenation of a number of elements. I'd like to convert at least the last part of the string into hours/minutes/seconds/decimal seconds so I can calculate the time gap between them (typically of the order of 2-5 seconds).
I have looked at this link but it assumes a 'proper' Julian time. How to convert Julian date to standard date?
My time stamp looks like this
1380643373
It is set up as ddd hh mm ss.s
This timestamp represent 138th day, 06:43:37.3
Is there a datetime method of working with this or do I need to strip out the various parts (hh,mm,ss.s) and concatenate them in some way? As I am only interested in the seconds, if I can just extract them I could deal with that by adding 60 if the second timestamp is smaller than the first - i.e event passes over the minute change boundary.

If you're only interested in seconds, you can do:
timestamp = 1380643373
seconds = (timestamp % 1000) / 10 # Gives 37.3
timestamp % 1000 gives you the last three digits of timestamp. Then you divide that by 10 to get seconds.
If it's a string, you can take the last three characters by slicing it.
timestamp = "1380643373"
seconds = int(timestamp[-3:]) / 10 # Gives 37.3
It's pretty easy to convert the timestamp to a datetime using the divmod() function repeatedly:
import datetime
base_date = datetime.datetime(2000, 1, 1, 0, 0, 0) # Midnight on Jan 1 2000
timestamp = 1380643373
timestamp, seconds = divmod(timestamp, 1000) # Gives 1380643, 373
seconds = seconds / 10 # Gives 37.3
timestamp, minutes = divmod(timestamp, 100) # Gives 13806, 43
days, hours = divmod(timestamp, 100) # Gives 138, 6
tdelta = datetime.timedelta(days=days, hours=hours, minutes=minutes, seconds=seconds) # Gives datetime.timedelta(days=138, seconds=24217, microseconds=300000)
new_date = base_date + tdelta

Related

how can I use the epoch

I need to print out “Your birthday is 31 March 2001 (a years, b days, c hours, d minutes and e seconds ago).”
I create input
birth_day = int(input("your birth day?"))
birth_month = int(input("your birth month?"))
birth_year = int(input("your birth year?"))
and I understand
print("your birthday is"+(birth_day)+(birth_month)+(birth_year)) to print out first sentence. but I faced problem with second one which is this part (a years, b days, c hours, d minutes and e seconds ago)
I guess I have to use “the epoch”
and use some of various just like below
year_sec=365*60*60*24
day_sec=60*60*24
hour_sec=60*60
min_sec=60
calculate how many seconds of the date since 1 January 1970 00:00:00 UTC:
import datetime, time
t = datetime.datetime(2001, 3, 31, 0, 0)
time.mktime(t.timetuple())
985960800.0
can anyone, could you solve my problem please?
Thank a lot
EDIT: See this answer in the thread kaya3 mentioned above for a more consistently reliable way of doing the same thing. I'm leaving my original answer below since it's useful to understand how to think about the problem, but just be aware that my answer below might mess up in tricky situations due to the quirks of the Gregorian calendar, in particular:
Every year that is exactly divisible by four is a leap year, except for years that are exactly divisible by 100, but these centurial years are leap years if they are exactly divisible by 400. For example, the years 1700, 1800, and 1900 are not leap years, but the years 1600 and 2000 are.
ORIGINAL ANSWER:
You can try using the time module:
import time
import datetime
def main(ask_for_hour_and_minute, convert_to_integers):
year, month, day, hour, minute = ask_for_birthday_info(ask_for_hour_and_minute)
calculate_time_since_birth(year, month, day, hour, minute, convert_to_integers)
def ask_for_birthday_info(ask_for_hour_and_minute):
birthday_year = int(input('What year were you born in?\n'))
birthday_month = int(input('What month were you born in?\n'))
birthday_day = int(input('What day were you born on?\n'))
if ask_for_hour_and_minute is True:
birthday_hour = int(input('What hour were you born?\n'))
birthday_minute = int(input('What minute were you born?\n'))
else:
birthday_hour = 0 # set to 0 as default
birthday_minute = 0 # set to 0 as default
return (birthday_year, birthday_month, birthday_day, birthday_hour, birthday_minute)
def calculate_time_since_birth(birthday_year, birthday_month, birthday_day, birthday_hour, birthday_minute, convert_to_integers):
year = 31557600 # seconds in a year
day = 86400 # seconds in a day
hour = 3600 # seconds in a hour
minute = 60 # seconds in a minute
# provide user info to datetime.datetime()
birthdate = datetime.datetime(birthday_year, birthday_month, birthday_day, birthday_hour, birthday_minute)
birthdate_tuple = time.mktime(birthdate.timetuple())
# figure out how many seconds ago birth was
seconds_since_birthday = time.time() - birthdate_tuple
# start calculations
years_ago = seconds_since_birthday // year
days_ago = seconds_since_birthday // day % 365
hours_ago = seconds_since_birthday // hour % 24
minutes_ago = seconds_since_birthday // minute % 60
seconds_ago = seconds_since_birthday % minute
# convert calculated values to integers if convert_to_integers is True
if convert_to_integers is True:
years_ago = int(years_ago)
days_ago = int(days_ago)
hours_ago = int(hours_ago)
minutes_ago = int(minutes_ago)
seconds_ago = int(seconds_ago)
# print calculations
print(f'Your birthday was {years_ago} years, {days_ago}, days, {hours_ago} hours, {minutes_ago} minutes, {seconds_ago} seconds ago.')
# to ask for just the year, month, and day
main(False, False)
# to ask for just the year, month, and day AND convert the answer to integer values
main(False, True)
# to ask for just the year, month, day, hour, and minute
main(True, False)
# to ask for just the year, month, day, hour, and minute AND convert the answer to integer values
main(True, True)
Tried to use descriptive variable names so the variables should make sense, but the operators might need some explaining:
10 // 3 # the // operator divides the numerator by the denominator and REMOVES the remainder, so answer is 3
10 % 3 # the % operator divides the numerator by the denominator and RETURNS the remainder, so the answer is 1
After understanding the operators, the rest of the code should make sense. For clarity, let's walk through it
Create birthdate by asking user for their information in the ask_for_birthday_info() function
Provide the information the user provided to the calculate_time_since_birth() function
Convert birthdate to a tuple and store it in birthdate_tuple
Figure out how many seconds have passed since the birthday and store it in seconds_since_birthday
Figure out how many years have passed since the birthday by dividing seconds_since_birthday by the number of seconds in a year
Figure out how many days have passed since the birthday by dividing seconds_since_birthday by the number of seconds in a day and keeping only the most recent 365 days (that's the % 365 in days_ago)
Figure out how many hours have passed since the birthday by dividing seconds_since_birthday by the number of seconds in a hour and keeping only the most recent 24 hours (that's the % 24 in hours_ago)
Figure out how many minutes have passed since the birthday by dividing seconds_since_birthday by the number of seconds in a minute and keeping only the most recent 60 minutes (that's the % 60 in minutes_ago)
Figure out how many seconds have passed since the birthday by dividing seconds_since_birthday and keeping only the most recent 60 seconds (that's the % 60 in seconds_ago)
Then, we just need to print the results:
print(f'Your birthday was {years_ago} years, {days_ago}, days, {hours_ago} hours, {minutes_ago} minutes, {seconds_ago} seconds ago.')
# if you're using a version of python before 3.6, use something like
print('Your birthday was ' + str(years_ago) + ' years, ' + str(days_ago) + ' days, ' + str(hours_ago) + ' hours, ' + str(minutes_ago) + ' minutes, ' + str(seconds_ago) + ' seconds ago.')
Finally, you can add some error checking to make sure that the user enters valid information, so that if they say they were born in month 15 or month -2, your program would tell the user they provided an invalid answer. For example, you could do something like this AFTER getting the birthday information from the user, but BEFORE calling the calculate_time_since_birth() function:
if not (1 <= month <= 12):
print('ERROR! You provided an invalid month!')
return
if not (1 <= day <= 31):
# note this isn't a robust check, if user provides February 30 or April 31, that should be an error - but this won't catch that
# you'll need to make it more robust to catch those errors
print('ERROR! You provided an invalid day!')
return
if not (0 <= hour <= 23):
print('ERROR! You provided an invalid hour!')
return
if not (0 <= minute <= 59):
print('ERROR! You provided an invalid minute!')
return
if not (0 <= second <= 59):
print('ERROR! You provided an invalid second!')
return

convert the h:m:s in minutes format

I have the following data. The idea is to multiply all the data.
however the minute column is in h:m:s format. So whenever i try to multiply i get an error.
and morever i need to convert the h:m:s in minutes format before i actually want to multiply.
tried with the following to convert this to minute
time1 = df['time']
time2 = time1.hour * 60 + time1.minute + time1.second
Create timedeltas by to_timedelta, convert to seconds by Series.dt.total_seconds and divide by 60:
df['Minutes'] = pd.to_timedelta(df['(MIN)']).dt.total_seconds().div(60)
If input valeus are python times also convert to strings:
df['Minutes'] = pd.to_timedelta(df['(MIN)'].astype(str)).dt.total_seconds().div(60)

Comparison between dates starts with -1

I have the following code:
import pandas as pd
from datetime import datetime, timedelta
df = pd.DataFrame ({
'Date':['4/22/2020 14:32:10','4/21/2020 4:32:10','4/20/2020 1:32:10']
})
date ='04/22/2020'
datetime_object = datetime.strptime(date, '%m/%d/%Y')
df['Date'] = pd.to_datetime(df['Date'],format='%m/%d/%Y %H:%M:%S')
days_diff = (datetime_object - df['Date']).dt.days
print(days_diff)
0 -1
1 0
2 1
Why the result is not looking like the one below? Why the no of days starts with -1 and not with 0?
0 0
1 1
2 2
This is because it's flooring the answers
for the first case
'4/22/2020 14:32:10' the diff is = -14/ 24 = ~ -0.6 days
o/p:- -1
for the second case
'4/21/2020 4:32:10' the diff is = 20/24 = ~ 0.8 days
o/p:- 0
for the third case
'4/20/2020 1:32:10' the difff is = 47/24 = ~1.9 days
o/p:- 1
I hope it helps.
Solution would be convert all the datetimes to dates
as in following line i have done with 'Date' column
days_diff = (datetime_object.date() - df['Date'].dt.date ).dt.days
In [32]: days_diff
Out[32]:
0 0
1 1
2 2
Name: Date, dtype: int64
The issue is to do with the fact you are subtracting the higher date from the lower date which leaves you with a negative result. In the datetime module, subtracting one date object from another creates a time delta object like so
days1 = self.toordinal()
days2 = other.toordinal()
secs1 = self._second + self._minute * 60 + self._hour * 3600
secs2 = other._second + other._minute * 60 + other._hour * 3600
base = timedelta(days1 - days2,
secs1 - secs2,
self._microsecond - other._microsecond)
If we mimic that with your dates we see the following days and secs created for each date object
737537 0
737537 52330
subtracting day2 from days1 and secs2 form secs 1 means we pass the following to the timedelta object
0 -52330
So we are saying create a time delta object where the difference is 0 days and negative 52,330 seconds. Which is quite correct. However the timedelta object is a complex object and allows fractional values, and also many other types, like weeks or minutes etc. it also does not apply any limits to the values. so in the seconds part you can pass 10 seconds or 100,000 seconds. Now 100,000 seconds is actually more seconds than there are in a day. So the code takes this into account and will divmod the seconds to work out if there are any extra days in these seconds.
days, seconds = divmod(seconds, 24*3600)
d += days
s += int(seconds) # can't overflow
Now Here the issue lies in understanding what divmod does. div mod will do a floor division and remainder of the calculation. Now in a positive case thats fine.
print(divmod(52330, 24*3600))
print(divmod(-52330, 24*3600))
(0, 52330)
(-1, 34070)
Since the floor division will round down to 0 days and return you the remaining seconds. However in the negative case the floor division will round down to -1 since -52330 / 86400 is -0.6056.... So floor division rounds this down to -1 and the remainder is the difference between between 86400 and 52330 so leaves 34070 seconds.
So you wouldnt face this issue if you are always subtracting the oldest date from the newest date so you never end up with a negative difference. Infact it doesnt make sense to subtract a newer date from an older date.
for the other cases you listed the difference between 4/21/2020 4:32:10 and 4/22/2020 00:00:00 is indeed 0 days since the difference is actually only 20 hours, this behavior is correct the difference is not 1 days its 20 hours.

How to convert timestamp into milliseconds in python

I am trying to write a basic script that can read in a timestamp as a string and convert it into milliseconds. The timestamps I am working with are in minute:second.millisecond format.
from datetime import datetime
timestamp_start = '54:12.123'
MSM = '%M:%S.%f'
zero = '00:00.000'
start_sec = (datetime.strptime(timestamp_start, MSM) - datetime.strptime(zero, MSM)).total_seconds()
start_ms = start_sec * 1000
print(start_ms)
This may be a round about approach, but I am first using datetime.strptime to get a datetime object, then subtracting by 0 in order to get a timedelta object, getting the total seconds of the timedelta object, and finally multiplying by 1000 to convert to milliseconds.
The above code works fine, except for any timestamps over an hour.
The issue that I am running into- the timestamps do not have an hour counter. For example: 1 hour, 5 minutes, and 30 seconds comes in as 65:30.000. datetime.strptime cannot recognize this format, as it only allows the minutes to be between 0 and 59.
How can I convert these timestamps into a format recognizable by datetime? Should I first get the timestamp into hour:minute:second:millisecond format? Keep in mind the end goal is to convert these timestamps into milliseconds. If there is a better approach any suggestions are more than welcomed!
'54:12.123' isn't really a timestamp, but elapsed time, and there's no built-in method in Python that can deal with elapsed time with a format string like a timestamp format.
Since the format string in question is simply minutes and seconds separated by a colon, and seconds and milliseconds separated by a period, you can easily parse it with the str.split method:
def convert(msf):
minutes, seconds = msf.split(':')
seconds, milliseconds = seconds.split('.')
minutes, seconds, milliseconds = map(int, (minutes, seconds, milliseconds))
return (minutes * 60 + seconds) * 1000 + milliseconds
so that convert('54:12.123') returns:
3252123

Summing time fields over 24 hours in Power Query

I have a Power Query in excel linked to another file. This file has a time column. I understand that M language will not sum above 24 hours automatically without some work as it uses a datetime reference hence if I import a time of 25 hours it reverts back 2 hours to 1 hour...
In the 3rd column along in my image below using the second row as a reference, this is actually supposed to read 47:47:38. How can I get the instances where the value is above 24 hours to show the true hours?
I have tried using duration.hours(#hours()) this also does not work for some reason.
The same data from the source excel file is below also
Power Query doesn't have custom formats for how it displays data. If you have it read your data as a Duration instead of a DateTime it will display as [d].hh.mm.ss format, but still not with the total hours. Ultimately though this doesn't really matter because even when your data is formatted to display total hours in Excel, it's really being stored internally as days+hours+minutes+seconds. So how it displays in Power Query doesn't matter, as you can just use the hour formatting wherever you output the data to.
Now if you need to use the hours for a calculation between something that isn't another Duration, you can extract the hours by doing
Duration.Days([Your Hours]) * 24 + Duration.Hours([Your Hours])
Or now that I look at it, there is also a TotalHours function that gives you the hours plus mm:ss as a fractional amount of that
Duration.TotalHours([Your Hours])
Power BI doesn't handle this case very gracefully. A solution could be to convert the duration to a number to make it additive (so you can perform calculations and aggregations) and when you need to visualize it, to convert it to the desired format (HH:MM:SS).
Duration and Time are often confused. When such Excel files are read, the type of the column usually is DateTime, and date 1899-12-31 is added to the "time" part. You can change the data type of the column to be Decimal Number, but the "zero point" in Excel unfortunately is one day off (1899-12-30), so you need to subtract 1 from the result to get the actual "number of days" of the duration (i.e. 0.25 means 06:00:00).
So you must perform some conversion of the data. I would make a new column in the model to get the duration in the lowest granularity that I need (seconds in your example). In Power Query Editor add a custom column to calculate the duration in seconds (where Column1 is the name of the original duration column):
Duration in seconds = Duration.TotalSeconds([Column1] - #datetime(1899, 12, 31, 0, 0, 0))
Make sure the data type of this column is Whole Number (change it if necessary). Here 9144 seconds are calculated as 2 * 3600 + 32 * 60 + 24, or 02:32:24. Now you can calculate a sum on this column to get total duration in seconds for example. But when you visualize this column, don't do it directly, but make a measure to convert the data to the desired format. It could me made like this:
Measure Duration =
VAR duration_in_seconds = SUM(Sheet1[Duration in seconds])
VAR hours = ROUNDDOWN ( duration_in_seconds / 3600; 0 )
VAR minutes = ROUNDDOWN ( MOD ( duration_in_seconds; 3600 ) / 60; 0 )
VAR seconds = INT ( MOD ( duration_in_seconds; 60 ) )
RETURN hours & ":" & FORMAT(minutes; "00") & ":" & FORMAT(seconds; "00")
duration_in_seconds variable hold the total duration in seconds of the data in the context. From it we are calculating hours, minutes and seconds and constructing a string to represent the duration in the desired format. FORMAT is used to make sure there is a leading zero in case minutes or seconds are less than 10.
Here is how all three columns looks like when visualized:
Hope this helps!

Resources