I am reading in time stamps as strings from a service that is UNIX time formatted in Nano seconds. This introduces an obvious problem in that I can't conduct standard operations to normalize the strings to seconds given how large they are. An example of one of these strings is '1589212802642680000' or 1.58921E+18 in scientific notation.
I was trying something like this: convert_fills_df['timeStamp'] = convert_fills_df.timeStamp.apply(lambda x: UNIX_EPOCH + (float(x)/1000000000)). But I overflow the float object when I try this; is there a string operation I can do without losing precision down to the second? Nanoseconds for my purpose are not necessary (though I appreciate their thoroughness). If I could keep the nanoseconds that's great too, but it is not a necessity.
I would like to just convert the time to a human readable format in 24 hour clock format.
The first 10 digits represents the seconds, the subsequent digits represent milli, micro & nanosecond precision
To keep all the information you can insert . at the right position, and pass the string to pd.to_datetime
df = pd.DataFrame({'ns': ['1589212802642680000']})
pd.to_datetime(df.ns.str[:10] + '.' + df.ns.str[10:], unit='s')
# outputs
0 2020-05-11 16:00:02.642679930
Name: ns, dtype: datetime64[ns]
Related
I am gathering data on a device, and after every second, I update a count and log it. I am now processing it, and am new to python, so I had a question as to whether it was possible to convert a numbered array [0,1,2,3,4,...1091,1092,1093,...] into a timestamp [00:00:01, 00:00:02, 00:00:03, 00:00:04, ... 00:18:11, 00:18:12, 00:18:13,...] for example.
If you could please lead me in the right direction, that would be very much appreciated!
p.s. In the future, I will be logging the data as a timestamp, but for now, I have 5 hours' worth of data that needs to be processed!
import datetime as dt
timestamp=[0,1,2,3,4,5,1092,1093]
print([dt.timedelta(seconds=ts) for ts in timestamp])
Happy Coding
If all you have is seconds, then you can just do simple arithmetic to convert them to minutes and hours:
inp = [0, 1, 2, 3, 4, 1091, 1092, 1093]
outp = [f'{secs // 3600:02}:{(secs // 60) % 60:02}:{secs % 60:02}' for secs in inp]
print(outp)
# ['00:00:00', '00:00:01', '00:00:02', '00:00:03', '00:00:04', '00:18:11', '00:18:12', '00:18:13']
Here, I use a list comprehension and, for each secs in the input, create a format string:
hours is secs // 3600 (that's integer floor division), because one hour is 3600 seconds
Minutes is (secs // 60) % 60 (this incorporates the modulo operator, which displays the remainder of secs // 60 after dividing it by 60 again). One minute is 60 seconds, but more than 60 minutes would be an hour, so we need to make sure to 'roll over' the counter every 60 minutes (which is what the mod is for).
Seconds is, of course, secs % 60, because a minute has 60 seconds and we want the counter to roll over.
The format string starts with f', and anything inside {} is an instruction to evaluate whatever's inside it, and insert that into the string. The syntax is {expression:format}, where display is an optional instruction for how to format the data (i.e. not just printing it out). And format can get complicated (look up a python f-string tutorial if you're curious about the specifics), but suffice it to say that in this case we use 02, which means that we want the output to be two characters in length, and padded with zeroes in case it's less than that.
I'm trying to convert a string containing a time ("%H:%M:%S.%f") to an int of the equivalent milliseconds. The complication is, the time is the output from FFmpeg, it's a point in the audio file. I need to get the number of milliseconds the time in the string represents. The timestamp method in DateTime is milliseconds from epoche, without another time stamp from when I began, this is no good.
For example:
t = "00:05:52.654321"
should be converted to:
i = 352654321
What is the best way to accomplish this?
This is how I figured out to do it.
def _convert_string_to_int(self, s) -> int:
begin = datetime.datetime(1900,1,1)
end = datetime.datetime.strptime(s, self._ffmpeg_format_string)
return int((end - begin).total_seconds() * 1000000)
It just feels really unnecessary to use timedelta like that.
Since timestamps are relative to the Unix Epoch (1970-01-01) you can make a datetime object from your time by prepending that date to it and then getting the timestamp of the resultant object to get the time string converted to seconds . Since python timestamps are floating point representations of seconds since the epoch, you will need to multiply by 1000 and convert to integer to get the number of milliseconds:
from datetime import datetime
t = "00:05:52.654321"
d = datetime.strptime('1970-01-01 ' + t, '%Y-%m-%d %H:%M:%S.%f')
print(int(d.timestamp()*1000))
Output:
352654
If you actually want microseconds, multiply by 1000000 instead.
As an alternative, you can split the time string on : and sum the parts, multiplying by 60 or 3600 to convert the hour and minute parts to seconds:
t = "00:05:52.654321"
millisecs = int(sum([float(v) * 1000 * 60 ** (2 - i) for i, v in enumerate(t.split(':'))]))
print(millisecs)
Output:
352654
Again, if you want microseconds, just multiply by 1000000 instead of 1000.
A number of milliseconds is inherently a time interval, so there is good reason why datetime.timedelta instances have a total_seconds method while datetime.datetime, datetime.date and datetime.time do not have one.
In principle you could use datetime.datetime.time(end) to get an object with properties including hour, minute, second and microsecond, and then use these to construct an arithmetic expression for the elapsed time since midnight on the same day. However, the supported way to handle time intervals like this is precisely the timedelta approach that you are already using.
I have two datetime.time objects and I want to calculate the difference in hours between them. For example
a = datetime.time(22,00,00)
b = datetime.time(18,00,00)
I would like to be able to subtract these so that it gives me the value 4.
To calculate the difference, you have to convert the datetime.time object to a datetime.datetime object. Then when you subtract, you get a timedelta object. In order to find out how many hours the timedelta object is, you have to find the total seconds and divide it by 3600.
# Create datetime objects for each time (a and b)
dateTimeA = datetime.datetime.combine(datetime.date.today(), a)
dateTimeB = datetime.datetime.combine(datetime.date.today(), b)
# Get the difference between datetimes (as timedelta)
dateTimeDifference = dateTimeA - dateTimeB
# Divide difference in seconds by number of seconds in hour (3600)
dateTimeDifferenceInHours = dateTimeDifference.total_seconds() / 3600
This is how I did
a = '2200'
b = '1800'
time1 = datetime.strptime(a,"%H%M") # convert string to time
time2 = datetime.strptime(b,"%H%M")
diff = time1 -time2
diff.total_seconds()/3600 # seconds to hour
output: 4.0
I got my result from this problem:
a='2017-10-10 21:25:13'
b='2017-10-02 10:56:33'
a=pd.to_datetime(a)
b=pd.to_datetime(b)
c.total_seconds()/3600
but in series that wont work:
table1['new2']=table1['new'].total_seconds()/3600
Aside, but this might bother more users finding this question...
To calculate the difference between pandas columns, better is not to have time as type datetime.time in the first place, but as numpy.timedelta64 instead (duration since midnight). One way to fix this:
from datetime import datetime, date, time
for c in df.select_dtypes('object'):
if isinstance(df[c][0], time):
df[c] = df[c].apply(lambda t: datetime.combine(date.min, t) - datetime.min)
I have a situation that: I have one vector A, say 10000x1, and another vector B 10000x1, both are numerical arrays with floating point numbers in it. Now I want to write the data into one line of string as below:
A(1):B(1) A(2):B(2) ....A(10000):B(10000)
Is there an efficient way to do this? Right now, i am just using a for loop, change the floating number to string first, than add the ':', and then concatenate them together. This is very slow. Could anybody help? Thanks a lot.
For dimension nx1 (Column Matrix)
tic
A=rand(10000,1);
B=rand(10000,1);
finalString=sprintf(' %f:%f',[A.'; B.']);
finalString(1)=[];
toc
Elapsed time is 0.036697 seconds.
For dimension 1xn (Row Matrix)
tic
A=rand(1,10000);
B=rand(1,10000);
finalString=sprintf(' %f:%f',[A; B]);
finalString(1)=[];
toc
Elapsed time is 0.036879 seconds.
Value Type
%f --> Floating-point number(Fixed-point notation)
%d --> Integer, signed(Base 10)
For more value types http://in.mathworks.com/help/matlab/ref/sprintf.html has a table for conversion characters to format numeric and character data as text or you can search sprintf in matlab help.
This should do it relatively quickly. I included a tic-toc to provide a reference execution time if someone provides an alternative implementation.
tic
a=rand(10000,1);
b=rand(10000,1);
c=zeros(20000,1);
c(1:2:end)=a;
c(2:2:end)=b;
c_string=mat2str(c);
idx=find(c_string==';');
c_string(idx(1:2:end))=':';
c_string(idx(2:2:end))=' ';
toc
%Elapsed time is 0.365694 seconds.
So, I've learned quite a few ways to control the precision when I'm dealing with floats.
Here is an example of 3 different techniques:
somefloat=0.0123456789
print("{0:.10f}".format(somefloat))
print("%.5f" % somefloat)
print(Decimal(somefloat).quantize(Decimal(".01")))
This will print:
0.0123456789
0.01235
0.01
In all of the above examples, the precision itself is a fixed value, but how could I turn the precision itself a variable that could be
be entered by the end-user?
I mean, the fixed precision values are now inside quatations marks, and I can't seem to find a way to add any variable there. Is there a way, anyway?
I'm on Python 3.
Using format:
somefloat=0.0123456789
precision = 5
print("{0:.{1}f}".format(somefloat, precision))
# 0.01235
Using old-style string interpolation:
print("%.*f" % (precision, somefloat))
# 0.01235
Using decimal:
import decimal
D = decimal.Decimal
q = D(10) ** -precision
print(D(somefloat).quantize(q))
# 0.01235