I have the following dataframe:
The column Time is a string and I want to convert it either to timestamp or datetime formats. However, when I run df['Time'] = pd.to_datetime(df['Time']), I always get an error
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 08:53:30
Are you sure you are getting the right column and values. Because running
time = pd.to_datetime("13:30:35.805")
Gives
Timestamp('2020-04-20 13:30:35.805000')
as output as expected.
If you can't solve the problem with pandas directly you can always manually split the string in hours, minutes and seconds with
h, m, s = map(float, x.split(':'))
And use those values to create a timestamp
Related
I have below date expressed as yearmon '202112'
I want to convert this to yearqtr and report the next quarter. Therefore from above string I want to get 2022Q1
I unsuccessfully tried below
import pandas as pd
pd.PeriodIndex(pd.to_datetime('202112') ,freq='Q')
Could you please help how to obtain the expected quarter. Any pointer will be veru helpful
import pandas as pd
df = pd.DataFrame({"Date": ['202112']}) # dummy data
df['next_quarter'] = pd.PeriodIndex(pd.to_datetime(df['Date'], format='%Y%m'), freq='Q') + 1
print(df)
Output:
Date next_quarter
0 202112 2022Q1
Note that column Date may be a string type but Quarter will be type period. You can convert it to a string if that's what you want.
I think one issue you're running into is that '202112' is not a valid date format. You'll want to use '2021-12'. Then you can do something like this:
pd.to_datetime('2021-12').to_period('Q') + 1
You can convert your date to this new format by simply inserting a - at index 4 of your string like so: date[:4] + '-' + date[4:]
This will take your date, convert it to quarters, and add 1 quarter.
I have a pandas column with timestamp strings in the format '00:00:00.000' (hours, minutes, seconds, micro seconds). I would like to convert them to datetime objects to work on the seconds.
I have seen many similar questions here for example. I am guessing that I should use strptime but I couldn't figure out how.
If convert values to datetimes in pandas, also there is added some default date by to_datetime:
df['col'] = pd.to_datetime(df['col'], format='%H:%M:%S.%f')
If need avoid it convert values to timedeltas by to_timedelta:
df['col'] = pd.to_timedelta(df['col'])
So, I have an issue. Pandas keeps telling me that
'datetime.date' is coerced to a datetime.
In the future pandas will not coerce, and a TypeError will be raised. To >retain the current behavior, convert the 'datetime.date' to a datetime with >'pd.Timestamp'.
I'd like to get rid of this warning
So until now I had a dataframe with some data, I was doing some filtration and manipulation. At some point I have a column with dates in string format. I don't care about timzeones etc. It's all about day accuracy. I'm getting a warning mentioned above, when I convert the strings to datetime, like below:
df['Some_date'] = pd.to_datetime(df['Some_date'], format='%m/%d/%Y')
So I tried to do something like that:
df['Some_date'] = pd.Timestamp(df['Some_date'])
But it fails as pd.Timestamp doesn't accept Series as an argument.
I'm looking for a quickest way to convert those strings to Timestamp.
=====================================
EDIT
I'm so sorry, for confusion. I'm getting my error at another place. It happens when I try to filtrate my data like this:
df = df[(df['Some_date'] > firstday)]
Where firstday is being calculated basing on datetime. Like here:
import datetime
def get_dates_filter():
lastday = datetime.date.today().replace(day=1) - datetime.timedelta(days=1)
firstday = lastday.replace(day=1)
return firstday, lastday
So probably the issue is comparing two different types of date representation
In pandas python dates are still poor supported, the best is working with datetimes with no times.
If there are python dates you can convert to strings before to_datetime:
df['Some_date'] = pd.to_datetime(df['Some_date'].astype(str))
If need remove times from datetimes in column use:
df['Some_date'] = pd.to_datetime(df['Some_date'].astype(str)).dt.floor('d')
Test:
rng = pd.date_range('2017-04-03', periods=3).date
df = pd.DataFrame({'Some_date': rng})
print (df)
Some_date
0 2017-04-03
1 2017-04-04
2 2017-04-05
print (type(df.loc[0, 'Some_date']))
<class 'datetime.date'>
df['Some_date'] = pd.to_datetime(df['Some_date'].astype(str))
print (df)
Some_date
0 2017-04-03
1 2017-04-04
2 2017-04-05
print (type(df.loc[0, 'Some_date']))
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
print (df['Some_date'].dtype)
datetime64[ns]
I have a dataset. In first column values an integer type YYYYMMDD. I want to convert it to date without clock (hour,second) and replace with column's values.
How can I convert an integer to date in Python? I use JupyterLab.
Since you said you have a dataset, let the name of the dataset be df then using pandas, you could do:
pd.to_datetime(df.Dates.astype(str),format = "%Y%m%d")
I have a column of times expressed as seconds since Jan 1, 1990, that I need to convert to a DateTime. I can figure out how to do this for a constant (e.g. add 10 seconds), but not a series or column.
I eventually tried writing a loop to do this one row at a time. (Probably not the right way, and I'm new to python).
This code works for a single row:
def addSecs(secs):
fulldate = datetime(1990,1,1)
fulldate = fulldate + timedelta(seconds=secs)
return fulldate
b= addSecs(intag112['outTags_1_2'].iloc[1])
print(b)
2018-06-20 01:05:13
Does anyone know an easy way to do this for a whole column in a dataframe?
I tried this:
for i in range(len(intag112)):
intag112['TransactionTime'].iloc[i]=addSecs(intag112['outTags_1_2'].iloc[i])
but it errored out.
If you want to do something with column (series) in DataFrame you can use apply method, for example:
import datetime
# New column 'datetime' is created from old 'seconds'
df['datetime'] = df['seconds'].apply(lambda x: datetime.datetime.fromtimestamp(x))
Check documentation for more examples. Overall advice - try to think in terms of vectors (or series) of values. Most operations in pandas can be done with entire series or even dataframe.