pandas to_datetime formatting - python-3.x

I am trying to compare a pandas to_datetime object to another to_datetime object. In both locations, I am entering the date as date = pd.to_datetime('2017-01-03'), but when I run a print statement on each, in one case I get 2017-01-03, but in another I get 2017-01-03 00:00:00. This causes a problem because if I use an if statement comparing them such as if date1 == date2: they will not compare as equal, when in reality they are. Is there a format statement that I can use to force the to_datetime() command to yield the 2017-01-03 format?

You can use date() method to just select date from pandas timestamp and also use strftimeme(format) method to convert it into string with different formats.
date = pd.to_datetime('2017-01-03').date()
print(date)
>datetime.date(2017, 1, 3)
or
date = pd.to_datetime('2017-01-03').strftime("%Y-%m-%d")
print(date)
>'2017-01-03'

try .date()
pd.to_datetime('2017-01-03').date()

You can use
pd.to_datetime().date()
For example:
a='2017-12-24 22:44:09'
b='2017-12-24'
if pd.to_datetime(a).date() == pd.to_datetime(b).date():
print('perfect')

Related

Converting year-month to next year-quarter

I have below date expressed as yearmon '202112'
I want to convert this to yearqtr and report the next quarter. Therefore from above string I want to get 2022Q1
I unsuccessfully tried below
import pandas as pd
pd.PeriodIndex(pd.to_datetime('202112') ,freq='Q')
Could you please help how to obtain the expected quarter. Any pointer will be veru helpful
import pandas as pd
df = pd.DataFrame({"Date": ['202112']}) # dummy data
df['next_quarter'] = pd.PeriodIndex(pd.to_datetime(df['Date'], format='%Y%m'), freq='Q') + 1
print(df)
Output:
Date next_quarter
0 202112 2022Q1
Note that column Date may be a string type but Quarter will be type period. You can convert it to a string if that's what you want.
I think one issue you're running into is that '202112' is not a valid date format. You'll want to use '2021-12'. Then you can do something like this:
pd.to_datetime('2021-12').to_period('Q') + 1
You can convert your date to this new format by simply inserting a - at index 4 of your string like so: date[:4] + '-' + date[4:]
This will take your date, convert it to quarters, and add 1 quarter.

How to convert string timestamp to datetime object in Python

I have a pandas column with timestamp strings in the format '00:00:00.000' (hours, minutes, seconds, micro seconds). I would like to convert them to datetime objects to work on the seconds.
I have seen many similar questions here for example. I am guessing that I should use strptime but I couldn't figure out how.
If convert values to datetimes in pandas, also there is added some default date by to_datetime:
df['col'] = pd.to_datetime(df['col'], format='%H:%M:%S.%f')
If need avoid it convert values to timedeltas by to_timedelta:
df['col'] = pd.to_timedelta(df['col'])

Pandas and datetime coercion. Can't convert whole column to Timestamp

So, I have an issue. Pandas keeps telling me that
'datetime.date' is coerced to a datetime.
In the future pandas will not coerce, and a TypeError will be raised. To >retain the current behavior, convert the 'datetime.date' to a datetime with >'pd.Timestamp'.
I'd like to get rid of this warning
So until now I had a dataframe with some data, I was doing some filtration and manipulation. At some point I have a column with dates in string format. I don't care about timzeones etc. It's all about day accuracy. I'm getting a warning mentioned above, when I convert the strings to datetime, like below:
df['Some_date'] = pd.to_datetime(df['Some_date'], format='%m/%d/%Y')
So I tried to do something like that:
df['Some_date'] = pd.Timestamp(df['Some_date'])
But it fails as pd.Timestamp doesn't accept Series as an argument.
I'm looking for a quickest way to convert those strings to Timestamp.
=====================================
EDIT
I'm so sorry, for confusion. I'm getting my error at another place. It happens when I try to filtrate my data like this:
df = df[(df['Some_date'] > firstday)]
Where firstday is being calculated basing on datetime. Like here:
import datetime
def get_dates_filter():
lastday = datetime.date.today().replace(day=1) - datetime.timedelta(days=1)
firstday = lastday.replace(day=1)
return firstday, lastday
So probably the issue is comparing two different types of date representation
In pandas python dates are still poor supported, the best is working with datetimes with no times.
If there are python dates you can convert to strings before to_datetime:
df['Some_date'] = pd.to_datetime(df['Some_date'].astype(str))
If need remove times from datetimes in column use:
df['Some_date'] = pd.to_datetime(df['Some_date'].astype(str)).dt.floor('d')
Test:
rng = pd.date_range('2017-04-03', periods=3).date
df = pd.DataFrame({'Some_date': rng})
print (df)
Some_date
0 2017-04-03
1 2017-04-04
2 2017-04-05
print (type(df.loc[0, 'Some_date']))
<class 'datetime.date'>
df['Some_date'] = pd.to_datetime(df['Some_date'].astype(str))
print (df)
Some_date
0 2017-04-03
1 2017-04-04
2 2017-04-05
print (type(df.loc[0, 'Some_date']))
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
print (df['Some_date'].dtype)
datetime64[ns]

Date Manipulation and Comparisons Python,Pandas and Excel

I have a datetime column[TRANSFER_DATE] in an excel sheet shows dates formated as
1/4/2019 0:45 when this date is selected, in it appears as
01/04/2019 00:45:08 am using a python scrip to read this column[TRANSFER_DATE] which shows the datetime as 01/04/2019 00:45:08
However when i try to compare the column[TRANSFER_DATE] whith another date, I get this error
Can only use .dt accessor with datetimelike "
ValueError: : "Can only use .dt accessor with datetimelike values" while evaluating
implying those values are not actually recognized as datetime values
mask_part_date = data.loc[data['TRANSFER_DATE'].dt.date.astype(str) == '2019-04-12']
As seen in this question, the Excel import might have silently failed for some of the values in the column. If you check the column type with:
data.dtypes
it might show as object instead of datetime64.
If you force your column to have datetime values, that might solve your issue:
data['TRANSFER_DATE'] = pd.to_datetime(data['TRANSFER_DATE'], errors='coerce')
You will spot the non-converted values as NaT and you can debug those manually.
Regarding your comparison, after the dataframe conversion to datetime objects, this might be more efficient:
mask_part_date = data.loc[data['TRANSFER_DATE'] == pd.Timestamp('2019-04-12')]

Pandas: convert series of time YYYY-MM-DD hh:mm:ss.0 keeping the YYYY-MM-DD format only

Pandas: convert series of time YYYY-MM-DD hh:mm:ss.0 keeping the YYYY-MM-DD format only
python 3.6, pandas 0.19.0
timestamp
0 2013-01-14 21:19:42.0
1 2013-01-16 09:04:37.0
2 2013-03-20 12:50:49.0
3 2013-01-03 17:02:53.0
4 2013-04-13 16:44:20.0
I tried:
df['timestamp'] = df['timestamp'].dt.strftime('%Y-%m-%d')
`AttributeError: Can only use .dt accessor with datetimelike values.`
Any thoughts? Thank you!
convert the series into datetime datatype and try,
df['timestamp'] = pd.to_datetime(df['timestamp']).dt.strftime('%Y-%m-%d')
it may satisfy your demand
df['timestamp'] = pd.to_datetime(df['timestamp']).dt.strftime('%Y-%m-%d')
Using the below shown method also helps to achieve the same.
df['timestamp'] = pd.to_datetime(df['timestamp']).dt.date
You can refer to the documentation provided in the below link as a handy guide for date time handling:
https://pandas.pydata.org/pandas-docs/stable/api.html#datetimelike-properties

Resources