Pandas set date as day(int)-month(str)-year(int) - python-3.x

I am trying to change the formatting of a date column
original: 2020/05/22
Desired outcome: 22/may/2020
so far I've done:
.to_datetime
dt.strftime('%d-%m-%Y')
converting into: 22/05/2020
how can I get the middle part to convert into alphabetical?

Try this, all the format codes are given here date formats:
df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%d/%b/%Y')
print(df)
Date
0 22/May/2020

Related

Converting year-month to next year-quarter

I have below date expressed as yearmon '202112'
I want to convert this to yearqtr and report the next quarter. Therefore from above string I want to get 2022Q1
I unsuccessfully tried below
import pandas as pd
pd.PeriodIndex(pd.to_datetime('202112') ,freq='Q')
Could you please help how to obtain the expected quarter. Any pointer will be veru helpful
import pandas as pd
df = pd.DataFrame({"Date": ['202112']}) # dummy data
df['next_quarter'] = pd.PeriodIndex(pd.to_datetime(df['Date'], format='%Y%m'), freq='Q') + 1
print(df)
Output:
Date next_quarter
0 202112 2022Q1
Note that column Date may be a string type but Quarter will be type period. You can convert it to a string if that's what you want.
I think one issue you're running into is that '202112' is not a valid date format. You'll want to use '2021-12'. Then you can do something like this:
pd.to_datetime('2021-12').to_period('Q') + 1
You can convert your date to this new format by simply inserting a - at index 4 of your string like so: date[:4] + '-' + date[4:]
This will take your date, convert it to quarters, and add 1 quarter.

separating date and time from datetime column in excel

how to separate date and time from datetime column if you have the format as below :
click here to view image
I am trying int(datetime column) for fetching date ; Datetime column - int(datetime column) for fetching time column
Your formula cannot work because your data is a text string (note that it has a letter included) and not a number.
So first convert the string into a "real" time with:
=substitute(a2,"T"," ")
You can then use:
Date: =INT(SUBSTITUTE(A2,"T"," "))
Time: =MOD(SUBSTITUTE(A2,"T"," "),1)
and be sure to format the results as desired:
If your column is formatted true date then use to separate date
=TEXT(A1,"yyyy-mm-dd")
For time
=TEXT(A1,"hh:mm:ss")
If data is in text string or output by TEXT() function then try below functions.
for date =TEXT(FILTERXML("<t><s>"&SUBSTITUTE(A1,"T","</s><s>")&"</s></t>","//s[1]"),"yyyy-mm-dd")
for time =TEXT(FILTERXML("<t><s>"&SUBSTITUTE(A1,"T","</s><s>")&"</s></t>","//s[last()]"),"hh:mm:ss")
For date
=LEFT(A2,FIND("T",A2)-1)
For time
=RIGHT(A2,LEN(A2)-FIND("T",A2))

Pandas and datetime coercion. Can't convert whole column to Timestamp

So, I have an issue. Pandas keeps telling me that
'datetime.date' is coerced to a datetime.
In the future pandas will not coerce, and a TypeError will be raised. To >retain the current behavior, convert the 'datetime.date' to a datetime with >'pd.Timestamp'.
I'd like to get rid of this warning
So until now I had a dataframe with some data, I was doing some filtration and manipulation. At some point I have a column with dates in string format. I don't care about timzeones etc. It's all about day accuracy. I'm getting a warning mentioned above, when I convert the strings to datetime, like below:
df['Some_date'] = pd.to_datetime(df['Some_date'], format='%m/%d/%Y')
So I tried to do something like that:
df['Some_date'] = pd.Timestamp(df['Some_date'])
But it fails as pd.Timestamp doesn't accept Series as an argument.
I'm looking for a quickest way to convert those strings to Timestamp.
=====================================
EDIT
I'm so sorry, for confusion. I'm getting my error at another place. It happens when I try to filtrate my data like this:
df = df[(df['Some_date'] > firstday)]
Where firstday is being calculated basing on datetime. Like here:
import datetime
def get_dates_filter():
lastday = datetime.date.today().replace(day=1) - datetime.timedelta(days=1)
firstday = lastday.replace(day=1)
return firstday, lastday
So probably the issue is comparing two different types of date representation
In pandas python dates are still poor supported, the best is working with datetimes with no times.
If there are python dates you can convert to strings before to_datetime:
df['Some_date'] = pd.to_datetime(df['Some_date'].astype(str))
If need remove times from datetimes in column use:
df['Some_date'] = pd.to_datetime(df['Some_date'].astype(str)).dt.floor('d')
Test:
rng = pd.date_range('2017-04-03', periods=3).date
df = pd.DataFrame({'Some_date': rng})
print (df)
Some_date
0 2017-04-03
1 2017-04-04
2 2017-04-05
print (type(df.loc[0, 'Some_date']))
<class 'datetime.date'>
df['Some_date'] = pd.to_datetime(df['Some_date'].astype(str))
print (df)
Some_date
0 2017-04-03
1 2017-04-04
2 2017-04-05
print (type(df.loc[0, 'Some_date']))
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
print (df['Some_date'].dtype)
datetime64[ns]

Send an email based on the date in a CSV column

I am looking to read data from a column in my CSV file.
All of the data in this column are dates. (DD/MM/YYYY).
I want my program to read the Dates column, and if the date is within 3 days of the current date, I want to add variables to all of the values in that row.
Ex.
Date,Name,LaterDate
1/1/19,John Smith, 2/21/19
If I run my program on 2/19/2019, I want an email sent that says "John Smith's case is closing on "2/21/2019".
I understand how to send an email. The part that I get stuck on is:
Reading the CSV column specifically.
If the date is within 3 days,
Assign variables to the values in the ROW,
Use those variables to send a custom email.
I see a lot of "Use Pandas" but I might need the individual steps broken down.
Thank you.
First things first, you need to read all the values of the csv file and store it in a variable (old_df). Then you need to save all the dates in the Series (dates). Next we create an empty DataFrame with the same columns. From here we create a simple for loop for each date in dates and it's index i. Turn date into a datetime object from the datetime library. Then we subtract amount of days between the current date and date. Take the absolute value of days so we always get a positive amount of days. Then add the index of that particular date in old_df to new_df.
import pandas as pd
from datetime import datetime
old_df = pd.read_csv('example.csv')
dates = old_df['LaterDate']
new_df = pd.DataFrame(columns=['Date', 'Name', 'LaterDate'])
for i, date in enumerate(dates):
date = datetime.strptime(date, '%m/%d/%y')
days = (datetime.now() - date).days
if abs(days) <= 3:
new_df = new_df.append(old_df.loc[i, :])
print(new_df)

pandas to_datetime formatting

I am trying to compare a pandas to_datetime object to another to_datetime object. In both locations, I am entering the date as date = pd.to_datetime('2017-01-03'), but when I run a print statement on each, in one case I get 2017-01-03, but in another I get 2017-01-03 00:00:00. This causes a problem because if I use an if statement comparing them such as if date1 == date2: they will not compare as equal, when in reality they are. Is there a format statement that I can use to force the to_datetime() command to yield the 2017-01-03 format?
You can use date() method to just select date from pandas timestamp and also use strftimeme(format) method to convert it into string with different formats.
date = pd.to_datetime('2017-01-03').date()
print(date)
>datetime.date(2017, 1, 3)
or
date = pd.to_datetime('2017-01-03').strftime("%Y-%m-%d")
print(date)
>'2017-01-03'
try .date()
pd.to_datetime('2017-01-03').date()
You can use
pd.to_datetime().date()
For example:
a='2017-12-24 22:44:09'
b='2017-12-24'
if pd.to_datetime(a).date() == pd.to_datetime(b).date():
print('perfect')

Resources