Converting year-month to next year-quarter - python-3.x

I have below date expressed as yearmon '202112'
I want to convert this to yearqtr and report the next quarter. Therefore from above string I want to get 2022Q1
I unsuccessfully tried below
import pandas as pd
pd.PeriodIndex(pd.to_datetime('202112') ,freq='Q')
Could you please help how to obtain the expected quarter. Any pointer will be veru helpful

import pandas as pd
df = pd.DataFrame({"Date": ['202112']}) # dummy data
df['next_quarter'] = pd.PeriodIndex(pd.to_datetime(df['Date'], format='%Y%m'), freq='Q') + 1
print(df)
Output:
Date next_quarter
0 202112 2022Q1
Note that column Date may be a string type but Quarter will be type period. You can convert it to a string if that's what you want.

I think one issue you're running into is that '202112' is not a valid date format. You'll want to use '2021-12'. Then you can do something like this:
pd.to_datetime('2021-12').to_period('Q') + 1
You can convert your date to this new format by simply inserting a - at index 4 of your string like so: date[:4] + '-' + date[4:]
This will take your date, convert it to quarters, and add 1 quarter.

Related

Pandas set date as day(int)-month(str)-year(int)

I am trying to change the formatting of a date column
original: 2020/05/22
Desired outcome: 22/may/2020
so far I've done:
.to_datetime
dt.strftime('%d-%m-%Y')
converting into: 22/05/2020
how can I get the middle part to convert into alphabetical?
Try this, all the format codes are given here date formats:
df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%d/%b/%Y')
print(df)
Date
0 22/May/2020

Convert float time to datetime or timestamp in Python

I have the following dataframe:
The column Time is a string and I want to convert it either to timestamp or datetime formats. However, when I run df['Time'] = pd.to_datetime(df['Time']), I always get an error
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 08:53:30
Are you sure you are getting the right column and values. Because running
time = pd.to_datetime("13:30:35.805")
Gives
Timestamp('2020-04-20 13:30:35.805000')
as output as expected.
If you can't solve the problem with pandas directly you can always manually split the string in hours, minutes and seconds with
h, m, s = map(float, x.split(':'))
And use those values to create a timestamp

Pandas and datetime coercion. Can't convert whole column to Timestamp

So, I have an issue. Pandas keeps telling me that
'datetime.date' is coerced to a datetime.
In the future pandas will not coerce, and a TypeError will be raised. To >retain the current behavior, convert the 'datetime.date' to a datetime with >'pd.Timestamp'.
I'd like to get rid of this warning
So until now I had a dataframe with some data, I was doing some filtration and manipulation. At some point I have a column with dates in string format. I don't care about timzeones etc. It's all about day accuracy. I'm getting a warning mentioned above, when I convert the strings to datetime, like below:
df['Some_date'] = pd.to_datetime(df['Some_date'], format='%m/%d/%Y')
So I tried to do something like that:
df['Some_date'] = pd.Timestamp(df['Some_date'])
But it fails as pd.Timestamp doesn't accept Series as an argument.
I'm looking for a quickest way to convert those strings to Timestamp.
=====================================
EDIT
I'm so sorry, for confusion. I'm getting my error at another place. It happens when I try to filtrate my data like this:
df = df[(df['Some_date'] > firstday)]
Where firstday is being calculated basing on datetime. Like here:
import datetime
def get_dates_filter():
lastday = datetime.date.today().replace(day=1) - datetime.timedelta(days=1)
firstday = lastday.replace(day=1)
return firstday, lastday
So probably the issue is comparing two different types of date representation
In pandas python dates are still poor supported, the best is working with datetimes with no times.
If there are python dates you can convert to strings before to_datetime:
df['Some_date'] = pd.to_datetime(df['Some_date'].astype(str))
If need remove times from datetimes in column use:
df['Some_date'] = pd.to_datetime(df['Some_date'].astype(str)).dt.floor('d')
Test:
rng = pd.date_range('2017-04-03', periods=3).date
df = pd.DataFrame({'Some_date': rng})
print (df)
Some_date
0 2017-04-03
1 2017-04-04
2 2017-04-05
print (type(df.loc[0, 'Some_date']))
<class 'datetime.date'>
df['Some_date'] = pd.to_datetime(df['Some_date'].astype(str))
print (df)
Some_date
0 2017-04-03
1 2017-04-04
2 2017-04-05
print (type(df.loc[0, 'Some_date']))
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
print (df['Some_date'].dtype)
datetime64[ns]

Send an email based on the date in a CSV column

I am looking to read data from a column in my CSV file.
All of the data in this column are dates. (DD/MM/YYYY).
I want my program to read the Dates column, and if the date is within 3 days of the current date, I want to add variables to all of the values in that row.
Ex.
Date,Name,LaterDate
1/1/19,John Smith, 2/21/19
If I run my program on 2/19/2019, I want an email sent that says "John Smith's case is closing on "2/21/2019".
I understand how to send an email. The part that I get stuck on is:
Reading the CSV column specifically.
If the date is within 3 days,
Assign variables to the values in the ROW,
Use those variables to send a custom email.
I see a lot of "Use Pandas" but I might need the individual steps broken down.
Thank you.
First things first, you need to read all the values of the csv file and store it in a variable (old_df). Then you need to save all the dates in the Series (dates). Next we create an empty DataFrame with the same columns. From here we create a simple for loop for each date in dates and it's index i. Turn date into a datetime object from the datetime library. Then we subtract amount of days between the current date and date. Take the absolute value of days so we always get a positive amount of days. Then add the index of that particular date in old_df to new_df.
import pandas as pd
from datetime import datetime
old_df = pd.read_csv('example.csv')
dates = old_df['LaterDate']
new_df = pd.DataFrame(columns=['Date', 'Name', 'LaterDate'])
for i, date in enumerate(dates):
date = datetime.strptime(date, '%m/%d/%y')
days = (datetime.now() - date).days
if abs(days) <= 3:
new_df = new_df.append(old_df.loc[i, :])
print(new_df)

How to convert a numeric year into day-month-year format in python?

I have a column called construction_year as numerical(int) year. I want to convert it to dd-mm-yyyy format in python. I have tried with datetime and pandas to_datetim and converting time stamp extracting the format but in vain.
Ex: I have year like 2013(int) I would like to convert it as 01-01-2013 in python 3.x.
Into a string
If you want to convert it into a string, you can simply use:
convert_string = '01-01-{}'.format
and then use it like:
>>> convert_string(2013)
'01-01-2013'
Into a datetime
If you want to convert it to a datetime object, you can simply use:
from datetime import date
from functools import partial
convert_to_date = partial(date,month=1,day=1)
Now convert_to_date is a function that converts a numerical year into a date object:
>>> convert_to_date(2013)
datetime.date(2013, 1, 1)

Resources