I want to subtract 5 days from current date
code:
import datetime
start_date = datetime.datetime.now().date()
end_date = datetime.datetime.now().date() - datetime.timedelta(days=5)
when I print end_date I am getting error like :
an integer is required (got type datetime.date)
from datetime import date, timedelta
date.today() - timedelta(5)
Related
I am trying to convert the str type hh:mm:ss to timestamp type without (year month day info), below is my code, however, it still pops out the 1970-01-01 info.
import pyspark
from pyspark.sql.functions import *
df1 = spark.createDataFrame([('10:30:00',)], ['date'])
df2 = (df1
.withColumn("new_date", to_timestamp("date", 'HH:mm:ss')))
df2.show(2)
sample output: 1970-01-01 10:30:00;
How to ignore the year-month-day info in this case? Can someone please help?
Thanks a lot
How do I convert Excel date format to number in Python? I'm importing a number of Excel files into Pandas dataframe in a loop and some values are formatted incorrectly in Excel. For example, the number column is imported as date and I'm trying to convert this date value into numeric.
Original New
1912-04-26 00:00:00 4500
How do I convert the date value in original to the numeric value in new? I know this code can convert numeric to date, but is there any similar function that does the opposite?
df.loc[0]['Date']= xlrd.xldate_as_datetime(df.loc[0]['Date'], 0)
I tried to specify the data type when I read in the files and also tried to simply change the data type of the column to 'float' but both didn't work.
Thank you.
I found that the number means the number of days from 1900-01-00.
Following code is to calculate how many days passed from 1900-01-00 until the given date.
import pandas as pd
from datetime import datetime, timedelta
df = pd.DataFrame(
{
'date': ['1912-04-26 00:00:00'],
}
)
print(df)
# date
#0 1912-04-26 00:00:00
def date_to_int(given_date):
given_date = datetime.strptime(given_date, '%Y-%m-%d %H:%M:%S')
base_date = datetime(1900, 1, 1) - timedelta(days=2)
delta = given_date - base_date
return delta.days
df['date'] = df['date'].apply(date_to_int)
print(df)
# date
#0 4500
I have a PySpark dataframe with column which has datetime values in the format '09/19/2020 09:27:18 AM'
I want to convert to first day of month 01-Nov-2020 in this format.
I have tried "date_col", F.trunc("date_col", "month") which is resulting in null date
and
df_result = df_result.withColumn('gl_date', F.udf(lambda d: datetime.datetime.strptime(d, '%MM/%dd/%yyyy %HH:%mm:%S a').strftime('%Y/%m/1'), t.StringType())(F.col('date_col')))
the second method I tried errors with date format '%MM/%dd/%yyyy %HH:%mm:%S a' is not matched with '09/19/2020 09:27:18 AM'
You can convert the column to timestamp type before calling trunc:
import pyspark.sql.functions as F
df_result2 = df_result.withColumn(
'gl_date',
F.date_format(
F.trunc(
F.to_timestamp("date_col", "MM/dd/yyyy hh:mm:ss a"),
"month"
),
"dd-MMM-yyyy"
)
)
When reading a csv file, the date column is set as month name (Jul-20 for July 2020), and when using parse_dates=True, Pandas converts it to 01-07-2020. How can I force pandas to convert it to end of month (ie, 31-07-2020)
Thanks
try using monthend from pandas.tseries.offsets
from pandas.tseries.offsets import MonthEnd
import pandas as pd
print(df)
month
0 2020-07-01
1 2020-08-02
df['month_end'] = df['month'] + MonthEnd(1)
print(df)
month month_end
0 2020-07-01 2020-07-31
1 2020-08-02 2020-08-31
You can use the inbuilt calendar and datetime modules and write your own apply method to achieve the desired result.
import calendar
import datetime
import pandas as pd
def parse_my_date(date):
date = datetime.datetime.strptime(date, '%B-%Y')
last_day = calendar.monthrange(date.year, date.month)[1]
date += datetime.timedelta(days=last_day-1)
return date
df['date'] = df['date'].apply(lambda x: parse_my_date(x))
I'm working on a Data set which contains "Date" column which is of int64 as a datatype. It is in YYYYMMDD format. I want it should be in YYYY/MM/DD format. Please help me to convert it to Date format.
Try this:
import pandas as pd
from datetime import datetime
dates = pd.DataFrame({"dates_int": [20190103, 20190206, 20190502]})
dates['dates_dt'] = dates['dates_int'].apply(lambda x: datetime.strptime(str(x), '%Y%m%d').strftime('%Y/%m/%d'))