Pandas to_json date format is changing - python-3.x

I have this dataframe with start_date and end_date
and when i convert to json using to_json using this line
json_data = df.to_json(orient='records')
now if i print json_data the start_date is getting converted from yyyy-mm-dd to integer format
Please suggest a way so that the date format remains in yyyy-mm-dd format

Use DataFrame.select_dtypes for datetime columns, convert to format YYYy-MM-DD and last overwrite original data by DataFrame.update:
df.update(df.select_dtypes('datetime').apply(lambda x: x.dt.strftime('%Y-%m-%d')))
Then your solution working correct:
json_data = df.to_json(orient='records')

First set the format of your date, then set the date_format to 'iso':
df['start_date'] = pd.to_datetime(df['start_date']).dt.strftime('%Y-%m-%d')
df['end_date'] = pd.to_datetime(df['end_date']).dt.strftime('%Y-%m-%d')
data = df.to_json(orient='records', date_format='iso')
print(data)
[{"start_date":"2020-08-10","end_date":"2020-08-16"}]

Related

Python convert a str date into a datetime with timezone object

In my django project i have to convert a str variable passed as a date ("2021-11-10") to a datetime with timezone object for execute an ORM filter on a DateTime field.
In my db values are stored as for example:
2021-11-11 01:18:04.200149+00
i try:
# test date
df = "2021-11-11"
df = df + " 00:00:00+00"
start_d = datetime.strptime(df, '%Y-%m-%d %H:%M:%S%Z')
but i get an error due to an error about str format and datetime representation (are different)
How can i convert a single date string into a datetimeobject with timezone stated from midnight of the date value?
So many thanks in advance
It's not the way to datetime.strptime.
Read a little bit more here
I believe it will help you.
you should implement month as str and without "-".
good luck

convert DD-MMM-YYYY to DD_MM_YYYY in spark

I have a file that contains a date column and the values are 01-Feb-2019 , 01-02-2019 02:00:00.
I have to convert these into DD_MM_YYYY format in spark.
Any suggestions?
I tried below with no luck
val r = dfCsvTS02.withColumn("create_dts", date_format($"create_dts", "dd-MM-yyyy hh:mm:ss"))
iS it possible that whatever the way we get the date , it will convert all to dd-mm-yyyy
Simply use functions to_timestamp to convert date and date_format to format. Something like this:
val r = dfCsvTS02.withColumn("create_dts", date_format(to_timestamp($"create_dts", "dd-MMM-yyyy").cast("date"), "dd-MM-yyyy"))

AWS Glue - How to exclude rows where string does not match a date format

I have a dataset with a datecreated column. this column is typically in the format 'dd/MM/yy' but sometimes it has garbage text. I want to ultimately convert the column to a DATE and have the garbage text as a NULL value.
I have been trying to use resolveChoice, but it is resulting in all null values.
data_res = date_dyf.resolveChoice(specs =
[('datescanned','cast:timestamp')])
Sample data
3,1/1/18,text7
93,this is a test,text8
9,this is a test,text9
82,12/12/17,text10
Try converting a DynamicFrame into Spark's DataFrame and parse date using to_date function:
from pyspark.sql.functions import to_date
df = date_dyf.toDF
parsedDateDf = df.withColumn("datescanned", to_date(df["datescanned"], "dd/MM/yy"))
dyf = DynamicFrame.fromDF(parsedDateDf, glueContext, "convertedDyf")
If a string doesn't match the format a null value will be set

convert a string to yyyy/mm/dd datetime type

I tried to convert a string into datetime object with yyyy/mm/dd format. using striptime() function it returns yyyy-mm-dd format with datetime.date type. I have used strftime() to convert into yyyy/mm/dd format but it returns a string type. How can we get a datetime type object with yyyy/mm/dd format?
datestr = '25/03/2019'
date_obj = datetime.strptime(datestr, '%d/%m/%Y').date()
print(date_obj)
date_formatted = date_obj.strftime('%Y/%m/%d')
print(type(datestr))
print(type(date_formatted))
I hope this help,
import datetime
date_time_str = '2018-06-29 08:15:27.243860'
date_time_obj = datetime.datetime.strptime(date_time_str, '%Y-%m-%d %H:%M:%S.%f')
print('Date:', date_time_obj.date())
print('Time:', date_time_obj.time())
print('Date-time:', date_time_obj)
Also recommend you to read this article.

Datetime Conversion with ValueError

I have a pandas dataframe with columns containing start and stop times in this format: 2016-01-01 00:00:00
I would like to convert these times to datetime objects so that I can subtract one from the other to compute total duration. I'm using the following:
import datetime
df = df['start_time'] =
df['start_time'].apply(lambda x:datetime.datetime.strptime(x,'%Y/%m/%d/%T %I:%M:%S %p'))
However, I have the following ValueError:
ValueError: 'T' is a bad directive in format '%Y/%m/%d/%T %I:%M:%S %p'
This would convert the column into datetime64 dtype. Then you could process whatever you need using that column.
df['start_time'] = pd.to_datetime(df['start_time'], format="%Y-%m-%d %H:%M:%S")
Also if you want to avoid explicitly specifying datetime format you can use the following:
df['start_time'] = pd.to_datetime(df['start_time'], infer_datetime_format=True)
Simpliest is use to_datetime:
df['start_time'] = pd.to_datetime(df['start_time'])

Resources