Unable to convert Dataframe object to datetime - python-3.x

I have been to trying to convert dataframe object to datetime with format Y-m-d. My data looks like:
pdi.head()
Date Predicted_Linear_Regression
0 [2005-02-16T00:00:00.000000000] 0.000663
1 [1982-02-03T00:00:00.000000000] 0.000666
2 [1995-07-12T00:00:00.000000000] 0.000665
3 [1995-03-13T00:00:00.000000000] 0.000666
4 [2009-05-20T00:00:00.000000000] 0.000658
I have tried to convert Date column to str and then tried to convert to datetime but not able. Tried to convert it directly but unable to.

Your Date column contains lists of dates, not dates. Extract the first element of each list, then convert to datetime:
pd.to_datetime(df['Date'].str[0])

If your df.Date is of type list, then try:
df.Date = pd.to_datetime(np.array(list(df.Date)).flatten())
If it's of type str, try:
df.Date = pd.to_datetime(df.Date.str.slice(1,-1))
And if lazy:
try:
df.Date = pd.to_datetime(df.Date.str.slice(1,-1))
except:
df.Date = pd.to_datetime(np.array(list(df.Date)).flatten())

Related

How to convert Excel imported data in format %m%d%y H:M in a dataframe to datetime data?

I have a dataframe where the first rows look like this:
When I list df.iloc1 it returns a column with those dates, but it says they are type "object". I tried to convert them to string using:
df.iloc[1] = df.iloc[1].astype(str)
It still lists the data type as object. But a string is an object, right? So I tried variations on this to convert to datetime:
df.iloc[1] = pd.to_datetime(df.iloc[1], format='%mm/%dd/%yyyy %H:%M')
error: time data '11/22/2022 5:15' does not match format '%mm/%dd/%yyyy %H:%M' (match)

How to convert excel date to numeric value using Python

How do I convert Excel date format to number in Python? I'm importing a number of Excel files into Pandas dataframe in a loop and some values are formatted incorrectly in Excel. For example, the number column is imported as date and I'm trying to convert this date value into numeric.
Original New
1912-04-26 00:00:00 4500
How do I convert the date value in original to the numeric value in new? I know this code can convert numeric to date, but is there any similar function that does the opposite?
df.loc[0]['Date']= xlrd.xldate_as_datetime(df.loc[0]['Date'], 0)
I tried to specify the data type when I read in the files and also tried to simply change the data type of the column to 'float' but both didn't work.
Thank you.
I found that the number means the number of days from 1900-01-00.
Following code is to calculate how many days passed from 1900-01-00 until the given date.
import pandas as pd
from datetime import datetime, timedelta
df = pd.DataFrame(
{
'date': ['1912-04-26 00:00:00'],
}
)
print(df)
# date
#0 1912-04-26 00:00:00
def date_to_int(given_date):
given_date = datetime.strptime(given_date, '%Y-%m-%d %H:%M:%S')
base_date = datetime(1900, 1, 1) - timedelta(days=2)
delta = given_date - base_date
return delta.days
df['date'] = df['date'].apply(date_to_int)
print(df)
# date
#0 4500

How to convert a column of data in a DataFrame filled with string representation of non-uniformed date formats to datetime?

Let's say:
>>> print(df)
location date
paris 23/02/2010
chicago 3-23-2013
...
new york 04-23-2013
helsinki 13/10/2015
Currently, df["date"] is in str. I want to convert the date column to datetime using
>>> df["date"] = pd.to_datetime(df["date"])
I would get ValueError due to ParserError. This is because the format of the date is inconsistent (i.e. dd/mm/yyyy, then next one is m/dd/yyyy).
If I were to write the code below, it still wouldn't work due to the date not being uniformed and delimiters being different:
>>> df["date"] = pd.to_datetime(df["date"], format="%d/%m/%Y")
The last option that I could think of was to write the code below, which replaces all of the dates that are not formatted like the first date to NaT:
>>> df["date"] = pd.to_datetime(df["date"], errors="coerce")
How do I convert the whole date column to datetime while having the dates not uniform in terms of the delimiters, and the orders of days, months and years?
use, apply method of pandas
df['date'] = df.apply(lambda x: pd.to_datetime(x['date']),axis = 1)

How to format pandas datetime64 type to show only the time and not the date and time

I am trying to change a column type from object to a datetime64 but want it to display only the time as hours:minute.
The column is a string formatted 13:45:00. When I change the data type to datetime64 it now prints it with a made up date (1900-01-01 13:45:00).
I want the column data type to be a datetime64 type (so I can do comparisons and operations later) but only I want it display the time in hour:minute format without the seconds and without the date.
Example - 13:45
Everything I can find in google is about getting only the date to show and maintain the datetime64 datatype, which I was able to do.
I have tried messing with the pd.to_datetime().dt.strftime('%H:%M'). It correctly formats the column but its datatype is object not datetime64.
cycle_trips_df['Checkout Date'] = pd.to_datetime(
cycle_trips_df['Checkout Date'], infer_datetime_format=True
).dt.normalize() #strftime('%m/%d/%Y') # format='%m/%d/%Y').dt.date
cycle_trips_df['Checkout Time'] = pd.to_datetime(
cycle_trips_df['Checkout Time'], format='%H:%M:%S'
).dt.strftime('%H:%M')
print(cycle_trips_df.dtypes)
[Output]
Checkout Date datetime64[ns]
Checkout Time object
Use a timedelta rather than a datetime:
In [11]: s = pd.Series(['13:45:00'])
In [12]: pd.to_timedelta(s)
Out[12]:
0 13:45:00
dtype: timedelta64[ns]
Distinguish between the data and your views of that data. A datetime64 is a datetime64 and will be printed by default as a full date string. You can use strftime to get the time part.
str = "13:45:00" # Your string.
dt64 = pd.to_datetime(str) # the datetime64 object
timestr = dt64.strftime("%H:%M:%S") # extracting the time string from the datetime64.
May use some like:
df['time'] = df['time'].apply(lambda x: datetime.strptime(x, "%H:%M:%S").time())
It will be object

Construct a DataTime index from multiple columns of a datadrame

I am parsing a dataframe from a sas7bdat file and I want to convert the index into datetime to resample the data.
I have one column with the Date which is type String and another column of the time which is of type datetime.time. Does anybody know how to convert this to one column of datetime?
I already tried the pd.datetime like this but it requires individual columns for year, month and day:
df['TimeIn']=str(df['TimeIn'])
df['datetime']=pd.to_datetime(df[['Date', 'TimeIn']], dayfirst=True)
This gives me a value error:
ValueError: to assemble mappings requires at least that [year, month, day] be specified: [day,month,year] is missing
DataFrame column headers
If you convert both the date and time column to str then you can concatenate them and then call to_datetime:
In[155]:
df = pd.DataFrame({'Date':['08/05/2018'], 'TimeIn':['10:32:12']})
df
Out[155]:
Date TimeIn
0 08/05/2018 10:32:12
In[156]:
df['new_date'] = pd.to_datetime(df['Date']+' '+df['TimeIn'])
df
Out[156]:
Date TimeIn new_date
0 08/05/2018 10:32:12 2018-08-05 10:32:12

Resources