How to convert working with date time in Pandas? - python-3.x

I have datetime field like 2017-01-15T02:41:38.466Z and would like to convert it to %Y-%m-%d format. How can this be achieved in pandas or python?
I tried this
frame['datetime_ordered'] = pd.datetime(frame['datetime_ordered'], format='%Y-%m-%d')
but getting the error
cannot convert the series to <class 'int'>

The following code worked
d_parser= lambda x: pd.datetime.strptime(x,'%Y-%m-%dT%H:%M:%S.%fZ')
for filename in all_files:
df = pd.read_csv(filename, index_col=None, header=0,parse_dates['datetime_ordered'],date_parser=d_parser)
li.append(df)
frame =pd.concat(li, axis=0, ignore_index=True)

import datetime
from datetime import datetime
date_str="2017-01-15T02:41:38.466Z"
a_date=pd.to_datetime(date_str)
print("date time value", a_date)
#datetime to string with format
print(a_date.strftime('%Y-%m-%d'))

Related

How to format date string like 16/08/2018 14.29.30 to Datetime 2018-08-16 14:29:30 Python

How can I convert date e.g. "16/08/2018 14.29.30" to Datetime 2018-08-16 14:29:30? I'having problem to convert '.' to ':'.
import pandas as pd
s_date = '16/08/2018 14.29.30'
s_date = s_date.replace('.',':')
pd.to_datetime(s_date)
Result:
Timestamp('2018-08-16 14:29:30')
Option 1. use pandas.to_datetime with format argument to specify strftime:
import pandas as pd
pd.to_datetime('16/08/2018 14.29.30', format='%d/%m/%Y %H.%M.%S')
Option 2. replace . with :, then pass to to_datetime:
pd.to_datetime('16/08/2018 14.29.30'.replace('.', ':'))
Both options return
Timestamp('2018-08-16 14:29:30')

Pandas plotting graph with timestamp

pandas 0.23.4
python 3.5.3
I have some code that looks like below
import pandas as pd
from datetime import datetime
from matplotlib import pyplot
def dateparse():
return datetime.strptime("2019-05-28T00:06:20,927", '%Y-%m-%dT%H:%M:%S,%f')
series = pd.read_csv('sample.csv', delimiter=";", parse_dates=True,
date_parser=dateparse, header=None)
series.plot()
pyplot.show()
The CSV file looks like below
2019-05-28T00:06:20,167;2070
2019-05-28T00:06:20,426;147
2019-05-28T00:06:20,927;453
2019-05-28T00:06:22,688;2464
2019-05-28T00:06:27,260;216
As you can see 2019-05-28T00:06:20,167 is the timestamp with milliseconds and 2070 is the value that I want plotted.
When I run this the graph gets printed however on the X-Axis I see numbers which is a bit odd. I was expecting to see actual timestamps (like MS Excel). Can someone tell me what I am doing wrong?
You did not set datetime as index. Aslo, you don't need a date parser, just pass the columns you want to parse:
dfstr = '''2019-05-28T00:06:20,167;2070
2019-05-28T00:06:20,426;147
2019-05-28T00:06:20,927;453
2019-05-28T00:06:22,688;2464
2019-05-28T00:06:27,260;216'''
df = pd.read_csv(pd.compat.StringIO(dfstr), sep=';',
header=None, parse_dates=[0])
plt.plot(df[0], df[1])
plt.show()
Output:
Or:
df.set_index(0)[1].plot()
gives a little better plot:

Keyerror in time/Date Components of datetime - what to do?

I am using a pandas DataFrame with datetime indexing. I know from the
Xarray documentation, that datetime indexing can be done as ds['date.year'] with ds being the DataArray of xarray, date the date index and years of the dates. Xarray points to datetime components which again leads to DateTimeIndex, the latter being panda documentation. So I thought of doing the same with pandas, as I really like this feature.
However, it is not working for me. Here is what I did so far:
# Import required modules
import pandas as pd
import numpy as np
# Create DataFrame (name: df)
df=pd.DataFrame({'Date': ['2017-04-01','2017-04-01',
'2017-04-02','2017-04-02'],
'Time': ['06:00:00','18:00:00',
'06:00:00','18:00:00'],
'Active': [True,False,False,True],
'Value': np.random.rand(4)})
# Combine str() information of Date and Time and format to datetime
df['Date']=pd.to_datetime(df['Date'] + ' ' + df['Time'],format = '%Y-%m-%d %H:%M:%S')
# Make the combined data the index
df = df.set_index(df['Date'])
# Erase the rest, as it is not required anymore
df = df.drop(['Time','Date'], axis=1)
# Show me the first day
df['2017-04-01']
Ok, so this shows me only the first entries. So far, so good.
However
df['Date.year']
results in KeyError: 'Date.year'
I would expect an output like
array([2017,2017,2017,2017])
What am I doing wrong?
EDIT:
I have a workaround, which I am able to go on with, but I am still not satisfied, as this doesn't explain my question. I did not use a pandas DataFrame, but an xarray Dataset and now this works:
# Load modules
import pandas as pd
import numpy as np
import xarray as xr
# Prepare time array
Date = ['2017-04-01','2017-04-01', '2017-04-02','2017-04-02']
Time = ['06:00:00','18:00:00', '06:00:00','18:00:00']
time = [Date[i] + ' ' + Time[i] for i in range(len(Date))]
time = pd.to_datetime(time,format = '%Y-%m-%d %H:%M:%S')
# Create Dataset (name: ds)
ds=xr.Dataset({'time': time,
'Active': [True,False,False,True],
'Value': np.random.rand(4)})
ds['time.year']
which gives:
<xarray.DataArray 'year' (time: 4)>
array([2017, 2017, 2017, 2017])
Coordinates:
* time (time) datetime64[ns] 2017-04-01T06:00:00 ... 2017-04-02T18:00:00
Just in terms of what you're doing wrong, your are
a) trying to call an index as a series
b) chaning commands within a string df['Date'] is a single column df['Date.year'] is a column called 'Date.year'
if you're datetime is the index, then use the .year or dt.year if it's a series.
df.index.year
#or assuming your dtype is a proper datetime (your code indicates it is)
df.Date.dt.year
hope that helps bud.

Convert csv time format

I was trying to convert the time format from CSV file like "21-03-2019 00:10:00" to "2019-03-21 00:10:00" I spent hours on this finally still doesn't work hope you all guys can point out my wrong place.
This time I am using Python 3
import pandas as pd
import datetime
data = pd.read_csv('/Users/dongmintian994410/Downloads/Data/FM02.csv', header=0)
for i in range(0, len(data)):
row = data.iloc[i]['Date Time']
now I can print out the first row which including the time array however I don't know how to continue .
I would like to convert the time format like "21-03-2019 00:10:00" to "2019-03-21 00:10:00"
from datetime import datetime
import pandas as pd
data = pd.read_csv('/Users/dongmintian994410/Downloads/Data_Capiatalwater/FM002 2.csv', header=0)
for i in range(0, len(data)):
row = data.iloc[i]['Date Time']
datetime_str = row
datetime_object = datetime.strptime(datetime_str, '%d/%m/%Y %H:%M:%S')
print(datetime_object)
finally, I figured this out

How to convert a datatype of pandas dataframe from str to float in Python3?

import pandas as pd
d=[('Shubham',24),
('Shrikant',58),
('na',34)]
df = pd.DataFrame(d,columns=['Name','Age'])
df.dtypes
Output:
Name object
Age int32
dtype: object
How do I convert the datatype of 'Name' column to float ?
df['Name'].astype(float)
Getting below error:
ValueError: could not convert string to float: 'na'
If you mean converting the name into number then no, string can't be turn into number directly using astype for what I know. If you meant to encode it then it is as follow:
import pandas as pd
d=[('Shubham',24),
('Shrikant',58),
('na',34)]
df = pd.DataFrame(d,columns=['Name','Age'])
df['Name'] = df['Name'].astype('category').cat.codes
print(df.head())

Resources