Add/Subtract UTC Time to Datetime 'Time' column - python-3.x

I have a sample dataframe as given below.
import pandas as pd
import numpy as np
data = {'InsertedDate':['2022-01-21 20:13:19.000000', '2022-01-21 20:20:24.000000', '2022-02-
02 16:01:49.000000', '2022-02-09 15:01:31.000000'],
'UTCOffset': ['-05:00','+02:00','-04:00','+06:00']}
df = pd.DataFrame(data)
df['InsertedDate'] = pd.to_datetime(df['InsertedDate'])
df
The 'InsertedDate' is a datetime column wheres the 'UTCOffset' is a string column.
I want to add the Offset time to the 'Inserteddate' column and display the final result in a new column as a 'datetime' column.
It should look something like this image shown below.
Any help is greatly appreciated. Thank you!

You can use pd.to_timedelta for the offset and add with time.
# to_timedelta needs to have [+-]HH:MM:SS format, so adding :00 to fill :SS part.
df['UTCOffset'] = pd.to_timedelta(df.UTCOffset + ':00')
df['CorrectTime'] = df.InsertedDate + df.UTCOffset

Related

How to convert a string to pandas datetime

I have a column in my pandas data frame which is string and want to convert it to pandas date so that I will be able to sort
import pandas as pd
dat = pd.DataFrame({'col' : ['202101', '202212']})
dat['col'].astype('datetime64[ns]')
However this generates error. Could you please help to find the correct way to perform this
I think this code should work.
dat['date'] = pd.to_datetime(dat['col'], format= "%Y%m")
dat['date'] = dat['date'].dt.to_period('M')
dat.sort_values(by = 'date')
If you want to replace the sorted dataframe add in brackets inplace = True.
Your code didn't work because your wrong format to date. If you would have date in format for example 20210131 yyyy-mm-dd. This code would be enought.
dat['date'] = pd.to_datetime(dat['col'], format= "%Y%m%d")

Pandas add new column in csv and save

I have code like:
import pandas as pd
df = pd.read_csv('file.csv')
for id1, id2 in zip(df.iterrows(),df.loc[1:].iterrows()):
id1[1]['X_Next'] = id2[1]['X']
as you see, I need for each row to have next row's column value.
Iteration looks good, but I dunno how to save it bvack to csv file.
Can someone help me ? thanks!
IIUC use Series.shift:
df = pd.read_csv('file.csv')
df['X_Next'] = df['X'].shift(-1)
df.to_csv('file1.csv', index=False)

Convert Dataframe Column from Series to Datetime

Unable to convert DataFrame column to date time format.
from datetime import datetime
Holidays = pd.DataFrame({'Date':['2016-01-01','2016-01-06','2016-02-09','2016-02-10','2016-03-20'], 'Expenditure':[907.2,907.3,904.8,914.6,917.3]})
Holidays['Date'] = pd.to_datetime(Holidays['Date'])
type(Holidays['Date'])
Output: pandas.core.series.Series
Also tried
Holidays['Date'] = Holidays['Date'].astype('datetime64[ns]')
type(Holidays['Date'])
But same output
Output: pandas.core.series.Series
I think you are getting a bit mixed up. The dtypes of Holidays['Date'] is datetime64[ns]
Here's how I am checking.
from datetime import datetime
import pandas as pd
Holidays = pd.DataFrame({'Date':['2016-01-01','2016-01-06','2016-02-09','2016-02-10','2016-03-20'], 'Expenditure':[907.2,907.3,904.8,914.6,917.3]})
print ('Before converting : ' , Holidays['Date'].dtypes)
Holidays['Date'] = pd.to_datetime(Holidays['Date'])
print ('After converting : ' ,Holidays['Date'].dtypes)
The output is:
Before converting : object
After converting : datetime64[ns]
Thought I will also share some addition information for you around types and dtypes. See more info in this link for types-and-dtypes

Python - Filtering Pandas Timestamp Index

Given Timestamp indices with many per day, how can I get a list containing only the last Timestamp of a day? So in case I have such:
import pandas as pd
all = [pd.Timestamp('2016-05-01 10:23:45'),
pd.Timestamp('2016-05-01 18:56:34'),
pd.Timestamp('2016-05-01 23:56:37'),
pd.Timestamp('2016-05-02 03:54:24'),
pd.Timestamp('2016-05-02 14:32:45'),
pd.Timestamp('2016-05-02 15:38:55')]
I would like to get:
# End of Day:
EoD = [pd.Timestamp('2016-05-01 23:56:37'),
pd.Timestamp('2016-05-02 15:38:55')]
Thx in advance!
Try pandas groupby
all = pd.Series(all)
all.groupby([all.dt.year, all.dt.month, all.dt.day]).max()
You get
2016 5 1 2016-05-01 23:56:37
2 2016-05-02 15:38:55
I've created an example dataframe.
import pandas as pd
all = [pd.Timestamp('2016-05-01 10:23:45'),
pd.Timestamp('2016-05-01 18:56:34'),
pd.Timestamp('2016-05-01 23:56:37'),
pd.Timestamp('2016-05-02 03:54:24'),
pd.Timestamp('2016-05-02 14:32:45'),
pd.Timestamp('2016-05-02 15:38:55')]
df = pd.DataFrame({'values':0}, index = all)
Assuming your data frame is structured as example, most importantly is sorted by index, code below is supposed to help you.
for date in set(df.index.date):
print(df[df.index.date == date].iloc[-1,:])
This code will for each unique date in your dataframe return last row of the slice so while sorted it'll return your last record for the day. And hey, it's pythonic. (I believe so at least)

extract information from single cells from pandas dataframe

I'm looking to pull specific information from the table below to use in other functions. For example extracting the volume on 1/4/16 to see if the volume traded is > 1 million. Any thoughts on how to do this would be greatly appreciated.
import pandas as pd
import pandas.io.data as web # Package and modules for importing data; this code may change depending on pandas version
import datetime
1, 2016
start = datetime.datetime(2016,1,1)
end = datetime.date.today()
apple = web.DataReader("AAPL", "yahoo", start, end)
type(apple)
apple.head()
Results:
The datareader will return a df with a datetimeIndex, you can use partial datetime string matching to give you the specific row and column using loc:
apple.loc['2016-04-01','Volume']
To test whether this is larger than 1 million, just compare it:
apple.loc['2016-04-01','Volume'] > 1000000
which will return True or False

Resources