Following my initial question here with very constructive answers, I want to customize it for my original dataframe.
The dataframe upon which I need to make changes has accrued from: nightframe=new_df.between_time('22:00','04:00') and the first few rows look like this:
date time diffs criteria 1
datetime
2018-01-05 22:00:00 2018-01-05 22:00:00 0.0 True
2018-01-05 23:00:00 2018-01-05 23:00:00 -1.0 False
2018-01-06 00:00:00 2018-01-06 00:00:00 1.0 True
2018-01-06 01:00:00 2018-01-06 01:00:00 -2.0 False
2018-01-06 02:00:00 2018-01-06 02:00:00 -1.0 True
2018-01-06 03:00:00 2018-01-06 03:00:00 1.0 True
2018-01-06 04:00:00 2018-01-06 04:00:00 1.0 False
2018-01-06 22:00:00 2018-01-06 22:00:00 -1.0 True
I need to assign the date to the previous date if the time is from 00:00 to 04:00. I have tried these codes for my condition and they do not work:
condition = nightframe['time'].isin([0,1,2,3,4])
condition = nightframe['time'].dt.time.isin(\
['00:00','01:00','02:00','03:00','04:00'])
condition = nightframe['time'](['00:00','01:00','02:00','03:00','04:00'])
If the condition works, I suppose that the dataframe I need can be given from: nightframe['date']=np.where(condition, nightframe['date']-pd.Timedelta('1 day'), nightframe['date']) and should give this view:
date time diffs criteria 1
datetime
2018-01-05 22:00:00 2018-01-05 22:00:00 0.0 True
2018-01-05 23:00:00 2018-01-05 23:00:00 -1.0 False
2018-01-06 00:00:00 2018-01-05 00:00:00 1.0 True
2018-01-06 01:00:00 2018-01-05 01:00:00 -2.0 False
2018-01-06 02:00:00 2018-01-05 02:00:00 -1.0 True
2018-01-06 03:00:00 2018-01-05 03:00:00 1.0 True
2018-01-06 04:00:00 2018-01-05 04:00:00 1.0 False
2018-01-06 22:00:00 2018-01-06 22:00:00 -1.0 True
2018-01-06 23:00:00 2018-01-06 23:00:00 1.0 True
2018-01-07 00:00:00 2018-01-06 00:00:00 0.0 False
2018-01-07 01:00:00 2018-01-06 01:00:00 1.0 True
2018-01-07 02:00:00 2018-01-06 02:00:00 0.0 False
2018-01-07 03:00:00 2018-01-06 03:00:00 -1.0 False
2018-01-07 04:00:00 2018-01-06 04:00:00 1.0 True
2018-01-07 22:00:00 2018-01-07 22:00:00 1.0 True
Note: the 'datetime' is the index of my dataframe and the types of the columns of nightframe are:
print(nightframe.dtypes)
date object
time object
diffs float64
criteria 1 object
dtype: object
print(nightframe.index.dtype)
datetime64[ns]
Thank you for your contribution. Here is the code that worked for me
I opted for keeping the 'date' and creating 'date2' column in order to be able to compare. Otherwise, 'date' in place of 'date2' in the following:
hours=nightframe.index.hour
condition=hours.isin([0,1,2,3,4])
nightframe['date2']=np.where(condition, \
nightframe['date']-pd.Timedelta('1 day'), \
nightframe['date'])
print(nightframe.head(20))
Output[]:
date time diffs criteria 1 date2
datetime
2018-01-05 13:00:00 2018-01-05 13:00:00 0.0 NaN 2018-01-05
2018-01-05 14:00:00 2018-01-05 14:00:00 -1.0 False 2018-01-05
2018-01-05 15:00:00 2018-01-05 15:00:00 0.0 True 2018-01-05
2018-01-05 16:00:00 2018-01-05 16:00:00 -2.0 False 2018-01-05
2018-01-05 17:00:00 2018-01-05 17:00:00 1.0 True 2018-01-05
2018-01-05 18:00:00 2018-01-05 18:00:00 1.0 False 2018-01-05
2018-01-05 19:00:00 2018-01-05 19:00:00 -1.0 False 2018-01-05
2018-01-05 20:00:00 2018-01-05 20:00:00 0.0 True 2018-01-05
2018-01-05 21:00:00 2018-01-05 21:00:00 -2.0 False 2018-01-05
2018-01-05 22:00:00 2018-01-05 22:00:00 0.0 True 2018-01-05
2018-01-05 23:00:00 2018-01-05 23:00:00 -2.0 False 2018-01-05
2018-01-06 00:00:00 2018-01-06 00:00:00 -2.0 True 2018-01-05
2018-01-06 01:00:00 2018-01-06 01:00:00 0.0 True 2018-01-05
2018-01-06 02:00:00 2018-01-06 02:00:00 -1.0 False 2018-01-05
2018-01-06 03:00:00 2018-01-06 03:00:00 0.0 True 2018-01-05
2018-01-06 04:00:00 2018-01-06 04:00:00 -1.0 False 2018-01-05
2018-01-06 05:00:00 2018-01-06 05:00:00 -2.0 False 2018-01-06
2018-01-06 06:00:00 2018-01-06 06:00:00 -1.0 True 2018-01-06
2018-01-06 07:00:00 2018-01-06 07:00:00 0.0 True 2018-01-06
2018-01-06 08:00:00 2018-01-06 08:00:00 0.0 True 2018-01-06
you can consider using again between_time to select the rows you want to remove a day something like:
nightframe[nightframe.between_time('00:00','04:00').index, 'date'] -= pd.Timedelta(days=1)
and if this fails, you may need to convert the column date first with pd.to_datetime
nightframe['date'] = pd.to_datetime(nightframe['date'])
Try this one:
df["date"] = pd.to_datetime(df["date"]) + datetime.timedelta(days=-1) * (pd.to_datetime(df["time"])<="04:00:00")
Related
I have a dataframe which looks like this, and index is datetime64 of numpy:
(index) data
2017-01-01 00:00:00 1
2017-01-01 01:00:00 2
2017-01-01 02:00:00 3
…… ……
2017-01-04 00:00:00 73
2017-01-04 01:00:00 nan
2017-01-04 02:00:00 75
…… ……
Now I want to get datas in rolling windows which width are all 72(72 hours) and there is no intersection between two windows such as this:
windows1:
(index) data
2017-01-01 00:00:00 1
2017-01-01 01:00:00 2
2017-01-01 02:00:00 3
…… ……
2017-01-03 23:00:00 72
windows2:
(index) data
2017-01-04 00:00:00 73
# data of 2017-01-04 01:00:00 is nan, removed
2017-01-01 02:00:00 75
…… ……
2017-01-03 23:00:00 144
So how can realize this by DataFrame.rolling or DataSeries.rolling? If there is no easy answer, I will use index itself to solve the problem.
A 72H rolling can be achieved with df.rolling('72H').sum() (or any other function than sum)
But it looks like you don't want a rolling but rather a groupby with floor:
for k,g in df.groupby(df.index.floor('72H')):
print(f'New group: {k}\n', g.head(), '\n')
output:
New group: 2016-12-31 00:00:00
data
index
2017-01-01 00:00:00 1
2017-01-01 01:00:00 2
2017-01-01 02:00:00 3
2017-01-01 03:00:00 4
2017-01-01 04:00:00 5
New group: 2017-01-03 00:00:00
data
index
2017-01-03 00:00:00 49
2017-01-03 01:00:00 50
2017-01-03 02:00:00 51
2017-01-03 03:00:00 52
2017-01-03 04:00:00 53
To compute, for example, the mean:
df.groupby(df.index.floor('72H')).mean()
data
index
2016-12-31 24.5
2017-01-03 73.0
alternative
group = (df.index-df.index[0])//pd.Timedelta('72H')
df.groupby(group).mean()
Used input:
df = pd.DataFrame({'index': pd.date_range('2017-01-01', '2017-01-05', freq='1H'),
'data': np.arange(1, 98)}).set_index('index')
My dataset df looks like this:
DateTimeVal Open
2017-01-01 17:00:00 5.1532
2017-01-01 17:01:00 5.3522
2017-01-01 17:02:00 5.4535
2017-01-01 17:03:00 5.3567
2017-01-01 17:04:00 5.1512
....
It is a minute diff based dataset.
In my calculation, a single day(24 hour) is defined as:
17:00:00 Sunday to 16:59:00 Monday and so on for other days
What I want to do is find the AVG, and STD of each 24 hour from 17:00:00 Sunday to 16:59:00 Monday and so on for all the day
What did I do?
I did the rolling to find the AVG but it does for a day and not with time-range
# day avg
# 7 day rolling avg
df = (
df.assign(DAY_AVG=df.rolling(window=1*24*60)['Open'].mean())
df.assign(7DAY_AVG=df.rolling(window=7*24*60)['Open'].mean())
.groupby(df['DateTimeVal'].dt.date)
.last() )
I need help with these 2 things:
How do I find the AVG, and STD between fixed time period?
How do I find the AVG, and STD between fixed time period for 7D rolling and 14 Days rolling?
Use resample with base:
#Create empty dataframe for 2 days
df = pd.DataFrame(index = pd.date_range('2017-07-01', periods=48, freq='1H'))
#Set value equal to 1 from 17:00 to 16:59 next day
df.loc['2017-07-01 17:00:00': '2017-07-02 16:59:59', 'Value'] = 1
print(df)
Output:
Value
2017-07-01 00:00:00 NaN
2017-07-01 01:00:00 NaN
2017-07-01 02:00:00 NaN
2017-07-01 03:00:00 NaN
2017-07-01 04:00:00 NaN
2017-07-01 05:00:00 NaN
2017-07-01 06:00:00 NaN
2017-07-01 07:00:00 NaN
2017-07-01 08:00:00 NaN
2017-07-01 09:00:00 NaN
2017-07-01 10:00:00 NaN
2017-07-01 11:00:00 NaN
2017-07-01 12:00:00 NaN
2017-07-01 13:00:00 NaN
2017-07-01 14:00:00 NaN
2017-07-01 15:00:00 NaN
2017-07-01 16:00:00 NaN
2017-07-01 17:00:00 1.0
2017-07-01 18:00:00 1.0
2017-07-01 19:00:00 1.0
2017-07-01 20:00:00 1.0
2017-07-01 21:00:00 1.0
2017-07-01 22:00:00 1.0
2017-07-01 23:00:00 1.0
2017-07-02 00:00:00 1.0
2017-07-02 01:00:00 1.0
2017-07-02 02:00:00 1.0
2017-07-02 03:00:00 1.0
2017-07-02 04:00:00 1.0
2017-07-02 05:00:00 1.0
2017-07-02 06:00:00 1.0
2017-07-02 07:00:00 1.0
2017-07-02 08:00:00 1.0
2017-07-02 09:00:00 1.0
2017-07-02 10:00:00 1.0
2017-07-02 11:00:00 1.0
2017-07-02 12:00:00 1.0
2017-07-02 13:00:00 1.0
2017-07-02 14:00:00 1.0
2017-07-02 15:00:00 1.0
2017-07-02 16:00:00 1.0
2017-07-02 17:00:00 NaN
2017-07-02 18:00:00 NaN
2017-07-02 19:00:00 NaN
2017-07-02 20:00:00 NaN
2017-07-02 21:00:00 NaN
2017-07-02 22:00:00 NaN
2017-07-02 23:00:00 NaN
Now use, resample with base=17
df.resample('24H', base=17).sum()
Output:
Value
2017-06-30 17:00:00 0.0
2017-07-01 17:00:00 24.0
2017-07-02 17:00:00 0.0
Update for minute sampling:
df = pd.DataFrame({'Value': 0}, index = pd.date_range('2018-10-01', '2018-10-03', freq='1T'))
df.loc['2018-10-01 15:00:00':'2018-10-02 18:59:50', 'Value'] = 1
df.resample('24H', base=17).agg(['sum','mean'])
Output:
Value
sum mean
2018-09-30 17:00:00 120 0.117647
2018-10-01 17:00:00 1440 1.000000
2018-10-02 17:00:00 120 0.285036
My df dataset looks likes this:
time Open
2017-01-03 06:00:00 5.2475
2017-01-03 07:00:00 5.2475
2017-01-03 08:00:00 5.2180
2017-01-03 09:00:00 5.2128
2017-01-03 10:00:00 5.2128
2017-01-04 06:00:00 5.4122
2017-01-04 07:00:00 5.4122
2017-01-04 08:00:00 5.2123
2017-01-04 09:00:00 5.2475
2017-01-04 10:00:00 5.2475
2017-01-05 07:00:00 5.2180
2017-01-05 08:00:00 5.2128
2017-01-05 09:00:00 5.4122
2017-01-05 10:00:00 5.4122
....
I want to filter time values starting from '07:00:00' and include next 3 values
My new df should look like this:
time Open
2017-01-03 07:00:00 5.2475
2017-01-03 08:00:00 5.2180
2017-01-03 09:00:00 5.2128
2017-01-04 07:00:00 5.4122
2017-01-04 08:00:00 5.2123
2017-01-04 09:00:00 5.2475
2017-01-05 07:00:00 5.2180
2017-01-05 08:00:00 5.2128
2017-01-05 09:00:00 5.4122
....
Here, we are not including the '06:00:00' or '10:00:00' since we are only getting the data starting from '07:00:00' and the next 3 values
We need to preserve the order of the original df and just remove unwanted data in between that does not match the criteria of starting from '07:00:00' and 3 values after '07:00:00'
What did I do?
I tried to filter by selecting the time part but it only gives me one value when I do this:
df[(df.index.time == datetime.time(07, 0))
but I want the next 3 values. Doing head(3) does not work:
df[(df.index.time == datetime.time(07, 0))].head(3)
Can you please help me?
use between_time to fetch data on the basis of time
df = pd.DataFrame(data={"time":["2017-01-03 07:00:00","2017-01-03 06:00:00","2017-01-03 08:00:00","2017-01-03 10:00:00"],
"open":[5,5,5,4]})
df['time'] = pd.to_datetime(df['time'])
df.set_index("time",inplace=True)
res = df.between_time('07:00:00','09:00:00')
print(res)
time
2017-01-03 07:00:00 5
2017-01-03 08:00:00 5
2017-01-03 09:00:00 4
addition to your question
date_list = ['2017-01-03', '2017-01-02', '2017-01-07']
res =res[res.index.normalize().isin(date_list)]
in order to ignore last_date you can do
res=res[(res.index >='2017-01-02') &(res.index < '2017-01-07')]
Compare values by time and create helper Series by Series.cumsum, then remove values with 0, because it is first values non matched first time from condition and use GroupBy.head:
s = pd.Series(df.index.time == datetime.time(7, 0), index=df.index).cumsum()
df = df[s != 0].groupby(s).head(3)
print (df)
Open
time
2017-01-03 07:00:00 5.2475
2017-01-03 08:00:00 5.2180
2017-01-03 09:00:00 5.2128
2017-01-04 07:00:00 5.4122
2017-01-04 08:00:00 5.2123
2017-01-04 09:00:00 5.2475
2017-01-05 07:00:00 5.2180
2017-01-05 08:00:00 5.2128
2017-01-05 09:00:00 5.4122
If need filter by hours and by dates with boolean indexing and Series.isin:
date_list = ['2017-01-03', '2017-01-02', '2017-01-07']
df = df[df.index.hour.isin([7,8,9]) & df.index.floor('d').isin(date_list)]
print (df)
Open
time
2017-01-03 07:00:00 5.2475
2017-01-03 08:00:00 5.2180
2017-01-03 09:00:00 5.2128
Or by times and dates:
date_list = ['2017-01-03', '2017-01-02', '2017-01-07']
times = [datetime.time(7, 0), datetime.time(8, 0), datetime.time(9, 0)]
df = df[np.in1d(df.index.time, times) & df.index.floor('d').isin(date_list)]
print (df)
Open
time
2017-01-03 07:00:00 5.2475
2017-01-03 08:00:00 5.2180
2017-01-03 09:00:00 5.2128
I have 5 time series data in DataFrames and each of them have a different time scale.
For example, data1 is from 4/15 0:00 to 4/16 0:00, data2 is from 9/16 06:30 to 7:00.
All these data are in different DataFrames and I wanna draw graphs of them by using matplotlib. I want to set the numbers of x tick labels 5 and put a date of the data just on the leftmost x tick label. I tried the code below but I couldn't get graphs I wanted.
fig = plt.figure(figsize=(15, 3))
for i in range(1,6): # because I have 5 DataFrames in 'df_event_num'
ax = plt.subplot(150+i)
plt.title('event_num{}'.format(i))
df_event_num[i-1]['Load_Avg'].plot(color=colors_2018[i-1])
ax.tick_params(rotation=270)
fig.tight_layout()
And I got a graph like this
Again, I want to set the numbers of x tick labels to 5 and put a date just on the leftmost x tick label on every graph. And hopefully, I want to rotate the characters of x tick labels.
Could anyone teach me how to get graphs I want?
df_event_num has 5 DataFrames and I want to make time series graphs of the column data named 'Load_Avg'.
Here is the sample data of 'df_event_num'.
print(df_event_num[0]['Load_Avg'])
>>>
TIMESTAMP
2018-04-15 00:00:00 406.2
2018-04-15 00:30:00 407.4
2018-04-15 01:00:00 409.6
2018-04-15 01:30:00 403.3
2018-04-15 02:00:00 405.0
2018-04-15 02:30:00 401.8
2018-04-15 03:00:00 401.1
2018-04-15 03:30:00 401.0
2018-04-15 04:00:00 402.3
2018-04-15 04:30:00 402.5
2018-04-15 05:00:00 404.3
2018-04-15 05:30:00 404.7
2018-04-15 06:00:00 417.0
2018-04-15 06:30:00 438.9
2018-04-15 07:00:00 466.4
2018-04-15 07:30:00 476.6
2018-04-15 08:00:00 499.3
2018-04-15 08:30:00 523.1
2018-04-15 09:00:00 550.2
2018-04-15 09:30:00 590.2
2018-04-15 10:00:00 604.4
2018-04-15 10:30:00 622.4
2018-04-15 11:00:00 657.7
2018-04-15 11:30:00 737.2
2018-04-15 12:00:00 775.0
2018-04-15 12:30:00 819.0
2018-04-15 13:00:00 835.0
2018-04-15 13:30:00 848.0
2018-04-15 14:00:00 858.0
2018-04-15 14:30:00 866.0
2018-04-15 15:00:00 874.0
2018-04-15 15:30:00 879.0
2018-04-15 16:00:00 883.0
2018-04-15 16:30:00 889.0
2018-04-15 17:00:00 893.0
2018-04-15 17:30:00 894.0
2018-04-15 18:00:00 895.0
2018-04-15 18:30:00 897.0
2018-04-15 19:00:00 895.0
2018-04-15 19:30:00 898.0
2018-04-15 20:00:00 899.0
2018-04-15 20:30:00 900.0
2018-04-15 21:00:00 903.0
2018-04-15 21:30:00 904.0
2018-04-15 22:00:00 905.0
2018-04-15 22:30:00 906.0
2018-04-15 23:00:00 906.0
2018-04-15 23:30:00 907.0
2018-04-16 00:00:00 909.0
Freq: 30T, Name: Load_Avg, dtype: float64
print(df_event_num[1]['Load_Avg'])
>>>
TIMESTAMP
2018-04-25 06:30:00 1133.0
2018-04-25 07:00:00 1159.0
Freq: 30T, Name: Load_Avg, dtype: float64
print(df_event_num[2]['Load_Avg'])
TIMESTAMP
2018-06-28 09:30:00 925.0
2018-06-28 10:00:00 1008.0
Freq: 30T, Name: Load_Avg, dtype: float64
print(df_event_num[3]['Load_Avg'])
>>>
TIMESTAMP
2018-09-08 00:00:00 769.3
2018-09-08 00:30:00 772.4
2018-09-08 01:00:00 778.3
2018-09-08 01:30:00 787.5
2018-09-08 02:00:00 812.0
2018-09-08 02:30:00 825.0
2018-09-08 03:00:00 836.0
2018-09-08 03:30:00 862.0
2018-09-08 04:00:00 884.0
2018-09-08 04:30:00 905.0
2018-09-08 05:00:00 920.0
2018-09-08 05:30:00 926.0
2018-09-08 06:00:00 931.0
2018-09-08 06:30:00 942.0
2018-09-08 07:00:00 948.0
2018-09-08 07:30:00 956.0
2018-09-08 08:00:00 981.0
Freq: 30T, Name: Load_Avg, dtype: float64
print(df_event_num[4]['Load_Avg'])
>>>
TIMESTAMP
2018-09-30 21:00:00 252.2
2018-09-30 21:30:00 256.5
2018-09-30 22:00:00 264.1
2018-09-30 22:30:00 271.1
2018-09-30 23:00:00 277.7
2018-09-30 23:30:00 310.0
2018-10-01 00:00:00 331.6
2018-10-01 00:30:00 356.3
2018-10-01 01:00:00 397.2
2018-10-01 01:30:00 422.4
2018-10-01 02:00:00 444.2
2018-10-01 02:30:00 464.7
2018-10-01 03:00:00 477.2
2018-10-01 03:30:00 487.2
2018-10-01 04:00:00 494.7
2018-10-01 04:30:00 515.2
2018-10-01 05:00:00 527.6
2018-10-01 05:30:00 537.5
2018-10-01 06:00:00 541.7
Freq: 30T, Name: Load_Avg, dtype: float64
I modified your code a little bit:
You do not need to use range() to loop over, you can iterate directly over the list of DataFrames
Use the created ax subplot to set the data and the title on it.
Create 5 linear separated ticks on the x-axis based on the first and last index of the individual dataframe: pd.to_datetime(np.linspace(df.index[0].value, df.index[-1].value, 5))
Use just the last value as label, and replace all other with empty stings: ts_names = ['','','','',ts_loc[-1]]
import numpy as np
colors_2018 = ['red', 'blue', 'green', 'yellow', 'orange', 'brown']
fig = plt.figure(figsize=(15, 4))
for i, df in enumerate(df_event_num): # because I have 5 DataFrames in 'df_event_num'
ax = plt.subplot(1,5,i+1)
ax.plot(df['Load_Avg'], color=colors_2018[i])
ax.set_title('event_num{}'.format(i))
# If the index is not a Timestamp-type already:
df.index = pd.to_datetime(df.index)
# x-Axis locations of 5 timestamps
ts_loc = pd.to_datetime(np.linspace(df.index[0].value, df.index[-1].value, 5))
ax.set_xticks(ts_loc, minor=False)
# Names of the timestamps (only last shown)
ts_names = ['','','','',ts_loc[-1]]
ax.set_xticklabels(ts_names, rotation="vertical")
fig.tight_layout()
tI would like to run timeseries analysis on repeated measures data (time only, no dates) taken overnight from 22:00:00 to 09:00:00 the next morning.
How is the time set so that the Timeseries starts at 22:00:00. At the moment even when plotting it starts at 00:00:00 and ends at 23:00:00 with a flat line between 09:00:00 and 23:00:00?
df = pd.read_csv('1310.csv', parse_dates=True)
df['Time'] = pd.to_datetime(df['Time'])
df['Time'].apply( lambda d : d.time() )
df = df.set_index('Time')
df['2017-05-16 22:00:00'] + pd.Timedelta('-1 day')
Note: The date in the last line of code is automatically added, seen when df['Time'] is executed, so I inserted the same format with date in the last line for 22:00:00.
This is the error:
TypeError: Could not operate Timedelta('-1 days +00:00:00') with block values unsupported operand type(s) for +: 'numpy.ndarray' and 'Timedelta'
You should consider your timestamps as pd.Timedeltas and add a day to the samples before your start time.
Create some example data:
import pandas as pd
d = pd.date_range(start='22:00:00', periods=12, freq='h')
s = pd.Series(d).dt.time
df = pd.DataFrame(pd.np.random.randn(len(s)), index=s, columns=['value'])
df.to_csv('data.csv')
df
value
22:00:00 -0.214977
23:00:00 -0.006585
00:00:00 0.568259
01:00:00 0.603196
02:00:00 0.358124
03:00:00 0.027835
04:00:00 -0.436322
05:00:00 0.627624
06:00:00 0.168189
07:00:00 -0.321916
08:00:00 0.737383
09:00:00 1.100500
Read in, make index a timedelta, add a day to timedeltas before the start time, then assign back to the index.
df2 = pd.read_csv('data.csv', index_col=0)
df2.index = pd.to_timedelta(df2.index)
s = pd.Series(df2.index)
s[s < pd.Timedelta('22:00:00')] += pd.Timedelta('1d')
df2.index = pd.to_datetime(s)
df2
value
1970-01-01 22:00:00 -0.214977
1970-01-01 23:00:00 -0.006585
1970-01-02 00:00:00 0.568259
1970-01-02 01:00:00 0.603196
1970-01-02 02:00:00 0.358124
1970-01-02 03:00:00 0.027835
1970-01-02 04:00:00 -0.436322
1970-01-02 05:00:00 0.627624
1970-01-02 06:00:00 0.168189
1970-01-02 07:00:00 -0.321916
1970-01-02 08:00:00 0.737383
1970-01-02 09:00:00 1.100500
If you want to set the date of the first day:
df2.index += (pd.Timestamp('2015-06-06') - pd.Timestamp(0))
df2
value
2015-06-06 22:00:00 -0.214977
2015-06-06 23:00:00 -0.006585
2015-06-07 00:00:00 0.568259
2015-06-07 01:00:00 0.603196
2015-06-07 02:00:00 0.358124
2015-06-07 03:00:00 0.027835
2015-06-07 04:00:00 -0.436322
2015-06-07 05:00:00 0.627624
2015-06-07 06:00:00 0.168189
2015-06-07 07:00:00 -0.321916
2015-06-07 08:00:00 0.737383
2015-06-07 09:00:00 1.100500