LSTM prediction model : the loss value doesn't change - keras

I'm trying to implement a simple LSTM prediction model in keras for timeseries. I have 10 timeseries with a lookback_window=28 and number of features is 1. I need to predict the next value (timesteps=28, n_features=1). Here is my model and the way I tried to train it:
model = Sequential()
model.add(LSTM(28, batch_input_shape=(49,28,1), stateful=True, return_sequences=True))
model.add(LSTM(14, stateful=True))
model.add(Dense(1, activation='relu'))
earlyStopping = callbacks.EarlyStopping(monitor='val_loss', patience=100, verbose=1, mode='auto')
model.compile(loss='mean_squared_error', optimizer='adam')
history = model.fit(train_data, train_y,
epochs=1000,
callbacks=[earlyStopping],
batch_size=49,
validation_data=(validation_data, validation_y),
verbose=1,
shuffle=False)
prediction_result = model.predict(test_data, batch_size=49)
I'm not reseting the states after an epoch nor using shuffling because the order in the timeseries is important and there is a connection between them. The problem is the loss value sometimes changes slightly only after the first epoch and then it remains constant and doesn't change at all, most of the time it doesn't change at all . I tried to use a different optimization like RMSprop, changed it's learning rate, removing the earlystope to let it train longer, changing batch_size and even traied without batch, tried the same model stateless, set shuffle=True, added more layers and made it deeper, ... but none of them made any difference! I wonder what am I doing wrong! Any suggestion?!
P.S. My data consists of 10 timeseries and each timeseries has 567 length:
timeseries#1: 451, 318, 404, 199, 225, 158, 357, 298, 339, 155, 135, 239, 306, ....
timeseries#2: 304, 274, 150, 143, 391, 357, 278, 557, 98, 106, 305, 288, 325, ....
...
timeseries#10: 208, 138, 201, 342, 280, 282, 280, 140, 124, 261, 193, .....
My lookback windeow is 28. So I generated the following sequences with 28 timesteps:
[451, 318, 404, 199, 225, 158, 357, 298, 339, 155, 135, 239, 306, .... ]
[318, 404, 199, 225, 158, 357, 298, 339, 155, 135, 239, 306, 56, ....]
[404, 199, 225, 158, 357, 298, 339, 155, 135, 239, 306, 56, 890, ....]
...
[304, 274, 150, 143, 391, 357, 278, 557, 98, 106, 305, 288, 325, ....]
[274, 150, 143, 391, 357, 278, 557, 98, 106, 305, 288, 325, 127, ....]
[150, 143, 391, 357, 278, 557, 98, 106, 305, 288, 325, 127, 798, ....]
...
[208, 138, 201, 342, 280, 282, 280, 140, 124, 261, 193, .....]
[138, 201, 342, 280, 282, 280, 140, 124, 261, 193, 854, .....]
Then, I'm splitting my data as follow (data.shape=(5390,28,1) is 5390 for 10 timeseies):
num_training_ts = int(data.shape[0] / 539 * (1 - config['validation_split_ratio']))
train_size = num_training_ts * 539
train_data = data[:train_size, :, :]
train_y = y[:train_size]
validation_data = data[train_size:-1*539, :, :]
validation_y = y[train_size:-1*539]
test_data = data[-1*539:, :, :] # The last timeseries
test_y = y[-1*539:]
I scaled the data between -1 and 1 using minMaxScale, but here for simplicity I'm using the actual values. At the end I have the following:
train_data.shape=(3234,28,1)
train_y.shape=(3234,)
test_data.shape=(539,28,1)
test_y.shape=(539,)
validation_data.shape=(1617,28,1)
validation_y.shape=(1617,)

When I find this kind of issues first I focus on data: My data are scaled? Do I have enough data for this model?
Then I pass to the model. In your case it seems that all the learn is done in the first iteration. So why don't you try to change the learning rate and the decay of your optimizer?
With keras it's so easy. First define your optimizer (in your code I see you used 'Adam'):
my_adam_optimizer = keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
then use it in the complie function:
model.compile(loss='mean_squared_error', optimizer=my_adam_compiler)
UPDATE:
The last relu layer 'cuts' the negative values, so if your target contains negatives it's not able to predict them. Somewhere in the topic you said you used the minmaxScaler between -1 and 1, and for sure it gives you problem. By removing the activation parameter you use the defalut, which I think is 'linear'.
Removing the relu activation from the last layer can fix the problem!

Related

How can I amend my codes to make my answer joined together to form a sentence?

I am trying decipher a list with caesar cipher using python
ciphertext_3: [225, 233, 228, 172, 160, 237, 249, 160, 228, 229, 225, 242, 160, 247, 225, 244, 243, 239, 238, 172, 160, 244, 232, 225, 244, 160, 237, 239, 243, 244, 160, 239, 230, 160, 249, 239, 245, 242, 160, 227, 239, 238, 227, 236]
here is my code:
for x in ciphertext_3:
k = chr(x-123)
answer =''.join(k)
print(answer)
but my output is showing that the letters are not joined together and it is printing 1 letter each line instead.
Try this :
answer = ''
for x in ciphertext_3:
k = chr(x-123)
answer = answer + k
print(answer)

Embedding: argument indices must be a Tensor, not a list

I am trying to train a RNN, but I am having trouble with my embedding.
I am getting the following error message:
TypeError: embedding(): argument 'indices' (position 2) must be Tensor, not list
The code in the forward method starts like that:
def forward(self, word_indices: [int]):
print("sentences")
print(len(word_indices))
print(word_indices)
word_ind_tensor = torch.tensor(word_indices, device="cpu")
print(word_ind_tensor)
print(word_ind_tensor.size())
embeds_word = self.embedding_word(word_indices)
The output of all of that is:
sentences
29
[261, 15, 5149, 44, 287, 688, 1125, 4147, 9874, 582, 15, 9875, 3, 2, 6732, 34, 2, 6733, 9, 2, 485, 7, 6734, 3, 741, 2, 2179, 1571, 1]
tensor([ 261, 15, 5149, 44, 287, 688, 1125, 4147, 9874, 582, 15, 9875,
3, 2, 6732, 34, 2, 6733, 9, 2, 485, 7, 6734, 3,
741, 2, 2179, 1571, 1])
torch.Size([29])
Traceback (most recent call last):
File "/home/lukas/Documents/HU/Materialen/21SoSe-Studienprojekt/flair-Studienprojekt/TestModel.py", line 68, in <module>
embeddings_storage_mode = "CPU") #auf cuda ändern
File "/home/lukas/Documents/HU/Materialen/21SoSe-Studienprojekt/flair-Studienprojekt/flair/trainers/trainer.py", line 423, in train
loss = self.model.forward_loss(batch_step)
File "/home/lukas/Documents/HU/Materialen/21SoSe-Studienprojekt/flair-Studienprojekt/flair/models/sandbox/srl_tagger.py", line 122, in forward_loss
features = self.forward(word_indices = sent_word_ind, frame_indices = sent_frame_ind)
File "/home/lukas/Documents/HU/Materialen/21SoSe-Studienprojekt/flair-Studienprojekt/flair/models/sandbox/srl_tagger.py", line 147, in forward
embeds_word = self.embedding_word(word_indices)
File "/home/lukas/miniconda3/envs/studienprojekt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/lukas/miniconda3/envs/studienprojekt/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 114, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/home/lukas/miniconda3/envs/studienprojekt/lib/python3.7/site-packages/torch/nn/functional.py", line 1724, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
TypeError: embedding(): argument 'indices' (position 2) must be Tensor, not list
I originally initialised the embedding the following way:
self.embedding_word = torch.nn.Embedding(self.word_dict_size, embedding_size)
word_dict_size and embedding_size are both integers.
Is there something obviously I did wrong or is that a deeper mistake?
You're passing in a list to self.embedding_word: word_indices, not the tensor you just created for that purpose word_ind_tensor.

ortools vrp does not give me any solution

I want solve a vehicle routing problem with ORTools, both distance and duration matrix will be used.
but the problem is when I change the matrix , it wouldn't give me any solutions anymore!
there are 2 groups of matrixes. with the commented matrixes there is solution, but with the other group, there is not. do you have any idea why this is happening:
from __future__ import print_function
from ortools.constraint_solver import routing_enums_pb2
from ortools.constraint_solver import pywrapcp
def create_data_model():
"""Stores the data for the problem."""
data = {}
#data['distance_matrix']=[[0, 329, 146, 157, 318, 528, 457, 242, 491, 335, 471, 456, 391, 128, 461, 555, 460], [329, 0, 399, 384, 544, 493, 339, 378, 108, 243, 125, 394, 136, 561, 505, 315, 447], [146, 399, 0, 262, 471, 316, 297, 227, 548, 377, 267, 430, 383, 154, 234, 188, 400], [157, 384, 262, 0, 440, 271, 383, 525, 223, 367, 511, 354, 112, 539, 159, 152, 373], [318, 544, 471, 440, 0, 423, 112, 381, 346, 512, 161, 239, 581, 291, 284, 145, 143], [528, 493, 316, 271, 423, 0, 380, 196, 409, 212, 199, 277, 387, 515, 391, 261, 318], [457, 339, 297, 383, 112, 380, 0, 379, 298, 267, 482, 247, 462, 256, 296, 533, 200], [242, 378, 227, 525, 381, 196, 379, 0, 156, 230, 551, 555, 338, 372, 403, 358, 506], [491, 108, 548, 223, 346, 409, 298, 156, 0, 140, 532, 405, 531, 129, 220, 482, 222], [335, 243, 377, 367, 512, 212, 267, 230, 140, 0, 418, 440, 526, 255, 455, 296, 430], [471, 125, 267, 511, 161, 199, 482, 551, 532, 418, 0, 439, 285, 181, 254, 208, 304], [456, 394, 430, 354, 239, 277, 247, 555, 405, 440, 439, 0, 397, 229, 121, 385, 147], [391, 136, 383, 112, 581, 387, 462, 338, 531, 526, 285, 397, 0, 544, 205, 197, 226], [128, 561, 154, 539, 291, 515, 256, 372, 129, 255, 181, 229, 544, 0, 150, 204, 516], [461, 505, 234, 159, 284, 391, 296, 403, 220, 455, 254, 121, 205, 150, 0, 192, 544], [555, 315, 188, 152, 145, 261, 533, 358, 482, 296, 208, 385, 197, 204, 192, 0, 138], [460, 447, 400, 373, 143, 318, 200, 506, 222, 430, 304, 147, 226, 516, 544, 138, 0]]
data['distance_matrix']=[[0, 228, 299, 301, 235, 208, 405, 447, 144, 579], [228, 0, 343, 288, 357, 426, 530, 510, 122, 490], [299, 343, 0, 236, 228, 523, 274, 377, 397, 530], [301, 288, 236, 0, 594, 523, 289, 397, 154, 380], [235, 357, 228, 594, 0, 558, 370, 444, 173, 558], [208, 426, 523, 523, 558, 0, 219, 278, 504, 507], [405, 530, 274, 289, 370, 219, 0, 195, 283, 257], [447, 510, 377, 397, 444, 278, 195, 0, 407, 417], [144, 122, 397, 154, 173, 504, 283, 407, 0, 273], [579, 490, 530, 380, 558, 507, 257, 417, 273, 0]]
data['time_matrix']=[[0, 205, 519, 308, 428, 574, 399, 138, 573, 541], [205, 0, 447, 578, 296, 536, 135, 345, 198, 315], [519, 447, 0, 209, 438, 174, 231, 382, 104, 522], [308, 578, 209, 0, 235, 264, 492, 305, 134, 538], [428, 296, 438, 235, 0, 600, 177, 435, 204, 556], [574, 536, 174, 264, 600, 0, 476, 119, 183, 476], [399, 135, 231, 492, 177, 476, 0, 497, 208, 167], [138, 345, 382, 305, 435, 119, 497, 0, 344, 454], [573, 198, 104, 134, 204, 183, 208, 344, 0, 422], [541, 315, 522, 538, 556, 476, 167, 454, 422, 0]]
data['cost_matrix']=[[0, 160, 135, 433, 581, 453, 336, 329, 343, 237], [160, 0, 313, 596, 576, 458, 264, 380, 348, 354], [135, 313, 0, 591, 391, 211, 561, 236, 304, 414], [433, 596, 591, 0, 539, 253, 427, 300, 214, 118], [581, 576, 391, 539, 0, 243, 521, 499, 560, 255], [453, 458, 211, 253, 243, 0, 571, 216, 121, 314], [336, 264, 561, 427, 521, 571, 0, 425, 271, 165], [329, 380, 236, 300, 499, 216, 425, 0, 425, 549], [343, 348, 304, 214, 560, 121, 271, 425, 0, 176], [237, 354, 414, 118, 255, 314, 165, 549, 176, 0]]
data['num_vehicles'] = 4
data['depot'] = 0
return data
def print_solution(data, manager, routing, assignment):
"""Prints assignment on console."""
total_cost ,total_distance,total_time= 0,0,0
print('Objective: {}'.format(assignment.ObjectiveValue()))
distance_dimension=routing.GetDimensionOrDie('Distance')
time_dimension=routing.GetDimensionOrDie('Time')
for vehicle_id in range(data['num_vehicles']):
index = routing.Start(vehicle_id)
plan_output = 'Route for vehicle {}:\n'.format(vehicle_id)
route_cost = 0
route_distance = 0
route_time = 0
while not routing.IsEnd(index):
plan_output += ' {} -> '.format(manager.IndexToNode(index))
distance_var=distance_dimension.CumulVar(index)
time_var=time_dimension.CumulVar(index)
previous_index = index
index = assignment.Value(routing.NextVar(index))
route_cost += routing.GetArcCostForVehicle(previous_index, index, vehicle_id)
route_distance+=assignment.Value(distance_var)
route_time+=assignment.Value(time_var)
plan_output += '{}\n'.format(manager.IndexToNode(index))
plan_output += 'Cost of the route: {0}\nDistance of the route: {1}m\nTime of route: {2}\n'.format(
route_cost,
route_distance,
route_time)
print(plan_output)
total_cost += route_cost
total_time+=route_time
total_distance+=route_distance
print('Total Cost of all routes: {}\nTotal Distance of all routes: {}\nTotal Time of all routes: {}\n'.format(total_cost,total_distance,total_time))
def get_routes(manager, routing, solution, num_routes):
"""Get vehicle routes from a solution and store them in an array."""
# Get vehicle routes and store them in a two dimensional array whose
# i,j entry is the jth location visited by vehicle i along its route.
routes = []
for route_nbr in range(num_routes):
index = routing.Start(route_nbr)
route = [manager.IndexToNode(index)]
while not routing.IsEnd(index):
index = solution.Value(routing.NextVar(index))
route.append(manager.IndexToNode(index))
routes.append(route)
return routes
def main():
# Instantiate the data problem.
data = create_data_model()
# Create the routing index manager.
manager = pywrapcp.RoutingIndexManager(len(data['cost_matrix']), data['num_vehicles'], data['depot'])
# Create Routing Model.
routing = pywrapcp.RoutingModel(manager)
# Create and register a transit callback.
def cost_callback(from_index, to_index):
"""Returns the distance between the two nodes."""
# Convert from routing variable Index to distance matrix NodeIndex.
from_node = manager.IndexToNode(from_index)
to_node = manager.IndexToNode(to_index)
return data['cost_matrix'][from_node][to_node]
transit_callback_index = routing.RegisterTransitCallback(cost_callback)
routing.SetArcCostEvaluatorOfAllVehicles(transit_callback_index)
# Add Cost constraint.
routing.AddDimension(
transit_callback_index,
0, # no slack
3000, # vehicle maximum travel distance
False, # start cumul to zero
'Cost')
cost_dimension = routing.GetDimensionOrDie('Cost')
cost_dimension.SetGlobalSpanCostCoefficient(1000)
#Add Distance constraint.
def distance_callback(from_index,to_index):
from_node = manager.IndexToNode(from_index)
to_node = manager.IndexToNode(to_index)
return data['distance_matrix'][from_node][to_node]
distance_callback_index=routing.RegisterTransitCallback(distance_callback)
routing.AddDimension(
distance_callback_index,
0,
3000,
False,
'Distance')
distance_dimension=routing.GetDimensionOrDie('Distance')
#Add Time constraint.
def time_callback(from_index,to_index):
from_node = manager.IndexToNode(from_index)
to_node = manager.IndexToNode(to_index)
return data['time_matrix'][from_node][to_node]
time_callback_index=routing.RegisterTransitCallback(time_callback)
routing.AddDimension(
time_callback_index,
0,
300,
False,
'Time')
time_dimension=routing.GetDimensionOrDie('Time')
# Setting first solution heuristic.
search_parameters = pywrapcp.DefaultRoutingSearchParameters()
search_parameters.first_solution_strategy = (routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC)
search_parameters.solution_limit = 100
search_parameters.time_limit.seconds = 3
# Solve the problem.
assignment = routing.SolveWithParameters(search_parameters)
# Print solution on console.
if assignment:
print_solution(data, manager, routing, assignment)
routes = get_routes(manager, routing, assignment, data['num_vehicles'])
# Display the routes.
for i, route in enumerate(routes):
print('Route', i, route)
if __name__ == '__main__':
main()
None could mean that it did not find a solution. Most likely your limits are too low
by increasing the limits, it works fine.
but for better understanding, it's better to check solver status. time limit errors mostly refer to the low limitations.
in this example we have many value more than 300 in time matrix but the maximum time for every vehicle is 300.so there is not a feasible solution for this problem.

stateful autoencoder in Keras

I'm trying to create a stateful autoencoder model. The goal is to make the autoencoder stateful for each timeseries. The data consists of 10 timeseries and each timeseries has 567 length.
timeseries#1: 451, 318, 404, 199, 225, 158, 357, 298, 339, 155, 135, 239, 306, ....
timeseries#2: 304, 274, 150, 143, 391, 357, 278, 557, 98, 106, 305, 288, 325, ....
...
timeseries#10: 208, 138, 201, 342, 280, 282, 280, 140, 124, 261, 193, .....
My lookback windeow is 28. So I generated the following sequences with 28 timesteps:
[451, 318, 404, 199, 225, 158, 357, 298, 339, 155, 135, 239, 306, .... ]
[318, 404, 199, 225, 158, 357, 298, 339, 155, 135, 239, 306, 56, ....]
[404, 199, 225, 158, 357, 298, 339, 155, 135, 239, 306, 56, 890, ....]
...
[304, 274, 150, 143, 391, 357, 278, 557, 98, 106, 305, 288, 325, ....]
[274, 150, 143, 391, 357, 278, 557, 98, 106, 305, 288, 325, 127, ....]
[150, 143, 391, 357, 278, 557, 98, 106, 305, 288, 325, 127, 798, ....]
...
[208, 138, 201, 342, 280, 282, 280, 140, 124, 261, 193, .....]
[138, 201, 342, 280, 282, 280, 140, 124, 261, 193, 854, .....]
That gives me 539 sequences for each timeseries. What I need to do is to make the LSTMs to be stateful for each of the timeseries and reset the state after seeing all the sequences from a timeseries. Here is the code I have:
batch_size = 35 #(total Number of samples is 5390, and it is dividable by 35)
timesteps = 28
n_features = 1
hunits = 14
RepeatVector(timesteps/hunits = 2)
epochs = 1000
inputEncoder = Input(batch_shape=(35, 28, 1), name='inputEncoder')
outEncoder, c, h = LSTM(14, stateful=True, return_state=True, name='outputEncoder')(inputEncoder)
encoder_model = Model(inputEncoder, outEncoder)
context = RepeatVector(2, name='inputDecoder')(outEncoder)
context_reshaped = Reshape(28, 1), name='ReshapeLayer')(context)
outDecoder = LSTM(1, return_sequences=True, stateful=True, name='decoderLSTM')(context_reshaped)
autoencoder = Model(inputEncoder, outDecoder)
autoencoder.compile(loss='mse', optimizer='rmsprop')
for i in range(epochs):
history = autoencoder.fit(data, data,
validation_split=config['validation_split_ratio'],
shuffle=False,
batch_size=35,
epochs=1,
)
autoencoder.reset_states()
2 questions:
1- I'm getting this error after the first epoch is finished, I wonder how it is happening:
ValueError: Cannot feed value of shape (6, 28, 1) for Tensor u'inputEncoder:0', which has shape '(35, 28, 1)'
2- I don't think that model works as I want. Here it will reset the states after all batches (one epoch) which means after all timeseries are processed. How should I change it to be stateful between timeseries?
The issue is from the validation_split rate!! It is set to a 0.33% and when the splits happens it tries to train on 3611 data samples which is not divisible by my batch_size=35 . Based on this post I could find the proper number, copying from that post:
def quantize_validation_split(validation_split, sample_count, batch_size):
batch_count = sample_count / batch_size
return float(int(batch_count * validation_split)) / batch_count
then you can call model.fit(...,
validation_split=fix_validation_split(0.05, len(X), batch_size)). but
it would be cool if keras did this for you inside fit().
Also, regarding make the autoencoder stateful the way I need: there shouldn't be a reset_state at the end of each epoch!

Python fbprophet - export values from plot_components() for yearly

Any ideas on how to export the yearly seasonal trend using fbprophet library?
The plot_components() function plots the trend, yearly, and weekly.
I want to obtain the values for yearly only.
Example data:
import pandas as pd
from fbprophet import Prophet
import matplotlib.pyplot as plt
df = pd.DataFrame.from_dict({'ds': ['1949-01', '1949-02', '1949-03', '1949-04', '1949-05', '1949-06',
'1949-07', '1949-08', '1949-09', '1949-10', '1949-11', '1949-12',
'1950-01', '1950-02', '1950-03', '1950-04', '1950-05', '1950-06',
'1950-07', '1950-08', '1950-09', '1950-10', '1950-11', '1950-12',
'1951-01', '1951-02', '1951-03', '1951-04', '1951-05', '1951-06',
'1951-07', '1951-08', '1951-09', '1951-10', '1951-11', '1951-12',
'1952-01', '1952-02', '1952-03', '1952-04', '1952-05', '1952-06',
'1952-07', '1952-08', '1952-09', '1952-10', '1952-11', '1952-12',
'1953-01', '1953-02', '1953-03', '1953-04', '1953-05', '1953-06',
'1953-07', '1953-08', '1953-09', '1953-10', '1953-11',
'1953-12',
'1954-01', '1954-02', '1954-03', '1954-04', '1954-05', '1954-06',
'1954-07', '1954-08', '1954-09', '1954-10', '1954-11', '1954-12',
'1955-01', '1955-02', '1955-03', '1955-04', '1955-05', '1955-06',
'1955-07', '1955-08', '1955-09', '1955-10', '1955-11', '1955-12',
'1956-01', '1956-02', '1956-03', '1956-04', '1956-05', '1956-06',
'1956-07', '1956-08', '1956-09', '1956-10', '1956-11', '1956-12',
'1957-01', '1957-02', '1957-03', '1957-04', '1957-05', '1957-06',
'1957-07', '1957-08', '1957-09', '1957-10', '1957-11', '1957-12',
'1958-01', '1958-02', '1958-03', '1958-04', '1958-05', '1958-06',
'1958-07', '1958-08', '1958-09', '1958-10', '1958-11', '1958-12',
'1959-01', '1959-02', '1959-03', '1959-04', '1959-05', '1959-06',
'1959-07', '1959-08', '1959-09', '1959-10', '1959-11', '1959-12',
'1960-01', '1960-02', '1960-03', '1960-04', '1960-05', '1960-06',
'1960-07', '1960-08', '1960-09', '1960-10', '1960-11', '1960-12'],
'y': [112, 118, 132, 129, 121, 135, 148, 148, 136, 119, 104, 118, 115, 126,
141, 135, 125, 149, 170, 170, 158, 133, 114, 140, 145, 150, 178, 163,
172, 178, 199, 199, 184, 162, 146, 166, 171, 180, 193, 181, 183, 218,
230, 242, 209, 191, 172, 194, 196, 196, 236, 235, 229, 243, 264, 272,
237, 211, 180, 201, 204, 188, 235, 227, 234, 264, 302, 293, 259, 229,
203, 229, 242, 233, 267, 269, 270, 315, 364, 347, 312, 274, 237, 278,
284, 277, 317, 313, 318, 374, 413, 405, 355, 306, 271, 306, 315, 301,
356, 348, 355, 422, 465, 467, 404, 347, 305, 336, 340, 318, 362, 348,
363, 435, 491, 505, 404, 359, 310, 337, 360, 342, 406, 396, 420, 472,
548, 559, 463, 407, 362, 405, 417, 391, 419, 461, 472, 535, 622, 606,
508, 461, 390, 432]})
df['ds'] = pd.to_datetime(df['ds'])
fbprophet plot_components():
model = Prophet()
model.fit(df)
future = model.make_future_dataframe(12, 'm')
fc = model.predict(future)
model.plot_components(fc)
plt.show()
I understand your question to mean "How do we get the values used in the "yearly" plot above?
days = (pd.date_range(start='2017-01-01', periods=365) + pd.Timedelta(days=0))
df_y = model.seasonality_plot_df(days)
seas = model.predict_seasonal_components(df_y)
fig,ax = plt.subplots(2, 1, figsize=(8,6))
ax[0].plot(fc['ds'].dt.to_pydatetime(), fc['trend'])
ax[0].grid(alpha=0.5)
ax[0].set_xlabel('ds')
ax[1].set_ylabel('trend')
ax[1].plot(df_y['ds'].dt.to_pydatetime(), seas['yearly'], ls='-', c='#0072B2')
ax[1].set_xlabel('Day of year')
ax[1].set_ylabel('yearly')
ax[1].grid(alpha=0.5)
plt.show()
Plot results:
How to obtain the values for yearly only?
The X values are an arbitrary yearly time period (basically). The y values use functions seasonality_plot_df() and predict_seasonal_components() to predict daily seasonality for a year time-span. You retrieve these values by looking in seas['yearly'].
There is a simple solution in the current version of the library. You can use from the predicted model fc. What you want for the value of yearly can be found with fc['yearly'] without using the functions in the above solution.
Moreover, if you want all the other components like trend, you can use fc['trend'].

Resources