I face the following problem with GEKKO: some parameters (.Param) are changing (others not) when solving a model and I cannot determine why.
Background: I am currently trying to translate code from EViews (see gennaro.zezza.it) to python. I use GEKKO to simulate a system consisting out of 11 equations (for now). I do want to use parameters (instead of constants which seem to work perfectly fine) as I need to ('exogenously') change their value over time (and thus need an array).
Example: In the following example, an 'economic system' reacts to new government expenditures. Here, I particularly face problems with "m.alpha1" and "m.alpha2" - if they are introduced as ".Param" their value will change to 1.0 (instead of 0.6 and 0.4) when solving the model. How can I stop GEKKO from doing this? (Again, I want to be able to change, e.g., alpha1 to 0.7 after time x. E.g., lower and upper bounds won't help here.)
Thanks for your help!!
Code:
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
import plotly.graph_objects as go
# Initialize model
m = GEKKO(remote=False)
tstart = 1945
tend = 2000
tdur = tend-tstart+1
m.time = np.linspace(0, tend-tstart, tdur)
# Model parameters
m.t = m.Param(value=m.time)
# Exogenous parameters
alpha1_ex = 0.6
alpha2_ex = 0.4
theta_ex = 0.2
w_ex = 1
# -as .Const
m.alpha1 = m.Const(value=alpha1_ex, name='Propensity to consume out of income')
m.alpha2 = m.Const(value=alpha2_ex, name='Propensity to consume out of wealth')
#m.theta = m.Const(value=theta_ex, name='Tax rate')
#m.w = m.Const(value=w_ex, name='Wage rate')
# -as .Param: issues with alpha1 & alpha2
#m.alpha1 = m.Param(value=np.full(tdur,alpha1_ex), name='Propensity to consume out of income')
#m.alpha2 = m.Param(value=np.full(tdur,alpha2_ex), name='Propensity to consume out of wealth')
m.theta = m.Param(value=np.full(tdur,theta_ex), name='Tax rate')
m.w = m.Param(value=np.ones(tdur), name='Wage rate')
# no issues with g_d
m.g_d = m.Param(value=np.zeros(tdur), name='Government goods, demand')
m.g_d[1:] = 20
# Endogenous variables
m.c_d = m.Var(value=0, name='Consumption goods demand by households')
m.c_s = m.Var(value=0, name='Consumption goods supply')
m.g_s = m.Var(value=0, name='Government goods, supply')
m.h_h = m.Var(value=0, name='Cash money held by households')
m.h_s = m.Var(value=0, name='Cash money supplied by government')
m.n_d = m.Var(value=0, name='Demand for labor')
m.n_s = m.Var(value=0, name='Supply for labor')
m.t_d = m.Var(value=0, name='Taxes, "demand"')
m.t_s = m.Var(value=0, name='Taxes, "supply"')
m.y = m.Var(value=0, name='Income (=GDP)')
m.yd = m.Var(value=0, name='Disposable income of households')
# Lag variables
m.h_h_lag = m.Var(value=0, name='Cash money held by households (t-1)')
m.delay(m.h_h,m.h_h_lag,1) # m.h_h_lag = m.h_h(t-1)
m.h_s_lag = m.Var(value=0, name='Cash money supplied by government (t-1)')
m.delay(m.h_s,m.h_s_lag,1)
# Equations
m.Equation(m.c_s == m.c_d)
m.Equation(m.g_s == m.g_d)
m.Equation(m.t_s == m.t_d)
m.Equation(m.n_s == m.n_d)
m.Equation(m.yd == m.w*m.n_s - m.t_s)
m.Equation(m.t_d == m.theta*m.w*m.n_s)
m.Equation(m.c_d == m.alpha1*m.yd + m.alpha2*m.h_h_lag)
m.Equation(m.h_s == m.h_s_lag + m.g_d - m.t_d)
m.Equation(m.h_h == m.h_h_lag + m.yd - m.c_d)
m.Equation(m.y == m.c_s + m.g_s)
m.Equation(m.n_d == m.y/m.w)
# Solve
m.options.IMODE = 4
m.solve(disp=False)
print("Alpha1 = ", m.alpha1.value)
print("Alpha2 = ", m.alpha2.value)
print("Theta = ", m.theta.value)
print("w = ", m.w.value)
# Plot results
fig, axes = plt.subplots(2, 2, sharex=True, figsize=(8, 7))
fig.canvas.manager.set_window_title('Figures Chapter 3')
fig.suptitle('SIM Model - basic')
x_major_ticks = np.arange(0,tdur,5)
axes[0,0].plot(m.time, m.g_d.value, '-', color='black', linewidth=1)
axes[0,0].legend([m.g_d.name],loc=4,fontsize=7)
axes[0,0].grid()
axes[0,0].set_xticks(x_major_ticks)
axes[1,0].plot(m.time, m.y.value, '-', color='red', linewidth=1)
axes[1,0].legend([m.y.name],loc=4,fontsize=7)
axes[1,0].grid()
axes[1,0].set_xlabel('Time (years)')
axes[1,0].set_xticks(x_major_ticks)
axes[0,1].plot(m.time, m.c_d.value, '-', color='blue', linewidth=0.75)
axes[0,1].plot(m.time, m.yd.value, '-', color='green', linewidth=0.75)
axes[0,1].legend([m.c_d.name,m.yd.name],loc=4,fontsize=7)
axes[0,1].grid()
axes[0,1].set_xticks(x_major_ticks)
ln1 = axes[1,1].plot(m.time, m.h_h.value, '-', color='purple', linewidth=0.75)
axes[1,1].tick_params(axis='y', labelcolor='purple')
ax2 = axes[1,1].twinx()
ln2 = ax2.plot(m.time, [a_i - b_i for a_i, b_i in zip(m.h_h, m.h_h_lag)], '-', color='orange', linewidth=0.75)
ax2.tick_params(axis='y', labelcolor='orange')
lns = ln1+ln2
axes[1,1].legend(lns,[m.h_h.name,'Household savings'],loc=4,fontsize=7)
axes[1,1].grid()
axes[1,1].set_xticks(x_major_ticks)
axes[1,1].set_xlabel('Time (years)')
plt.show()
Output #1: with m.alpha1 and m.alpha2 as .const
Alpha1 = 0.6
Alpha2 = 0.4
Theta = [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2]
w = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
Output #2: with m.alpha1 as .param
Alpha1 = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
Alpha2 = 0.4
Theta = [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2]
w = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
The problem is that the name of the variable name='Propensity to consume out of income' is over 25 characters long.
m.alpha1 = m.Param(value=np.full(tdur,alpha1_ex), name='Propensity to consume out of income')
m.alpha2 = m.Param(value=np.full(tdur,alpha2_ex), name='Propensity to consume out of wealth')
The model file is produced correctly (gk_model0.apm) but the data file (gk_model0.csv) header is truncated to 25 characters. The files are accessible with m.open_folder(). The bug is in this line of gk_write_files.py where numbers are output as strings of length 25.
np.savetxt(os.path.join(self._path,file_name), csv_data.T, delimiter=",", fmt='%1.25s')
I've added this as a bug report with tracking on GitHub. One work-around is to use shorter variable names or leave off the variable names.
m.alpha1 = m.Param(value=np.full(tdur,alpha1_ex)) # Propensity to consume out of income
I want to create a while loop in python which will give an output as a list [0.00, 0.05, 0.10, 0.15,...., 1.00]
I tried doing it by following method:
alpha=0
alphalist=list()
while alpha<=1:
alphalist.append(alpha)
alpha+=0.05
print(alphalist)
I got the output as [0, 0.05, 0.1, 0.15000000000000002, 0.2, 0.25, 0.3, 0.35, 0.39999999999999997, 0.44999999999999996, 0.49999999999999994, 0.5499999999999999, 0.6, 0.65, 0.7000000000000001, 0.7500000000000001, 0.8000000000000002, 0.8500000000000002, 0.9000000000000002, 0.9500000000000003]
But What I want is this: [0.00, 0.05, 0.10, 0.15,...., 1.00]
This is the result of floating-point error. 0.05 isn't really the rational number 1/20 to begin with, so any arithmetic involving it may differ from what you expect.
Dividing two integers, rather than starting with a floating-point value, helps mitigate the problem.
>>> [x/100 for x in range(0, 101, 15)]
[0.0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1.0]
There are some numbers that can cause imprecisions with the floating number system computers use. You're just seeing an example of that.
What I would do, if you want to continue using the while loop this way, is to add another line with
alpha = round(alpha,2)
I had a situation here, and found that collect_list in spark is not efficient when the item is already a list.
Basically, I tried to calculate the mean of a nested list (the size of each list is guaranteed to be the same). When the data set becomes, for example, 10 M rows, it may produce out of memory errors. Originally, I thought it has something to do with the udf (to calculate the mean). But actually, I found that the aggregation part (collect_list of lists) is the real problem.
What I am doing now is to divide the 10 M rows into multiple blocks (by 'user'), aggregate each block individually, and then union them at the end. Any better suggestion on efficiently dealing with nested data?
For example, the toy example is like this:
data = [('user1','place1', ['place1', 'place2', 'place3'], [0.0, 0.5, 0.4], [0.0, 0.4, 0.3]),
('user1','place2', ['place1', 'place2', 'place3'], [0.7, 0.0, 0.4], [0.6, 0.0, 0.3]),
('user2','place1', ['place1', 'place2', 'place3'], [0.0, 0.4, 0.3], [0.0, 0.3, 0.4]),
('user2','place3', ['place1', 'place2', 'place3'], [0.1, 0.2, 0.0], [0.3, 0.1, 0.0]),
('user3','place2', ['place1', 'place2', 'place3'], [0.3, 0.0, 0.4], [0.2, 0.0, 0.4]),
]
data_df = sparkApp.sparkSession.createDataFrame(data, ['user', 'place', 'places', 'data1', 'data2'])
data_agg = data_df.groupBy('user') \
.agg(f.collect_list('place').alias('place_list'),
f.first('places').alias('places'),
f.collect_list('data1').alias('data1'),
f.collect_list('data1').alias('data2'),
)
import numpy as np
def average_values(sim_vectors):
if len(sim_vectors) == 1:
return sim_vectors[0]
mat = np.array(sim_vectors)
mean_vector = np.mean(mat, axis=0)
return np.round(mean_vector, 3).tolist()
avg_vectors_udf = f.udf(average_values, ArrayType(DoubleType()))
data_agg_ave = data_agg.withColumn('data1', avg_vectors_udf('data1')) \
.withColumn('data2', avg_vectors_udf('data2'))
The result would be:
+-----+----------------+--------------------+-----------------+-----------------+
| user| place_list| places| data1| data2|
+-----+----------------+--------------------+-----------------+-----------------+
|user1|[place1, place2]|[place1, place2, ...|[0.35, 0.25, 0.4]|[0.35, 0.25, 0.4]|
|user3| [place2]|[place1, place2, ...| [0.3, 0.0, 0.4]| [0.3, 0.0, 0.4]|
|user2|[place1, place3]|[place1, place2, ...|[0.05, 0.3, 0.15]|[0.05, 0.3, 0.15]|
+-----+----------------+--------------------+-----------------+-----------------+