TypeError: predict() got an unexpected keyword argument 'return_mean_grad' - scikit-learn

Using scikit-optimize and scikit-learn, I get the following error on calling the gp_minimize function:
TypeError: predict() got an unexpected keyword argument
'return_mean_grad'
I think it may be some version incompatability but they are both latest versions (according to pip)
scikit-optimize 0.9.0
scikit-learn 1.2.0
What could be the problem?
Sample code I am using
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import Matern
from skopt import gp_minimize
# Define the kernel for the Gaussian process
kernel = Matern(nu=2.5)
# Initialize the Gaussian process regressor
gpr = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=10)
# Define the objective function
def f(x):
x1, x2, x3 = x
return -x1**2 - x2**2 - x3**2
# Define the variable bounds
bounds = [(0, 1), (0, 1), (0, 1)]
# Perform the optimization
res = gp_minimize(f, bounds, n_calls=20, random_state=0, verbose=True, n_random_starts=10, acq_func="EI", base_estimator=gpr)
# Print the optimal variables and function value
print("x1 = {:.4f}, x2 = {:.4f}, x3 = {:.4f}, f(x) = {:.4f}".format(res.x[0], res.x[1], res.x[2], res.fun))

Related

Python: Fitting a piecewise polynomial

I am trying to fit a piecewise polynomial function
Code:
import numpy as np
import scipy
from scipy.interpolate import UnivariateSpline, splrep
from scipy.optimize import curve_fit
from matplotlib import pyplot as plt
def piecewise_func(x, X, Y):
"""
cond_l: condition list
func_l: function list
"""
spl = UnivariateSpline(X, Y, k=3, s=0.5)
tck = (spl._data[8], spl._data[9], 3) # tck = (knots, coefficients, degree)
p = scipy.interpolate.PPoly.from_spline(tck)
cond_l = []
func_l = []
for idx, i in enumerate(range(3, len(spl.get_knots()) + 3 - 1)):
cond_l.append([(x >= p.x[i] & x < p.x[i + 1])])
func_l.append([lambda x: p.c[3, i] + p.c[2, i] * x + p.c[1, i] * x ** 2 + p.c[0, i] * x ** 3])
return np.piecewise(x, cond_l, func_l)
if __name__ == '__main__':
xdata = [0.28190937, 0.63429607, 0.91620544, 1.68793236, 2.32350115, 2.95215219, 4.5,
4.78103382, 7.2, 7.53430054, 8.03627018, 9., 9.86212529, 11.25951191, 11.62658532, 11.65598578, 13.90295926]
ydata = [0.36273168, 0.81614628, 1.17887796, 1.4475374, 5.52692706, 2.17548169, 3.55313396, 3.80326533, 7.75556311, 8.30176616, 10.72117182, 11.2499386,
11.72296513, 11.02146624, 14.51260631, 20.59365525, 21.77847853]
spl = UnivariateSpline(xdata, ydata, k=3, s=1)
plt.plot(xdata, ydata, '*')
plt.plot(xdata, spl(xdata))
plt.show()
p, e = curve_fit(piecewise_func, xdata, ydata)
# x_plot = np.linspace(0., 0.15, len(x))
# plt.plot(x, y, "+")
# plt.plot(x, (piecewise_func(x_plot, *p)), 'C3-', lw=3)
I tried the UnivariateSpline function to interpolate, I see the following result
However, I don't want the polynomial curve to pass through all data points. I tried varying the smoothing factor but I am not able to obtain something like the one below.
Expected output:
I'm trying curve fitting (Use UnivariateSpline to fit data tightly) to get the expected output and I have the following issues.
piecewise_func in the code posted returns the piecewise polynomial.
Passing this to curve_fit(piecewise_func, xdata, ydata) returns an error
Error:
res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)
ValueError: diff requires input that is at least one dimensional
I am not sure what is wrong.
Suggestions on how to get the expected fit will be
of great help.
I would recommend having a closer look at the parameter s in the UnivariateSpline documentation:
s : float or None, optional
Positive smoothing factor used to choose the number of knots. Number of knots will be increased until the smoothing condition is satisfied:
sum((w[i] * (y[i]-spl(x[i])))**2, axis=0) <= s
If s is None, s = len(w) which should be a good value if 1/w[i] is an estimate of the standard deviation of y[i]. If 0, spline will interpolate through all data points. Default is None.
Since you do not set w, this is just a complicated way of saying that s is the least squares error that you allow, i.e., squared errors summed over all the data points. Your value of 1 does not lead to interpolation but it is quite tight compared to what you want to achieve.
Taking
spl = UnivariateSpline(xdata, ydata, k=3, s=10)
you get the following:
Yet closer to your goal is s=100:
So my recommendation is to play around with s and if that proves insufficient, to ask a new question describing what you need more precisely. I haven't had a proper look at the problem with piecewise_func.

Numba jit and Scipy

I have found a few posts on the subject here, but most of them did not have a useful answer.
I have a 3D NumPy dataset [images number, x, y] in which the probability that the pixel belongs to a class is stored as a float (0-1). I would like to correct the wrong segmented pixels (with high performance).
The probabilities are part of a movie in which objects are moving from right to left and possibly back again. The basic idea is that I fit the pixels with a Gaussian function or comparable function and look at around 15-30 images ( [i-15 : i+15 ,x, y] ). It is very probable that if the previous 5 pixels and the following 5 pixels are classified in this class, this pixel also belongs to this class.
To illustrate my problem I add a sample code, the results were calculated without the usage of numba:
from scipy.optimize import curve_fit
from scipy import exp
import numpy as np
from numba import jit
#jit
def fit(size_of_array, outputAI, correct_output):
x = range(size_of_array[0])
for i in range(size_of_array[1]):
for k in range(size_of_array[2]):
args, cov = curve_fit(gaus, x, outputAI[:, i, k])
correct_output[2, i, k] = gaus(2, *args)
return correct_output
#jit
def gaus(x, a, x0, sigma):
return a*exp(-(x-x0)**2/(2*sigma**2))
if __name__ == '__main__':
# output_AI = [imageNr, x, y] example 5, 2, 2
# At position [2][1][1] is the error, the pixels before and after were classified to the class but not this pixel.
# The objects do not move in such a speed, so the probability should be corrected.
outputAI = np.array([[[0.1, 0], [0, 0]], [[0.8, 0.3], [0, 0.2]], [[1, 0.1], [0, 0.2]],
[[0.1, 0.3], [0, 0.2]], [[0.8, 0.3], [0, 0.2]]])
correct_output = np.zeros(outputAI.shape)
# I correct now in this example only all pixels in image 3, in the code a loop runs over the whole 3D array and
# corrects every image and every pixel separately
size_of_array = outputAI.shape
correct_output = fit(size_of_array, outputAI, correct_output)
# numba error: Compilation is falling back to object mode WITH looplifting enabled because Function "fit" failed
# type inference due to: Untyped global name 'curve_fit': cannot determine Numba type of <class 'function'>
print(correct_output[2])
# [[9.88432346e-01 2.10068763e-01]
# [6.02428922e-20 2.07921125e-01]]
# The wrong pixel at position [0][0] was corrected from 0.2 to almost 1, the others are still not assigned
# to the class.
Unfortunately numba does NOT work. I always get the following error:
Compilation is falling back to object mode WITH looplifting enabled because Function "fit" failed type inference due to: Untyped global name 'curve_fit': cannot determine Numba type of <class 'function'>
** ------------------------------------------------------------------------**
Update 04.08.2020
Currently I have this solution for my problem in mind. But I am open for further suggestions.
from scipy.optimize import curve_fit
from scipy import exp
import numpy as np
import time
def fit_without_scipy(input):
x = range(input.size)
x0 = outputAI[i].argmax()
a = input.max()
var = (input - input.mean())**2
return a * np.exp(-(x - x0) ** 2 / (2 * var.mean()))
def fit(input):
x = range(len(input))
try:
args, cov = curve_fit(gaus, x, outputAI[i])
return gaus(x, *args)
except:
return input
def gaus(x, a, x0, sigma):
return a * exp(-(x - x0) ** 2 / (2 * sigma ** 2))
if __name__ == '__main__':
nr = 31
N = 100000
x = np.linspace(0, 30, nr)
outputAI = np.zeros((N, nr))
correct_output = outputAI.copy()
correct_output_numba = outputAI.copy()
perfekt_result = outputAI.copy()
for i in range(N):
perfekt_result[i] = gaus(x, np.random.random(), np.random.randint(-N, 2*N), np.random.random() * np.random.randint(0, 100))
outputAI[i] = perfekt_result[i] + np.random.normal(0, 0.5, nr)
start = time.time()
for i in range(N):
correct_output[i] = fit(outputAI[i])
print("Time with scipy: " + str(time.time() - start))
start = time.time()
for i in range(N):
correct_output_numba[i] = fit_without_scipy(outputAI[i])
print("Time without scipy: " + str(time.time() - start))
for i in range(N):
correct_output[i] = abs(correct_output[i] - perfekt_result[i])
correct_output_numba[i] = abs(correct_output_numba[i] - perfekt_result[i])
print("Mean deviation with scipy: " + str(correct_output.mean()))
print("Mean deviation without scipy: " + str(correct_output_numba.mean()))
Output [with nr = 31 and N = 100000]:
Time with scipy: 193.27853846549988 secs
Time without scipy: 2.782526969909668 secs
Mean deviation with scipy: 0.03508043754489116
Mean deviation without scipy: 0.0419951370808896
In the next step I would try to speed up the code even more with numba. Currently this does not work because of the argmax function.
Curve_fit eventually calls into either least_squares (pure python) or leastsq (C extension). You have three options:
figure out how to make numba-jitted code talk to a C extension which powers leastsq
extract relevant parts of least_squares and numba.jit them
implement the LowLevelCallable support for least_squares or minimize.
None of these is easy. OTOH all of these would be interesting to a wider audience if successful.

numpy code works in REPL, script says type error

Copy and pasting this code into the python3 REPL works, but when I run it as a script, I get a type error.
"""Softmax."""
scores = [3.0, 1.0, 0.2]
import numpy as np
from math import e
def softmax(x):
"""Compute softmax values for each sets of scores in x."""
results = []
x = np.transpose(x)
for j in range(len(x)):
exps = [np.exp(s) for s in x[j]]
_sum = np.sum(np.exp(x[j]))
softmax = [i / _sum for i in exps]
results.append(softmax)
final = np.vstack(results)
return np.transpose(final)
# pass # TODO: Compute and return softmax(x)
print(softmax(scores))
# Plot softmax curves
import matplotlib.pyplot as plt
x = np.arange(-2.0, 6.0, 0.1)
scores = np.vstack([x, np.ones_like(x), 0.2 * np.ones_like(x)])
plt.plot(x, softmax(scores).T, linewidth=2)
plt.show()
The error I get running the script via CLI is the following:
bash$ python3 softmax.py
Traceback (most recent call last):
File "softmax.py", line 22, in <module>
print(softmax(scores))
File "softmax.py", line 13, in softmax
exps = [np.exp(s) for s in x[j]]
TypeError: 'numpy.float64' object is not iterable
This kind of crap makes me so nervous about running interpreted code in production with libraries like these, seriously unreliable and undefined behaviour is totally unacceptable IMO.
At the top of your script, you define
scores = [3.0, 1.0, 0.2]
This is the argument in your first call of softmax(scores). When converted to a numpy array, scores is 1-d array with shape (3,).
You pass scores into the function, and then it is converted to a numpy array by the call
x = np.transpose(x)
However, it is still 1-d, with shape (3,). The transpose function swaps dimensions, but it does not add a dimension to a 1-d array. In effect, transpose is a "no-op" when applied to a 1-d array.
Then, in the loop that follows, x[j] is a scalar of type numpy.float64, so it does not make sense to write [np.exp(s) for s in x[j]]. x[j] is a scalar, not a sequence, so you can't iterate over it.
In the bottom part of your script, you redefine scores as
x = np.arange(-2.0, 6.0, 0.1)
scores = np.vstack([x, np.ones_like(x), 0.2 * np.ones_like(x)])
Now scores is 2-d array (scores.shape is (3, 80)), so you don't get an error when you call softmax(scores).

Fit CDF with 2 Gaussian using LeastSq

I am trying to fit empirical CDF plot to two Gaussian cdf as it seems that it has two peaks, but it does not work. I fit the curve with leastsq from scipy.optimize and erf function from scipy.special. The fitting only gives constant line at a value of 2. I am not sure in which part of the code that I make mistake. Any pointers will be helpful. Thanks!
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
x = np.array([ 90.64115156, 90.85690063, 91.07264971, 91.28839878,
91.50414786, 91.71989693, 91.93564601, 92.15139508,
92.36714415, 92.58289323, 92.7986423 , 93.01439138,
93.23014045, 93.44588953, 93.6616386 , 93.87738768,
94.09313675, 94.30888582, 94.5246349 , 94.74038397,
94.95613305, 95.17188212, 95.3876312 , 95.60338027,
95.81912935, 96.03487842, 96.2506275 , 96.46637657,
96.68212564, 96.89787472, 97.11362379, 97.32937287,
97.54512194, 97.76087102, 97.97662009, 98.19236917,
98.40811824, 98.62386731, 98.83961639, 99.05536546,
99.27111454, 99.48686361, 99.70261269, 99.91836176,
100.13411084, 100.34985991, 100.56560899, 100.78135806,
100.99710713, 101.21285621])
y = np.array([3.33333333e-04, 3.33333333e-04, 3.33333333e-04, 1.00000000e-03,
1.33333333e-03, 3.33333333e-03, 6.66666667e-03, 1.30000000e-02,
2.36666667e-02, 3.40000000e-02, 5.13333333e-02, 7.36666667e-02,
1.01666667e-01, 1.38666667e-01, 2.14000000e-01, 3.31000000e-01,
4.49666667e-01, 5.50000000e-01, 6.09000000e-01, 6.36000000e-01,
6.47000000e-01, 6.54666667e-01, 6.61000000e-01, 6.67000000e-01,
6.76333333e-01, 6.84000000e-01, 6.95666667e-01, 7.10000000e-01,
7.27666667e-01, 7.50666667e-01, 7.75333333e-01, 7.93333333e-01,
8.11333333e-01, 8.31333333e-01, 8.56333333e-01, 8.81333333e-01,
9.00666667e-01, 9.22666667e-01, 9.37666667e-01, 9.47333333e-01,
9.59000000e-01, 9.70333333e-01, 9.77333333e-01, 9.83333333e-01,
9.90333333e-01, 9.93666667e-01, 9.96333333e-01, 9.99000000e-01,
9.99666667e-01, 1.00000000e+00])
plt.plot(a,b,'r.')
# Fitting with 2 Gaussian
from scipy.special import erf
from scipy.optimize import leastsq
def two_gaussian_cdf(params, x):
(mu1, sigma1, mu2, sigma2) = params
model = 0.5*(1 + erf( (x-mu1)/(sigma1*np.sqrt(2)) )) +\
0.5*(1 + erf( (x-mu2)/(sigma2*np.sqrt(2)) ))
return model
def residual_two_gaussian_cdf(params, x, y):
model = double_gaussian(params, x)
return model - y
params = [5.,2.,1.,2.]
out = leastsq(residual_two_gaussian_cdf,params,args=(x,y))
double_gaussian(out[0],x)
plt.plot(x,two_gaussian_cdf(out[0],x))
which return to this plot
You may find lmfit (see http://lmfit.github.io/lmfit-py/) to be a useful alternative to leastsq here as it provides a higher-level interface to optimization and curve fitting (though still based on scipy.optimize.leastsq). With lmfit, your example might look like this (cutting out the definition of x and y data):
#!/usr/bin/env python
import numpy as np
from scipy.special import erf
import matplotlib.pyplot as plt
from lmfit import Model
# define the basic model. I included an amplitude parameter
def gaussian_cdf(x, amp, mu, sigma):
return (amp/2.0)*(1 + erf( (x-mu)/(sigma*np.sqrt(2))))
# create a model that is the sum of two gaussian_cdfs
# note that a prefix names each component and will be
# applied to the parameter names for each model component
model = Model(gaussian_cdf, prefix='g1_') + Model(gaussian_cdf, prefix='g2_')
# make a parameters object -- a dict with parameter names
# taken from the arguments of your model function and prefix
params = model.make_params(g1_amp=0.50, g1_mu=94, g1_sigma=1,
g2_amp=0.50, g2_mu=98, g2_sigma=1.)
# you can apply bounds to any parameter
#params['g1_sigma'].min = 0 # sigma must be > 0!
# you may want to fix the amplitudes to 0.5:
#params['g1_amp'].vary = False
#params['g2_amp'].vary = False
# run the fit
result = model.fit(y, params, x=x)
# print results
print(result.fit_report())
# plot results, including individual components
comps = result.eval_components(result.params, x=x)
plt.plot(x, y,'r.', label='data')
plt.plot(x, result.best_fit, 'k-', label='fit')
plt.plot(x, comps['g1_'], 'b--', label='g1_')
plt.plot(x, comps['g2_'], 'g--', label='g2_')
plt.legend()
plt.show()
This prints out a report of
[[Model]]
(Model(gaussian_cdf, prefix='g1_') + Model(gaussian_cdf, prefix='g2_'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 66
# data points = 50
# variables = 6
chi-square = 0.00626332
reduced chi-square = 1.4235e-04
Akaike info crit = -437.253376
Bayesian info crit = -425.781238
[[Variables]]
g1_amp: 0.65818908 +/- 0.00851338 (1.29%) (init = 0.5)
g1_mu: 93.8438526 +/- 0.01623273 (0.02%) (init = 94)
g1_sigma: 0.54362156 +/- 0.02021614 (3.72%) (init = 1)
g2_amp: 0.34058664 +/- 0.01153346 (3.39%) (init = 0.5)
g2_mu: 97.7056728 +/- 0.06408910 (0.07%) (init = 98)
g2_sigma: 1.24891832 +/- 0.09204020 (7.37%) (init = 1)
[[Correlations]] (unreported correlations are < 0.100)
C(g1_amp, g2_amp) = -0.892
C(g2_amp, g2_sigma) = 0.848
C(g1_amp, g2_sigma) = -0.744
C(g1_amp, g1_mu) = 0.692
C(g1_amp, g2_mu) = 0.662
C(g1_mu, g2_amp) = -0.607
C(g1_amp, g1_sigma) = 0.571
and a plot like this:
This fit is not perfect, but it should get you started.
Here is how I used the scipy.optimize.differential_evolution module to generate initial parameter estimates for curve fitting. I have coded the sum of squared errors as the target for the genetic algorithm as shown below. This scipy module uses the Latin Hypercube algorithm to ensure a thorough search of parameter space, which requires parameter bounds within which to search. In this case, the parameter bounds are automatically derived from the data so that there is no need to provide them manually in the code.
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import warnings
from scipy.optimize import differential_evolution
from scipy.special import erf
# bounds on parameters are set in generate_Initial_Parameters() below
def two_gaussian_cdf(x, mu1, sigma1, mu2, sigma2):
model = 0.5*(1 + erf( (x-mu1)/(sigma1*np.sqrt(2)) )) +\
0.5*(1 + erf( (x-mu2)/(sigma2*np.sqrt(2)) ))
return model
# function for genetic algorithm to minimize (sum of squared error)
# bounds on parameters are set in generate_Initial_Parameters() below
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
return np.sum((yData - two_gaussian_cdf(xData, *parameterTuple)) ** 2)
def generate_Initial_Parameters():
# data min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
parameterBounds = []
parameterBounds.append([minX, maxX]) # parameter bounds for mu1
parameterBounds.append([minY, maxY]) # parameter bounds for sigma1
parameterBounds.append([minX, maxX]) # parameter bounds for mu2
parameterBounds.append([minY, maxY]) # parameter bounds for sigma2
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
xData = np.array([ 90.64115156, 90.85690063, 91.07264971, 91.28839878,
91.50414786, 91.71989693, 91.93564601, 92.15139508,
92.36714415, 92.58289323, 92.7986423 , 93.01439138,
93.23014045, 93.44588953, 93.6616386 , 93.87738768,
94.09313675, 94.30888582, 94.5246349 , 94.74038397,
94.95613305, 95.17188212, 95.3876312 , 95.60338027,
95.81912935, 96.03487842, 96.2506275 , 96.46637657,
96.68212564, 96.89787472, 97.11362379, 97.32937287,
97.54512194, 97.76087102, 97.97662009, 98.19236917,
98.40811824, 98.62386731, 98.83961639, 99.05536546,
99.27111454, 99.48686361, 99.70261269, 99.91836176,
100.13411084, 100.34985991, 100.56560899, 100.78135806,
100.99710713, 101.21285621])
yData = np.array([3.33333333e-04, 3.33333333e-04, 3.33333333e-04, 1.00000000e-03,
1.33333333e-03, 3.33333333e-03, 6.66666667e-03, 1.30000000e-02,
2.36666667e-02, 3.40000000e-02, 5.13333333e-02, 7.36666667e-02,
1.01666667e-01, 1.38666667e-01, 2.14000000e-01, 3.31000000e-01,
4.49666667e-01, 5.50000000e-01, 6.09000000e-01, 6.36000000e-01,
6.47000000e-01, 6.54666667e-01, 6.61000000e-01, 6.67000000e-01,
6.76333333e-01, 6.84000000e-01, 6.95666667e-01, 7.10000000e-01,
7.27666667e-01, 7.50666667e-01, 7.75333333e-01, 7.93333333e-01,
8.11333333e-01, 8.31333333e-01, 8.56333333e-01, 8.81333333e-01,
9.00666667e-01, 9.22666667e-01, 9.37666667e-01, 9.47333333e-01,
9.59000000e-01, 9.70333333e-01, 9.77333333e-01, 9.83333333e-01,
9.90333333e-01, 9.93666667e-01, 9.96333333e-01, 9.99000000e-01,
9.99666667e-01, 1.00000000e+00])
# generate initial parameter values
initialParameters = generate_Initial_Parameters()
# curve fit the data
fittedParameters, niepewnosci = curve_fit(two_gaussian_cdf, xData, yData, initialParameters)
# create values for display of fitted peak function
mu1, sigma1, mu2, sigma2 = fittedParameters
y_fit = two_gaussian_cdf(xData, mu1, sigma1, mu2, sigma2)
plt.plot(xData, yData) # plot the raw data
plt.plot(xData, y_fit) # plot the equation using the fitted parameters
plt.show()
print(fittedParameters)

Multivariate Dirichlet process mixtures for density estimation using pymc3

I want to extend the Austin's example on Dirichlet process mixtures for density estimationto the multivariate case.
The first information about multivariate Gaussian mixture using pymc3 I have found is this issue at Github. People involved in the issue said that there are two different solutions but they don't work for me. For instance, by using the Brandon's Multivariate Extension in a simple model like this:
import numpy as np
import pymc3 as pm
from mvnormal_extension import MvNormal
with pm.Model() as model:
var_x = MvNormal('var_x', mu = 3*np.zeros(2), tau = np.diag(np.ones(2)), shape=2)
trace = pm.sample(100)
I can't obtain the proper Mean around (3,3):
pm.summary(trace)
var_x:
Mean SD MC Error 95% HPD interval
-------------------------------------------------------------------
0.220 1.161 0.116 [-1.897, 2.245]
0.165 1.024 0.102 [-2.626, 1.948]
Posterior quantiles:
2.5 25 50 75 97.5
|--------------|==============|==============|--------------|
-1.897 -0.761 0.486 1.112 2.245
-2.295 -0.426 0.178 0.681 2.634
The other solution can be reproduced thanks to Benavente:
import numpy as np
import pymc3 as pm
import scipy
import theano
from theano import tensor
target_data = np.random.random((500, 16))
N_COMPONENTS = 5
N_SAMPLES, N_DIMS = target_data.shape
# Dirichilet prior.
ALPHA_0 = np.ones(N_COMPONENTS)
# Component means prior.
MU_0 = np.zeros(N_DIMS)
LAMB_0 = 1. * np.eye(N_DIMS)
# Components precision prior.
BETA_0, BETA_1 = 0., 1. # Covariance stds prior uniform limits.
L_0 = 2. # LKJ corr. shape. Larger shape -> more biased to identity.
# In order to convert the upper triangular correlation values to a
# complete correlation matrix, we need to construct an index matrix:
# Source: http://stackoverflow.com/q/29759789/1901296
N_ELEMS = N_DIMS * (N_DIMS - 1) / 2
tri_index = np.zeros([N_DIMS, N_DIMS], dtype=int)
tri_index[np.triu_indices(N_DIMS, k=1)] = np.arange(N_ELEMS)
tri_index[np.triu_indices(N_DIMS, k=1)[::-1]] = np.arange(N_ELEMS)
with pm.Model() as model:
# Component weight prior.
pi = pm.Dirichlet('pi', ALPHA_0, testval=np.ones(N_COMPONENTS) / N_COMPONENTS)
#pi_potential = pm.Potential('pi_potential', tensor.switch(tensor.min(pi) < .01, -np.inf, 0))
###################
# Components plate.
###################
# Component means.
mus = [pm.MvNormal('mu_{}'.format(i), MU_0, LAMB_0, shape=N_DIMS)
for i in range(N_COMPONENTS)]
# Component precisions.
#lamb = diag(sigma) * corr(corr_shape) * diag(sigma)
corr_vecs = [
pm.LKJCorr('corr_vec_{}'.format(i), L_0, N_DIMS)
for i in range(N_COMPONENTS)
]
# Transform the correlation vector representations to matrices.
corrs = [
tensor.fill_diagonal(corr_vecs[i][tri_index], 1.)
for i in range(N_COMPONENTS)
]
# Stds for the correlation matrices.
cov_stds = pm.Uniform('cov_stds', BETA_0, BETA_1, shape=(N_COMPONENTS, N_DIMS))
# Finally re-compose the covariance matrices using diag(sigma) * corr * diag(sigma)
# Source http://austinrochford.com/posts/2015-09-16-mvn-pymc3-lkj.html
lambs = []
for i in range(N_COMPONENTS):
std_diag = tensor.diag(cov_stds[i])
cov = std_diag.dot(corrs[i]).dot(std_diag)
lambs.append(tensor.nlinalg.matrix_inverse(cov))
stacked_mus = tensor.stack(mus)
stacked_lambs = tensor.stack(lambs)
#####################
# Observations plate.
#####################
z = pm.Categorical('z', pi, shape=N_SAMPLES)
#theano.as_op(itypes=[tensor.dmatrix, tensor.lvector, tensor.dmatrix, tensor.dtensor3],
otypes=[tensor.dscalar])
def likelihood_op(values, z_values, mu_values, prec_values):
logp = 0.
for i in range(N_COMPONENTS):
indices = z_values == i
if not indices.any():
continue
logp += scipy.stats.multivariate_normal(
mu_values[i], prec_values[i]).logpdf(values[indices]).sum()
return logp
def likelihood(values):
return likelihood_op(values, z, stacked_mus, stacked_lambs)
y = pm.DensityDist('y', likelihood, observed=target_data)
step1 = pm.Metropolis(vars=mus + lambs + [pi])
step2 = pm.ElemwiseCategoricalStep(vars=[z], values=list(range(N_COMPONENTS)))
trace = pm.sample(100, step=[step1, step2])
I have changed in this code pm.ElemwiseCategoricalStep to pm.ElemwiseCategorical and
logp += scipy.stats.multivariate_normal(mu_values[i], prec_values[i]).logpdf(values[indices])
by
logp += scipy.stats.multivariate_normal(mu_values[i], prec_values[i]).logpdf(values[indices]).sum()
but I get this exception:
ValueError: expected an ndarray
Apply node that caused the error: Elemwise{Composite{((i0 + i1) - (i2 + i3))}}[(0, 0)](Sum{acc_dtype=float64}.0, FromFunctionOp{likelihood_op}.0, Sum{acc_dtype=float64}.0, FromFunctionOp{likelihood_op}.0)
Toposort index: 101
Inputs types: [TensorType(float64, scalar), TensorType(float64, scalar), TensorType(float64, scalar), TensorType(float64, scalar)]
Inputs shapes: [(), (), (), ()]
Inputs strides: [(), (), (), ()]
Inputs values: [array(-127.70516572917249), -13460.012199423296, array(-110.90354888959129), -13234.61313535326]
Outputs clients: [['output']]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
I appreciate any help.
Thanks!

Resources