I want to fit a f(x,y) function using lmfit. Dataset is small and there are many fitting parameters (6 points on x-axis, 11 points on y-axis and 16 unconstrained fitting parameters). Using all defaults from Model.fit I cannot obtain covariance matrix and during the fitting process the values of free parameters are not being changed at all.
I tried to change initial values for the parameters. However, when I set the same kind of problem in OriginPro Surface Fitting functionality, the Levenberg-Marquardt algorithm manages to fit data and estimate the errors (although quite large-valued for certain parameters). This means that there has to be some problem with my code. I can't find where the problem lies. I'm not Python master.
The MWE is as below.
import numpy as np
from lmfit import Model, Parameters
import numdifftools # not calling this doesn't change anything
x, y = np.array([226.5, 361.05, 404.41, 589, 632.8, 1013.98]), np.linspace(0,100,11)
X, Y = np.meshgrid(x, y)
Z = np.array([[1.3945, 1.34896, 1.34415, 1.33432, 1.33306, 1.32612],\
[1.39422, 1.3487, 1.34389, 1.33408, 1.33282, 1.32591],\
[1.39336, 1.34795, 1.34315, 1.33336, 1.33211, 1.32524],\
[1.39208, 1.34682, 1.34205, 1.3323, 1.33105, 1.32424],\
[1.39046, 1.3454, 1.34065, 1.33095, 1.32972, 1.32296],\
[1.38854, 1.34373, 1.33901, 1.32937, 1.32814, 1.32145],\
[1.38636, 1.34184, 1.33714, 1.32757, 1.32636, 1.31974],\
[1.38395, 1.33974, 1.33508, 1.32559, 1.32438, 1.31784],\
[1.38132, 1.33746, 1.33284, 1.32342, 1.32223, 1.31576],\
[1.37849, 1.33501, 1.33042, 1.32109, 1.31991, 1.31353],\
[1.37547, 1.33239, 1.32784, 1.31861, 1.31744, 1.31114]])
#This has to be defined beforehand (otherwise parameters names are not defined error)
a1,a2,a3,a4 = 1.3208, -1.2325E-5, -1.8674E-6, 5.0233E-9
b1,b2,b3,b4 = 5208.2413, -0.5179, -2.284E-2, 6.9608E-5
c1,c2,c3,c4 = -2.5551E8, -18341.336, -920, 2.7729
d1,d2,d3,d4 = 9.3495, 2E-3, 3.6733E-5, -1.2932E-7
# Function to fit
def model(x, y, *args):
return a1+a2*y+a3*np.power(y,2)+a4*np.power(y,3)+\
(b1+b2*y+b3*np.power(y,2)+b4*np.power(y,3))/np.power(x,2)+\
(c1+c2*y+c3*np.power(y,2)+c4*np.power(y,3))/np.power(x,4)+\
(d1+d2*y+d3*np.power(y,2)+d4*np.power(y,3))/np.power(x,6)
# This is the callable that is passed to Model.fit. M is a (2,N) array
# where N is the total number of data points in Z, which will be ravelled
# to one dimension.
def _model(M, **args):
x, y = M
arr = model(x, y, params)
return arr
# We need to ravel the meshgrids of X, Y points to a pair of 1-D arrays.
xdata = np.vstack((X.ravel(), Y.ravel()))
# Fitting parameters.
fmodel = Model(_model)
params = Parameters()
params.add_many(('a1',1.3208,True,1,np.inf,None,None),\
('a2',-1.2325E-5,True,-np.inf,np.inf,None,None),\
('a3',-1.8674E-6,True,-np.inf,np.inf,None,None),\
('a4',5.0233E-9,True,-np.inf,np.inf,None,None),\
('b1',5208.2413,True,-np.inf,np.inf,None,None),\
('b2',-0.5179,True,-np.inf,np.inf,None,None),\
('b3',-2.284E-2,True,-np.inf,np.inf,None,None),\
('b4',6.9608E-5,True,-np.inf,np.inf,None,None),\
('c1',-2.5551E8,True,-np.inf,np.inf,None,None),\
('c2',-18341.336,True,-np.inf,np.inf,None,None),\
('c3',-920,True,-np.inf,np.inf,None,None),\
('c4',2.7729,True,-np.inf,np.inf,None,None),\
('d1',9.3495,True,-np.inf,np.inf,None,None),\
('d2',2E-3,True,-np.inf,np.inf,None,None),\
('d3',3.6733E-5,True,-np.inf,np.inf,None,None),\
('d4',-1.2932E-7,True,-np.inf,np.inf,None,None))
result = fmodel.fit(Z.ravel(), params, M=xdata)
fit = model(X, Y, result.params)
print(result.covar)
This code results in covariance being NoneType. I expect that it will after all be calculated, because Origin can somehow manage. If it is needed I can provide all parameters from Origin Surface Fitting Parameters.
When plotting Z-fit difference, there is quite large discrepancy for low x-values (not happening in Origin).
You are not defining your model function in a way that can be used sensibly by lmfit. You have:
def _model(M, **args):
x, y = M
arr = model(x, y, params)
return arr
def model(x, y, *args):
return a1+a2*y+a3*np.power(y,2)+a4*np.power(y,3)+\
(b1+b2*y+b3*np.power(y,2)+b4*np.power(y,3))/np.power(x,2)+\
(c1+c2*y+c3*np.power(y,2)+c4*np.power(y,3))/np.power(x,4)+\
(d1+d2*y+d3*np.power(y,2)+d4*np.power(y,3))/np.power(x,6)
model = Model(_model)
Which has a few problems:
args is not used in _model, and params is not defined in the function so will be module-level.
Similarly in model, args is not used and a1, a2, etc will be taken from the module-level (programming) variables and (importantly!!) these will not be updated in the fit.
In short, your model function never sees varying values for the parameters.
lmfit.Model takes the named function arguments and turns those into parameter names. It does not turn **kws or *position_args into parameter names. So I think that what you want to do is write a model function like this:
def model(x, y, a1, a2, a2, a4, b1, b2, b3 ,b3, c1, c2, c3, c4,
d1, d2, d3, d4):
return a1+a2*y+a3*np.power(y,2)+a4*np.power(y,3)+\
(b1+b2*y+b3*np.power(y,2)+b4*np.power(y,3))/np.power(x,2)+\
(c1+c2*y+c3*np.power(y,2)+c4*np.power(y,3))/np.power(x,4)+\
(d1+d2*y+d3*np.power(y,2)+d4*np.power(y,3))/np.power(x,6)
Then create a model from that with:
# Note: don't give a function and Model instance the same name!!
my_model = Model(model, independent_vars=('x', 'y'))
With that model defined you can run the fit, and without having to unravel your data (the independent data in lmfit can be of almost any data type, and data arrays can be multi-dimensional):
result = my_model.fit(Z, params, x=X, y=Y)
For what it is worth, making such changes works for me in the sense that the fit runs to completion. The fit still gets stuck with some of the parameters not updating from their initial values, but that is sort of a separate question from the mechanics of setting up and running the fit, and is probably due to polynomials being pretty unstable or poor initial estimates.
As an aside: np.power(y,n) can be spelled y**n and readability counts. Also, numerical stability is sometimes improved with replaced
a + b*x + c*x**2 + d*x**3
with
a + x*(b + x*(c + x*d))
Though I do not know if that would help in your case.
I want to use python3 to build a zeroinflatedpoisson model. I found in library statsmodel the function statsmodels.discrete.count_model.ZeroInflatePoisson.
I just wonder how to use it. It seems I should do:
ZIFP(Y_train,X_train).fit().
But when I wanted to do prediction using X_test.
It told me the length of X_test doesn't fit X_train.
Or is there another package to fit this model?
Here is the code I used:
X1 = [random.randint(0,1) for i in range(200)]
X2 = [random.randint(1,2) for i in range(200)]
y = np.random.poisson(lam = 2,size = 100).tolist()
for i in range(100):y.append(0)
df['x1'] = x1
df['x2'] = x2
df['y'] = y
df_x = df.iloc[:,:-1]
x_train,x_test,y_train,y_test = train_test_split(df_x,df['y'],test_size = 0.3)
clf = ZeroInflatedPoisson(endog = y_train,exog = x_train).fit()
clf.predict(x_test)
ValueError:operands could not be broadcat together with shapes (140,)(60,)
also tried:
clf.predict(x_test,exog = np.ones(len(x_test)))
ValueError: shapes(60,) and (1,) not aligned: 60 (dim 0) != 1 (dim 0)
This looks like a bug to me.
As far as I can see:
If there are no explanatory variables, exog_infl, specified for the inflation model, then a array of ones is used to model a constant inflation probability.
However, if exog_infl in predict is None, then it uses the model.exog_infl which is an array of ones with the length equal to the training sample.
As work around specifying a 1-D array of ones of correct length in predict should work.
Try:
clf.predict(test_x, exog_infl=np.ones(len(test_x))
I guess the same problem will occur if exposure was used in the model, but is not explicitly specified in predict.
I ran into the same problem, landing me on this thread. As noted by Josef, it seems like you need to provide exog_infl with a 1-D array of ones of correct length to work.
However, the code Josef provided misses the 1-D array-part, so the full line required to generate the required array is actually
clf.predict(test_x, exog_infl=np.ones((len(test_x),1))
I'm trying to define a complex custom likelihood function using pymc3. The likelihood function involves a lot of iteration, and therefore I'm trying to use theano's scan method to define iteration directly within theano. Here's a greatly simplified example that illustrates the challenge that I'm facing. The (fake) likelihood function I'm trying to define is simply the sum of two pymc3 random variables, p and theta. Of course, I could simply return p+theta, but the actual likelihood function I'm trying to write is more complicated, and I believe I need to use theano.scan since it involves a lot of iteration.
import pymc3 as pm
from pymc3 import Model, Uniform, DensityDist
import theano.tensor as T
import theano
import numpy as np
### theano test
theano.config.compute_test_value = 'raise'
X = np.asarray([[1.0,2.0,3.0],[1.0,2.0,3.0]])
### pymc3 implementation
with Model() as bg_model:
p = pm.Uniform('p', lower = 0, upper = 1)
theta = pm.Uniform('theta', lower = 0, upper = .2)
def logp(X):
f = p+theta
print("f",f)
get_ll = theano.function(name='get_ll',inputs = [p, theta], outputs = f)
print("p keys ",p.__dict__.keys())
print("theta keys ",theta.__dict__.keys())
print("p name ",p.name,"p.type ",p.type,"type(p)",type(p),"p.tag",p.tag)
result=get_ll(p, theta)
print("result",result)
return result
y = pm.DensityDist('y', logp, observed = X) # Nx4 y = f(f,x,tx,n | p, theta)
When I run this, I get the error:
TypeError: ('Bad input argument to theano function with name "get_ll" at index 0(0-based)', 'Expected an array-like object, but found a Variable: maybe you are trying to call a function on a (possibly shared) variable instead of a numeric array?')
I understand that the issue occurs in line
result=get_ll(p, theta)
because p and theta are of type pymc3.TransformedRV, and that the input to a theano function needs to be a scalar number of a simple numpy array. However, a pymc3 TransformedRV does not seem to have any obvious way of obtaining the current value of the random variable itself.
Is it possible to define a log likelihood function that involves the use of a theano function that takes as input a pymc3 random variable?
The problem is that your th.function get_ll is a compiled theano function, which takes as input numerical arrays. Instead, pymc3 is sending it a symbolic variable (theano tensor). That's why you're getting the error.
As to your solution, you're right in saying that just returning p+theta is the way to go. If you have scans and whatnot in your logp, then you would return the scan variable of interest; there is no need to compile a theano function here. For example, if you wanted to add 1 to each element of a vector (as an impractical toy example), you would do:
def logp(X):
the_sum, the_sum_upd = th.scan(lambda x: x+1, sequences=[X])
return the_sum
That being said, if you need gradients, you would need to calculate your the_sum variable in a theano Op and provide a grad() method along with it (you can see a toy example of that on the answer here). If you do not need gradients, you might be better off doing everything in python (or C, numba, cython, for performance) and using the as_op decorator.
I want to make use of Theano's logistic regression classifier, but I would like to make an apples-to-apples comparison with previous studies I've done to see how deep learning stacks up. I recognize this is probably a fairly simple task if I was more proficient in Theano, but this is what I have so far. From the tutorials on the website, I have the following code:
def errors(self, y):
# check if y has same dimension of y_pred
if y.ndim != self.y_pred.ndim:
raise TypeError(
'y should have the same shape as self.y_pred',
('y', y.type, 'y_pred', self.y_pred.type)
)
# check if y is of the correct datatype
if y.dtype.startswith('int'):
# the T.neq operator returns a vector of 0s and 1s, where 1
# represents a mistake in prediction
return T.mean(T.neq(self.y_pred, y))
I'm pretty sure this is where I need to add the functionality, but I'm not certain how to go about it. What I need is either access to y_pred and y for each and every run (to update my confusion matrix in python) or to have the C++ code handle the confusion matrix and return it at some point along the way. I don't think I can do the former, and I'm unsure how to do the latter. I've done some messing around with an update function along the lines of:
def confuMat(self, y):
x=T.vector('x')
classes = T.scalar('n_classes')
onehot = T.eq(x.dimshuffle(0,'x'),T.arange(classes).dimshuffle('x',0))
oneHot = theano.function([x,classes],onehot)
yMat = T.matrix('y')
yPredMat = T.matrix('y_pred')
confMat = T.dot(yMat.T,yPredMat)
confusionMatrix = theano.function(inputs=[yMat,yPredMat],outputs=confMat)
def confusion_matrix(x,y,n_class):
return confusionMatrix(oneHot(x,n_class),oneHot(y,n_class))
t = np.asarray(confusion_matrix(y,self.y_pred,self.n_out))
print (t)
But I'm not completely clear on how to get this to interface with the function in question and give me a numpy array I can work with.
I'm quite new to Theano, so hopefully this is an easy fix for one of you. I'd like to use this classifer as my output layer in a number of configurations, so I could use the confusion matrix with other architectures.
I suggest using a brute force sort of a way. You need an output for a prediction first. Create a function for it.
prediction = theano.function(
inputs = [index],
outputs = MLPlayers.predicts,
givens={
x: test_set_x[index * batch_size: (index + 1) * batch_size]})
In your test loop, gather the predictions...
labels = labels + test_set_y.eval().tolist()
for mini_batch in xrange(n_test_batches):
wrong = wrong + int(test_model(mini_batch))
predictions = predictions + prediction(mini_batch).tolist()
Now create confusion matrix this way:
correct = 0
confusion = numpy.zeros((outs,outs), dtype = int)
for index in xrange(len(predictions)):
if labels[index] is predictions[index]:
correct = correct + 1
confusion[int(predictions[index]),int(labels[index])] = confusion[int(predictions[index]),int(labels[index])] + 1
You can find this kind of an implementation in this repository.