In the code snippet below, I am running into some sort of dimension mismatch or matrix singularity issue.
def eval(x,m):
hess = 6*x + m
return hess
x =np.ones((1, 5))
m =np.ones((1, 5))
D = np.linalg.inv(eval(x, m))
print(D)
error:
LinAlgError: Last 2 dimensions of the array must be square
I found my issue. The equation under eval was incorrect. Like hpaulj mentioned, it was returning a singular matrix. With the correct eq it works fine
Related
I'm trying to implement double stochastic normalisation of an N x N x P tensor as described in Section 3.2 in Gong, CVPR 2019. This can be done easily in the N x N case using matrix operations but I am stuck with the 3D tensor case. What I have so far is
def doubly_stochastic_normalise(E):
"""E: n x n x f"""
E = E / torch.sum(E, dim=1, keepdim=True) # normalised across rows
F = E / torch.sum(E, dim=0, keepdim=True) # normalised across cols
E = torch.einsum('ijp,kjp->ikp', E, F)
return E
but I'm wondering if there is a method without einsum.
In this setting, you can always fall back to using torch.matmul (batched matrix multiplication to be more precise). However, this requires you to transpose the axis. Recall the matrix multiplication for two 3D inputs, in einsum notation, it gives us:
bik,bkj->bij
Notice how the k dimension gets reduces. To get to this setting, we need to transpose the inputs of the operator. In your case we have:
ijp ? kjp -> ikp
↓ ↓ ↑
pij # pjk -> pik
This translates to:
>>> (E.permute(2,0,1) # F.permute(2,1,0)).permute(1,2,0)
# ijp ➝ pij kjp ➝ pjk pik ➝ ikp
You can argue your method is not only shorter but also a lot more readable. I would therefore stick with torch.einsum. The reason why the einsum operator is so useful here is because you can perform axes transpositions on the fly.
I am looking for a way to avoid the nested loops in the following snippet, where A and B are two-dimensional arrays, each of shape (m, n) with m, n beeing arbitray positive integers:
import numpy as np
m, n = 5, 2
a = randint(0, 10, (m, n))
b = randint(0, 10, (m, n))
out = np.empty((n, n))
for i in range(n):
for j in range(n):
out[i, j] = np.sum(A[:, i] + B[:, j])
The above logic is roughly equivalent to
np.einsum('ij,ik', A, B)
with the exception that einsum computes the sum of products.
Is there a way, equivalent to einsum, that computes a sum of sums? Or do I have to write an extension for this operation?
einsum needs to perform elementwise multiplication and then it does summing (optional). As such it might not be applicable/needed to solve our case. Read on!
Approach #1
We can leverage broadcasting such that the first axes are aligned
and second axis are elementwise summed after extending dimensions to 3D. Finally, we need summing along the first axis -
(A[:,:,None] + B[:,None,:]).sum(0)
Approach #2
We can simply do outer addition of columnar summations of each -
A.sum(0)[:,None] + B.sum(0)
Approach #3
And hence, bring in einsum -
np.einsum('ij->j',A)[:,None] + np.einsum('ij->j',B)
You can also use numpy.ufunc.outer, specifically here numpy.add.outer after summing along axis 0 as #Divakar mentioned in #approach 2
In [126]: numpy.add.outer(a.sum(0), b.sum(0))
Out[126]:
array([[54, 67],
[43, 56]])
I am currently going through the theano tutorial for logistic regression, very much like discussed in this post:
what-does-negative-log-likelihood-of-logistic-regression-in-theano-look-like. However, the original tutorial uses shared variables W and b, as well as a matrix called input. Input is a n x n_in matrix, W is n_in x n_out and b is a n_out x 1 column vector.
self.W = theano.shared(
value=numpy.zeros(
(n_in, n_out),
dtype=theano.config.floatX
),
name='W',
borrow=True
)
self.b = theano.shared(
value=numpy.zeros(
(n_out,),
dtype=theano.config.floatX
),
name='b',
borrow=True
)
self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)
Now, as far as I understood from the documentation of shared variables, the broadcasting pattern of shared variables defaults to false. So why is it that this line of code does not throw an error because of mismatched dimensions?
self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)
After all, we are adding a matrix T.dot(input, self.W) to a vector b. Do shared variables broadcast by default after all? Even with broadcasting, the dimension don't add up. T.dot(input, self.W) is n x n_out matrix and b a n_out x 1 vector.
What am I missing?
Shared variables are not broadcastable by default as their shape can change, however, Theano takes care of necessary broadcasting with Tensors when operations like adding scalar to a matrix, adding vector to a matrix etc. are requested. Check the documentation for more details.
Note that input is a symbolic variable not a shared variable. The dimension mismatch would have occurred if second dimension of self.W i.e n_out would be different from first dimension of vector b, which is not the case here, so they perfectly add up, it's the same in numpy as well:
import numpy as np
a = np.asarray([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
b = np.asarray([1,2,3])
print(a + b)
I don't understand why do we need tensor.reshape() function in Theano. It is said in the documentation:
Returns a view of this tensor that has been reshaped as in
numpy.reshape.
As far as I understood, theano.tensor.var.TensorVariable is some entity that is used for computation graphs creation. And it is absolutely independent of shapes. For instance when you create your function you can pass there matrix 2x2 or matrix 100x200. As I thought reshape somehow restricts this variety. But it is not. Suppose the following example:
X = tensor.matrix('X')
X_resh = X.reshape((3, 3))
Y = X_resh ** 2
f = theano.function([X_resh], Y)
print(f(numpy.array([[1, 2], [3, 4]])))
As I understood, it should give an error since I passed matrix 2x2 not 3x3, but it computes element-wise squares perfectly.
So what is the shape of the theano tensor variable and where should we use it?
There is an error in the provided code though Theano fails to point this out.
Instead of
f = theano.function([X_resh], Y)
you should really use
f = theano.function([X], Y)
Using the original code you are actually providing the tensor after the reshape so the reshape command never gets executed. This can be seen by adding
theano.printing.debugprint(f)
which prints
Elemwise{sqr,no_inplace} [id A] '' 0
|<TensorType(float64, matrix)> [id B]
Note that there is no reshape operation in this compiled execution graph.
If one changes the code so that X is used as the input instead of X_resh then Theano throws an error including the message
ValueError: total size of new array must be unchanged Apply node that
caused the error: Reshape{2}(X, TensorConstant{(2L,) of 3})
This is expected because one cannot reshape a tensor with shape (2, 2) (i.e. 4 elements) into a tensor with shape (3, 3) (i.e. 9 elements).
To address the broader question, we can use symbolic expressions in the target shape and those expressions can be functions of the input tensor's symbolic shape. Here's some examples:
import numpy
import theano
import theano.tensor
X = theano.tensor.matrix('X')
X_vector = X.reshape((X.shape[0] * X.shape[1],))
X_row = X.reshape((1, X.shape[0] * X.shape[1]))
X_column = X.reshape((X.shape[0] * X.shape[1], 1))
X_3d = X.reshape((-1, X.shape[0], X.shape[1]))
f = theano.function([X], [X_vector, X_row, X_column, X_3d])
for output in f(numpy.array([[1, 2], [3, 4]])):
print output.shape, output
I am new to theano. I am trying to implement simple linear regression but my program throws following error:
TypeError: ('Bad input argument to theano function with name "/home/akhan/Theano-Project/uog/theano_application/linear_regression.py:36" at index 0(0-based)', 'Expected an array-like object, but found a Variable: maybe you are trying to call a function on a (possibly shared) variable instead of a numeric array?')
Here is my code:
import theano
from theano import tensor as T
import numpy as np
import matplotlib.pyplot as plt
x_points=np.zeros((9,3),float)
x_points[:,0] = 1
x_points[:,1] = np.arange(1,10,1)
x_points[:,2] = np.arange(1,10,1)
y_points = np.arange(3,30,3) + 1
X = T.vector('X')
Y = T.scalar('Y')
W = theano.shared(
value=np.zeros(
(3,1),
dtype=theano.config.floatX
),
name='W',
borrow=True
)
out = T.dot(X, W)
predict = theano.function(inputs=[X], outputs=out)
y = predict(X) # y = T.dot(X, W) work fine
cost = T.mean(T.sqr(y-Y))
gradient=T.grad(cost=cost,wrt=W)
updates = [[W,W-gradient*0.01]]
train = theano.function(inputs=[X,Y], outputs=cost, updates=updates, allow_input_downcast=True)
for i in np.arange(x_points.shape[0]):
print "iteration" + str(i)
train(x_points[i,:],y_points[i])
sample = np.arange(x_points.shape[0])+1
y_p = np.dot(x_points,W.get_value())
plt.plot(sample,y_p,'r-',sample,y_points,'ro')
plt.show()
What is the explanation behind this error? (didn't got from the error message). Thanks in Advance.
There's an important distinction in Theano between defining a computation graph and a function which uses such a graph to compute a result.
When you define
out = T.dot(X, W)
predict = theano.function(inputs=[X], outputs=out)
you first set up a computation graph for out in terms of X and W. Note that X is a purely symbolic variable, it doesn't have any value, but the definition for out tells Theano, "given a value for X, this is how to compute out".
On the other hand, predict is a theano.function which takes the computation graph for out and actual numeric values for X to produce a numeric output. What you pass into a theano.function when you call it always has to have an actual numeric value. So it simply makes no sense to do
y = predict(X)
because X is a symbolic variable and doesn't have an actual value.
The reason you want to do this is so that you can use y to further build your computation graph. But there is no need to use predict for this: the computation graph for predict is already available in the variable out defined earlier. So you can simply remove the line defining y altogether and then define your cost as
cost = T.mean(T.sqr(out - Y))
The rest of the code will then work unmodified.