X_flatten in numpy - python-3.x

If someone explain me this . How below code is working.
It is mentioned in Neural Networks and Deep Learning from coursera
A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape (b×c×d, a) is to
X_flatten = X.reshape(X.shape[0], -1).T # X.T is the transpose of X
Assume i have X=(209,64,64,3) , then when i say
X_flatten = X.reshape(X.shape[0], -1).T
which means
X_flatten = X.reshape(209, -1).T
How it is working , I am really confused about it.

Assume the shape of x is (209,64,64,3).
Then x.reshape(209, -1) will turn it into shape (209, 12288), since it will reshape to have 209 rows and will automatically figure out how many columns are needed (here: 64*64*3 = 12288 columns).
x.reshape(209, -1).T will simply transpose this, so that x has the final shape (209, 12288).

As far as I am concerned, the reshape(209, -1) will flatten the (64, 64, 3) to 64, 64 * 3, then to 64 * 64 * 3. You can do a test as below:
stacked_matrix = np.arange(10).reshape(2, 5)
print(stacked_matrix.reshape(1, -1)
# 0 1 2 3 4 5 6 7 8 9

Related

sklearn PCA number of components_

Using sklearn's PCA:
m = np.random.randn(10, 5)
mod = PCA()
mod.fit_transform(m)
mod.components_ will have 5 components, which makes sense to me since there are 5 features in the data.
However if m = np.random.randn(10, 20)
mod.components_ will contain 10 components
Assuming the rows in mod.components_ correspond to the number of features, shouldn't there be 20 components in the second example? Shouldn't there be as many components as features in the data?
From scikit-learn PCA
n_components : int, None or string
Number of components to keep. if n_components is not set all components are kept:
n_components == min(n_samples, n_features)
so in first case min(10,5)=5, output shape is (5,5) and in second case min(10,20)=10, output shape is (10,20)
from sklearn.decomposition import *
import numpy as np
m = np.random.randn(10, 5)
mod = PCA()
mod.fit_transform(m)
print(mod.components_.shape) # (5, 5)
m = np.random.randn(10, 20)
mod = PCA()
mod.fit_transform(m)
print(mod.components_.shape) # (10, 20)
Feature Vs Components :
Suppose you have a dataset, contain 3 Column Named (Age, Sex, Risk_Factor ) and 500 rows. Here, number of features is 3 Not 500. The number of instance/observation/component is 500. How it can be possible every row is a unique feature, rather here, Age, Sex or Risk_Factor is unique feature.
Hope everything is clear.

Building a 4D matrix from 1D array duplicated

In python, Suppose I have a 1D array C (c dimensions), and I want to construct a 4D matrix of dimension a x b x c x d, such that the array is duplicated along all other axes.
I.e. no matter what the dimension 1, 2 and 4 indexes are, array[i][j][k][l] = C[k]
Is there any numpy function to do that? Thanks!
For an array ar, you could use np.broadcast_to, to get that higher dim array as a view (hence virtually zero runtime and no memory overhead), like so -
np.broadcast_to(ar[None,None,:,None],(a,b,len(ar),d))
Sample run -
In [115]: ar = np.random.rand(10)
In [116]: a,b,d = 3,4,5
In [117]: np.broadcast_to(ar[None,None,:,None],(a,b,len(ar),d)).shape
Out[117]: (3, 4, 10, 5)
If you need output with its own memory space, append with .copy().
Leading newaxes(None) are optional. Hence, alternatively -
In [121]: np.broadcast_to(ar[:,None],(a,b,len(ar),d)).shape
Out[121]: (3, 4, 10, 5)

How can I simplify a nested loop into torch tensor operations?

I'm trying to convert some code I have written in numpy which contains a nested-loop into tensor operations found in PyTorch. However, after trying to implement my own version I'm not getting the same value on the output. I have managed to do the same with a single loop, so I'm not entirely sure what I'm doing wrong.
#(Numpy Version)
#calculate Kinetic Energy
summation = 0.0
for i in range(0,len(k_values)-1):
summation += (k_values[i]**2.0)*wavefp[i]*(((self.hbar*kp_values[i])**2.0)/(2.0*self.mu))*wavef[i]
Ek = step*(4.0*np.pi)*summation
#(Numpy Version)
#calculate Potential Energy
summation = 0.0
for i in range(0,len(k_values)-1):
for j in range(0,len(kp_values)-1):
summation+= (k_values[i]**2.0)*wavefp[i]*(kp_values[j]**2.0)*wavef[j]*self.MTV[i,j]
Ep = (step**2.0)*(4.0*np.pi)*(2.0/np.pi)*summation
#####################################################
#(PyTorch Version)
#calcualte Kinetic Energy
Ek = step*(4.0*np.pi)*torch.sum( k_values.pow(2)*wavefp.mul(wavef)*((kp_values.mul(self.hbar)).pow(2)/(2.0*self.mu)) )
#(PyTorch Version)
#calculate Potential Energy
summation = 0.0
for i in range(0,len(k_values)-1):
summation += ((k_values[i].pow(2)).mul(wavefp[i]))*torch.sum( (kp_values.pow(2)).mul(wavef).mul(self.MTV[i,:]) )
Ep = (step**2.0)*(4.0*np.pi)*(2.0/np.pi)*summation
The arrays/tensors k_values, kp_values, wavef, and wavefp have dimensions of (1000,1). The values self.hbar, and self.mu, and step are scalars. The variable self.MTV is a matrix of size (1000,1000).
I would expect that both methods would give the same output but they don't. The code for calculating the Kinetic Energy (in both Numpy and PyTorch) give the same value. However, the potential energy calculation differ, and I'm not entirely sure why.
Many Thanks in advance!
The problem is in the shapes. You have kp_values and wavef in (1000, 1) which needs to be converted to (1000, ) before the multiplications. The outcome of (kp_values.pow(2)).mul(wavef).mul(MTV[i,:]) is a matrix but you asummed it is a vector.
So, the following should work.
summation += ((k_values[i].pow(2)).mul(wavefp[i]))*torch.sum((kp_values.squeeze(1)
.pow(2)).mul(wavef.squeeze(1)).mul(MTV[i,:]))
And a loop-free Numpy and PyTorch solution would be:
step = 1.0
k_values = np.random.randint(0, 100, size=(1000, 1)).astype("float") / 100
kp_values = np.random.randint(0, 100, size=(1000, 1)).astype("float") / 100
wavef = np.random.randint(0, 100, size=(1000, 1)).astype("float") / 100
wavefp = np.random.randint(0, 100, size=(1000, 1)).astype("float") / 100
MTV = np.random.randint(0, 100, size=(1000, 1000)).astype("float") / 100
# Numpy solution
term1 = k_values**2.0 * wavefp # 1000 x 1
temp = kp_values**2.0 * wavef # 1000 x 1
term2 = np.matmul(temp.transpose(1, 0), MTV).transpose(1, 0) # 1000 x 1000
summation = np.sum(term1 * term2)
print(summation)
# PyTorch solution
term1 = k_values.pow(2).mul(wavefp) # 1000 x 1
term2 = kp_values.pow(2).mul(wavef).transpose(0, 1).matmul(MTV) # 1000 x 1000
summation = torch.sum(term2.transpose(0, 1).mul(term1)) # 1000 x 1000
print(summation.item())
Output
12660.407492918514
12660.407492918514

How to make this 1 cell LSTM network?

I would like to make an LSTM network to learn to give me back the first value of the sequence each time there is a 0 in the sequence and 0 if there is another value.
Example:
x = 9 8 3 1 0 3 4
y = 0 0 0 0 9 0 0
The network memorize a value and give it back when it receives a special signal.
I think a can do that with one LSTM cell like that:
in red the weights and inside the gray area the biases.
Here is my model:
model2=Sequential()
model2.add(LSTM(input_dim=1, output_dim=1, return_sequences = True))
model2.add(TimeDistributed(Dense(output_dim=1, activation='linear')))
model2.compile(loss = "mse", optimizer = "rmsprop")
and here how I set the weigths to my cell, however I not sure at all of the order :
# w : weights of x_t
# u : weights of h_{t-1}
# order of array: input_gate, new_input, forget_gate, output_gate
# (Tensorflow order)
w = np.array([[0, 1, 0, -100]], dtype=np.float32)
u = np.array([[1, 0, 0, 0]], dtype=np.float32)
biases = np.array([0, 0, 1, 1], dtype=np.float32)
model2.get_layer('lstm').set_weights([w, u, biases])
Am I right with the weigths? Is it as I put it on the figure?
To work it needs to have the right inital values. How do I set the initial values c of the cell and h of the previous output ? I seen that in the source code
h_tm1 = states[0] # previous memory state
c_tm1 = states[1] # previous carry state
but I couldn't find how to use that.
Why not do this manually? It's so easy and it's an exact calculation. You don't need weights for that, and that is certainly not differentiable regarding weights.
Given an input tensor with shape (batch, steps, features):
def processSequence(x):
initial = x[:,0:1]
zeros = K.cast(K.equal(x,0), K.floatx())
return initial * zeros
model.add(Lambda(processSequence))
Warning: if you're intending to use this with inputs from other layers, the probability of finding a zero will be so small that this layer will be useless.

Theano Reshaping

I am unable to clearly comprehend theano's reshape. I have an image matrix of shape:
[batch_size, stack1_size, stack2_size, height, width]
, where there are stack2_size stacks of images, each having stack1_size of channels. I now want to convert them into the following shape:
[batch_size, stack1_size*stack2_size, 1 , height, width]
such that all the stacks will be combined together into one stack of all channels. I am not sure if reshape will do this for me. I see that reshape seems to not lexicographically order the pixels if they are mixed in dimensions in the middle. I have been trying to achieve this with a combination of dimshuffle,reshape and concatenate, but to no avail. I would appreciate some help.
Thanks.
Theano reshape works just like numpy reshape with its default order, i.e. 'C':
‘C’ means to read / write the elements using C-like index order, with
the last axis index changing fastest, back to the first axis index
changing slowest.
Here's an example showing that the image pixels remain in the same order after a reshape via either numpy or Theano.
import numpy
import theano
import theano.tensor
def main():
batch_size = 2
stack1_size = 3
stack2_size = 4
height = 5
width = 6
data = numpy.arange(batch_size * stack1_size * stack2_size * height * width).reshape(
(batch_size, stack1_size, stack2_size, height, width))
reshaped_data = data.reshape([batch_size, stack1_size * stack2_size, 1, height, width])
print data[0, 0, 0]
print reshaped_data[0, 0, 0]
x = theano.tensor.TensorType('int64', (False,) * 5)()
reshaped_x = x.reshape((x.shape[0], x.shape[1] * x.shape[2], 1, x.shape[3], x.shape[4]))
f = theano.function(inputs=[x], outputs=reshaped_x)
print f(data)[0, 0, 0]
main()

Resources