I'm trying to implement a layer (via lambda layer) which is doing the following numpy procedure:
def func(x, n):
return np.concatenate((x[:, :n], np.tile(x[:, n:].mean(axis = 0), (x.shape[0], 1))), axis = 1)
I'm stuck because I don't know how to get the size of the first dimension of x (which is the batch size). The backend function int_shape(x) returns (None, ...).
So, if I know the batch_size, the corresponding Keras procedure would be:
def func(x, n):
return K.concatenate([x[:, :n], K.tile(K.mean(x[:, n:], axis=0), [batch_size, 1])], axis = 1)
Just as #pitfall says, the second argument of K.tile should be a tensor.
And according to the doc of keras backend, K.shape returns a tensor and K.int_shape returns a tuple of int or None entries. So the correct way is to use K.shape. Following is the MWE:
import keras.backend as K
from keras.layers import Input, Lambda
from keras.models import Model
import numpy as np
batch_size = 8
op_len = ip_len = 10
def func(X):
return K.tile(K.mean(X, axis=0, keepdims=True), (K.shape(X)[0], 1))
ip = Input((ip_len,))
lbd = Lambda(lambda x:func(x))(ip)
model = Model(ip, lbd)
model.summary()
model.compile('adam', loss='mse')
X = np.random.randn(batch_size*100, ip_len)
Y = np.random.randn(batch_size*100, op_len)
#no parameters to train!
#model.fit(X,Y,batch_size=batch_size)
#prediction
np_result = np.tile(np.mean(X[:batch_size], axis=0, keepdims=True),
(batch_size,1))
pred_result = model.predict(X[:batch_size])
print(np.allclose(np_result, pred_result))
You should not use K.int_shape, but something like tf.shape, which will give you a dynamic shape.
New updates
Here is a solution without using tile
from keras.layers import *
from keras.models import *
# define the lambda layer
n = 5
MyConcat = Lambda(lambda x: K.concatenate([x[:,:n],
K.ones_like(x[:,n:]) * K.mean(x[:,n:], axis=0)],
axis=1))
# make a dummy testing model
x = Input(shape=(10,))
y = MyConcat(x)
mm = Model(inputs=x, outputs=y)
# test code
a = np.arange(40).reshape([-1,10])
print(a)
[[ 0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]
[20 21 22 23 24 25 26 27 28 29]
[30 31 32 33 34 35 36 37 38 39]]
b = mm.predict(a)
print(b)
[[ 0. 1. 2. 3. 4. 20. 21. 22. 23. 24.]
[10. 11. 12. 13. 14. 20. 21. 22. 23. 24.]
[20. 21. 22. 23. 24. 20. 21. 22. 23. 24.]
[30. 31. 32. 33. 34. 20. 21. 22. 23. 24.]]
The last thing that wants to mention is --- in keras you are not allowed to change the batch-size within a layer, namely the output and input batch size of a layer MUST be the same.
Create a functor and give it the batch size at the initialization.
class SuperLoss:
def __init__(self,batch_size):
self.batch_size = batch_size
def __call__(self,y_true,y_pred):
self.batch_size ....
Related
I am new to pytorch, and I have been trying some examples with autograd, to see if I understand it. I am confused about why the following code does not work:
def Loss(a):
return a**2
a=torch.tensor(3.0, requires_grad=True )
L=Loss(a)
L.backward()
with torch.no_grad(): a=a+1.0
L=Loss(a)
L.backward()
print(a.grad)
Instead of outputing 8.0, we get "RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn"
There are two things to note regarding your code:
You are performing two back propagation up to leaf a which means the gradients should accumulate. In other words, you should get a gradient equal to da²/da + d(a+1)²/da which is equal to 2a + 2(a+1) which is 2(2a + 1). If a=3, then a.grad will be equal to 14.
You are using a torch.no_grad context manager which means you will be unable to perform backpropagation from any resulting tensor i.e. here a itself.
Here is a snippet which yields the desired result, that is 14 as the accumulation of both gradients:
>>> L = Loss(a)
>>> L.backward()
>>> a.grad
6
>>> L = Loss(a+1)
>>> L.backward()
>>> a.grad
14 # as 6 + 8
I have a Train set training_set of m observations and n features, and I have three different validation sets val_a, val_b, and val_c which don't leak information to one another.
I would like to perform hyperparameter tuning via HalvingGridSearchCV, where I fit models on training_set, and validate on all three validation sets separately, and then take the score to be the average score for all three (or the lowest score).
The reason is that the three validation were observations of the samples at three distinct time points (A, B, C), and the training set contains observations from only time point A. Thus, a model trained on training_set and evaluated on val_a would not necessarily be best for val_b and val_c.
Also, concatenating all of the sets via training_set = pd.concat([training_set, val_a, val_b, val_c]), and then performing a variant of GroupShuffleSplit is non-ideal, as this results in leaking information from different time points to the model.
Thus far here's what I've tried:
import pandas as pd
from sklearn.model_selection import PredefinedSplit
# Assume each dataset has 4 observations.
tf = [-1] * len(training_set)
training_set = pd.concat([training_set, val_a, val_b, val_c])
tf += [0] * len(val_a) + [1] * len(val_b) + [2] * len(val_c)
print("Test fold:", tf)
pds = PredefinedSplit(test_fold = tf)
# gs = HalvingGridSearchCV(estimator = LGBMRegressor(), param_grid = param_grid, cv = pds, scoring = 'r2', refit = False, min_resources = 'exhaust')
for train_index, test_index in ps.split():
print("TRAIN:", train_index, "TEST:", test_index)
Output:
Test fold: [-1, -1, -1, -1, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2]
TRAIN: [ 0 1 2 3 8 9 10 11 12 13 14 15] TEST: [4 5 6 7]
TRAIN: [ 0 1 2 3 4 5 6 7 12 13 14 15] TEST: [ 8 9 10 11]
TRAIN: [ 0 1 2 3 4 5 6 7 8 9 10 11] TEST: [12 13 14 15]
As you can see, this would generate a 3 fold cross-validation, where each validation set is left out once, and included in the training set all of the other times. I know -1 will leave the observations out of any test set, but there is no value to leave the observations out of any train set. ):
Thank you!
If someone explain me this . How below code is working.
It is mentioned in Neural Networks and Deep Learning from coursera
A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape (b×c×d, a) is to
X_flatten = X.reshape(X.shape[0], -1).T # X.T is the transpose of X
Assume i have X=(209,64,64,3) , then when i say
X_flatten = X.reshape(X.shape[0], -1).T
which means
X_flatten = X.reshape(209, -1).T
How it is working , I am really confused about it.
Assume the shape of x is (209,64,64,3).
Then x.reshape(209, -1) will turn it into shape (209, 12288), since it will reshape to have 209 rows and will automatically figure out how many columns are needed (here: 64*64*3 = 12288 columns).
x.reshape(209, -1).T will simply transpose this, so that x has the final shape (209, 12288).
As far as I am concerned, the reshape(209, -1) will flatten the (64, 64, 3) to 64, 64 * 3, then to 64 * 64 * 3. You can do a test as below:
stacked_matrix = np.arange(10).reshape(2, 5)
print(stacked_matrix.reshape(1, -1)
# 0 1 2 3 4 5 6 7 8 9
I got an error when implementing Residual Network in Keras. Below is the code that gives me error (the error comes from the first line of the final step in the function definition):
Load packages:
import numpy as np
from keras import layers
from keras.layers import Input, Add, Concatenate, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D, AveragePooling2D, MaxPooling2D, GlobalMaxPooling2D
from keras.models import Model, load_model
from keras.preprocessing import image
from keras.utils import layer_utils
from keras.utils.data_utils import get_file
from keras.applications.imagenet_utils import preprocess_input
import pydot
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
from keras.utils import plot_model
from resnets_utils import *
from keras.initializers import glorot_uniform
import scipy.misc
from matplotlib.pyplot import imshow
%matplotlib inline
import keras.backend as K
K.set_image_data_format('channels_last')
K.set_learning_phase(1)
Define the function: (it's the first line of the "final step" that gives me the error)
def identity_block(X, f, filters, stage, block):
"""
Implementation of the identity block as defined in Figure 4
Arguments:
X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
f -- integer, specifying the shape of the middle CONV's window for the main path
filters -- python list of integers, defining the number of filters in the CONV layers of the main path
stage -- integer, used to name the layers, depending on their position in the network
block -- string/character, used to name the layers, depending on their position in the network
Returns:
X -- output of the identity block, tensor of shape (n_H, n_W, n_C)
"""
# defining name basis
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
# Save the input value. You'll need this later to add back to the main path.
X_shortcut = X
# First component of main path
X = Conv2D(filters = F1, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = conv_name_base + '2a', kernel_initializer = glorot_uniform(seed=0))(X)
X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
X = Activation('relu')(X)
# Second component of main path
X = Conv2D(filters=F2, kernel_size=(f,f),strides=(1,1),padding='same',name=conv_name_base+'2b',kernel_initializer=glorot_uniform(seed=0))(X)
X = BatchNormalization(axis=3,name=bn_name_base+'2b')(X)
X = Activation('relu')(X)
# Third component of main path
X = Conv2D(filters=F3,kernel_size=(1,1),strides=(1,1),padding='valid',name=conv_name_base+'2c',kernel_initializer=glorot_uniform(seed=0))(X)
print(f'before BatchNormalization: X={X}');X = BatchNormalization(axis=3,name=bn_name_base+'2c');print(f'after BatchNormalization: X={X}');
# Final step: Add shortcut value to main path, and pass it through a RELU activation
X = Add()([X_shortcut,X])
X = Activation('relu')(X)
### END CODE HERE ###
return X
Call/test the above function:
tf.reset_default_graph()
with tf.Session() as test:
np.random.seed(1)
A_prev = tf.placeholder("float", [3, 4, 4, 6])
X = np.random.randn(3, 4, 4, 6)
A = identity_block(A_prev, f = 2, filters = [2, 4, 6], stage = 1, block = 'a')
test.run(tf.global_variables_initializer())
out = test.run([A], feed_dict={A_prev: X, K.learning_phase(): 0})
print("out = " + str(out[0][1][1][0]))
Below is the print message and error message:
before BatchNormalization: X=Tensor("res1a_branch2c/BiasAdd:0", shape=(3, 4, 4, 6), dtype=float32)
after BatchNormalization: X= <keras.layers.normalization.BatchNormalization object at 0x7f169c6d9668>
ValueError: Unexpectedly found an instance of type `<class 'keras.layers.normalization.BatchNormalization'>`. Expected a symbolic tensor instance.
Below is the complete log (in case you need it)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/opt/conda/lib/python3.6/site-packages/keras/engine/topology.py in assert_input_compatibility(self, inputs)
424 try:
--> 425 K.is_keras_tensor(x)
426 except ValueError:
/opt/conda/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py in is_keras_tensor(x)
399 tf.SparseTensor)):
--> 400 raise ValueError('Unexpectedly found an instance of type `' + str(type(x)) + '`. '
401 'Expected a symbolic tensor instance.')
ValueError: Unexpectedly found an instance of type `<class 'keras.layers.normalization.BatchNormalization'>`. Expected a symbolic tensor instance.
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-6-b3d1050f50dc> in <module>()
5 A_prev = tf.placeholder("float", [3, 4, 4, 6])
6 X = np.random.randn(3, 4, 4, 6)
----> 7 A = identity_block(A_prev, f = 2, filters = [2, 4, 6], stage = 1, block = 'a')
8 test.run(tf.global_variables_initializer())
9 out = test.run([A], feed_dict={A_prev: X, K.learning_phase(): 0})
<ipython-input-5-013941ce79d6> in identity_block(X, f, filters, stage, block)
43
44 # Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
---> 45 X = Add()([X_shortcut,X])
46 X = Activation('relu')(X)
47
/opt/conda/lib/python3.6/site-packages/keras/engine/topology.py in __call__(self, inputs, **kwargs)
556 # Raise exceptions in case the input is not compatible
557 # with the input_spec specified in the layer constructor.
--> 558 self.assert_input_compatibility(inputs)
559
560 # Collect input shapes to build layer.
/opt/conda/lib/python3.6/site-packages/keras/engine/topology.py in assert_input_compatibility(self, inputs)
429 'Received type: ' +
430 str(type(x)) + '. Full input: ' +
--> 431 str(inputs) + '. All inputs to the layer '
432 'should be tensors.')
433
ValueError: Layer add_1 was called with an input that isn't a symbolic tensor. Received type: <class 'keras.layers.normalization.BatchNormalization'>. Full input: [<tf.Tensor 'Placeholder:0' shape=(3, 4, 4, 6) dtype=float32>, <keras.layers.normalization.BatchNormalization object at 0x7f169c6d9668>]. All inputs to the layer should be tensors.
I guess that I missed something in the final step of the function definition, but I have no idea why I got the error. Could any Keras expert here help me out?
Always remember to pass tensors into layers:
print(f'before BatchNormalization: X={X}');
#X = BatchNormalization(axis=3,name=bn_name_base+'2c') # <--- INCORRECT
X = BatchNormalization(axis=3,name=bn_name_base+'2c')(X) # <--- CORRECT
print(f'after BatchNormalization: X={X}');
The difference between 'CORRECT' and 'INCORRECT' is, latter yields a layer - whereas former evaluates that layer into a tensor when fed with X.
Furthermore, your identity_block() lacks a return, which will throw another error; add:
return X. Lastly, F1, F2, F3 are neither defined within the function nor passed as arguments - which you may not see as an error since they were defined outside the function - e.g. in your local namespace.
I am getting following error in Python 3.5 with theano and keras. I am running a CNN model in keras.
TypeError Traceback (most recent calllast)
<ipython-input-5-8f63b9541c7f> in <module>()
----> 1 model = get_unet(Adam(lr=1e-5))
<ipython-input-1-1aa31b714bd1> in get_unet_inception_2head(optimizer)
121
122 inputs = Input((1, IMG_ROWS, IMG_COLS), name='main_input')
--> 123 conv1 = inception_block(inputs, 32, batch_mode=2, splitted=splitted, activation=act)
124 #conv1 = inception_block(conv1, 32, batch_mode=2, splitted=splitted, activation=act)
125 <ipython-input-1-1aa31b714bd1> in inception_block(inputs, depth, batch_mode, splitted, activation)
27 actv = activation == 'relu' and (lambda: LeakyReLU(0.0)) or activation == 'elu' and (lambda: ELU(1.0)) or None
28 ---> 29 c1_1 = Convolution2D(depth/4, 1, 1, init='he_normal', border_mode='same')(inputs)
30
31 c2_1 = Convolution2D(depth/8*3, 1, 1, init='he_normal',border_mode='same')(inputs)
.
.
.
.
.
.
/opt/anaconda3/lib/python3.5/site-packages/theano/tensor/basic.py in make_node(self, x, shp)
4361 # except when shp is constant and empty
4362 # (in this case, shp.dtype does not matter anymore).
-> 4363 raise TypeError("Shape must be integers", shp, shp.dtype)
4364 assert shp.ndim == 1
4365 if isinstance(shp, TensorConstant):
TypeError: ('Shape must be integers', TensorConstant{[ 1. 8. 1. 1.]}, 'float64')
I am also getting following warning:
/opt/anaconda3/lib/python3.5/site-packages/keras/backend/theano_backend.py:145: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
return variable(np.random.normal(loc=0.0, scale=scale, size=shape),
/opt/anaconda3/lib/python3.5/site-packages/keras/backend/theano_backend.py:116: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future return variable(np.zeros(shape), dtype, name)
I think this warning is causing problem. I tried everything like casting to int etc.
Please help me resolve this problem.
code link : http://ideone.com/Y02erf