how to get the final cnn layer output shape (none,100)? - conv-neural-network

i have a simple CNN model for my audio data. The final layer will need to be in size (none, 100) as i need it to concatenate with my text (current shape (none,128)). Here is my codes;
from keras.layers import Reshape
inputs_au = Input(shape=(637,20))
audio_model = Conv1D(100,3,activation='relu')(inputs_au)
audio_model = MaxPooling1D()(audio_model)
audio_model = Conv1D(100,3,activation='relu')(audio_model)
audio_model = MaxPooling1D()(audio_model)
audio_model = Conv1D(100,3,activation='relu')(audio_model)
audio_model = MaxPooling1D()(audio_model)
#audio_model = Reshape((100,))(audio_model)
model_audio = Model(inputs = inputs_au, outputs = audio_model)
model_audio.summary()
the output is:
I can't run 'reshape' as it is currently in shape (none,77,100). How can i make the final layer to (none,100)?
Please advice

Related

Where is the window size in Pytorch LSTM model incorporated?

I have built a lstm model that takes input data with 3 features and the rolling window size is 18. My model has layers as I have attached in the code below. What I don't understand is how is the rolling window size of 18 incorporated in the model if the window size is never passed as an argument to the model. And if the model takes input as just one row at a time, is it not equivalent to using window size = 1?
class LSTMnetwork(nn.Module):
def __init__(self,input_size=3,hidden_size1=24, hidden_size2=50, hidden_size3=20,output_size=1):
super().__init__()
self.hidden_size1 = hidden_size1
self.hidden_size2 = hidden_size2
self.hidden_size3 = hidden_size3
# Add an LSTM and dropout layer:
self.lstm1 = nn.LSTM(input_size,hidden_size1)
self.dropout1 = nn.Dropout(p=0.2)
# Add second LSTM and dropout layer:
self.lstm2 = nn.LSTM(hidden_size1,hidden_size2)
self.dropout2 = nn.Dropout(p=0.2)
# Add a fully-connected layer:
self.fc1 = nn.Linear(hidden_size2,hidden_size3)
# Add a fully-connected layer:
self.fc2 = nn.Linear(hidden_size3,output_size)
# Initialize h0 and c0:
self.hidden1 = (torch.zeros(1,1,self.hidden_size1),
torch.zeros(1,1,self.hidden_size1))
# Initialize h1 and c1:
self.hidden2 = (torch.zeros(1,1,self.hidden_size2),
torch.zeros(1,1,self.hidden_size2))
def forward(self,seq):
lstm1_out, self.hidden1 = self.lstm1(seq.view(len(seq),1,-1), self.hidden1)
dropout1 = self.dropout1(lstm1_out)
lstm2_out, self.hidden2 = self.lstm2(dropout1.view(len(dropout1),1,-1), self.hidden2)
dropout2 = self.dropout2(lstm2_out)
fc1_out = F.relu(self.fc1(dropout2))
fc2_out = self.fc2(fc1_out)
return fc2_out[-1]

Fine tuning custom keras model

I have a keras model which is trained on 5 classes,The final layers of the model look like so
dr_steps = Dropout(0.25)(Dense(128, activation = 'relu')(gap_dr))
out_layer = Dense(5, activation = 'softmax')(dr_steps)
model = Model(inputs = [in_lay], outputs = [out_layer])
What I want to do is fine tune this model on an 8 class multilabel problem but I am not sure how to achieve this. This is what I have tried:
dr_steps = Dropout(0.25)(Dense(128, activation = 'relu')(gap_dr))
out_layer = Dense(t_y.shape[-1], activation = 'softmax')(dr_steps)
model = Model(inputs = [in_lay], outputs = [out_layer])
weights_path = 'weights.best.hdf5'
retina_model.load_weights(weights_path)
model.layers.pop()
output = Dense(8, activation = 'sigmoid')(model.layers[-1].output)
model = Model(inputs = [in_lay], outputs = [output])
loss = 'binary_crossentropy'
model.compile(optimizer = RAdam(), loss = FocalLoss,
metrics = ["binary_accuracy",precision, recall,auc])
but this will raise an error like this
raise ValueError(str(e))
ValueError: Dimension 1 in both shapes must be equal, but are 8 and 5. Shapes are [128,8] and [128,5]. for 'Assign_390' (op: 'Assign') with input shapes: [128,8], [128,5].
Any suggestions on how to fine tune this model will be very helpful,Thanks in advance.
Here,
model = Model(inputs = [in_lay], outputs = [out_layer])
weights_path = 'weights.best.hdf5'
this out_layer should have the same dimension(5 classes) described inside weights.best.hdf5.
So, t_y.shape[-1] should be 5 dimensional, not 8.

Creating Variable Length Output for RNN in Keras

Im trying to convert a sequence of length N to a sequence of around length N^2 using a pseudo seq2seq type model, but Im not sure how to implement the variable input length in my keras model
def LSTMModel():
input = Input(shape = (None,num_channels))
lstm_one = LSTM(75, return_sequences = True)
lstm_one_output = lstm_one(input)
BiLSTM = Bidirectional(LSTM(units = 100, return_sequences=True, recurrent_dropout = 0.1))
LSTM_outputs = BiLSTM(lstm_one_output)
output = LSTM(2, return_sequences = False)(LSTM_outputs)
return Model(input, output)
This code would produce a (None, 2) output, but I really want it to be a (None, None^2) output. Is there any way to somehow store the shape within the model and do some operations with it with keras layers, perhaps with a lambda function?

ValueError: Error when checking target: expected fc1000 to have shape (30,) but got array with shape (1,)

I was trying to retrain ResNet50 model to classify given images of animals into 30 different classes. To do this, I made a list containing arrays of given images of dimension(after expanding dimensions and preprocessing it):- (1, 224, 224, 3), thereby the shape of given list(after converting it to numpy array) was (300, 1, 224, 224, 3), as initially i took only 300 images. For Ytrain, I Label encoded the classes and one hot encoded the afterwards. For 30 classes, I had an numpy array of dimension (300, 30). Then I used DataGenerator for model.fit_generator, passing Xtrain of shape (1, 224, 224, 3) and Ytrain of shape (30, ), But got the error:-
ValueError: Error when checking target: expected fc1000 to have shape (30,) but got array with shape (1,)
Here is my code:-
inputShape = (224, 224)
preprocess = imagenet_utils.preprocess_input
df = pd.read_csv('DLBeginner/meta-data/train.csv')
df = df.head(300)
imagesData, target = [], []
c = 0
for images in df['Image_id']:
filename = args["target"] + '/' + images
image = load_img(filename, target_size = inputShape)
image = img_to_array(image)
image = np.expand_dims(image, axis = 0)
image = preprocess(image)
imagesData.append(image)
c += 1
print('Count = {}, Image > {} '.format(c, images))
imagesData = np.array(imagesData)
labelEncoder = LabelEncoder()
series = df['Animal'][0:300]
integerEncoded = labelEncoder.fit_transform(series)
Hot = OneHotEncoder(sparse = False)
integerEncoded = integerEncoded.reshape(len(integerEncoded), 1)
oneHot = Hot.fit_transform(integerEncoded)
model = ResNet50(include_top = True, classes = 30, weights = None)
model.compile(optimizer = 'Adam', loss='categorical_crossentropy', metrics = ['accuracy'])
l = len(imagesData)
def DataGenerator(Xtrain, Ytrain):
while(True):
for i in range(l):
arr1 = Xtrain[i]
arr2 = Ytrain[i]
print("arr1.shape : {}".format(arr1.shape))
print("arr2.shape : {}".format(arr2.shape))
yield(arr1, arr2)
and here is the "fitting part"
generator = DataGenerator(imagesData, oneHot)
model.fit_generator(generator = generator, epochs = 5, steps_per_epoch=l)
Where am I going wrong?
Thanks in advance.
Switching from 'categorical_crossentropy' to 'sparse_categorical_crossentropy' solved it for me.
Just want to add little more details.
When you have multi-class classification problem and
(1) if your targets are one-hot encoded then use categorical_crossentropy
(2) if your targets are integers as in MNIST example, use sparse_categorical_crossentropy. When you use this, Tensorflow under the hood, it will convert data into one-hot encoded and classify the data.
Hope that helps. Thanks!

Neural net with non standard input

I want to make a neural net which takes image+image+value as an input and performs convolution+pooling on images and then a linear transform on results. Can I do that in keras?
This is architecturally similar to Craig Li's answer but is in the image, image, value format and does not use VGG16 and just a vanilla CNN. These are 3 separate networks whose outputs are concatenated after being processed individually and the resulting concatenated vector is passed through the final layers, including information from all inputs.
input_1 = Input(data_1.shape[1:], name = 'input_1')
conv_branch_1 = Conv2D(filters, (kernel_size, kernel_size),
activation = LeakyReLU())(conv_branch_1)
conv_branch_1 = MaxPooling2D(pool_size = (2,2))(conv_branch_1)
conv_branch_1 = Flatten()(conv_branch_1)
input_2 = Input(data_2.shape[1:], name = 'input_2')
conv_branch_2 = Conv2D(filters, (kernel_size, kernel_size),
activation = LeakyReLU())(conv_branch_2)
conv_branch_2 = MaxPooling2D(pool_size = (2,2))(conv_branch_2)
conv_branch_2 = Flatten()(conv_branch_2)
value_input = Input(value_data.shape[1:], name = 'value_input')
fc_branch = Dense(80, activation=LeakyReLU())(value_input)
merged_branches = concatenate([conv_branch_1, conv_branch_2, fc_branch])
merged_branches = Dense(60, activation=LeakyReLU())(merged_branches)
merged_branches = Dropout(0.25)(merged_branches)
merged_branches = Dense(30, activation=LeakyReLU())(merged_branches)
merged_branches = Dense(1, activation='sigmoid')(merged_branches)
model = Model(inputs=[input_1, input_2, value_input], outputs=[merged_branches])
#if binary classification do this otherwise whatever loss you need
model.compile(loss='binary_crossentropy')
Suppose your image is RGB type, the shape of the image is (width,height,3), you can combine two images with numpy like:
import numpy as np
from PIL import Image
img1 = Image.open('image1.jpg')
img2 = Image.open('imgae2.jpg')
img1 = img1.resize((width,height))
img2 = img2.resize((width,height))
img1_arr = np.asarray(img1,dtype='int32')
img2_arr = np.asarray(img2,dtype='int32')
#shape of img_arr is (width,height,6)
img_arr = np.concatenate((img1_arr,img2_arr),axis=2)
Combine two images in this way, we only increase the channels, so we can still do convolution on the first two axis.
UPDATE:
I guess you mean Multi-Task Model, you want to merge two images after convolution,Keras has concatenate() can do that.
input_tensor = Input(shape=(channels, img_width, img_height))
# Task1 on image1
conv_model1 = VGG16(input_tensor=input_tensor, weights=None, include_top=False, classes=classes,
input_shape=(channels, img_width, img_height))
conv_output1 = conv_model1.output
flatten1 = Flatten()(conv_output1)
# Task2 on image2
conv_model2 = VGG16(input_tensor=input_tensor, weights=None, include_top=False, classes=classes,
input_shape=(channels, img_width, img_height))
conv_output2 = conv_model2.output
flatten2 = Flatten()(conv_output2)
# Merge the output
merged = concatenate([conv_output1, conv_output2], axis=1)
merged = Dense(classes,activation='softmax')(merged)
# add some Dense layers and Dropout,
final_model = Model(inputs=[input_tensor,input_tensor],outputs=merged)

Resources