I want to make a neural net which takes image+image+value as an input and performs convolution+pooling on images and then a linear transform on results. Can I do that in keras?
This is architecturally similar to Craig Li's answer but is in the image, image, value format and does not use VGG16 and just a vanilla CNN. These are 3 separate networks whose outputs are concatenated after being processed individually and the resulting concatenated vector is passed through the final layers, including information from all inputs.
input_1 = Input(data_1.shape[1:], name = 'input_1')
conv_branch_1 = Conv2D(filters, (kernel_size, kernel_size),
activation = LeakyReLU())(conv_branch_1)
conv_branch_1 = MaxPooling2D(pool_size = (2,2))(conv_branch_1)
conv_branch_1 = Flatten()(conv_branch_1)
input_2 = Input(data_2.shape[1:], name = 'input_2')
conv_branch_2 = Conv2D(filters, (kernel_size, kernel_size),
activation = LeakyReLU())(conv_branch_2)
conv_branch_2 = MaxPooling2D(pool_size = (2,2))(conv_branch_2)
conv_branch_2 = Flatten()(conv_branch_2)
value_input = Input(value_data.shape[1:], name = 'value_input')
fc_branch = Dense(80, activation=LeakyReLU())(value_input)
merged_branches = concatenate([conv_branch_1, conv_branch_2, fc_branch])
merged_branches = Dense(60, activation=LeakyReLU())(merged_branches)
merged_branches = Dropout(0.25)(merged_branches)
merged_branches = Dense(30, activation=LeakyReLU())(merged_branches)
merged_branches = Dense(1, activation='sigmoid')(merged_branches)
model = Model(inputs=[input_1, input_2, value_input], outputs=[merged_branches])
#if binary classification do this otherwise whatever loss you need
model.compile(loss='binary_crossentropy')
Suppose your image is RGB type, the shape of the image is (width,height,3), you can combine two images with numpy like:
import numpy as np
from PIL import Image
img1 = Image.open('image1.jpg')
img2 = Image.open('imgae2.jpg')
img1 = img1.resize((width,height))
img2 = img2.resize((width,height))
img1_arr = np.asarray(img1,dtype='int32')
img2_arr = np.asarray(img2,dtype='int32')
#shape of img_arr is (width,height,6)
img_arr = np.concatenate((img1_arr,img2_arr),axis=2)
Combine two images in this way, we only increase the channels, so we can still do convolution on the first two axis.
UPDATE:
I guess you mean Multi-Task Model, you want to merge two images after convolution,Keras has concatenate() can do that.
input_tensor = Input(shape=(channels, img_width, img_height))
# Task1 on image1
conv_model1 = VGG16(input_tensor=input_tensor, weights=None, include_top=False, classes=classes,
input_shape=(channels, img_width, img_height))
conv_output1 = conv_model1.output
flatten1 = Flatten()(conv_output1)
# Task2 on image2
conv_model2 = VGG16(input_tensor=input_tensor, weights=None, include_top=False, classes=classes,
input_shape=(channels, img_width, img_height))
conv_output2 = conv_model2.output
flatten2 = Flatten()(conv_output2)
# Merge the output
merged = concatenate([conv_output1, conv_output2], axis=1)
merged = Dense(classes,activation='softmax')(merged)
# add some Dense layers and Dropout,
final_model = Model(inputs=[input_tensor,input_tensor],outputs=merged)
Related
I am working on one deep learning model where I am trying to combine two different model's output :
The overall structure is like this :
So the first model takes one matrix, for example [ 10 x 30 ]
#input 1
input_text = layers.Input(shape=(1,), dtype="string")
embedding = ElmoEmbeddingLayer()(input_text)
model_a = Model(inputs = [input_text] , outputs=embedding)
# shape : [10,50]
Now the second model takes two input matrix :
X_in = layers.Input(tensor=K.variable(np.random.uniform(0,9,[10,32])))
M_in = layers.Input(tensor=K.variable(np.random.uniform(1,-1,[10,10]))
md_1 = New_model()([X_in, M_in]) #new_model defined somewhere
model_s = Model(inputs = [X_in, A_in], outputs = md_1)
# shape : [10,50]
I want to make these two matrices trainable like in TensorFlow I was able to do this by :
matrix_a = tf.get_variable(name='matrix_a',
shape=[10,10],
dtype=tf.float32,
initializer=tf.constant_initializer(np.array(matrix_a)),trainable=True)
I am not getting any clue how to make those matrix_a and matrix_b trainable and how to merge the output of both networks then give input.
I went through this question But couldn't find an answer because their problem statement is different from mine.
What I have tried so far is :
#input 1
input_text = layers.Input(shape=(1,), dtype="string")
embedding = ElmoEmbeddingLayer()(input_text)
model_a = Model(inputs = [input_text] , outputs=embedding)
# shape : [10,50]
X_in = layers.Input(tensor=K.variable(np.random.uniform(0,9,[10,10])))
M_in = layers.Input(tensor=K.variable(np.random.uniform(1,-1,[10,100]))
md_1 = New_model()([X_in, M_in]) #new_model defined somewhere
model_s = Model(inputs = [X_in, A_in], outputs = md_1)
# [10,50]
#tranpose second model output
tranpose = Lambda(lambda x: K.transpose(x))
agglayer = tranpose(md_1)
# concat first and second model output
dott = Lambda(lambda x: K.dot(x[0],x[1]))
kmean_layer = dotter([embedding,agglayer])
# input
final_model = Model(inputs=[input_text, X_in, M_in], outputs=kmean_layer,name='Final_output')
final_model.compile(loss = 'categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
final_model.summary()
Overview of the model :
Update:
Model b
X = np.random.uniform(0,9,[10,32])
M = np.random.uniform(1,-1,[10,10])
X_in = layers.Input(tensor=K.variable(X))
M_in = layers.Input(tensor=K.variable(M))
layer_one = Model_b()([M_in, X_in])
dropout2 = Dropout(dropout_rate)(layer_one)
layer_two = Model_b()([layer_one, X_in])
model_b_ = Model([X_in, M_in], layer_two, name='model_b')
model a
length = 150
dic_size = 100
embed_size = 12
input_text = Input(shape=(length,))
embedding = Embedding(dic_size, embed_size)(input_text)
embedding = LSTM(5)(embedding)
embedding = Dense(10)(embedding)
model_a = Model(input_text, embedding, name = 'model_a')
I am merging like this:
mult = Lambda(lambda x: tf.matmul(x[0], x[1], transpose_b=True))([embedding, model_b_.output])
final_model = Model(inputs=[model_b_.input[0],model_b_.input[1],model_a.input], outputs=mult)
Is it right way to matmul two keras model?
I don't know if I am merging the output correctly and the model is correct.
I would greatly appreciate it if anyone kindly gives me some advice on how should I make that matrix trainable and how to merge the model's output correctly then give input.
Thanks in advance!
Trainable weights
Ok. Since you are going to have custom trainable weights, the way to do this in Keras is creating a custom layer.
Now, since your custom layer has no inputs, we will need a hack that will be explained later.
So, this is the layer definition for the custom weights:
from keras.layers import *
from keras.models import Model
from keras.initializers import get as get_init, serialize as serial_init
import keras.backend as K
import tensorflow as tf
class TrainableWeights(Layer):
#you can pass keras initializers when creating this layer
#kwargs will take base layer arguments, such as name and others if you want
def __init__(self, shape, initializer='uniform', **kwargs):
super(TrainableWeights, self).__init__(**kwargs)
self.shape = shape
self.initializer = get_init(initializer)
#build is where you define the weights of the layer
def build(self, input_shape):
self.kernel = self.add_weight(name='kernel',
shape=self.shape,
initializer=self.initializer,
trainable=True)
self.built = True
#call is the layer operation - due to keras limitation, we need an input
#warning, I'm supposing the input is a tensor with value 1 and no shape or shape (1,)
def call(self, x):
return x * self.kernel
#for keras to build the summary properly
def compute_output_shape(self, input_shape):
return self.shape
#only needed for saving/loading this layer in model.save()
def get_config(self):
config = {'shape': self.shape, 'initializer': serial_init(self.initializer)}
base_config = super(TrainableWeights, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
Now, this layer should be used like this:
dummyInputs = Input(tensor=K.constant([1]))
trainableWeights = TrainableWeights(shape)(dummyInputs)
Model A
Having the layer defined, we can start modeling.
First, let's see the model_a side:
#general vars
length = 150
dic_size = 100
embed_size = 12
#for the model_a segment
input_text = Input(shape=(length,))
embedding = Embedding(dic_size, embed_size)(input_text)
#the following two lines are just a resource to reach the desired shape
embedding = LSTM(5)(embedding)
embedding = Dense(50)(embedding)
#creating model_a here is optional, only if you want to use model_a independently later
model_a = Model(input_text, embedding, name = 'model_a')
Model B
For this, we are going to use our TrainableWeights layer.
But first, let's simulate a New_model() as mentioned.
#simulates New_model() #notice the explicit batch_shape for the matrices
newIn1 = Input(batch_shape = (10,10))
newIn2 = Input(batch_shape = (10,30))
newOut1 = Dense(50)(newIn1)
newOut2 = Dense(50)(newIn2)
newOut = Add()([newOut1, newOut2])
new_model = Model([newIn1, newIn2], newOut, name='new_model')
Now the entire branch:
#the matrices
dummyInput = Input(tensor = K.constant([1]))
X_in = TrainableWeights((10,10), initializer='uniform')(dummyInput)
M_in = TrainableWeights((10,30), initializer='uniform')(dummyInput)
#the output of the branch
md_1 = new_model([X_in, M_in])
#optional, only if you want to use model_s independently later
model_s = Model(dummyInput, md_1, name='model_s')
The whole model
Finally, we can join the branches in a whole model.
Notice how I didn't have to use model_a or model_s here. You can do it if you want, but those submodels are not needed, unless you want later to get them individually for other usages. (Even if you created them, you don't need to change the code below to use them, they're already part of the same graph)
#I prefer tf.matmul because it's clear and understandable while K.dot has weird behaviors
mult = Lambda(lambda x: tf.matmul(x[0], x[1], transpose_b=True))([embedding, md_1])
#final model
model = Model([input_text, dummyInput], mult, name='full_model')
Now train it:
model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])
model.fit(np.random.randint(0,dic_size, size=(128,length)),
np.ones((128, 10)))
Since the output is 2D now, there is no problem about the 'categorical_crossentropy', my comment was because of doubts on the output shape.
First I read pictures from folders and sub-folders. Then I’m changing images to gray and resizing to 100*200. I want to classify my images to 6 class.
When i want to create my model, I can't use Conv2D, cause I have dimension error but when I use Conv1D, I don’t have any error and neural network is work.
I want to use conv2D because my data is image. What is my problem?
#load train_images
path_spec_train = "/home/narges/dataset/train_24/"
spec_train = glob.glob(path_spec_train + "**/*.png")
spec_train.sort()
X_modify = []
width = 200
height = 100
for spec in spec_train:
specs = cv2.imread(spec)
specs = cv2.cvtColor(specs,cv2.COLOR_BGR2GRAY)
specs = cv2.resize(specs ,(width, height))
specs = specs / np.max(specs)
specs = specs.astype(np.float32)
X_modify.append(specs)
X_train = np.asarray(X_modify,dtype=np.float32)
#=======================================================
#load test_image
path_spec_test = "/home/narges/dataset/test_24/"
spec_test = glob.glob(path_spec_test + "**/*.png")
spec_test.sort()
X_modify_t = []
width = 200
height = 100
for spec_t in spec_test:
specs_test = cv2.imread(spec_t)
specs_test = cv2.cvtColor(specs_test,cv2.COLOR_BGR2GRAY)
specs_test = cv2.resize(specs_test ,(width, height))
specs_test = specs_test / np.max(specs_test)
specs_test = specs_test.astype(np.float32)
X_modify_t.append(specs_test)
X_test = np.asarray(X_modify_t,dtype=np.float32)
#======================================================================
#label
spk_ID = [wavs[i].split('/')[-1].lower() for i in range(number_of_files)]
spk_ID_t = [wavs_t[i].split('/')[-1].lower() for i in range(number_of_files_t)]
label_no = [spk_ID[i].split('_')[-2] for i in range(number_of_files)]
Y_train = np_utils.to_categorical(label_no)
label_no_t = [spk_ID_t[i].split('_')[-2] for i in range(number_of_files_t)]
Y_test = np_utils.to_categorical(label_no_t)
#======================================================================
# Create my model
myinput = layers.Input(shape=(100,200))
conv1 = layers.Conv1D(32, 3, activation='relu', padding='same', strides=2)(myinput)
conv2 = layers.Conv1D(64, 3, activation='relu', padding='same', strides=2)(conv1)
flat = layers.Flatten()(conv2)
out_layer = layers.Dense(6, activation='softmax')(flat)
mymodel = Model(myinput, out_layer)
mymodel.summary()
mymodel.compile(optimizer=keras.optimizers.Adam(), loss=keras.losses.categorical_crossentropy, metrics=['accuracy'])
network_history = mymodel.fit(X_train, Y_train, batch_size=128, epochs=100)
pred = np.round(mymodel.predict(X_test))
print(classification_report(Y_test, pred))
You need to add the channels dimension to your input training data:
import numpy as np
X_train = np.expand_dims(X_train, axis=3)
X_test = np.expand_Dims(X_test, axis=3)
Then your data will be 4D, for example, (3312, 100, 200, 1), which is a grayscale image (one channel).
I am implementing a siamese network in which i know how to calculate triplet loss by picking anchor, positive and negative by dividing input in three parts(which is a handcrafted feature vector) and then calculating it at time of training.
anchor_output = ... # shape [None, 128]
positive_output = ... # shape [None, 128]
negative_output = ... # shape [None, 128]
d_pos = tf.reduce_sum(tf.square(anchor_output - positive_output), 1)
d_neg = tf.reduce_sum(tf.square(anchor_output - negative_output), 1)
loss = tf.maximum(0., margin + d_pos - d_neg)
loss = tf.reduce_mean(loss)
But the problem is when at time of testing i would be having only two files positive and negative then how i would deal with(triplets, as i need one more anchor file but my app only take one picture and compare with in database so only two files in this case), I searched a lot but nobody provided code to deal with this problem only there was code to implement triplet loss but not for whole scenario.
AND I DONT WANT TO USE CONTRASTIVE LOSS
Colab notebook with test code on CIFAR 10:
https://colab.research.google.com/drive/1VgOTzr_VZNHkXh2z9IiTAcEgg5qr19y0
The general idea:
from tensorflow import keras
from tensorflow.keras.layers import *
from tensorflow.keras.models import Model
from tensorflow.keras import backend as K
img_width = 128
img_height = 128
img_colors = 3
margin = 1.0
VECTOR_SIZE = 32
def triplet_loss(y_true, y_pred):
""" y_true is a dummy value that should be ignored
Uses the inverse of the cosine similarity as a loss.
"""
anchor_vec = y_pred[:, :VECTOR_SIZE]
positive_vec = y_pred[:, VECTOR_SIZE:2*VECTOR_SIZE]
negative_vec = y_pred[:, 2*VECTOR_SIZE:]
d1 = keras.losses.cosine_proximity(anchor_vec, positive_vec)
d2 = keras.losses.cosine_proximity(anchor_vec, negative_vec)
return K.clip(d2 - d1 + margin, 0, None)
def make_image_model():
""" Build a convolutional model that generates a vector
"""
inp = Input(shape=(img_width, img_height, img_colors))
l1 = Conv2D(8, (2, 2))(inp)
l1 = MaxPooling2D()(l1)
l2 = Conv2D(16, (2, 2))(l1)
l2 = MaxPooling2D()(l2)
l3 = Conv2D(16, (2, 2))(l2)
l3 = MaxPooling2D()(l3)
conv_out = Flatten()(l3)
out = Dense(VECTOR_SIZE)(conv_out)
model = Model(inp, out)
return model
def make_siamese_model(img_model):
""" Siamese model input are 3 images base, positive, negative
output is a dummy variable that is ignored for the purposes of loss
calculation.
"""
anchor = Input(shape=(img_width, img_height, img_colors))
positive = Input(shape=(img_width, img_height, img_colors))
negative = Input(shape=(img_width, img_height, img_colors))
anchor_vec = img_model(anchor)
positive_vec = img_model(positive)
negative_vec = img_model(negative)
vecs = Concatenate(axis=1)([anchor_vec, positive_vec, negative_vec])
model = Model([anchor, positive, negative], vecs)
model.compile('adam', triplet_loss)
return model
img_model = make_image_model()
train_model = make_siamese_model(img_model)
img_model.summary()
train_model.summary()
###
train_model.fit(X, dummy_y, ...)
img_model.save('image_model.h5')
###
# In order to use the model
vec_base = img_model.predict(base_image)
vec_test = img_model.predict(test_image)
compare cosine similarity of vec_base and vec_test in order to determine whether base and test are within the acceptable criteria.
I'm trying to run a simple example with a NN using the MNIST dataset provided by tensorflow itself, running on Google Colab. I want to get the raw data and mount by myself the structure that has the data. I'm able to train the NN, but when I try to predict one example from the test set, I get the error
ValueError: Error when checking input: expected dense_input to have shape (784,) but got array with shape (1,).
Could somebody help me with this issue? I'm pretty new to Python and Keras/TensorFlow.
When I run
print(inp.shape)
I get (784,) and not the (1,) as the error says.
I have also tried to evaluate the test set using
test_loss, test_accuracy = model.evaluate(test_input.T)
, but I also get the error
ValueError: Arguments and signature arguments do not match: 25 27.
The source code is the following:
# Importing stuff
import tensorflow as tf
import tensorflow_datasets as tfds
import matplotlib.pyplot as plt
import numpy as np
import math
import time
import keras
tf.enable_eager_execution()
# Functions
def normalize(images, labels):
images = tf.cast(images, tf.float32)
images /= 255
return images, labels
# Getting dataset
ds, meta = tfds.load('fashion_mnist', as_supervised=True, with_info=True)
test_ds, train_ds = ds['test'], ds['train']
# Preprocess the data
train_ds = train_ds.map(normalize)
test_ds = test_ds.map(normalize)
num_train_examples = meta.splits['train'].num_examples
num_test_examples = meta.splits['test'].num_examples
# Making the train set
train_input = np.empty(shape=(784, num_train_examples))
train_label = np.empty(shape=(1, num_train_examples))
i = 0
for image, label in train_ds:
image = image.numpy().reshape((784, 1))
train_input[:, i] = image.ravel()
label = label.numpy().reshape(1)
train_label[:, i] = label
i = i + 1;
# Making the test set
test_input = np.empty(shape=(784, num_test_examples))
test_label = np.empty(shape=(1, num_test_examples))
i = 0
for image, label in test_ds:
image = image.numpy().reshape((784, 1))
test_input[:, i] = image.ravel()
label = label.numpy().reshape(1)
test_label[:, i] = label
i = i + 1;
# Network
input_layer = tf.keras.layers.Dense(units=784, input_shape=[784])
h1 = tf.keras.layers.Dense(128, activation=tf.nn.relu)
output_layer = tf.keras.layers.Dense(10, activation=tf.nn.softmax)
model = tf.keras.Sequential([input_layer, h1, output_layer])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(train_input.T, train_label.T, epochs=3, steps_per_epoch=100, batch_size=1)
test_loss, test_accuracy = model.evaluate(test_input.T)
inp = test_input[:, 0].T
res = model.predict(inp)
All the API functions expect an input shape containing the batch size first. In your case, you're trying to feed only 1 example and thus no batch size is given. You just need to specify your batch size as 1 by reshaping the data.
Using numpy:
res = model.predict(np.reshape(inp, len(inp))
The argument to the predict method now receives the array with shape (1, 784) in your case, specifying the batch size as 1.
When you give the function more examples to evaluate by stacking them on top of one another, the batch size is implicitly given by the shape of the array, so no further transformation is necessary.
I was trying to retrain ResNet50 model to classify given images of animals into 30 different classes. To do this, I made a list containing arrays of given images of dimension(after expanding dimensions and preprocessing it):- (1, 224, 224, 3), thereby the shape of given list(after converting it to numpy array) was (300, 1, 224, 224, 3), as initially i took only 300 images. For Ytrain, I Label encoded the classes and one hot encoded the afterwards. For 30 classes, I had an numpy array of dimension (300, 30). Then I used DataGenerator for model.fit_generator, passing Xtrain of shape (1, 224, 224, 3) and Ytrain of shape (30, ), But got the error:-
ValueError: Error when checking target: expected fc1000 to have shape (30,) but got array with shape (1,)
Here is my code:-
inputShape = (224, 224)
preprocess = imagenet_utils.preprocess_input
df = pd.read_csv('DLBeginner/meta-data/train.csv')
df = df.head(300)
imagesData, target = [], []
c = 0
for images in df['Image_id']:
filename = args["target"] + '/' + images
image = load_img(filename, target_size = inputShape)
image = img_to_array(image)
image = np.expand_dims(image, axis = 0)
image = preprocess(image)
imagesData.append(image)
c += 1
print('Count = {}, Image > {} '.format(c, images))
imagesData = np.array(imagesData)
labelEncoder = LabelEncoder()
series = df['Animal'][0:300]
integerEncoded = labelEncoder.fit_transform(series)
Hot = OneHotEncoder(sparse = False)
integerEncoded = integerEncoded.reshape(len(integerEncoded), 1)
oneHot = Hot.fit_transform(integerEncoded)
model = ResNet50(include_top = True, classes = 30, weights = None)
model.compile(optimizer = 'Adam', loss='categorical_crossentropy', metrics = ['accuracy'])
l = len(imagesData)
def DataGenerator(Xtrain, Ytrain):
while(True):
for i in range(l):
arr1 = Xtrain[i]
arr2 = Ytrain[i]
print("arr1.shape : {}".format(arr1.shape))
print("arr2.shape : {}".format(arr2.shape))
yield(arr1, arr2)
and here is the "fitting part"
generator = DataGenerator(imagesData, oneHot)
model.fit_generator(generator = generator, epochs = 5, steps_per_epoch=l)
Where am I going wrong?
Thanks in advance.
Switching from 'categorical_crossentropy' to 'sparse_categorical_crossentropy' solved it for me.
Just want to add little more details.
When you have multi-class classification problem and
(1) if your targets are one-hot encoded then use categorical_crossentropy
(2) if your targets are integers as in MNIST example, use sparse_categorical_crossentropy. When you use this, Tensorflow under the hood, it will convert data into one-hot encoded and classify the data.
Hope that helps. Thanks!