I am trying to approach a multi label image classification problem,for which i have image data but i also have some other features like gender etc, but the issue is that i will get this information during testing, in other words during testing only the image information will be provided.
My question is how can i use these extra features to help my image model which is a convolution Neural Network even though i wont have this info during testing?
Any advice will be helpful.Thanks in advance.
This is a really open ended question. I can give you some general guidelines on how this can work.
keras model API supports multiple inputs as well as merge layers. For example you can have something like this:
from keras.layers import Input
from keras.models import Model
image = Input(...)
text = Input(...)
... # apply layers onto image and text
from keras.layers.merge import Concatenate
combined = Concatenate()([image, text])
... # apply layers onto combined
model = Model([image, text], [combined])
This way you can have a model that takes multiple inputs that can make use of all of your data sources. keras has tools to combine your different inputs to produce one output. The part where this becomes open ended is the architecture.
Right now you should probably pass image through a CNN, and then merge the output with text. You have to tweak the exact specifications, such as how you handle each input, your merge method, and how you handle the combined output.
A good example of merge being used is here, where a GAN is given latent noise in the form of an image but also a label to determine what kind of image it should generate. Both the discriminator and the generator make use of the multiply merge layer to combine their inputs.
Related
ā¯“Question
I'm trying to convert a pytorch model to coreml. The model was based on yolov5.
Here is a netron view our our exported coreml model.
Currently, the architecture has 3 outputs. You can see one of the outputs in the screenshot, number '740'.
However, we want a different output from coreml. We need to get the output before the reshapeStatic and transpose layers. So in this image you can see that we need the last convolution layer instead of 740.
Those reshapeStatic and transpose layers were added by the process which convert the net to coreML, they are not organic layers of yolov5.
Is there any way we can do the conversion to coreml differently in order to have more control over which layers are output? For example, can we have control over the output layers in the sample code below:
model = ct.convert(
traced_model,
inputs=[ct.ImageType(name="input_1", shape=example_input.shape)],
classifier_config = ct.ClassifierConfig(class_labels) )
Alternatively, is there a way where we can choose at runtime which values to pull out of the coreml model? For example, is there a way to specify in the code below which layers we want to output?
img = load_image(img_path, resize_to=(img_size, img_size))
//can we specify here which layers to output?
coreml_out_dict = model.predict({'image': img})
Thanks!
My question is simple. I want to visualize what filters are used in ConvNet in deep layers to extract the features predicting the final model. By visualize I mean to save it in .png format like filters shown in final layer of https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/03/cnn_filters.png , we can actually see a car in final layer filters
I can visualise the filters of first convolutional layer by help provided from my own question Visualising Keras CNN final trained filters at each layer but that shows only to visualise for first layer. First layer filters look like some random coloured 3x3 pixel images. But I want to see the final layer filters like the car filter in the first link.
First layer filters look like some random coloured 3x3 pixel images. But I want to see the final layer filters like the car filter in the first link.
Even the article of car filter https://www.analyticsvidhya.com/blog/2018/03/essentials-of-deep-learning-visualizing-convolutional-neural-networks/ has code for only first layer
The Python library keras-vis is a great tool for visualizing CNNs. It can generate conv filter visualizations, dense layer visualizations, and attention maps. The latest release is quite old (and a little bit buggy), so I recommend installing from master:
pip install git+https://github.com/raghakot/keras-vis.git
You can adress the weights of the different layers by:
w = model.layers[i].get_weights()[0][:,:,:,:]
where i is the number of your layer.
In case of the picture in the link I am not sure if it is actually the weights or maybe the activation map which is shown. You could get that one by:
from keras import backend as K
get_output = K.function([model.layers[0].input],[cnn.layers[i].output])
output_normal = get_output([X])[0][m]
where m is the number of a certain image in X as input.
I have a question regarding the implementation of a custom loss-function for my neural network.
I am currently trying to segment cells for a project and I decided to use a unet as it seems to work quite well. In order to improve my current model, I decided to follow the idea of the original paper of the unet (https://arxiv.org/abs/1505.04597) where they implemented a weight-map assigning thus more weight to pixels that are located in between cells that are tightly associated, as you can see in this picture: Example of a weight map.
I am currently using Keras for my unet and my problem is that I do not know how to give my weights to my model without creating any problem. My idea was to create a generator with the images and a 2-channeled array containing the labels in the first channel and the weights in the second channel, that way I can extract my weights and my labels easily in my custom loss function.
My code looks like that:
train_generator = zip(image_generator, label_generator, weight_generator)
for (img, label, weight) in train_generator:
img, label = adjustData(img, True, label)
label_weights = np.concatenate((label, weight),axis=3)
# This is the final generator
yield (img, label_weights)
As you can see, I construct the train_generator with three previously constructed generators, I adjust some things and then I yield my images and combined labels and weights.
Then, when I try to fit my model with fit_generator, I get this error: ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays.
I really do not know what to do and how to implement correctly what I want to do.
Thank you in advance for your answers.
I am doing transfer-learning/retraining using Tensorflow Inception V3 model. I have 6 labels. A given image can be one single type only, i.e, no multiple class detection is needed. I have three queries:
Which activation function is best for my case? Presently retrain.py file provided by tensorflow uses softmax? What are other methods available? (like sigmoid etc)
Which Optimiser function I should use? (GradientDescent, Adam.. etc)
I want to identify out-of-scope images, i.e. if users inputs a random image, my algorithm should say that it does not belong to the described classes. Presently with 6 classes, it gives one class as a sure output but I do not want that. What are possible solutions for this?
Also, what are the other parameters that we may tweak in tensorflow. My baseline accuracy is 94% and I am looking for something close to 99%.
Since you're doing single label classification, softmax is the best loss function for this, as it maps your final layer logit values to a probability distribution. Sigmoid is used when it's multilabel classification.
It's always better to use a momentum based optimizer compared to vanilla gradient descent. There's a bunch of such modified optimizers like Adam or RMSProp. Experiment with them to see what works best. Adam is probably going to give you the best performance.
You can add an extra label no_class, so your task will now be a 6+1 label classification. You can feed in some random images with no_class as the label. However the distribution of your random images must match the test image distribution, else it won't generalise.
I've implemented a neural network using Keras. Once trained and tested for final test accuracy, using a matrix with a bunch of rows containing features (plus corresponding labels), I have a model which I should be able to use for prediction.
How can I feed a single unseen example, meaning a feature vector to the model, to obtain a class prediction?
I've looked at their documentation here but could not find a method for it.
What you want is the predict method, it takes a batch of input samples and produces predictions, which are the outputs computer by your network. To feed a single example you can just put it inside a numpy ndarray wrapper.