Darknet Yolov3 - Custom training on pre-trained model - python-3.x

Actually in darknet yolov3 model has coco.names file for labels which include 80 classes.
Now if I want to train a custom model with two labels only, where one label is already there in coco.names and another is not there.
For example I want to train a model to detect for cell phone and dslr camera, so cell phone class already exist in coco.names whereas dslr camera is not there in its labels file.
So can I train custom model using two classes cell phone and dslr camera and give data of only dslr camera for training and it will predict for both dslr camera and cell phone or shall I train with both data of cell phone and dslr images or is there any other way out.
I am a bit new to ML, so any help would be great
Thanks

So you want to fine tune a pre-trained model.
You need to think of classes by just being a set of end nodes of a network, the labels (phone, camera) are just a naming convention for them, and to give us visual guidance.
These nodes are fully connected (with associated weights) to the previous layer of the network, the total number of these intermediate connections varies depending on the number of end nodes (classes) you have.
With the fully trained model, you can't just select the nodes you want, and take out the rest, and add a few more. Because the previous layer (and full network) was trained to give estimates/predictions taking into account a certain number of final nodes.
So basically you need to give a full reset on the last layer (the head), and restart it with the desired number of classes. The idea here, is that you take advantage of the previous training effort on a broader dataset, and fine tune it to your desired data.
Short answer, you need data for both, and need to change the model to accept 2 classes only.
To configure that specific model for the new number of classes and data, I believe you can find some guidance and instructions here

Related

what should be the target in this deep learning image classification problem

I am doing a image classification project using CNN in keras. I have a dataset of about 900 photos of about 70 people .Each person has multiple photos of his different age.
My goal is to predict the correct ID of the person if any one of his photo is in the input.
Here is the glimpse of the data.
My questions are:
What should be my target column ?Is Target 'AGE' or 'ID'? 2-Do I
need to do hot-encoding of the target column? For example if I used
ID as my target,then do I have to do one-hot-encoding of ID column?
If I used ID as my target,then after one-hot-encoding, does it
mean,I will be having 70 classes?
I need information about the
output layer. My goal is to find whether the photo belong to the
same ID or not,so what should be the output layer? Shall I use
softmax with 70 outputs ?
Another question about the output layer
is that can I use a softmax with 70 outputs and then feed it to a
layer of sigmoid with single output ?
You are going to identify the same person using different age images. For example, in the dataset, you have 100 different images of khan and you trained a model. Now you provide the 101st image of khan, the model will detect it. So your target column should be ID.
yes, there are 70 classes and you get one hot encoded vector of 900x70
It should be a softmax layer because the sigmoid layer is used for binary class or multilabel problem. As you have to detect 70 different people from each other, you need a softmax class.
I don't think so, in this way your model would not be capable of telling which person image is this (the one provided as a test)

How do i retrain the model without losing the earlier model data with new set of data

for my current requirement, I'm having a dataset of 10k+ faces from 100 different people from which I have trained a model for recognizing the face(s). The model was trained by getting the 128 vectors from the facenet_keras.h5 model and feeding those vector value to the Dense layer for classifying the faces.
But the issue I'm facing currently is
if want to train one person face, I have to retrain the whole model once again.
How should I get on with this challenge? I have read about a concept called transfer learning but I have no clues about how to implement it. Please give your suggestion on this issue. What can be the possible solutions to it?
With transfer learning you would copy an existing pre-trained model and use it for a different, but similar, dataset from the original one. In your case this would be what you need to do if you want to train the model to recognize your specific 100 people.
If you already did this and you want to add another person to the database without having to retrain the complete model, then I would freeze all layers (set layer.trainable = False for all layers) except for the final fully-connected layer (or the final few layers). Then I would replace the last layer (which had 100 nodes) to a layer with 101 nodes. You could even copy the weights to the first 100 nodes and maybe freeze those too (I'm not sure if this is possible in Keras). In this case you would re-use all the trained convolutional layers etc. and teach the model to recognise this new face.
You can save your training results by saving your weights with:
model.save_weights('my_model_weights.h5')
And load them again later to resume your training after you added a new image to the dataset with:
model.load_weights('my_model_weights.h5')

How to implement multi state LSTM RNN in keras

I have 1000 distinct users and the dataset consists activities of these users over the past 1 year. Total records are over 300K. The inputs for the LSTM RNN has the feature vectors corresponding to these users. The user is also included because behavior of each user may vary from person to person. The network should learn behavior of each user and should be able to predict the next behavior based on the past information of the same user.
How to maintain separate hidden states for each user within an LSTM RNN.
Following blog post is similar to my problem:
https://towardsdatascience.com/multi-state-lstms-for-categorical-features-66cc974df1dc
Update
My dataset looks like:
DATASET
I transformed my dataset into a 3D the numpy array and reshaped it as (No of records, timesteps, n_features).
The questions are:
1) Is it necessary to encode the "user" attribute?
2) what is the correct batch size for this problem? Is it batch = 1000 (no. of distinct users)?
3) Do I need to include each user's data in each batch input to the model?
OR
Please suggest the correct implementation of this problem.
This is just automatic. You don't need to do anything.
The LSTM layer will certainly have a state matrix the size of your batch of users. (Otherwise it wouldn't be useful)

How to use CNN model to detect object recognized by YOLO

Let start by saying that i have 2 pre-trained models (in hdf5 files):
The first model is a YOLO-based model, trained on dataset A, which is used to locate human in any images (note that: a trained images o this model may contain many people inside)
The second model is a CNN model which is used to detect gender of a person (male or female) based on the image which only contains 1 person.
Suppose that i only want to use these 2 models and do not want to re-train or modify anything on the dataset. How could i locate female person in a picture of Dataset A?
A possible solution that i think could work:
First use the first model to detect, that is to create bounding boxes around persons in the images.
Crop the bounding boxes into unique images. Feed those images to the second model to see if that person is Female/Male
However, this solution is slow in performance. So is there anyway that can festen this solution or perform this task in different ways?

Train multiple models with various measures and accumulate predictions

So I have been playing around with Azure ML lately, and I got one dataset where I have multiple values I want to predict. All of them uses different algorithms and when I try to train multiple models within one experiment; it says the “train model can only predict one value”, and there are not enough input ports on the train-model to take in multiple values even if I was to use the same algorithm for each measure. I tried launching the column selector and making rules, but I get the same error as mentioned. How do I predict multiple values and later put the predicted columns together for the web service output so I don’t have to have multiple API’s?
What you would want to do is to train each model and save them as already trained models.
So create a new experiment, train your models and save them by right clicking on each model and they will show up in the left nav bar in the Studio. Now you are able to drag your models into the canvas and have them score predictions where you eventually make them end up in the same output as I have done in my example through the “Add columns” module. I made this example for Ronaldo (Real Madrid CF player) on how he will perform in match after training day. You can see my demo on http://ronaldoinform.azurewebsites.net
For more detailed explanation on how to save the models and train multiple values; you can check out Raymond Langaeian (MSFT) answer in the comment section on this link:
https://azure.microsoft.com/en-us/documentation/articles/machine-learning-convert-training-experiment-to-scoring-experiment/
You have to train models for each variable that you going to predict. Then add all those predicted columns together and get as a single output for the web service.
The algorithms available in ML are only capable of predicting a single variable at a time based on the inputs it's getting.

Resources