what should be the maximum size of input layer? - python-3.x

I am trying to create a multilayer perceptron for an training over images dataset. images are 300*300 and input layer is 90000. Is it the right way to create it?

90000 is a huge layer for what I assume you a running on a consumer-grade device. The error is likely Tensorflow running out of RAM.
If you post the whole traceback, I can be much more specific.
In general,
for a basic image classification task:
Try feeding the image into a Conv net first with dimension 300.
Then pool to reduce spatial dimensions.

Related

Given inputs and outputs vector, which model is best for predicting unknown data?

I don't have much experience with training neural networks. I have 4 variable vectors as input and I have respectively 3 variable output vector. I want to create a neural network that takes these inputs and outputs which have some unknown correlation(might not be linear) between them and train. So that when I put previously untrained data through it should predict the correlated output.
I was wondering,
What type of model should I use in such scenarios? Is it Restricted boltzmann machine, regression, GAN, etc?
What library is easiest to learn and implement for such a model? eg:- TensorFlow, PyTorch, etc
If images were involved which can be processed as fft arrays, would the model change.
I did find this answer, but I am not satisfied with it.
Please let me know if there are any functions or other points you would like me to know. Any help is much appreciated.
A multilayer perceprton is a good place to start.
Keras is the highest level/easiest to use library I have used.
If you are working with images or spatially structured data a convolutional neural network will probably work best.

How to predict different data via neural network, which is trained on the data with 36x60 size?

I was training a neural network with images of an eye that are shaped 36x60. So I can only predict the result using a 36x60 image? But in my application I have a video stream, this stream is divided into frames, for each frame 68 points of landmarks are predicted. In the eye range, I can select the eye point, and using the 'boundingrect' function from OpenCV, it is very easy to get a cropped image. But this image has no form 36x60. What is the correct way to get 36x60 data that can be used for forecasting? Or how to use a neural network for data of another form?
Neural networks (insofar as I've encountered) have a fixed input shape, freedom permitted only to batch size. This (probably) goes for every amazing neural network you've ever seen. Don't be too afraid of reshaping your image with off-the-shelf sampling to the network's expected input size. Robust computer-vision networks are generally trained on augmented data; randomly scaled, skewed, and otherwise transformed in order to---among other things---broaden the network's ability to handle this unavoidable scaling situation.
There are caveats, of course. An input for prediction should be as similar to the dataset it was trained on as possible, which is to say that a model should be applied to the data for which it was designed. For example, consider an object detection network made for satellite applications. If that same network is then applied to drone imagery, the relative size of objects may be substantially larger than the objects for which the network (specifically its anchor-box sizes) was designed.
Tl;dr: Assuming you're using the right network for the job, don't be afraid to scale your images/frames to fit the network's inputs.

which is the most suitable method for training among model.fit(), model.train_on_batch(), model.fit_generator()

I have a training dataset of 600 images with (512*512*1) resolution categorized into 2 classes(300 images per class). Using some augmentation techniques I have increased the dataset to 10000 images. After having following preprocessing steps
all_images=np.array(all_images)/255.0
all_images=all_images.astype('float16')
all_images=all_images.reshape(-1,512,512,1)
saved these images to H5 file.
I am using an AlexNet architecture for classification purpose with 3 convolutional, 3 overlap max-pool layers.
I want to know which of the following cases will be best for training using Google Colab where memory size is limited to 12GB.
1. model.fit(x,y,validation_split=0.2)
# For this I have to load all data into memory and then applying an AlexNet to data will simply cause Resource-Exhaust error.
2. model.train_on_batch(x,y)
# For this I have written a script which randomly loads the data batch-wise from H5 file into the memory and train on that data. I am confused by the property of train_on_batch() i.e single gradient update. Do this will affect my training procedure or will it be same as model.fit().
3. model.fit_generator()
# giving the original directory of images to its data_generator function which automatically augments the data and then train using model.fit_generator(). I haven't tried this yet.
Please guide me which will be the best among these methods in my case. I have read many answers Here, Here, and Here about model.fit(), model.train_on_batch() and model.fit_generator() but I am still confused.
model.fit - suitable if you load the data as numpy-array and train without augmentation.
model.fit_generator - if your dataset is too big to fit in the memory or\and you want to apply augmentation on the fly.
model.train_on_batch - less common, usually used when training more than one model at a time (GAN for example)

joint autoencoder with sharing weight using keras

In this article, I've come across the following network structure:
Figure 1(b). https://wx4.sinaimg.cn/mw690/5396ee05ly1fg9vi5phcbj20vj0kb0ty.jpg
Each layer is a fully connected one.
The weights shared by the two parts are denoted by Wc.
The pairs of the top fully connected layers of dimension 500 are concatenated to create a layer of dimension 1000 which is then used directly to reconstruct the input of size 784.
I want to implement it with keras, however I am not skilled with keras.
any ideas on how to implement this?
thank you very much!

Training Methodology of CNN in theano with large scale data

I am training a CNN with 1M images with theano. Now I am puzzled on how to prepare the training data.
My questions are:
When the images resize to 64*64*3, the size of whole data is about 100G. Should I save the data into a single npy file or some smaller files? which one is efficient?
How to decide the number of parameters of the CNN? How about 1M/10 = 100K?
Should I limit the memory cost of a training block and the CNN parameters less than GPU memory?
My computer is with 16G memory and GPU Titian.
Thank you very much.
If you're using a NN framework like pylearn2, lasagne, Keras, etc, check the docs to see if there are guidelines for iterating batches off disk from an hdf5 store or similar.
If there's nothing and you don't want to roll your own, the fuel package provides lots of helpful data iteration schemes that can be adapted to models in theano (and probably most of the frameworks; there's a good tutorial in the fuel repository).
As for the parameters, you'll have to cross validate to figure out the best parameters for your data.
And yes, the model size + minibatch size + dropout mask for the batch has to be under the available vram.

Resources