I'm new to neural networks and PyTorch in particular, so please excuse my question if it turns out to be a simple one. I am creating a simple neural network that can predict the presence of lung cancer based on a given dataset.
I've reached the point where I have to create my input and output tensors with which to train my network. Unfortunately, I've run into an error while creating the tensors, and I'm not sure how to resolve it.
You need to be using vectors/matrices of numbers to create tensors. Right now you seem to be passing strings describing the data rather than the data itself.
Related
I am trying to build a neural network model where the input data has many missing values which are hard to fill in by any means in advance. Therefore, the idea is to train a neural network with only observed data. The data vector fed in the input layer is then a vector with missing values in various positions. The positions of the missing values will not be fixed.
After some search, I found Tensorflow has a masking layer for use. Therefore, I inserted a masking layer right after the input layer,
inputs = keras.Input(shape=(inputDim,))
maskingLayer = keras.Masking(mask_value = -999)(inputs)
where the missing values are replaced with -999 in the preprocessing. After that, several dense layers are inserted and the model was compiled and fit in usual way.
The question is that I don't see much effect of the masking layer. I am wondering if the masking layer really masked out all the nodes of value -999 in the input layer as well as the weights and biases connected to them?
I found this post who had a similar question
Not fully connected layer in tensorflow
However, his unwanted links are fixed and in my case I would like to build a layer (next to the input layer) that only connects to the unmasked nodes of the input layer. Is it possible to do it?
Thanks.
I am trying to solve the problem of sequence completion. Let's suppose we have ground truth sequence (1,2,4,7,6,8,10,12,18,20)
The input to our model is an incomplete sequence. i.e (1,2,4, _ , _ ,_,10,12,18,20). From this incomplete sequence, we want to predict the original sequence (Ground Truth sequence). Which deep learning models can be used to solve this problem?
Is this the problem of encoder-decoder LSTM architecture?
Note: we have thousands of complete sequences to train and test the model.
Any help is appreciated.
This not exactly sequence-to-sequence problem, this is a sequence labeling problem. I would suggest either stacking bidirectional LSTM layers followed by a classifier or Transformer layers followed by a classifier.
Encoder-decoder architecture requires plenty of data to train properly and is particularly useful if the target sequence can be of arbitrary length, only vaguely depending on the source sequence length. It would eventually learn to do the job with enough, but sequence labeling is a more straightforward problem.
With sequence labeling, you can set a custom mask over the output, so the model will only predict the missing numbers. An encoder-decoder model would need to learn to copy most of the input first.
In your sequence completion task, are you trying to predict next items in a sequence or learn only the missing values?
Training a neural network with missing data is an issue on its own terms.
If you're using Keras and LSTM-type NN for solving your problem, you should consider masking, you can refer to this stackoverflow thread for more details: Multivariate LSTM with missing values
Regarding predicting the missing values, why not try auto-encoders?
I don't have much experience with training neural networks. I have 4 variable vectors as input and I have respectively 3 variable output vector. I want to create a neural network that takes these inputs and outputs which have some unknown correlation(might not be linear) between them and train. So that when I put previously untrained data through it should predict the correlated output.
I was wondering,
What type of model should I use in such scenarios? Is it Restricted boltzmann machine, regression, GAN, etc?
What library is easiest to learn and implement for such a model? eg:- TensorFlow, PyTorch, etc
If images were involved which can be processed as fft arrays, would the model change.
I did find this answer, but I am not satisfied with it.
Please let me know if there are any functions or other points you would like me to know. Any help is much appreciated.
A multilayer perceprton is a good place to start.
Keras is the highest level/easiest to use library I have used.
If you are working with images or spatially structured data a convolutional neural network will probably work best.
I have a question and I am not sure if it's a smart one. But I've been reading quite a lot about convolution neural networks. And so far I understand that the output layer could for example be a softmax layer for a classification problem or you could do regression in order to get a quantitative value. But I was wondering if it is possible to infer more than one parameter. For example, if I have a data and my output label is both price of the house and size of the house. I know it is not a smart example. But I just want to know if it's possible to predict two different output values in the same output layer in the convolution neural network. Or do I need to have two different convolution neural network where one predicts the size of the house and the one predicts price of the house. And how can we combine these two predictions then. And if we can do it in one convolution neural network, then how can we do that?
In your mentioned cases, the output layer is most likely a dense layer, not a convolutional one. But that's beside the point, if you want multiple outputs, then multiple output layers are often trained. So the same convolutional network can go to two separate output layers, which can be trained independently. Then you've one neural network, with two outputs. The convolutional part is often received by transfer learning, and are often frozen layers that can no longer be trained. Have a look at the figures of this paper, this shows how it can be done.
In this article, I've come across the following network structure:
Figure 1(b). https://wx4.sinaimg.cn/mw690/5396ee05ly1fg9vi5phcbj20vj0kb0ty.jpg
Each layer is a fully connected one.
The weights shared by the two parts are denoted by Wc.
The pairs of the top fully connected layers of dimension 500 are concatenated to create a layer of dimension 1000 which is then used directly to reconstruct the input of size 784.
I want to implement it with keras, however I am not skilled with keras.
any ideas on how to implement this?
thank you very much!