I am running a cnn in Google colab and i am using tensorflow or Keras. However I received this feedback
Negative dimension size caused by subtracting 3 from 2 for '{{node conv2d_11/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], explicit_paddings=[], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](Placeholder, conv2d_11/Conv2D/ReadVariableOp)' with input shapes: [?,2,2,394], [3,3,394,394].
Call arguments received:
• inputs=tf.Tensor(shape=(None, 2, 2, 394), dtype=float32)
does this have to do with my input data or my parameters? Thanks
Try inserting a number instead of using None when specifying the shape. As said in the documentation here, you run across not-fully-specified shapes when using None.
in this case you have defined model which consists of MaxPool layers or AvgPool layers alot, so by the images pass through model layers, images size will be decreased; i think it would be helpful if you set padding parameter in convolution layer to same and for more details you could read about conv layers parameters strides and padding.
How to do inference in batches in PyTorch? How to do inference in parallel to speed up that part of the code.
I've started with the standard way of doing inference:
with torch.no_grad():
for inputs, labels in dataloader['predict']:
inputs = inputs.to(device)
output = model(inputs)
output = output.to(device)
And I've researched and the only mention of doing inference in parallel (in the same machine) seems to be with the library Dask: https://examples.dask.org/machine-learning/torch-prediction.html
Currently attempting to understand that library and create a working example. In the meanwhile do you know of a better way?
In pytorch, the input tensors always have the batch dimension in the first dimension. Thus doing inference by batch is the default behavior, you just need to increase the batch dimension to larger than 1.
For example, if your single input is [1, 1], its input tensor is [[1, 1], ] with shape (1, 2). If you have two inputs [1, 1] and [2, 2], generate the input tensor as [[1, 1], [2, 2], ] with shape (2, 2). This is usually done in the batch generator function such as your dataloader.
I want to train an autoencoder using keras where X_train is mxn matrix and y_train is also mxn matrix.
for Examaple
X_train = np.array(([1, 2],
[3, 4]))
y_train = np.array(([5, 6],
[7, 8]))
I concatenate two matrix in train_set and save into one file training.npy
train_set = np.concatenate([X_train, y_train], axis=1)
print(train_set)
array([[1, 2, 5, 6],
[3, 4, 7, 8]])
Later I save it to S3
training_path_input = sess.upload_data('/tmp/training.npy', key_prefix=prefix+'/training')
Now when I fit the model
model.fit({'train': training_path_input })
I wonder how estimator will find index for X_train and y_train since y_train is not a vector unlike other cases. Is there any way to specify this in fit() method.
Or is there any alternative way to do it?
The fit method does 2 things: (1) copy your data from training_path_input (on S3) to /opt/ml/input/data/<channel> in the SageMaker training instance (/opt/ml/input/data/train in your case) and (2) launching the code with any hyperparameter you specified. You need to make sure that your training code knows how to read the type of files you're copying to the machine. Your training code must include code that will read locally the copied files.
I'm working on a smaller project to better understand RNN, in particualr LSTM and GRU. I'm not at all an expert, so please bear that in mind.
The problem I'm facing is given as data in the form of:
>>> import numpy as np
>>> import pandas as pd
>>> pd.DataFrame([[1, 2, 3],[1, 2, 1], [1, 3, 2],[2, 3, 1],[3, 1, 1],[3, 3, 2],[4, 3, 3]], columns=['person', 'interaction', 'group'])
person interaction group
0 1 2 3
1 1 2 1
2 1 3 2
3 2 3 1
4 3 1 1
5 3 3 2
6 4 3 3
this is just for explanation. We have different person interacting with different groups in different ways. I've already encoded the various features. The last interaction of a user is always a 3, which means selecting a certain group. In the short example above person 1 chooses group 2, person 2 chooses group 1 and so on.
My whole data set is much bigger but I would like to understand first the conceptual part before throwing models at it. The task I would like to learn is given a sequence of interaction, which group is chosen by the person. A bit more concrete, I would like to have an output a list with all groups (there are 3 groups, 1, 2, 3) sorted by the most likely choice, followed by the second and third likest group. The loss function is therefore a mean reciprocal rank.
I know that in Keras Grus/LSTM can handle various length input. So my three questions are.
The input is of the format:
(samples, timesteps, features)
writing high level code:
import keras.layers as L
import keras.models as M
model_input = L.Input(shape=(?, None, 2))
timestep=None should imply the varying size and 2 is for the feature interaction and group. But what about the samples? How do I define the batches?
For the output I'm a bit puzzled how this should look like in this example? I think for each last interaction of a person I would like to have a list of length 3. Assuming I've set up the output
model_output = L.LSTM(3, return_sequences=False)
I then want to compile it. Is there a way of using the mean reciprocal rank?
model.compile('adam', '?')
I know the questions are fairly high level, but I would like to understand first the big picture and start to play around. Any help would therefore be appreciated.
The concept you've drawn in your question is a pretty good start already. I'll add a few things to make it work, as well as a code example below:
You can specify LSTM(n_hidden, input_shape=(None, 2)) directly, instead of inserting an extra Input layer; the batch dimension is to be omitted for the definition.
Since your model is going to perform some kind of classification (based on time series data) the final layer is what we'd expect from "normal" classification as well, a Dense(num_classes, action='softmax'). Chaining the LSTM and the Dense layer together will first pass the time series input through the LSTM layer and then feed its output (determined by the number of hidden units) into the Dense layer. activation='softmax' allows to compute a class score for each class (we're going to use one-hot-encoding in a data preprocessing step, see code example below). This means class scores are not ordered, but you can always do so via np.argsort or np.argmax.
Categorical crossentropy loss is suited for comparing the classification score, so we'll use that one: model.compile(loss='categorical_crossentropy', optimizer='adam').
Since the number of interactions. i.e. the length of model input, varies from sample to sample we'll use a batch size of 1 and feed in one sample at a time.
The following is a sample implementation w.r.t to the above considerations. Note that I modified your sample data a bit, in order to provide more "reasoning" behind group choices. Also each person needs to perform at least one interaction before choosing a group (i.e. the input sequence cannot be empty); if this is not the case for your data, then introducing an additional no-op interaction (e.g. 0) can help.
import pandas as pd
import tensorflow as tf
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.LSTM(10, input_shape=(None, 2))) # LSTM for arbitrary length series.
model.add(tf.keras.layers.Dense(3, activation='softmax')) # Softmax for class probabilities.
model.compile(loss='categorical_crossentropy', optimizer='adam')
# Example interactions:
# * 1: Likes the group,
# * 2: Dislikes the group,
# * 3: Chooses the group.
df = pd.DataFrame([
[1, 1, 3],
[1, 1, 3],
[1, 2, 2],
[1, 3, 3],
[2, 2, 1],
[2, 2, 3],
[2, 1, 2],
[2, 3, 2],
[3, 1, 1],
[3, 1, 1],
[3, 1, 1],
[3, 2, 3],
[3, 2, 2],
[3, 3, 1]],
columns=['person', 'interaction', 'group']
)
data = [person[1][['interaction', 'group']].values for person in df.groupby('person')]
x_train = [x[:-1] for x in data]
y_train = tf.keras.utils.to_categorical([x[-1, 1]-1 for x in data]) # Expects class labels from 0 to n (-> subtract 1).
print(x_train)
print(y_train)
class TrainGenerator(tf.keras.utils.Sequence):
def __init__(self, x, y):
self.x = x
self.y = y
def __len__(self):
return len(self.x)
def __getitem__(self, index):
# Need to expand arrays to have batch size 1.
return self.x[index][None, :, :], self.y[index][None, :]
model.fit_generator(TrainGenerator(x_train, y_train), epochs=1000)
pred = [model.predict(x[None, :, :]).ravel() for x in x_train]
for p, y in zip(pred, y_train):
print(p, y)
And the corresponding sample output:
[...]
Epoch 1000/1000
3/3 [==============================] - 0s 40ms/step - loss: 0.0037
[0.00213619 0.00241093 0.9954529 ] [0. 0. 1.]
[0.00123938 0.99718493 0.00157572] [0. 1. 0.]
[9.9632275e-01 7.5039308e-04 2.9268670e-03] [1. 0. 0.]
Using custom generator expressions: According to the documentation we can use any generator to yield the data. The generator is expected to yield batches of the data and loop over the whole data set indefinitely. When using tf.keras.utils.Sequence we do not need to specify the parameter steps_per_epoch as this will default to len(train_generator). Hence, when using a custom generator, we shall provide this parameter as well:
import itertools as it
model.fit_generator(((x_train[i % len(x_train)][None, :, :],
y_train[i % len(y_train)][None, :]) for i in it.count()),
epochs=1000,
steps_per_epoch=len(x_train))
I've been implementing an autoencoder which receives as inputs vectors that consist only of 0 and 1, such as [1, 0, 1, 0, 1, 0, ...].
Likewise, another autoencoder that receives as inputs vectors that consist in values between 0 and 1, such as [0.123, 1, 0.9, 0.01, 0.9, ...]. In both cases each vector element is the input value of a node. The activation function of the hidden layers is relu and for the output layer is sigmoid.
I've seen some examples of autoencoders where adam/adadelta are used as optimizer and binary_crossentropy is used as a loss function. For that reason I implemented in both adadelta and binary_crossentropy, but I'm not sure if for both cases it's the correct configuration.