I am facing problem while reshaping the data to fit in Convolutional Neural Network. I've tried many solutions but still unable to do that. Dataset Contains 800 rows and 271 columns (last column contains class label). There are total 9 classes. Below is my Code:
dataset = pd.read_csv('train.csv')
X = dataset.iloc[:, 0:270].values
y = dataset.iloc[:, 270].values
print("X Shape: "+str(X.shape)) ---> (804, 270)
*** Reshaping Variables here
X_train, X_test, y_train, y_test = train_test_split(X_reshaped, Y_reshaped, test_size = 0.20)
model = Sequential()
model.add(Convolution1D(64, kernel_size=(10), input_shape=(X_train.shape[1],X_train.shape[2])))
model.add(Activation('relu'))
model.add(MaxPooling1D(3))
model.add(Flatten())
model.add(Dense(100))
model.add(Dropout(0.5))
model.add(Dense(9))
model.add(Activation('softmax'))
model.compile(loss='sparse_categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
model.fit(X_train,y_train,validation_data=(X_test,y_test))
print(str(model.evaluate(x_test,y_test)))
Is there anyway to successfully reshape the variables for training the model? Thanks!
Convolution1D requires an input of the form
(samples, steps, input_dim)
Right now you are passing
(samples,input_dim)
You need to reshape the data depending on how you have arranged the timesteps in the 800 rows.
For example, if the 800 rows are 80 samples of 10 timesteps, like 10 timesteps of first sample followed by 10 of another...
then you need to reshape is as (80,10,270)
The Convolutional1D is for processing temporal data and you do not seem to have it. You need to split your data into number of samples and timesteps
Related
First of all, I'm very new to ML and I have this task:
I need to build a ML model to give clients a list of 10 professions for each of them which best fit with their data: bachelor's degree type, favourite subjects, favourite professional sectors, etc...
My team and I have alredy extracted info from an SQL database and created a dataframe with all the relevant information and One Hot Encoded it: this is the result:
df_all_dumm.shape
(773, 1029)
So I have 773 clients and 1029 columns (a lot of them, but we thought it is necessary because all the columns were numeric categorical). Most of the columns are OHE professions columns (from 99 to 998), where there is "1" if the profession has been suggested to the client or "0" if not.
I'm a bit lost about if this dataset approach is fine for multi-label classification, about what method to use (NN, RandomTrees classif, scikit multi-learn models...). I have alredy tried with some multi-learn models like MLkNN or BRkNNaClassifier but the results are very poor (F1 Score = 0.1 - 0.2).
This is the dataset (it doesn't contain any private data, so I think there is no problem uploading it. Also, I don't know if this is the propper way to paste a link, sorry again)
https://drive.google.com/file/d/1nID4q7EfpoiNKdWz6N4FRUgIQEniwFRV/view?usp=sharing
EDIT:
I have created a Sequential Keras model:
# Slicing target columns from the rest of the df
data = pd.read_csv('df_all_dumm.csv')
data_c = data.copy()
data_in = data_c.copy()
data_c.iloc[:,99:999]
data_out = data_c.iloc[:,99:999]
data_in = data_in.drop(data_out.columns,1)
data_in = data_in.drop(['id'],1)
X_train, X_test, y_train, y_test = train_test_split(data_in,
data_out,
test_size = 0.3,
random_state = 42)
print("{0:2.2f}% of data in train set".format(len(X_train)/len(data.index)*100))
print("{0:2.2f}% of data in test set".format(len(X_test)/len(data.index)*100))
# Dataframes to tensors
X_train_tf = tf.convert_to_tensor(X_train.values)
X_test_tf = tf.convert_to_tensor(X_test.values)
y_train_tf = tf.convert_to_tensor(y_train.values)
y_test_tf = tf.convert_to_tensor(y_test.values)
from numpy import asarray
from sklearn.datasets import make_multilabel_classification
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
# get the model
def get_model(n_inputs, n_outputs):
model = Sequential()
model.add(Dense(512, input_dim=n_inputs, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(600, input_dim=n_inputs, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(700, input_dim=n_inputs, activation='relu'))
model.add(Dense(900, input_dim=n_inputs, activation='relu'))
model.add(Dense(n_outputs, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
# load dataset
X, y = X_train_tf, y_train_tf
n_inputs, n_outputs = X.shape[1], y.shape[1]
# get model
model = get_model(n_inputs, n_outputs)
# fit the model on all data
model.fit(X, y, validation_split=0.33, epochs=100, batch_size=10)
prec = model.evaluate(X_test_tf, y_test_tf)[1]
print("La precisiĆ³n de la red es: {} %".format(round(prec*100,2)))
La precisiĆ³n de la red es: 4.74 %
So, this is the model we created. I think the main problem here is that we have 900 different output labels and our data input size is 500...
We also thought about applying some clustering algorithm first to cluster the professions in, let's say 5 groups, train a NN for every group and then predict.
I have training data in the form of numpy arrays, that I will use in ConvLSTM.
Following are dimensions of array.
trainX = (5000, 200, 5) where 5000 are number of samples. 200 is time steps per sample, and 8 is number of features per timestep. (samples, timesteps, features).
out of these 8 features, 3 features remains the same throghout all timesteps in a sample (In other words, these features are directly related to samples). for example, day of the week, month number, weekday (these changes from sample to sample). To reduce the complexity, I want to keep these three features separate from initial training set and merge them with the output of convlstm layer before applying dense layer for classication (softmax activiation). e,g
Intial training set dimension would be (7000, 200, 5) and auxiliary input dimensions to be merged would be (7000, 3) --> because these 3 features are directly related to sample. How can I implement this using keras?
Following is my code that I write using Functional API, but don't know how to merge these two inputs.
#trainX.shape=(7000,200,5)
#trainy.shape=(7000,4)
#testX.shape=(3000,200,5)
#testy.shape=(3000,4)
#trainMetadata.shape=(7000,3)
#testMetadata.shape=(3000,3)
verbose, epochs, batch_size = 1, 50, 256
samples, n_features, n_outputs = trainX.shape[0], trainX.shape[2], trainy.shape[1]
n_steps, n_length = 4, 50
input_shape = (n_steps, 1, n_length, n_features)
model_input = Input(shape=input_shape)
clstm1 = ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu',return_sequences = True)(model_input)
clstm1 = BatchNormalization()(clstm1)
clstm2 = ConvLSTM2D(filters=128, kernel_size=(1,3), activation='relu',return_sequences = False)(clstm1)
conv_output = BatchNormalization()(clstm2)
metadata_input = Input(shape=trainMetadata.shape)
merge_layer = np.concatenate([metadata_input, conv_output])
dense = Dense(100, activation='relu', kernel_regularizer=regularizers.l2(l=0.01))(merge_layer)
dense = Dropout(0.5)(dense)
output = Dense(n_outputs, activation='softmax')(dense)
model = Model(inputs=merge_layer, outputs=output)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit([trainX, trainMetadata], trainy, validation_data=([testX, testMetadata], testy), epochs=epochs, batch_size=batch_size, verbose=verbose)
_, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)
y = model.predict(testX)
but I am getting Value error at merge_layer statement. Following is the ValueError
ValueError: zero-dimensional arrays cannot be concatenated
What you are saying can not be done using the Sequential mode of Keras.
You need to use the Model class API Guide to Keras Model.
With this API you can build the complex model you are looking for
Here you have an example of how to use it: How to Use the Keras Functional API for Deep Learning
My task is to learn defected items in a factory. It means, I try to detect defected goods or fine goods. This led a problem where one class dominates the others (one class is 99.7% of the data) as the defected items were very rare. Training accuracy is 0.9971 and validation accuracy is 0.9970. It sounds amazing.
But the problem is, the model only predicts everything is 0 class which is fine goods. That means, it fails to classify any defected goods.
How can I solve this problem? I have checked other questions and tried out, but I still have the situation. the total data points are 122400 rows and 5 x features.
In the end, my confusion matrix of the test set is like this
array([[30508, 0],
[ 92, 0]], dtype=int64)
which does a terrible job.
My code is as below:
le = LabelEncoder()
y = le.fit_transform(y)
ohe = OneHotEncoder(sparse=False)
y = y.reshape(-1,1)
y = ohe.fit_transform(y)
scaler = StandardScaler()
x = scaler.fit_transform(x)
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.25, random_state = 777)
#DNN Modelling
epochs = 15
batch_size =128
Learning_rate_optimizer = 0.001
model = Sequential()
model.add(Dense(5,
kernel_initializer='glorot_uniform',
activation='relu',
input_shape=(5,)))
model.add(Dense(5,
kernel_initializer='glorot_uniform',
activation='relu'))
model.add(Dense(8,
kernel_initializer='glorot_uniform',
activation='relu'))
model.add(Dense(2,
kernel_initializer='glorot_uniform',
activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = Learning_rate_optimizer),
metrics=['accuracy'])
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
y_pred = model.predict(x_test)
confusion_matrix(y_test.argmax(axis=1), y_pred.argmax(axis=1))
Thank you
it sounds like you have highly imbalanced dataset, the model is learning only how to classify fine goods.
you can try one of the approaches listed here:
https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/
The best attempt would be to firstly take almost equal portions of data of both classes, split them into train-test-val, train the classifier and do thorough testing on your complete dataset. You can also try and use data augmentation techniques to your other set to get more data from the same set. Keep on iterating and maybe even try and change your loss function to suit your condition.
Binary classification problem: I want to have One input layer(optional), One Conv1D layer then output layer of 1 neuron predicting either 1 or 0.
Here is my model:
x_train = np.expand_dims(x_train,axis=1)
x_valid = np.expand_dims(x_valid,axis=1)
#x_train = x_train.reshape(x_train.shape[0], 1, x_train.shape[1])
#x_valid = x_train.reshape(x_valid.shape[0], 1, x_train.shape[1])
model = Sequential()
#hidden layer
model.add(Convolution1D(filters = 1, kernel_size = (3),input_shape=(1,x_train.shape[2])))
#output layer
model.add(Flatten())
model.add(Dense(1, activation = 'softmax'))
sgd = SGD(lr=0.01, nesterov=True, decay=1e-6, momentum=0.9)
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
print('model compiled successfully')
model.fit(x_train, y_train, nb_epoch = nb_epochs, validation_data=(x_valid,y_valid), batch_size=100)
Input shape: x_train.shape = (5,1,133906) which is (batch,steps,channels) respectively. Steps added through expand_dims. Actual size (5,133906) which is 5 samples of time series data of length 133906 sampled randomly sometimes at 2 ms and sometimes at 5 ms.
Error Message: ValueError: Negative dimension size caused by subtracting 3 from 1 for 'conv1d_1/convolution/Conv2D' (op: 'Conv2D') with input shapes: [?,1,1,133906], [1,3,133906,1].
How do I resolve this issue? What should the size of x_train and the input_size argument passed inside Conv1D be?
Convolution1D layers takes input in a format of [batch, steps, channels]
Your length of convolution window (kernel size) cannot be larger than number of steps.
Therefore if you want to use your defined input shape of:
x_train.shape = (5,1,133906)
you need to change kernel size to 1
i.e. change line 9 to
model.add(Convolution1D(filters = 1, kernel_size = 1,input_shape=(1,x_train.shape[2])))
However, this will only enable your example to work. Depending on your goals, data type, etc. you might want to try different combinations of your kernel size and dimensions of input data to obtain best results.
I have a single training batch of 600 sequential points (x(t), y(t)) with x(t) being a 25 dimensional vector and y(t) being my target (1 dim). I would like to train an LSTM to predict how the series would continue given a few additional x(t) [t> 600]. I tried the following model:
model = Sequential()
model.add(LSTM(128, input_shape = (600,25), batch_size = 1, activation= 'tanh', return_sequences = True))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=20 ,verbose=2) prediction
prediction = model.predict(testX, batch_size = 1)
Fitting works fine, but I keep getting the following error at the prediction step:
Error when checking : expected lstm_46_input to have shape (1, 600, 25) but got array with shape (1, 10, 25)
What am I missing?
Here are my shapes:
trainX.shape = (1,600,25)
trainY.shape = (1,600,1)
testX.shape = (1,10,25)
According to Keras documentation input of LSTM (or any RNN) layers should be of shape (batch_size, timesteps, input_dim) where your input shape is
trainX.shape = (1,600,25)
So it means for training you are passing only one data with 600 timesteps and 25 features per timestep. But I got a feeling that you actually have 600 training data each having 25 timesteps and 1 feature per timestep. I guess your input shape (trainX) should be 600 x 25 x 1. Train target (trainY) should be 600 x 1 If my assumption is right then your test data should be of shape 10 x 25 x 1. First LSTM layer should be written as
model.add(LSTM(128, input_shape = (25,1), batch_size = 1, activation= 'tanh', return_sequences = False))
If your training data is in fact (1,600,25) what this means is you are unrolling the LSTM feedback 600 times. The first input has an impact on the 600th input. If this is what you want, you can use the Keras function "pad_sequences" to add append zeros to the test matrix so it has the shape (1,600,25). The network should predict zeros and you will need to add 590 zeros to your testY.
If you only want say 10 previous timesteps to affect your current Y prediction, then you will want to turn your trainX into shape (590,10,25). The input line will be something like:
model.add(LSTM(n_hid, stateful=True, return_sequences=False, batch_input_shape=(1,nTS,x_train.shape[2])))
The processing to get it in the form you want could be something like this:
def formatTS(XX, yy, window_length):
x_train = np.zeros((XX.shape[0]-window_length,window_length,XX.shape[1]))
for i in range(x_train.shape[0]):
x_train[i] = XX[i:i+window_length,:]
y_train = yy[window_length:]
return x_train, y_train
Then your testing will work just fine since it is already in the shape (1,10,25).