How to build a deep learning text classifier using convolutional neural networks (python) - python-3.x

What are the steps I would need to take to build a deep learning text classifier, more specifically a text classifier that identifies an author (authorship attribution) in a set of unlabeled texts? The model I am looking at using is word-word CNN (convolutional neural network) which has proven to be very successful in things such as text classification. I am looking to build this model in python.
I am new to deep learning so any resources and information is appreciated.

An End to End Example to demonstrate how to use Convolutional Neural Networks for Text Classification using Tensorflow.Keras is shown below:
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence
max_features = 10000
max_len = 500
print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=max_len)
x_test = sequence.pad_sequences(x_test, maxlen=max_len)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
model = Sequential()
model.add(layers.Embedding(max_features, 128, input_length=max_len))
model.add(layers.Conv1D(32, 7, activation='relu'))
model.add(layers.MaxPooling1D(5))
model.add(layers.Conv1D(32, 7, activation='relu'))
model.add(layers.GlobalMaxPooling1D())
model.add(layers.Dense(1))
model.summary()
model.compile(optimizer=RMSprop(lr=1e-4),
loss='binary_crossentropy',
metrics=['acc'])
history = model.fit(x_train,y_train,epochs=10,batch_size=128,validation_split=0.2)
For more information please refer Section 6.4 Sequence processing with convnets in the book, Deep Learning Using Python written by Francois Chollet, the Father of Keras.
Hope this helps. Happy Learning!

Related

Issues with eli5 for feature importance in Keras

The end goal is to determine important features in a Neural Network model built within tensorflow keras OR kerasRegressor. The logic has been explained in this question, which utilizes eli5 by introducing noise for variables and measuring outcome.
I have been attempting for hours to implement this with no luck.
Question:
Why won't eli5 for feature importance work on either of my models?
My Current Error:
ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets
I have built the model in both tf.keras && kerasRegressor reading somewhere that eli5 doesn't work with tensorflow.keras. I admit I do not truly understanding the difference b/n kerasRegressor & tf.keras.
Code for Keras Regressor Model:
def base_model():
# 1- Instantiate Model
modelNEW = keras.Sequential()
# 2- Specify Shape of First Layer
modelNEW.add(layers.Dense(512, activation = 'relu', input_shape = ourInputShape))
# 3- Add the layers
modelNEW.add(layers.Dense(3, activation= 'softmax')) #softmax returns array of probability scores (num prior), and in this case we have to predict either CSCANCEL, MEMBERCANCEL, ACTIVE)
modelNEW.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
return modelNEW
# *** THIS IS SUPPOSED TO PREVENT OVERFITTING ***
from tensorflow.keras.callbacks import EarlyStopping
callbacks = [
EarlyStopping(patience=2)
]
yTrain = keras.utils.to_categorical(yTrain, 3)
yValidation = keras.utils.to_categorical(yValidation, 3)
currentModel = KerasRegressor(build_fn=base_model, epochs=100, batch_size=50, shuffle='True')
history = currentModel.fit(xTrain, yTrain)
Code for tf.Keras model:
Only change is model name
modelNEW = keras.Sequential()
modelNEW.add(layers.Dense(512, activation = 'relu', input_shape = ourInputShape))
modelNEW.add(layers.Dense(3, activation= 'softmax')) #softmax returns array of probability scores (num prior), and in this case we have to predict either CSCANCEL, MEMBERCANCEL, ACTIVE)
modelNEW.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
*** THIS IS SUPPOSED TO PREVENT OVERFITTING ***
from tensorflow.keras.callbacks import EarlyStopping
callbacks = [
EarlyStopping(patience=2)
]
yTrain = keras.utils.to_categorical(yTrain, 3)
yValidation = keras.utils.to_categorical(yValidation, 3)
history = modelNEW.fit(xTrain, yTrain, epochs=100, batch_size=50, shuffle="True")
Attempting to implement eli5:
from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
import eli5
from eli5.sklearn import PermutationImportance
from eli5 import show_weights
perm = PermutationImportance(currentModel, scoring="accuracy", random_state=1).fit(xTrain,yTrain)
eli5.show_weights(perm, feature_names = xTrain.columns.tolist())

Is it possible to extend python imageai pretrained model for more classes?

I am working on a project which uses imageai with YOLOv3 which works fast and accurately for my purpose. However this model is able to detect only 80 classes out of which I want some of them but want to add some more classes as well.
I referred to https://imageai.readthedocs.io/en/latest/customdetection/index.html to train my own custom model with 3 more classes. However, I am unable to detect the 80 classes that were provided by YOLOv3. Is there a way to generate a model that extends the existing YOLOv3 and can detect all 80 classes + extra classes that I want?
P.S. I am new to tensorflow and imageai so I don't know too much. Please bear with me.
I have not yet found a way to extend an existing model, but i can assure you that training your own model is far more efficient than using all the classes noones wants.
If it still interests you, this person had a similar question: Loading a trained Keras model and continue training
This is his finished code example:
"""
Model by: http://machinelearningmastery.com/
"""
import numpy
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import load_model
numpy.random.seed(7)
def baseline_model():
model = Sequential()
model.add(Dense(num_pixels, input_dim=num_pixels, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
if __name__ == '__main__':
# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# flatten 28*28 images to a 784 vector for each image
num_pixels = X_train.shape[1] * X_train.shape[2]
X_train = X_train.reshape(X_train.shape[0], num_pixels).astype('float32')
X_test = X_test.reshape(X_test.shape[0], num_pixels).astype('float32')
# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
# build the model
model = baseline_model()
#Partly train model
dataset1_x = X_train[:3000]
dataset1_y = y_train[:3000]
model.fit(dataset1_x, dataset1_y, nb_epoch=10, batch_size=200, verbose=2)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Baseline Error: %.2f%%" % (100-scores[1]*100))
#Save partly trained model
model.save('partly_trained.h5')
del model
#Reload model
model = load_model('partly_trained.h5')
#Continue training
dataset2_x = X_train[3000:]
dataset2_y = y_train[3000:]
model.fit(dataset2_x, dataset2_y, nb_epoch=10, batch_size=200, verbose=2)
scores = model.evaluate(X_test, y_test, verbose=0)
print("Baseline Error: %.2f%%" % (100-scores[1]*100))

Keras Embedding layer output dimensionality

I am confused with the output dimensions specified in the embedding layer in this code snippet
from keras.datasets import imdb
from keras.preprocessing import sequence
from keras.layers import Dense
from keras.models import Sequential
from keras.layers import Embedding, SimpleRNN
max_features = 10000
maxlen = 500
batch_size = 32
print('Loading data...')
(input_train, y_train), (input_test, y_test) = imdb.load_data(num_words=max_features)
print(len(input_train), 'train sequences')
print(len(input_test), 'test sequences')
print('Pad sequences (samples x time)')
input_train = sequence.pad_sequences(input_train, maxlen=maxlen)
input_test = sequence.pad_sequences(input_test, maxlen=maxlen)
print('input_train shape:', input_train.shape)
print('input_test shape:', input_test.shape)
print(input_train)
model = Sequential()
model.add(Embedding(max_features, 32))
model.add(SimpleRNN(32))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
Since the max_features is 10000, shouldn't the Embedding have an output dimensionality of 10000?
max_features is the number of words, not the dimensionality. In your embedding layer you have 10000 words that are each represented as an embedding with dimension 32.
The output dimensionality of the embedding is the dimension of the tensor you use to represent each word. In your case, you use a 32-dimensional tensor to represent each of the 10k word you might get in your dataset.

Keras imdb sentiment model - how to predict sentiment of new sentences?

I'm working my way through the Deep Learning with Python book where there is an example for learning word embeddings for sentiment:
from keras.datasets import imdb
from keras import preprocessing
max_features = 10000
maxlen = 20
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
x_train = preprocessing.sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = preprocessing.sequence.pad_sequences(x_test, maxlen=maxlen)
from keras.models import Sequential
from keras.layers import Flatten, Dense
model = Sequential()
model.add(Embedding(10000, 8, input_length=maxlen))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
model.summary()
history = model.fit(x_train, y_train,
epochs=10,
batch_size=32,
validation_split=0.2)
I would like to pass in a sentence and make a prediction on the sentiment. My first thought was to pass in an array of indices (because that's how the works are represented in the model if I have understood correctly) such as:
import numpy as np
# does this reflect a really bad review?
model.predict(np.array([[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,]]))
[out] array([[ 0.0066505]], dtype=float32)
# does this reflect a really good review?
model.predict(np.array([[9999,9999,9999,9999,9999,9999,9999,9999,9999,9999,9999,9999,9999,9999,9999,9999,9999,9999,9999,9999 ]]))
[out] array([[ 0.64767915]], dtype=float32)
How can I pass in a list of words instead of indices? I.e. how can I retrieve a list of word indices for my new sentence?
Update - I tried to tokenize some words:
def index(word):
if word in word_index:
return word_index[word]
else:
return "0"
def sequences(words):
words = text_to_word_sequence(words)
seqs = [[index(word) for word in words if word != "0"]]
return preprocessing.sequence.pad_sequences(seqs, maxlen=maxlen)
bad_seq = sequences("Rubbish terrible awful dreadful hate stinks")
good_seq = sequences("Awesome recommended brilliant best")
print("bad movie: " + str(model.predict(bad_seq))) # 0.00759153
print("good movie: " + str(model.predict(good_seq))) # 0.00423771
The sentiment is very similar which suggests that the tokenizing approach does not work.

How do I test my own hand written digits or one of data from MNIST dataset using CLI

I'm studying machine learning and I'm totaly new with this. I have given a task to build a simple command line program that takes in a handwritten digit image,
and output prediction of which digit the computer thinks the image contains using MNIST dataset. I found a code that user keras.
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop
batch_size = 128
num_classes = 10
epochs = 20
# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
print (tf.(orange_measurement))lis[]3
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
history = model.fit(x_train, y_train,
batch_size=batch_size, epochs=epochs,
verbose=1, validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
after I execute this code how can I make it become simple CLI program that can receive picture and give me prediction what digit it is more likely.
as for example I saw in one youtube tutorial to determine flower(rose,daisy,dandalion, sunflower, and tulip) by obly executing in command:
# In Docker
python /tf_files/label_image.pyy /tf_files/flower_photos/daisy/21652746_cc379e0eea_m.jpg
after restarting docker and it'll show the confident of the computer. So what command can I use to test my own image or one imange from mnist dataset and result a prediction?
It looks like this code is learning how to identify the digits but when it's finished the model disappears. If you want to be able to use the model later you'll want to try model.save(filepath). (More information on how to save and load here:https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model)
Then you could create a separate script, image_label.py for example, that loads the model and runs the second argument through the network. You're going to need to do some preprocessing of handwritten image files to run them through a network trained for MNIST images. If you want to test it on MNIST sample images it might be a little easier.

Resources