How to use a portion of the data? - python-3.x

I want to use a portion randomly from the MNIST dataset. Can you help me please? Now the output shape (i.e. Out) is 60000 but I want to get about 2000:
import matplotlib.pyplot as plt
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784) / 255
x_test = x_test.reshape(10000, 784) / 255
x_train.shape # Out: (60000, 748)

Just take a slice of x_train:
new_x_train = x_train[:2000]
If the data in x_train is ordered (i.e. digits of class 1, then class 2, etc.), then you should first shuffle the data and then slice it:
import numpy as np
indices = np.arange(x_train.shape[0])
np.random.shuffle(indices)
x_train = x_train[indices]
Read more about slicing in numpy documentation.

Related

Numpy Python: Exception: Data must be 1-dimensional

Getting exception Exception: Data must be 1-dimensional
using NumPy in Python 3.7
Same code is working for others but not in my case. Bellow is my code please help
Working_code_in_diff_system
Same_code_not_working_in_my_system
import numpy as np
from sklearn import linear_model
from sklearn.model_selection import train_test_split
import seaborn as sns
from sklearn import metrics
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.read_csv('./Data/new-data.csv', index_col=False)
x_train, x_test, y_train, y_test = train_test_split(df['Hours'], df['Marks'], test_size=0.2, random_state=42)
sns.jointplot(x=df['Hours'], y=df['Marks'], data=df, kind='reg')
x_train = np.reshape(x_train, (-1,1))
x_test = np.reshape(x_test, (-1,1))
y_train = np.reshape(y_train, (-1,1))
y_test = np.reshape(y_test, (-1,1))
#
print('Train - Predictors shape', x_train.shape)
print('Test - Predictors shape', x_test.shape)
print('Train - Target shape', y_train.shape)
print('Test - Target shape', y_test.shape)
Expected output should be
Train - Predictors shape (80, 1)
Test - Predictors shape (20, 1)
Train - Target shape (80, 1)
Test - Target shape (20, 1)
As output getting exception Exception: Data must be 1-dimensional
I think you need to call np.reshape on the underlying numpy array rather than on the Pandas series - you can do this using .values:
x_train = np.reshape(x_train.values, (-1, 1))
Repeat the same idea for the next three lines.
Or, if you are on a recent version of Pandas >= 0.24, to_numpy is preferred:
x_train = np.reshape(x_train.to_numpy(), (-1, 1))
numpy.squeeze() removes all dimensions of size 1 from a NumPy array.
x_train = numpy.squeeze(x_train)
Converts a (80,1) array to (80,)

image preprocessing is not working in vgg16

I am learning image classification using transfer learning(vgg16) and I am using inbuilt fashion mnist dataset of keras.
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
to preprocess the data for vgg16, I used the below commands by importing preprocess_input from keras.applications.vgg16
X_train = preprocess_input(x_train)
X_test = preprocess_input(x_test)
train_features = vgg16.predict(np.array(X_train), batch_size=256, verbose=1)
test_features = vgg16.predict(np.array(X_test), batch_size=256, verbose=1)
but I am getting the below error
ValueError: Error when checking input: expected input_1 to have 4 dimensions, but got array with shape (60000, 28, 28)
I am using keras2.2.4, pip 19.0.3
Fashion mnist dataset has grayscale images it means it has only single channel in depth and VGG16 is trained with RGB images with 3 channels in depth. According to your error you can not use VGG16 with single channel input. To use VGG16 for fashion mnist dataset you have to read images as three channel. You can further process your X_train and X_test as follows using np.stack:
import numpy as np
X_train = np.stack((X_train,)*3, axis=-1)
X_test = np.stack((X_test,)*3, axis=-1)
VGG accepts a minimum of 32 and max of 224, which can be seen here, to reshape this, we can do
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1) # converting it to (,28x28x1)
x_train = np.pad(x_train, ((0,0),(2,2),(2,2),(0,0)), 'constant',constant_values=(0, 0)) # converting it to min (,32x32x1)
x_train = np.stack((x_train,)*3, axis=-1) # (,32,32,1,3)
x_train = x_train[:,:,:,0,:] # (,32,32,1)
y_train = keras.utils.to_categorical(y_train, num_classes)
This can be used easily for .fit(), .evaluate() and .predict() in keras without the need to convert it into tensor data and write generators.

ValueError: cannot resize this array: it does not own its data

ValueError: cannot resize this array: it does not own its data
from keras.datasets import cifar10
import numpy as np
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X_train1 = X_train.copy().ravel()
y_train1 = y_train.copy().ravel()
X_train2 = X_train1.resize(64*64*500)
y_train2 = y_train1.resize(64*64*500)
X_train = X_train2.resize(64*64*500).reshape(64, 64, 1)
y_train = y_train2.resize(64*64*500).reshape(64, 64, 1)
Why am I getting this error after explicitly copying the data ? How to fix this ?
Using reshape in numpy, Change lines to
X_train2 = np.resize(X_train1, 64*64*500)
y_train2 = np.resize(y_train1, 64*64*500)

Error in creating h5 file (hdf file)

For below code i have save models weights in mnist_weights1234.h5. and want to create same file like mnist_weights1234.h5 with same layer configuration
import keras
from __future__ import print_function
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import numpy as np
from sklearn.model_selection import train_test_split
batch_size = 128
num_classes = 3
epochs = 1
# input image dimensions
img_rows, img_cols = 28, 28
#Just for reducing data set
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x1_train=x_train[y_train==0]; y1_train=y_train[y_train==0]
x1_test=x_test[y_test==0];y1_test=y_test[y_test==0]
x2_train=x_train[y_train==1];y2_train=y_train[y_train==1]
x2_test=x_test[y_test==1];y2_test=y_test[y_test==1]
x3_train=x_train[y_train==2];y3_train=y_train[y_train==2]
x3_test=x_test[y_test==2];y3_test=y_test[y_test==2]
X=np.concatenate((x1_train,x2_train,x3_train,x1_test,x2_test,x3_test),axis=0)
Y=np.concatenate((y1_train,y2_train,y3_train,y1_test,y2_test,y3_test),axis=0)
# the data, shuffled and split between train and test sets
x_train, x_test, y_train, y_test = train_test_split(X,Y)
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Conv2D(1, kernel_size=(2, 2),
activation='relu',
input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(16,16)))
model.add(Flatten())
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.save_weights('mnist_weights1234.h5')
Now i want to create file like mnist_weights.h5. So i use below code and getting error.
hf = h5py.File('mnist_weights12356.h5', 'w')
hf.create_dataset('conv2d_2/conv2d_2/bias', data=weights[0])
hf.create_dataset('conv2d_2/conv2d_2/kernel', data=weights[1])
hf.create_dataset('dense_2/dense_2/bias', data=weights[2])
hf.create_dataset('dense_2/dense_2/kernel', data=weights[3])
hf.create_dataset('flatten_2', data=None)
hf.create_dataset('max_pooling_2d_2', data=None)
hf.close()
But getting following error:TypeError: One of data, shape or dtype must be specified.
How to solve problem
If you want to use weights that are in numpy arrays, simply set the weights in the layers:
model.get_layer('conv2d_2').set_weights([weights[1],weights[0]])
model.get_layer('dense_2').set_weights([weights[3],weights[2]])
If your arrays are stored in files:
array = numpy.load('arrayfile.npy')
You can save the entire model weights as numpy arrays:
numpy.save('weights.npy', model.get_weights())
model.set_weights(numpy.load('weights.npy'))
The error message has your solution. In these lines:
hf.create_dataset('flatten_2', data=None)
hf.create_dataset('max_pooling_2d_2', data=None)
You are giving data equals to None. To create a dataset, the HDF5 library needs a minimum information, and as the error says, you either need to give a dtype (the data type of the dataset' elements), or a non-None data parameter (to infer the shape), or a shape parameter. You are giving none of these, so the error is correct.
Just give enough information in the create_dataset call for a dataset ti be created.

Image plotting - after processing

I am studying the Keras package for deep learning, and found a nice code example on https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py that nicely integrates image pre-processing (e.g. rotations and shifts). I was wondering - is there an easy to plot the training images after pre-processing to observe the impact of these rotations and shifts?
You can save the generated images to the disk by giving save_to_dir='path_to_dir' to the flow() function of the data generator.
Yes it is possible to plot images. For example in case of MNIST dataset:
from keras.datasets import mnist
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28)
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
datagen = ImageDataGenerator(horizontal_flip=True, vertical_flip=True)
datagen.fit(X_train)
for X_batch, y_batch in datagen.flow(X_train, y_train, batch_size=9):
# grid of 3x3 images
for i in range(0, 9):
pyplot.subplot(330 + 1 + i)
pyplot.imshow(X_batch[i].reshape(28, 28), cmap=pyplot.get_cmap('gray'))
pyplot.show()
break
For more details refer this link.

Resources