I want to classify the Cifar-10 dataset using Hierarchical SVM. I know CNN is best choice but I need to preprocess this data and then use hierarchical SVM. I saw one of the post hierarchical classification with SVM but I am still confused for cifar10. I tried the following code for one level of hierarchy but it doesn't satisfy me as I am getting an accuracy of 90% only. See the code below. Any help would be highly appreciated.
rootFolder = 'cifar10Train';
categories = {'Deer','Dog','Frog','Cat','truck','ship','airplane','horse',...
'bird','automobile'};
imds = imageDatastore(fullfile(rootFolder, categories), 'LabelSource',...
'foldernames');
%Load test data
rootFolder = 'cifar10Test';
imds_test = imageDatastore(fullfile(rootFolder, categories), ...
'LabelSource', 'foldernames');
% Hierarchical SVM
% data generation suffix 'T' is used for test dataset
Y = imds.Labels;
YT = imds_test.Labels;
L = length(Y);
LT = length(YT);
X = zeros(32*32*3,L);
XT = zeros(32*32*3,LT);
% New labels for hierarchy
Y1 = (Y=='Deer');
Y2 = (Y=='Dog');
Y3 = (Y=='Frog');
Y4 = (Y=='Cat');
Y5 = (Y=='truck');
Y6 = (Y=='ship');
Y7 = (Y=='airplane');
Y8 = (Y=='horse');
Y9 = (Y=='bird');
Y10= (Y=='automobile');
% for test dataset
Y1T = (YT=='Deer');
Y2T = (YT=='Dog');
Y3T = (YT=='Frog');
Y4T = (YT=='Cat');
Y5T = (YT=='truck');
Y6T = (YT=='ship');
Y7T = (YT=='airplane');
Y8T = (YT=='horse');
Y9T = (YT=='bird');
Y10T= (YT=='automobile');
% train Samples
for i=1:L
img = readimage(imds,i);
X(:,i) = double(img(:));
end
% test data
for i=1:LT
img = readimage(imds_test,i);
XT(:,i) = double(img(:));
end
%First Linear classification
c1 = fitclinear(X',Y1);
pred1 = predict(c1,XT');%
Acc = sum(pred1==Y1T)/LT;'''
I have written another compressed code for it. But the thing that confuse me is that I will be 9 classifiers not a one. Should I expect one classifier which can classify hierarchically all classes.
function [Classifier Accuracy] = HSVM_mine(Num_of_Classes)
% training data rows consitute samples and column features
[Xtrain,Ytrain,Xtest,Ytest] = data_generate(filename_train,filename_test);
Accuracy = zeros(Num_of_Classes,1)
Classifier = {};
Labels = {'class 1','class 2', 'class 3'...}
Num_of_TrainSamples_per_class= m;
m= size(Xtrain,1)/Num_of_Classes;
Num_of_TestSamples_per_class= n;
n= size(Xtest,1)/Num_of_Classes;
for i=1:Num_of_Classes
X1 = Xtrain((1+(i-1)*m):size(Xtrain,1));
Y1 = (Yrain((1+(i-1)*m):size(Xtrain,1))==Ytrain(Labels(i));
XT = Xtest((1+(i-1)*n):size(Xest,1));
YT = (Yest((1+(i-1)*n):size(Xest,1))==Ytest(Labels(i));
Classifier{i} = fitclinear(X1,Y1)
Pred = predict(Classifier{i},XT);
Accuracy(i) = sum(Pred==YT)/length(YT)
end
Related
I am trying to implement parts of Facebook's prophet with some help from this example.
https://github.com/luke14free/pm-prophet/blob/master/pmprophet/model.py
This goes well :), but I am having some problems with the dot product I don't understand. Note that I am implementing the linear trends.
ds = pd.to_datetime(df['dagindex'], format='%d-%m-%y')
m = pm.Model()
changepoint_prior_scale = 0.05
n_changepoints = 25
changepoints = pd.date_range(
start=pd.to_datetime(ds.min()),
end=pd.to_datetime(ds.max()),
periods=n_changepoints + 2
)[1: -1]
with m:
# priors
sigma = pm.HalfCauchy('sigma', 10, testval=1)
#trend
growth = pm.Normal('growth', 0, 10)
prior_changepoints = pm.Laplace('changepoints', 0, changepoint_prior_scale, shape=len(changepoints))
y = np.zeros(len(df))
# indexes x_i for the changepoints.
s = [np.abs((ds - i).values).argmin() for i in changepoints]
g = growth
x = np.arange(len(ds))
# delta
d = prior_changepoints
regression = x * g
base_piecewise_regression = []
for i in s:
local_x = x.copy()[:-i]
local_x = np.concatenate([np.zeros(i), local_x])
base_piecewise_regression.append(local_x)
piecewise_regression = np.array(base_piecewise_regression)
# this dot product doesn't work?
piecewise_regression = pm.math.dot(theano.shared(piecewise_regression).T, d)
# If I comment out this line and use that one as dot product. It works fine
# piecewise_regression = (piecewise_regression.T * d[None, :]).sum(axis=-1)
regression += piecewise_regression
y += regression
obs = pm.Normal('y',
mu=(y - df.gebruikers.mean()) / df.gebruikers.std(),
sd=sigma,
observed=(df.gebruikers - df.gebruikers.mean()) / df.gebruikers.std())
start = pm.find_MAP(maxeval=10000)
trace = pm.sample(500, step=pm.NUTS(), start=start)
If I run the snippet above with
piecewise_regression = (piecewise_regression.T * d[None, :]).sum(axis=-1)
the model works as expected. However I cannot get it to work with a dot product. The NUTS sampler doesn't sample at all.
piecewise_regression = pm.math.dot(theano.shared(piecewise_regression).T, d)
EDIT
Ive got a minimal working example
The problem still occurs with theano.shared. I’ve got a minimal working example:
np.random.seed(5)
n_changepoints = 10
t = np.arange(1000)
s = np.sort(np.random.choice(t, size=n_changepoints, replace=False))
a = (t[:, None] > s) * 1
real_delta = np.random.normal(size=n_changepoints)
y = np.dot(a, real_delta) * t
with pm.Model():
sigma = pm.HalfCauchy('sigma', 10, testval=1)
delta = pm.Laplace('delta', 0, 0.05, shape=n_changepoints)
g = tt.dot(a, delta) * t
obs = pm.Normal('obs',
mu=(g - y.mean()) / y.std(),
sd=sigma,
observed=(y - y.mean()) / y.std())
trace = pm.sample(500)
It seems to have something to do with the size of matrix a. NUTS doesnt’t sample if I start with
t = np.arange(1000)
however the example above does sample when I reduce the size of t to:
t = np.arange(100)
Using "scipy.optimize.curve_fit" we can determine the fit parameters for a curve fit on x and y using
popt, pcov = curve_fit(func, xdata, ydata)
In the documentation for this function, they state that: To compute one standard deviation errors on the parameters use
perr = np.sqrt(np.diag(pcov))
Here's a link to the documentation I was reading. https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
What if I want to compute something more general than simply 1 standard deviation on the errors of the parameters? In particular, what If I'm looking for, say, 2 standard deviations (a 95% confidence interval on the parameters).
To be clear, I'm not looking for a 10 line+ solution. I already know how to compute these errors in a "hackish" way for a linear function:
def get_slope_params(data1, data2):
x_mean = mean(data1)
y_mean = mean(data2)
N = len(data1)
sum_xy = 0
for (x, y) in zip(data1, data2):
sum_xy = sum_xy + x*y
sum_xsq = 0
for x in data1:
sum_xsq = sum_xsq + x*x
b = (sum_xy-N*x_mean*y_mean)/(sum_xsq-N*x_mean**2)
a = y_mean - b*x_mean
return (a,b)
# 95%
def get_slope_params_uncertainties(data1, data2):
N = len(data1)
a, b = get_slope_params(data1, data2)
y_approx = a+b*data1
s_eps = 0
for (y, y_app) in zip(data2, y_approx):
s_eps = s_eps + (y-y_app)**2
s_eps = np.sqrt(s_eps/(N-2))
s_x = np.sqrt(cov(data1, data1))
delta_b = (1/np.sqrt(N-1))*(s_eps/s_x)*sp.stats.t.ppf(1-0.05/2, N-2)
delta_a = mean(data1)*delta_b
return delta_a, delta_b
What I'd like is a function already implemented entirely by scipy.
I want to implement the
loss function defined here.
I use fcn-VGG16 to obtain a map x, and add a activation layer.(x is the output of the fcn vgg16 net). And then just some operations to get extracted features.
co_map = Activation('sigmoid')(x)
#add mean values
img = Lambda(AddMean, name = 'addmean')(img_input)
#img map multiply
img_o = Lambda(HighLight, name='highlightlayer1')([img, co_map])
img_b = Lambda(HighLight, name='highlightlayer2')([img, 1-co_map])
extractor = ResNet50(weights = 'imagenet', include_top = False, pooling = 'avg')
extractor.trainable = False
extractor.summary()
o_feature = extractor(img_o)
b_feature = extractor(img_b)
loss = Lambda(co_attention_loss,name='name')([o_feature,b_feature])
model = Model(inputs=img_input, outputs= loss ,name='generator')
The error i get is at this line model = Model(inputs=img_input, outputs= loss ,name='generator')
I think is because the way i calculate the loss makes it not an accepted output to keras models.
def co_attention_loss(args):
loss = []
o_feature,b_feature = args
c = 2048
for i in range(5):
for j in range(i,5):
if i!=j:
print("feature shape : "+str(o_feature.shape))
d1 = K.sum(K.pow(o_feature[i] - o_feature[j],2))/c
d2 = K.sum(K.pow(o_feature[i] - b_feature[i],2))
d3 = K.sum(K.pow(o_feature[j] - b_feature[j],2))
d4 = d2 + d3/(2*c)
p = K.exp(-d1)/K.sum([K.exp(-d1),K.exp(-d4)])
loss.append(-K.log(p))
return K.sum(loss)
How can i modify my loss function to make this work?
loss = Lambda(co_attention_loss,name='name')([o_feature,b_feature])
means the args you input is a list, but you call args as a tuple
o_feature,b_feature = args
you could change the loss code to
def co_attention_loss(args):
loss = []
o_feature = args[0]
b_feature = args[1]
c = 2048
for i in range(5):
for j in range(i,5):
if i!=j:
print("feature shape : "+str(o_feature.shape))
d1 = K.sum(K.pow(o_feature[i] - o_feature[j],2))/c
d2 = K.sum(K.pow(o_feature[i] - b_feature[i],2))
d3 = K.sum(K.pow(o_feature[j] - b_feature[j],2))
d4 = d2 + d3/(2*c)
p = K.exp(-d1)/K.sum([K.exp(-d1),K.exp(-d4)])
loss.append(-K.log(p))
return K.sum(loss)
NOTICE: NOT TEST
I am training multiclass logistic regression for handwritting recognition.For function minimization i am using fmin_tnc.
I have implemented gradient function as follows:
def gradient(theta,*args):
X,y,lamda = args;
m = np.size(X,0);
h = X.dot(theta);
grad = (1/m) * X.T.dot( sigmoid(h)-y );
grad[1:np.size(grad),] = grad[1:np.size(grad),] + (lamda/
m)*theta[1:np.size(theta),] ;
return grad.flatten()
#flattened because fmin_tnc accepts list of gradients
This yields correct gradient values for small set example provided below:
theta_t = np.array([[-2],[-1],[1],[2]]);
X_t = np.array([[1,0.1,0.6,1.1],[1,0.2,0.7,1.2],[1,0.3,0.8,1.3],
[1,0.4,0.9,1.4],[1,0.5,1,1.5]])
y_t = np.array([[1],[0],[1],[0],[1]])
lamda_t = 3
But when using checkgrad function from scipy its giving error of 0.6222474393497573
I am not able to trace why this is happening.Because of this may be fmin_tnc is not performing any optimization and always gives optimized parameters equal to initial parameters given.
fmin_tnc function call is as follows:
optimize.fmin_tnc(func=lrcostfunction, x0=initial_theta,fprime = gradient,args=
(X,tmp_y.flatten(),lamda))
As y and theta passed is of form 1-d array having size(n,) it should be converted to 2-d array having size (n,1).This is because 2-d array form is used in gradient function implementation.
Correct implementation is as follow:
def gradient(theta,*args):
#again y and theta reshaped for same reason
X,y,lamda = args;
l = np.size(X,1);
theta = np.reshape(theta,(l,1));
m = np.size(X,0);
y = np.reshape(y,(m,1));
h = sigmoid( X.dot(theta) );
grad = (1/m) * X.T.dot( h-y );
grad[1:np.size(grad),] = grad[1:np.size(grad),] +
(lamda/m)*theta[1:np.size(theta),] ;
return grad.ravel()
trainingSet = imageSet(imgFolder1, 'recursive');
% Using HOG Features
cellSize = [4 4];
hogFeatureSize = length(hog_4x4);
%% Train a Digit Classifier
trainingFeatures = [];
trainingLabels = [];
for digit = 1:numel(trainingSet)
numImages = trainingSet(digit).Count;
features = zeros(numImages, hogFeatureSize, 'single');
for i = 1:numImages
img = read(trainingSet(digit), i);
% Apply pre-processing steps
lvl = graythresh(img);
img = im2bw(img, lvl);
features{i}= extractHOGFeatures(img, 'CellSize', cellSize);
end
labels= repmat(trainingSet(digit).Description, numImages, 1);
trainingFeatures=[training features; features];
trainingLabels=[trainingLabels; labels];
end
Classifier=fitcecoc(trainingFeatures, training labels);
I am trying to train an svm classifier with this code above.
I get error at the classifier line, saying
Error in fitcecoc: obj=fit(temp, X, Y);
...X must be a numeric matrix
I don't know what to do. Pls help