How to get Equation of a decision boundary in matlab svm plot? - svm

my data
y n Rh y2
1 1 1.166666667 1
-1 2 0.5 1
-1 3 0.333333333 1
-1 4 0.166666667 1
1 5 1.666666667 2
1 6 1.333333333 1
-1 7 0.333333333 1
-1 8 0.333333333 1
1 9 0.833333333 1
1 10 2.333333333 2
1 11 1 1
-1 12 0.166666667 1
1 13 0.666666667 1
1 14 0.833333333 1
1 15 0.833333333 1
-1 16 0.333333333 1
-1 17 0.166666667 1
1 18 2 2
1 19 0.833333333 1
1 20 1.333333333 1
1 21 1.333333333 1
-1 22 0.166666667 1
-1 23 0.166666667 1
-1 24 0.333333333 1
-1 25 0.166666667 1
-1 26 0.166666667 1
-1 27 0.333333333 1
-1 28 0.166666667 1
-1 29 0.166666667 1
-1 30 0.5 1
1 31 0.833333333 1
-1 32 0.166666667 1
-1 33 0.333333333 1
-1 34 0.166666667 1
-1 35 0.166666667 1
my codes r
data=xlsread('btpdata.xlsx',1.)
A = data(1:end,2:3)
B = data(1:end,1)
svmStruct = svmtrain(A,B,'showplot',true)
hold on
C = data(1:end,2:3)
D = data(1:end,4)
svmStruct = svmtrain(C,D,'showplot',true)
hold off
How can i get the approximate equations of this black lines in the given mat-lab plot?

It depends what package you did use, but as it is a linear Support Vector Machine there are more or less two options:
Your trained svm contains the equation of the line in a property coefs (sometimes called w or weights) and b (or intercept), so your line is <coefs, X> + b = 0
Your svm containes alphas (dual coefficients, Lagrange multipliers) and then coefs = SUM_i alphas_i * y_i * SV_i where SV_i is i'th support vector (the ones in circles on your plot) and y_i is its label (-1 or +1). Sometimes alphas are already multiplied by y_i, then your coefs = SUM_i alphas_i * SV_i.
If you are trying to get the equation from the actual plot (image), then you can only read it (and it is more or less y = 0.6, meaning that coefs = [0 1] and b = -0.6. Image analysis based approach (for arbitrary such plot) would require:
detecting image part (object detection)
reading the ticks/scale (OCR + object detection) <- this would be actually the hardest part
filtering out everything non-black and performing linear regression to points left, then trasforming through scale detected earlier.

I have had the same problem. To build the linear equation (y = mx + b) of the decision boundary you need the gradient (m) and the y-intercept (b). SVMStruct.Bias is the b-term. The gradient is determined by the SVM beta weights, which SVMStruct does not contain so you need to calculate them from the alphas (which are included in SVMStruct):
alphas = SVMStruct.Alpha;
SV = SVMStruct.SupportVectors;
betas = sum(alphas.*SV);
m = betas(1)/betas(2)
By the way, if your SVM has scaled the data, then I think you will need to unscale it.

Related

How to recognize [1,X,X,X,1] repeating pattern in panda serie

I have a boolean column in a csv file for example:
1 1
2 0
3 0
4 0
5 1
6 1
7 1
8 0
9 0
10 1
11 0
12 0
13 1
14 0
15 1
You can see here 1 is reapting every 5 lines.
I want to recognize this repeating pattern [1,0,0,0] as soon as the repetition is above 10 in python (I have ~20.000 rows/file).
The pattern can start at any position
How could I manage this in python avoiding if .....
# Generate 20000 of 0s and 1s
data = pd.Series(np.random.randint(0, 2, 20000))
# Keep indices of 1s
idx = df[df > 0].index
# Check distance of current index with next index whether is 4 or not,
# Say if position 2 and position 6 is found as 1, so 6 - 2 = 4
found = []
for i, v in enumerate(idx):
if i == len(idx) - 1:
break
next_value = idx[i + 1]
if (next_value - v) == 4:
found.append(v)
print(found)

Use # (Copy) as selection or filter on 2-d array

The J primitive Copy (#) can be used as a filter function, such as
k =: i.8
(k>3) # k
4 5 6 7
That's essentially
0 0 0 0 1 1 1 1 # i.8
The question is if the right-hand side of # is 2-d or higher rank shaped array, how to make a selection using #, if possible. For example:
k =: 2 4 $ i.8
(k > 3) # k
I got length error
What is the right way to make such a selection?
You can use the appropriate verb rank to get something like a 2d-selection:
(2 | k) #"1 1 k
1 3
5 7
but the requested axes have to be filled with 0s (or !.) to keep the correct shape:
(k > 3) #("1 1) k
0 0 0 0
4 5 6 7
(k > 2) #("1 1) k
3 0 0 0
4 5 6 7
You have to better define select for dimensions > 1 because now you have a structure. How do you discard values? Do you keep empty "cells"? Do you replace with 0s? Is structure important for the result?
If, for example, you only need the "values where" then just ravel , the array:
(,k > 2) # ,k
3 4 5 6 7
If you need to "replace where", then you can use amend }:
u =: 5 :'I. , 5 > y' NB. indices where 5 > y
0 u } k
0 0 0 0
0 5 6 7
z =: 3 2 4 $ i.25
u =: 4 :'I. , (5 > y) +. (0 = 3|y)' NB. indices where 5>y or 3 divides y
_999 u } z
_999 _999 _999 _999
_999 5 _999 7
8 _999 10 11
_999 13 14 _999
16 17 _999 19
20 _999 22 23

Trying to learn Python; code leads to infinite loop and I can't figure out why?

I am trying to learn Python and I am trying to run a random walk that plots the points. I have tried de-bugging this myself but I cannot figure out where this is going wrong. I apologise since this seems like a really simple problem but I am getting frustrated.
One file rw_visual.py sets things up and then calls the other file random_walk.py to generate the points in the walk.
rw_visual.py:
enter image description here
random_walk.py:
enter image description here
In debugging, rw_visual.py seems to run until it tries to run the command "rw.fill_walk()" and then it hangs. This tells me that there is something wrong in the while loop in random_walk.py causing this. As hard as I try, I cannot figure it out thought.
Sorry for the very basic question.
Python indentation implies scope. By getting the indentation of your while loop (and all it should contain) correct, I think this is producing the results you're looking for I left out the "graphical" part and just printed the x and y coordinates as a result of the random walk. You can take over the graphical part from here.
from random import choice
class RandomWalk():
def __init__(self, num_points=50):
self.num_points = num_points
self.x_values = [0]
self.y_values = [0]
def fill_walk(self):
while len(self.x_values) < self.num_points:
x_direction = choice([1, -1])
x_distance = choice([0,1,2,3,4])
x_step = x_direction * x_distance
y_direction = choice([1, -1])
y_distance = choice([0,1,2,3,4])
y_step = y_direction * y_distance
if x_step == 0 and y_step == 0:
continue
next_x = self.x_values[-1] + x_step
next_y = self.y_values[-1] + y_step
print (str(next_x) + " " + str(next_y))
self.x_values.append(next_x)
self.y_values.append(next_y)
rw = RandomWalk()
rw.fill_walk()
RESULTS
-2 -3
1 0
-2 0
-1 1
-1 -3
1 -1
4 0
0 0
0 4
0 5
3 5
1 3
1 4
1 3
-2 4
-3 7
0 7
1 7
-2 5
-2 1
-3 1
-1 0
-4 3
-3 5
0 9
3 7
3 4
-1 5
1 8
4 10
6 11
6 7
9 9
13 10
12 10
12 11
9 9
12 10
16 11
15 7
14 6
14 3
16 2
18 2
15 0
13 -2
12 -1
8 1
12 1

Does convolution in Theano rotate the filters?

I have an 3-channel 5-by-5 image like this:
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
And a 3-channel 3-by-3 filter like this:
10 20 30 0.1 0.2 0.3 1 2 3
40 50 60 0.4 0.5 0.6 4 5 6
70 80 90 0.7 0.8 0.9 7 8 9
When convolve the image with the filter, I am expecting this output:
369.6 514.8 316.8
435.6 594. 356.4
211.2 277.2 158.4
However, Theano (using keras) gives me this output:
158.4 277.2 211.2
356.4 594. 435.6
316.8 514.8 369.6
It seems the output is rotated 180 degrees, I wonder why this happens and how can I get the correct answer. Here is my test code:
def SimpleNet(weight_array,biases_array):
model = Sequential()
model.add(ZeroPadding2D(padding=(1,1),input_shape=(3,5,5)))
model.add(Convolution2D(1, 3, 3, weights=[weight_array,biases_array],border_mode='valid',subsample=(2,2)))
return model
im = np.asarray([
1,1,1,1,1,
1,1,1,1,1,
1,1,1,1,1,
1,1,1,1,1,
1,1,1,1,1,
2,2,2,2,2,
2,2,2,2,2,
2,2,2,2,2,
2,2,2,2,2,
2,2,2,2,2,
3,3,3,3,3,
3,3,3,3,3,
3,3,3,3,3,
3,3,3,3,3,
3,3,3,3,3])
weight_array = np.asarray([
10,20,30,
40,50,60,
70,80,90,
0.1,0.2,0.3,
0.4,0.5,0.6,
0.7,0.8,0.9,
1,2,3,
4,5,6,
7,8,9])
im = np.reshape(im,[1,3,5,5])
weight_array = np.reshape(weight_array,[1,3,3,3])
biases_array = np.zeros(1)
model = SimpleNet(weight_array,biases_array)
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy')
out = model.predict(im)
print out.shape
print out
This is the definition of convolution. It has the advantage that if you convolve an image that consists of only zeros except for one single 1 somewhere, the convolution will place a copy of the filter at that position.
Theano does exactly these convolutions, as defined mathematically. This implies flipping the filters (the operation is filter[:, :, ::-1, ::-1]) before taking dot products with the image patches. Note that these are not rotations by 180 degrees, at least not in general.
It appears that what you are looking for is cross-correlation, which is taking dot products with the non-flipped versions of the filters at each point of the image.
See also this answer in which theano.tensor.nnet.conv2d is shown to do exactly the same thing as the scipy counterpart.

Empty square for legend for stackplot

I'm trying to generate a stack plot of version data using matplotlib. I have that portion working and displaying properly, but I'm unable to get the legend to display anything other than an empty square in the corner.
ra_ys = np.asarray(ra_ys)
# Going to generate a stack plot of the version stats
fig = plt.figure()
ra_plot = fig.add_subplot(111)
# Our x axis is going to be the dates, but we need them as numbers
x = [date2num(date) for date in dates]
# Plot the data
ra_plot.stackplot(x, ra_ys)
# Setup our legends
ra_plot.legend(ra_versions) #Also tried converting to a tuple
ra_plot.set_title("blah blah words")
print(ra_versions)
# Only want x ticks on the dates we supplied, and want them to display AS dates
ra_plot.set_xticks(x)
ra_plot.set_xticklabels([date.strftime("%m-%d") for date in dates])
plt.show()
ra_ys is a multidimensional array:
[[ 2 2 2 2 2 2 2 2 2 2 1]
[ 1 1 1 1 1 1 1 1 1 1 1]
[ 1 1 1 1 1 1 1 1 1 1 1]
[53 52 51 50 50 49 48 48 48 48 47]
[18 19 20 20 20 20 21 21 21 21 21]
[ 0 0 12 15 17 18 19 19 19 19 22]
[ 5 5 3 3 3 3 3 3 3 3 3]
[ 4 4 3 3 2 2 2 2 2 2 2]
[14 14 6 4 3 3 2 2 2 2 2]
[ 1 1 1 1 1 1 1 1 1 1 1]
[ 1 1 1 1 1 1 1 1 1 1 1]
[ 1 1 1 1 1 1 1 1 1 1 1]
[ 2 2 2 2 2 2 2 2 2 2 2]
[ 1 1 1 1 1 1 1 1 1 1 1]
[ 1 1 1 1 1 1 1 1 1 1 1]
[ 3 3 2 2 2 2 2 2 2 2 2]]
x is some dates: [734969.0, 734970.0, 734973.0, 734974.0, 734975.0, 734976.0, 734977.0, 734978.0, 734979.0, 734980.0, 734981.0]
ra_versions is a list: ['4.5.2', '4.5.7', '4.5.8', '5.0.0', '5.0.1', '5.0.10', '5.0.7', '5.0.8', '5.0.9', '5.9.105', '5.9.26', '5.9.27', '5.9.29', '5.9.31', '5.9.32', '5.9.34']
Am I doing something wrong? Can stack plots not have legends?
EDIT: I tried to print the handles and labels for the plot and got two empty lists ([] []):
handles, labels = theplot.get_legend_handles_labels()
print(handles,labels)
I then tested the same figure using the follow code for a proxy handle and it worked. So it looks like the lack of handles is the problem.
p = plt.Rectangle((0, 0), 1, 1, fc="r")
theplot.legend([p], ['test'])
So now the question is, how can I generate a variable number of proxy handles that match the colors of my stack plot?
This is the final (cleaner) approach to getting the legend. Since there are no handles, I generate proxy artists for each line. It's theoretically capable of handling cases where colors are reused, but it'll be confusing.
def plot_version_data(title, dates, versions, version_ys, savename=None):
print("Prepping plot for \"{0}\"".format(title))
fig = plt.figure()
theplot = fig.add_subplot(111)
# Our x axis is going to be the dates, but we need them as numbers
x = [date2num(date) for date in dates]
# Use these colors
colormap = "bgrcmy"
theplot.stackplot(x, version_ys, colors=colormap)
# Make some proxy artists for the legend
p = []
i = 0
for _ in versions:
p.append(plt.Rectangle((0, 0), 1, 1, fc=colormap[i]))
i = (i + 1) % len(colormap)
theplot.legend(p, versions)
theplot.set_ylabel(versions) # Cheating way to handle the legend
theplot.set_title(title)
# Setup the X axis - rotate to keep from overlapping, display like Oct-16,
# make sure there's no random whitespace on either end
plt.xticks(rotation=315)
theplot.set_xticks(x)
theplot.set_xticklabels([date.strftime("%b-%d") for date in dates])
plt.xlim(x[0],x[-1])
if savename:
print("Saving output as \"{0}\"".format(savename))
fig.savefig(os.path.join(sys.path[0], savename))
else:
plt.show()

Resources