How can I implement this locally-connected architecture in Keras? - keras

My input data looks like this:
> x <- rnorm(10*9, sd = 10) %>% matrix(10) %>% round
> colnames(x) <- c(paste0(2014, c("a","b", "c")), paste0(2015, c("a","b", "c")), paste0(2016, c("a","b", "c")))
> x
2014a 2014b 2014c 2015a 2015b 2015c 2016a 2016b 2016c
[1,] 1 -11 3 3 6 5 17 5 15
[2,] 9 8 0 -1 10 8 -3 -11 6
[3,] -6 22 -3 1 -1 -4 -3 11 -9
[4,] 10 -15 0 -2 4 14 11 -11 3
[5,] 5 4 5 5 15 -9 2 5 1
[6,] -24 16 9 -7 2 -12 1 18 -2
[7,] 1 13 5 -14 1 -10 15 -1 14
[8,] -8 4 4 -15 -1 -20 -6 14 5
[9,] 10 19 -15 15 -4 3 -1 -11 8
[10,] 10 -11 -9 -1 16 3 24 -8 4
My outcome variable is continuous (i.e.: this is a regression problem).
I want to fit a model with an architecture that looks like this:
Basically, I've got granular data from separate years that aggregate to form a set of annual phenomena, which may themselves interact. If I had enough data, I could just fit a bunch of fully-connected layers. But those would be inefficient with my modest sample size.
This isn't exactly a conv net, because I don't want the "tiles" to overlap.
I also want apply both dropout and a global L2 penalty.
I'm new to Keras, but not to neural nets. How can I implement this, and how is it referred-to in Keras terminology?

You can use the functional API to have multiple inputs and create that computation graph. Something along the lines of:
inputs = [Input(shape=(3,)) for _ in range(3)]
latents = list()
for i in range(3):
latent = Dense(3, activation='relu')(inputs[i])
latent = Dense(3, activation='relu')(latent)
latents.append(latent)
merged = concatenate(latents)
out = Dense(4, activation='relu')(merged)
out = Dense(4, activation='relu')(out)
out = Dense(1)(out)
Your architecture diagram assumes you have fixed number year inputs, in this case 3 years. If you have variable number of years you have to use shared Dense layers and use TimeDistributed wrapper to apply the Dense layers to every year before merging:
in = Inputs(shape=(3,3)) # this time we have 2d array of 3 years
latent = TimeDistributed(Dense(3, activation='relu'))(in) # apply same dense to every year
latent = TimeDistributed(Dense(3, activation='relu'))(latent)
merged = Flatten()(latent)
out = ...
This time the Dense layers are shared across years, they have the same weights essentially.

Related

Random forest prediction values

Having a dataset like this:
y x size type total_neighbours res
113040 29 1204 15 3 2 0
66281 52 402 9 3 3 0
32296 21 1377 35 0 3 0
48367 3 379 139 0 4 0
33501 1 66 17 0 3 0
... ... ... ... ... ... ...
131230 39 1002 439 3 4 6
131237 40 1301 70 1 2 1
131673 26 1124 365 1 2 1
131678 27 1002 629 3 3 6
131684 28 1301 67 1 2 1
I would like to use random forest algorithm to predict the value of res column (res column can only take integer values between [0-6])
I'm doing it like this:
labels = np.array(features['res'])
features= features.drop('res', axis = 1)
features = np.array(features)
train_features, test_features, train_labels, test_labels = train_test_split(features, labels, test_size = 0.25,
random_state = 42)
rf = RandomForestRegressor(n_estimators= 1000, random_state=42)
rf.fit(train_features, train_labels);
predictions = rf.predict(test_features)
The prediction I get are the following:
array([1.045e+00, 4.824e+00, 4.608e+00, 1.200e-01, 5.982e+00, 3.660e-01,
4.659e+00, 5.239e+00, 5.982e+00, 1.524e+00])
I have no experience on this field so I don't quite understand the predictions.
How do I interpret them?
Is there any way to limit the predictions to the res column values (integers between [0-6])?
Thanks
As #MaxNoe said, I had a misconception about the model. I was using a regression to predict a discrete variable.
RandomForestClassifier is giving the expected output.

Python dataframe find difference between min and max in a row

I have a dataframe of many columns as given below
df =
index P1 Q1 W1 P2 Q2 W2 P3 Q3 W3
0 1 -1 2 3 0 -4 -4 4 0
1 2 -5 8 9 3 -7 -8 9 6
2 -4 -5 3 4 5 -6 -7 8 8
I want to compute row wise difference between max and min in P columns.
df['P_dif'] = max (P1,P2,P3) - min (P1,P2,P3)
My expected output
df =
index P1 Q1 W1 P2 Q2 W2 P3 Q3 W3 P_dif
0 1 -1 2 3 0 -4 -4 4 0 7 # 3-(-4)
1 2 -5 8 9 3 -7 -8 9 6 17 # 9-(-8)
2 -4 -5 3 4 5 -6 -7 8 8 11 # 4-(-7)
My present code
df['P_dif'] = df[df.columns[::3]].apply(lambda g: g.max()-g.min())
My present output
print(df['P_dif'])
NaN
NaN
NaN
Not sure why you're getting Nan values but I suspect it may be because you have rows with NaN in the Px columns (in the rows you hven't shown us in your example).
The reason I suspect this is because the lambda you're applying is operating on columns rather than rows, as per the following transcript:
>>> import pandas
>>> data = [[1,-1,2,3,0,-4,-4,4,0],[2,-5,8,9,3,-7,-8,9,6],[-4,-5,3,4,5,-6,-7,8,8]]
>>> df=pandas.DataFrame(data,columns=['P1','Q1','W1','P2','Q2','W2','P3','Q3','W3'])
>>> df
P1 Q1 W1 P2 Q2 W2 P3 Q3 W3
0 1 -1 2 3 0 -4 -4 4 0
1 2 -5 8 9 3 -7 -8 9 6
2 -4 -5 3 4 5 -6 -7 8 8
>>> df[df.columns[::3]].apply(lambda g: g.max()-g.min())
P1 6 # 2 - -4 -> 6
P2 6 # 9 - 3 -> 6
P3 4 # -4 - -8 -> 4
Note the output specifying the P1, P2 and P3 values and the stuff I've added as comments to the right, to show that it's the maximal difference of the column rather than the row.
You can get the information you need with the following:
>>> numpy.ptp(numpy.array(df[['P1', 'P2', 'P3']]), axis=1)
array([7, 17, 11], dtype=int64)
I don't doubt someone more familar than I with Pandas and Numpy could improve on that so feel free to edit this answer if that's the case.
You can use DataFrame.max, DataFrame.min with axis=1 to calculate max and min value among columns
computed_cols = df.loc[:, ['P1', 'P2', 'P3']]
df['P_dif'] = computed_cols.max(axis=1) - computed_cols.min(axis=1)
Best,

generate normalized discrete values for feature engineering

There is a dataframe, with one columns store the discrete values, shown as follows. I would like to create another column storing the normalized values. For instance, for 4050, the corresponding entry will be 4. Are there any efficient ways to do that instead of writing my own function? In Sklearn, are there any functions to generating normalized values?
Based on your comment:
there are around 20 different values, and the range is from 1000 to 9999, so I would like to use every 1000 as a category
This isn't really normalization in the strict sense of the word. However, to do that, you can easily use floor division (//):
df['new_column'] = df['values']//1000
For example:
>>> df
values
0 2021
1 8093
2 9870
3 4508
4 2645
5 1441
6 8888
7 8921
8 7292
9 8571
df['new_column'] = df['values']//1000
>>> df
values new_column
0 2021 2
1 8093 8
2 9870 9
3 4508 4
4 2645 2
5 1441 1
6 8888 8
7 8921 8
8 7292 7
9 8571 8

Does convolution in Theano rotate the filters?

I have an 3-channel 5-by-5 image like this:
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
And a 3-channel 3-by-3 filter like this:
10 20 30 0.1 0.2 0.3 1 2 3
40 50 60 0.4 0.5 0.6 4 5 6
70 80 90 0.7 0.8 0.9 7 8 9
When convolve the image with the filter, I am expecting this output:
369.6 514.8 316.8
435.6 594. 356.4
211.2 277.2 158.4
However, Theano (using keras) gives me this output:
158.4 277.2 211.2
356.4 594. 435.6
316.8 514.8 369.6
It seems the output is rotated 180 degrees, I wonder why this happens and how can I get the correct answer. Here is my test code:
def SimpleNet(weight_array,biases_array):
model = Sequential()
model.add(ZeroPadding2D(padding=(1,1),input_shape=(3,5,5)))
model.add(Convolution2D(1, 3, 3, weights=[weight_array,biases_array],border_mode='valid',subsample=(2,2)))
return model
im = np.asarray([
1,1,1,1,1,
1,1,1,1,1,
1,1,1,1,1,
1,1,1,1,1,
1,1,1,1,1,
2,2,2,2,2,
2,2,2,2,2,
2,2,2,2,2,
2,2,2,2,2,
2,2,2,2,2,
3,3,3,3,3,
3,3,3,3,3,
3,3,3,3,3,
3,3,3,3,3,
3,3,3,3,3])
weight_array = np.asarray([
10,20,30,
40,50,60,
70,80,90,
0.1,0.2,0.3,
0.4,0.5,0.6,
0.7,0.8,0.9,
1,2,3,
4,5,6,
7,8,9])
im = np.reshape(im,[1,3,5,5])
weight_array = np.reshape(weight_array,[1,3,3,3])
biases_array = np.zeros(1)
model = SimpleNet(weight_array,biases_array)
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy')
out = model.predict(im)
print out.shape
print out
This is the definition of convolution. It has the advantage that if you convolve an image that consists of only zeros except for one single 1 somewhere, the convolution will place a copy of the filter at that position.
Theano does exactly these convolutions, as defined mathematically. This implies flipping the filters (the operation is filter[:, :, ::-1, ::-1]) before taking dot products with the image patches. Note that these are not rotations by 180 degrees, at least not in general.
It appears that what you are looking for is cross-correlation, which is taking dot products with the non-flipped versions of the filters at each point of the image.
See also this answer in which theano.tensor.nnet.conv2d is shown to do exactly the same thing as the scipy counterpart.

How to get Equation of a decision boundary in matlab svm plot?

my data
y n Rh y2
1 1 1.166666667 1
-1 2 0.5 1
-1 3 0.333333333 1
-1 4 0.166666667 1
1 5 1.666666667 2
1 6 1.333333333 1
-1 7 0.333333333 1
-1 8 0.333333333 1
1 9 0.833333333 1
1 10 2.333333333 2
1 11 1 1
-1 12 0.166666667 1
1 13 0.666666667 1
1 14 0.833333333 1
1 15 0.833333333 1
-1 16 0.333333333 1
-1 17 0.166666667 1
1 18 2 2
1 19 0.833333333 1
1 20 1.333333333 1
1 21 1.333333333 1
-1 22 0.166666667 1
-1 23 0.166666667 1
-1 24 0.333333333 1
-1 25 0.166666667 1
-1 26 0.166666667 1
-1 27 0.333333333 1
-1 28 0.166666667 1
-1 29 0.166666667 1
-1 30 0.5 1
1 31 0.833333333 1
-1 32 0.166666667 1
-1 33 0.333333333 1
-1 34 0.166666667 1
-1 35 0.166666667 1
my codes r
data=xlsread('btpdata.xlsx',1.)
A = data(1:end,2:3)
B = data(1:end,1)
svmStruct = svmtrain(A,B,'showplot',true)
hold on
C = data(1:end,2:3)
D = data(1:end,4)
svmStruct = svmtrain(C,D,'showplot',true)
hold off
How can i get the approximate equations of this black lines in the given mat-lab plot?
It depends what package you did use, but as it is a linear Support Vector Machine there are more or less two options:
Your trained svm contains the equation of the line in a property coefs (sometimes called w or weights) and b (or intercept), so your line is <coefs, X> + b = 0
Your svm containes alphas (dual coefficients, Lagrange multipliers) and then coefs = SUM_i alphas_i * y_i * SV_i where SV_i is i'th support vector (the ones in circles on your plot) and y_i is its label (-1 or +1). Sometimes alphas are already multiplied by y_i, then your coefs = SUM_i alphas_i * SV_i.
If you are trying to get the equation from the actual plot (image), then you can only read it (and it is more or less y = 0.6, meaning that coefs = [0 1] and b = -0.6. Image analysis based approach (for arbitrary such plot) would require:
detecting image part (object detection)
reading the ticks/scale (OCR + object detection) <- this would be actually the hardest part
filtering out everything non-black and performing linear regression to points left, then trasforming through scale detected earlier.
I have had the same problem. To build the linear equation (y = mx + b) of the decision boundary you need the gradient (m) and the y-intercept (b). SVMStruct.Bias is the b-term. The gradient is determined by the SVM beta weights, which SVMStruct does not contain so you need to calculate them from the alphas (which are included in SVMStruct):
alphas = SVMStruct.Alpha;
SV = SVMStruct.SupportVectors;
betas = sum(alphas.*SV);
m = betas(1)/betas(2)
By the way, if your SVM has scaled the data, then I think you will need to unscale it.

Resources