Keras custom layer/constraint to implement equal weights - python-3.x

I would like to create a layer in Keras such that:
y = Wx + c
where W is a block matrix with the form:
A and B are square matrices with elements:
and c is a bias vector with repeated elements:
How can I implement these restrictions? I was thinking it could either be implemented in the MyLayer.build() when initializing weights or as a constraint where I can specify certain indices to be equal but I am unsure how to do so.

You can define such W using Concatenate layer.
import keras.backend as K
from keras.layers import Concatenate
A = K.placeholder()
B = K.placeholder()
row1 = Concatenate()([A, B])
row2 = Concatenate()([B, A])
W = Concatenate(axis=1)([row1, row2])
Example evaluation:
import numpy as np
get_W = K.function(outputs=[W], inputs=[A, B])
get_W([np.eye(2), np.ones((2,2))])
Returns
[array([[1., 0., 1., 1.],
[0., 1., 1., 1.],
[1., 1., 1., 0.],
[1., 1., 0., 1.]], dtype=float32)]
To figure out exact solution you can use placeholder's shape argument. Addition and multiplication are quite straightforward.

Related

Scikit learn preprocessing cannot understand the output using min_frequency argument in OneHotencoder class

Consider the below array t. When using min_frequency kwarg in the OneHotEncoder class, I cannot understand why the category snake is still present when transforming a new array. There are 2/40 events of this label. Should the shape of e be (4,3) instead?
sklearn.__version__ == '1.1.1'
t = np.array([['dog'] * 8 + ['cat'] * 20 + ['rabbit'] * 10 +
['snake'] * 2], dtype=object).T
enc = OneHotEncoder(min_frequency= 4/40,
sparse=False).fit(t)
print(enc.infrequent_categories_)
# [array(['snake'], dtype=object)]
e = enc.transform(np.array([['dog'], ['cat'], ['dog'], ['snake']]))
array([[0., 1., 0., 0.],
[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 0., 1.]]) # snake is present?
Check out enc.get_feature_names_out():
array(['x0_cat', 'x0_dog', 'x0_rabbit', 'x0_infrequent_sklearn'],
dtype=object)
"snake" isn't considered its own category anymore, but lumped into the infrequent category. If you added some other rare categories, they'd be assigned to the same, and if you additionally set handle_unknown="infrequent_if_exist", you would also encode unseen categories to the same.

How do I mask a feed forward layer based on tensor in pytorch?

I have a really simple network with 2 inputs (x and m).
x is size 100
m is size 3
My network is simply...
f_1 = linear_layer(x)
f_2 = linear_layer(f_1)
f_3 = linear_layer(f_1)
f_4 = linear_layer(f_1)
f_5 = softmax(linear_layer(sum(f_2, f_3, f_4)))
based on the vector m, I want to zero out and ignore f_2, f_3, f_4 in the final sum and resulting gradient calculation. Is there a way to create a mask based on vector m to achieve this?
Ok, here is how you do it. Use list comprehensions to make it more generic:
# example input and output
x = torch.ones(5)
y = torch.zeros(3)
# mask tensor
mask = torch.tensor([0, 1, 0])
# initial layer
z0 = torch.nn.Linear(5, 5)
# layers to potentially mask
z1 = torch.nn.Linear(5, 3)
z2 = torch.nn.Linear(5, 3)
z3 = torch.nn.Linear(5, 3)
# defines how the data passes through the layers, specific mask element is applied to each of the maskable layers
layer1_output = z0(x)
layer2_output = mask[0]*z1(layer1_output) + mask[1]*z2(layer1_output) + mask[2]*z3(layer1_output)
# loss function
loss = torch.nn.functional.binary_cross_entropy_with_logits(layer2_output, y)
# run it and see
loss.backward()
print(z0.weight.grad)
print(z1.weight.grad)
print(z2.weight.grad)
print(z3.weight.grad)
as shown below, the masking tensor is effective in selecting subnets to apply computation to based on mask element
tensor([[ 0.0354, 0.0354, 0.0354, 0.0354, 0.0354],
[-0.0986, -0.0986, -0.0986, -0.0986, -0.0986],
[-0.0372, -0.0372, -0.0372, -0.0372, -0.0372],
[-0.0168, -0.0168, -0.0168, -0.0168, -0.0168],
[-0.0133, -0.0133, -0.0133, -0.0133, -0.0133]])
tensor([[-0., 0., 0., -0., 0.],
[-0., 0., 0., -0., 0.],
[-0., 0., 0., -0., 0.]])
tensor([[-0.0422, 0.1314, 0.1108, -0.1644, 0.0906],
[-0.0240, 0.0747, 0.0630, -0.0934, 0.0515],
[-0.0251, 0.0781, 0.0659, -0.0977, 0.0539]])
tensor([[-0., 0., 0., -0., 0.],
[-0., 0., 0., -0., 0.],
[-0., 0., 0., -0., 0.]])

How to map element in pytorch tensor to id?

Given a tensor:
A = torch.tensor([2., 3., 4., 5., 6., 7.])
Then, give each element in A an id:
id = torch.arange(A.shape[0], dtype = torch.int) # tensor([0,1,2,3,4,5])
In other words, id of 2. in A is 0 and id of 3. in A is 1:
2. -> 0
3. -> 1
4. -> 2
5. -> 3
6. -> 4
7. -> 5
Then, I have a new tensor:
B = torch.tensor([3., 6., 6., 5., 4., 4., 4.])
In pytorch, is there any way in Pytorch to map each element in B to id?
In other words, I want to obtain tensor([1, 4, 4, 3, 2, 2, 2]), in which each element is id of the element in B.
What you ask can be done with slowly iterating the whole B matrix and checking each element of it against all elements of A and then retrieving the index of each element:
In [*]: for x in B:
...: print(torch.where(x==A)[0][0])
...:
...:
tensor(1)
tensor(4)
tensor(4)
tensor(3)
tensor(2)
tensor(2)
tensor(2)
Here I used torch.where to find all the True elements in the matrix x==A, where x take the value of each element of matrix B. This is really slow but it allows you to add some functionality to deal with cases where some elements of B do not appear in matrix A
The fast and dirty method to get what you want with linear algebra operations is:
In [*]: (B.view(-1,1) == A).int().argmax(dim=1)
Out[*]: tensor([1, 4, 4, 3, 2, 2, 2])
This trick takes advantage of the fact that argmax returns the first 'max' index of each vector in dim=1.
Big warning here, if the element does not exist in the matrix no error will be raised and the result will silently be 0 for all elements that do not exist in A.
In [*]: C = torch.tensor([100, 1000, 1, 3, 9999])
In [*]: (C.view(-1,1) == A).int().argmax(dim=1)
Out[*]: tensor([0, 0, 0, 1, 0])
I don't think there is such a function in PyTorch to map a tensor.
It seems quite unreasonable to solve this by comparing each value from B to values from B.
Here are two possible solutions to solve this problem.
Using a dictionary as a map
You can use a dictionary. Not so not much of a pure-PyTorch solution but will most probably be the fastest and safest way...
Just create a dict to map each element to an id, then use it to map B:
>>> map = {x.item(): i for i, x in enumerate(A)}
>>> torch.tensor([map[x.item()] for x in B])
tensor([1, 4, 4, 3, 2, 2, 2])
Change of basis approach
An alternative only using torch.Tensors. This will require the values you want to map - the content of A - to be integers because they will be used to index a tensor.
Encode the content of A into one-hot encodings:
>>> A_enc = torch.zeros((int(A.max())+1,)*2)
>>> A_enc[A, torch.arange(A.shape[0])] = 1
>>> A_enc
tensor([[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.]])
We'll use A_enc as our basis to map integers:
>>> v = torch.argmax(A_enc, dim=0)
tensor([0, 0, 0, 1, 2, 3, 4, 5])
Now, given an integer for instance x=3, we can encode it into a one-hot-encoding: x_enc = [0, 0, 0, 1, 0, 0, 0, 0]. Then, use v to map it. With a simple dot product you can get the mapping of x_enc: here <v/x_enc> gives 1 which is the desired result (first element of mapped-B). But instead of giving x_enc, we will compute the matrix multiplication between v and encoded-B. First encode B then compute the matrix multiplcition vxB_enc:
>>> B_enc = torch.zeros(A_enc.shape[0], B.shape[0])
>>> B_enc[B, torch.arange(B.shape[0])] = 1
>>> B_enc
tensor([[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 1., 1.],
[0., 0., 0., 1., 0., 0., 0.],
[0., 1., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.]])
>>> v#B_enc.long()
tensor([1, 4, 4, 3, 2, 2, 2])
Note - you will have to define your tensors with Long type.
There is a similar issue for numpy so my answer is heavily inspired by their solution. I will compare some of the mentioned methods using perfplot. I will also generalize the problem to apply a mapping to a tensor (yours is just a specific case).
For the analysis, I will assume the mapping contains all the unique elements in the tensor and the number of elements to small and constant.
import torch
def apply(a: torch.Tensor, ids: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
mapping = {k.item(): v.item() for k, v in zip(a, ids)}
return b.clone().apply_(lambda x: mapping.__getitem__(x))
def bucketize(a: torch.Tensor, ids: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
mapping = {k.item(): v.item() for k, v in zip(a, ids)}
# From `https://stackoverflow.com/questions/13572448`.
palette, key = zip(*mapping.items())
key = torch.tensor(key)
palette = torch.tensor(palette)
index = torch.bucketize(b.ravel(), palette)
remapped = key[index].reshape(b.shape)
return remapped
def iterate(a: torch.Tensor, ids: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
mapping = {k.item(): v.item() for k, v in zip(a, ids)}
return torch.tensor([mapping[x.item()] for x in b])
def argmax(a: torch.Tensor, ids: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
return (b.view(-1, 1) == a).int().argmax(dim=1)
if __name__ == "__main__":
import perfplot
a = torch.arange(2, 8)
ids = torch.arange(0, 6)
perfplot.show(
setup=lambda n: torch.randint(2, 8, (n,)),
kernels=[
lambda x: apply(a, ids, x),
lambda x: bucketize(a, ids, x),
lambda x: iterate(a, ids, x),
lambda x: argmax(a, ids, x),
],
labels=["apply", "bucketize", "iterate", "argmax"],
n_range=[2 ** k for k in range(25)],
xlabel="len(a)",
)
Running this yields the following plot:
Hence depending on the number of elements in your tensor you can pick either the argmax method (with the caveats mentioned and the restriction that you have to map the values from 0 to N), apply, or bucketize.
Now if we increase the number of elements to be mapped lets say tens of thousands i.e. a = torch.arange(2, 10002) and ids = torch.arange(0, 10000) we get the following results:
This means the speed increase of bucketize will only be visible for a larger array but still outperforms the other methods (the argmax method was killed and therefore I had to remove it).
Last, if we have a mapping that does not have all keys present in the tensor we can just update a dictionary with all unique keys:
mapping = {x.item(): x.item() for x in torch.unique(a)}
mapping.update({k.item(): v.item() for k, v in zip(a, ids)})
Now, if the unique elements you want to map is orders of magnitude larger than the array computing this may shift the value of n for when bucketize is faster than apply (since for apply you can change the mapping.__getitem__(x) for mapping.get(x, x).
I guess there is an easier way. Create an array as mapper, cast your tensor back into np.ndarray first and then address it.
import numpy as np
a_array = A.numpy().astype(int)
b_array = B.numpy().astype(int)
mapper = np.zeros(10)
for i, x in enumerate(a_array):
mapper[x] = i
out = torch.Tensor(mapper[b_array])

Fitting Poisson distribution on a histogram

Despite the overwhelming amount of posts on fitting Poisson distribution onto a histogram, having followed all of them, none of them seems to work for me.
I'm looking to fit a poisson distribution on this histogram which I've plotted as such:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.misc import factorial
def poisson(t, rate, scale): #scale is added here so the y-axis
# of the fit fits the height of histogram
return (scale*(rate**t/factorial(t))*np.exp(-rate))
lifetimes = 1/np.random.poisson((1/550e-6), size=100000)
hist, bins = np.histogram(lifetimes, bins=50)
width = 0.8*(bins[1]-bins[0])
center = (bins[:-1]+bins[1:])/2
plt.bar(center, hist, align='center', width=width, label = 'Normalised data')
popt, pcov = curve_fit(poisson, center, hist, bounds=(0.001, [2000, 7000]))
plt.plot(center, poisson(center, *popt), 'r--', label='Poisson fit')
# import pdb; pdb.set_trace()
plt.legend(loc = 'best')
plt.tight_layout()
The histogram I get looks like this:
I gave the guess of scale as 7000 to scale the distribution to the same height as the y-axis of the histogram I plotted and a guess of 2000 as the rate parameter since it's 2000 > 1/550e-6. As you can see the fitted red dotted line is 0 at every point. Weirdly pdb.set_trace() tells me that the poisson(center, *popt) gives me a list of 0 values.
126 plt.plot(center, poisson(center, *popt), 'r--', label='Poisson fit')
127 import pdb; pdb.set_trace()
--> 128 plt.legend(loc = 'best')
129 plt.tight_layout()
130
ipdb>
ipdb> poisson(center, *popt)
array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
Which doesn't make sense. What I want is to fit a poisson distribution on the histogram such that it finds the best coefficient of the poisson distribution equation. I suspected that it might be have to do with because I am plotting histogram of lifetimes instead, which is technically randomly sampled data from the inverse of the poisson distribution. So I tried to compute the jacobian of the distribution so I can make a change of variables but it still won't work. I feel like I'm missing something here that's not coding but rather mathematics related.
You're calculation is rounding to zero. With a rate of 2000 and scale of 7000 your poisson formula is reduced to:
7000 * 2000^t/(e^(2000) * t!)
Using Stirling's approximation t! ~ (2*pi*t)^(1/2) * (t/e)^t you get:
[7000 * 2000^t] / [Sqrt(2*pi*t) * e^(2000-t) * (t^t)] ~ poisson(t)
I used python to get the first couple values of poisson(t):
poisson(1) -> 0
poisson(2) -> 0
poisson(3) -> 0
Using wolfram alpha you find that the derivative of the denominator greater than the derivative of the numerator for all real numbers greater than zero. Therefore, poisson(t) is approaching zero as t gets larger.
This means that no matter what t is, if your rate is 2000, the poisson function will return 0.
Sorry for the formatting. They wont let me post TeX yet.

Unable to transform string column to categorical matrix using Keras and Sklearn

I am trying to build a simple Keras model, with Python3.6 on MacOS, to predict house prices in a given range but I fail to transform the output into a category matrix. I am using this dataset from Kaggle.
I've created a new column in the dataframe with different price ranges as strings to serve as target output in my model, then use keras.utils and Sklearn LabelEncoder to try to create the output binary matrix but I keep getting the error:
ValueError: invalid literal for int() with base 10: '0 - 50000'
Here is my code:
import pandas as pd
import numpy as np
from keras.layers import Dense
from keras.models import Sequential, load_model
from keras.callbacks import EarlyStopping
from keras.utils import to_categorical, np_utils
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder
seed = 7
np.random.seed(seed)
data = pd.read_csv("Melbourne_housing_FULL.csv")
data.fillna(0, inplace=True)
price_range = 50000
bins = np.arange(0, 12000000, price_range)
labels = ['{} - {}'.format(i + 1, j) for i, j in zip(bins[:-1], bins[1:])]
#correct first value
labels[0] = '0 - 50000'
for item in labels:
str(item)
print (labels[:10])
['0 - 50000', '50001 - 100000', '100001 - 150000', '150001 - 200000',
'200001 - 250000', '250001 - 300000', '300001 - 350000', '350001 - 400000',
'400001 - 450000', '450001 - 500000']
data['PriceRange'] = pd.cut(data.Price,
bins=bins,
labels=labels,
right=True,
include_lowest=True)
#print(data.PriceRange.value_counts())
output_len = len(labels)
print(output_len)
Everything is correct here until I run the next piece:
predictors = data.drop(['Suburb', 'Address', 'SellerG', 'CouncilArea',
'Propertycount', 'Date', 'Type', 'Price', 'PriceRange'], axis=1).as_matrix()
target = data['PriceRange']
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(target)
encoded_Y = encoder.transform(target)
target = np_utils.to_categorical(data.PriceRange)
n_cols = predictors.shape[1]
And I get the ValueError: invalid literal for int() with base 10: '0 - 50000'
Con someone help me here? Don't really understand what I am doing wrong.
Many thanks
Its because np_utils.to_categorical takes y of datatype int, but you have strings either convert them into int by giving them a key i.e :
cats = data.PriceRange.values.categories
di = dict(zip(cats,np.arange(len(cats))))
#{'0 - 50000': 0,
# '10000001 - 10050000': 200,
# '1000001 - 1050000': 20,
# '100001 - 150000': 2,
# '10050001 - 10100000': 201,
# '10100001 - 10150000': 202,
target = np_utils.to_categorical(data.PriceRange.map(di))
or since you are using pandas you can use pd.get_dummies to get one hot encoding.
onehot = pd.get_dummies(data.PriceRange)
target_labels = onehot.columns
target = onehot.as_matrix()
array([[ 1., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
...,
[ 0., 0., 0., ..., 0., 0., 0.],
[ 1., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.]])
With only one line of code
tf.keras.utils.to_categorical(data.PriceRange.factorize()[0])

Resources