Numbers of hidden layers and units in AutoKeras dense block

Numbers of hidden layers and units in AutoKeras dense block - keras

I am training a model with Autokeras. So far my best model is this:
structured_data_block_1/normalize:false
structured_data_block_1/dense_block_1/use_batchnorm:true
structured_data_block_1/dense_block_1/num_layers:2
structured_data_block_1/dense_block_1/units_0:32
structured_data_block_1/dense_block_1/dropout:0
structured_data_block_1/dense_block_1/units_1:32
dense_block_2/use_batchnorm:true
dense_block_2/num_layers:2
dense_block_2/units_0:128
dense_block_2/dropout:0
dense_block_2/units_1:16
dense_block_3/use_batchnorm:false
dense_block_3/num_layers:1
dense_block_3/units_0:32
dense_block_3/dropout:0
dense_block_3/units_1:32
regression_head_1/dropout:0
optimizer:"adam"
learning_rate:0.1
dense_block_2/units_2:32
structured_data_block_1/dense_block_1/units_2:256
dense_block_3/units_2:128
My first dense_block_1 has 2 layers (num_layers:2), how can I have three units / neurons then? It say units_0: 32, units_1: 32 and units_2: 256, this implies to me that I have three layers, so why is num_layers:2?
If I would want to recreate the above model in this code, how would I do it properly?
input_node = ak.StructuredDataInput()
output_node = ak.StructuredDataBlock(categorical_encoding=False, normalize=False)(input_node)
output_node = ak.DenseBlock()(output_node)
output_node = ak.DenseBlock()(output_node)
output_node = ak.RegressionHead()(output_node)
Thx for any input

Related

Computing gradient twice for two different losses in Pytorch

I want to compute the gradients twice for two different losses in the same iteration.
Code:
batch_output0,batch_output1 = get_output_from_model(model=model,
data=batch[0])
train_loss0 = loss_fun0(batch_output0, batch_labels0.float().view(-1, 1))
train_loss0.backward()
grad0_conv_w = model.conv1.conv1.weight.grad
batch_output0,batch_output1 = get_output_from_model(model=model,
data=batch[0])
train_loss1 = loss_fun1(batch_output1, batch_labels1.float().view(-1, 1))
train_loss1.backward()
grad1_conv_w = model.conv1.conv1.weight.grad
Outputs:
train_loss0: tensor(0.6950, grad_fn=<BinaryCrossEntropyBackward>)
train_loss1: tensor(25.5431, grad_fn=<MseLossBackward>)
Grad0: tensor([-2.4883e-05, 3.7842e-05, 1.2635e-04, ..., -1.6413e-04,
-1.8419e-04, -1.7884e-04])
Grad1: tensor([-2.4883e-05, 3.7842e-05, 1.2635e-04, ..., -1.6413e-04,
-1.8419e-04, -1.7884e-04])
You may note that even though the two losses are quite different, the gradients for the corresponding losses are exactly the same.
Please help me to diagnose the problem.
Thank you.

Gradients vanishing despite using Kaiming initialization

I was implementing a conv block in pytorch with activation function(prelu). I used Kaiming initilization to initialize all my weights and set all the bias to zero. However as I tested these blocks (by stacking 100 such conv and activation blocks on top of each other), I noticed that the output I am getting values of the order of 10^(-10). Is this normal, considering I am stacking upto 100 layers. Adding a small bias to each layer fixes the problem. But in Kaiming initialization the biases are supposed to be zero.
Here is the conv block code
from collections import Iterable
def convBlock(
input_channels, output_channels, kernel_size=3, padding=None, activation="prelu"
):
"""
Initializes a conv block using Kaiming Initialization
"""
padding_par = 0
if padding == "same":
padding_par = same_padding(kernel_size)
conv = nn.Conv2d(input_channels, output_channels, kernel_size, padding=padding_par)
relu_negative_slope = 0.25
act = None
if activation == "prelu" or activation == "leaky_relu":
nn.init.kaiming_normal_(conv.weight, a=relu_negative_slope, mode="fan_in")
if activation == "prelu":
act = nn.PReLU(init=relu_negative_slope)
else:
act = nn.LeakyReLU(negative_slope=relu_negative_slope)
if activation == "relu":
nn.init.kaiming_normal_(conv.weight, nonlinearity="relu")
act = nn.ReLU()
nn.init.constant_(conv.bias.data, 0)
block = nn.Sequential(conv, act)
return block
def flatten(lis):
for item in lis:
if isinstance(item, Iterable) and not isinstance(item, str):
for x in flatten(item):
yield x
else:
yield item
def Sequential(args):
flattened_args = list(flatten(args))
return nn.Sequential(*flattened_args)
This is the test Code
ls=[]
for i in range(100):
ls.append(convBlock(3,3,3,"same"))
model=Sequential(ls)
test=np.ones((1,3,5,5))
model(torch.Tensor(test))
And the output I am getting is
tensor([[[[-1.7771e-10, -3.5088e-10, 5.9369e-09, 4.2668e-09, 9.8803e-10],
[ 1.8657e-09, -4.0271e-10, 3.1189e-09, 1.5117e-09, 6.6546e-09],
[ 2.4237e-09, -6.2249e-10, -5.7327e-10, 4.2867e-09, 6.0034e-09],
[-1.8757e-10, 5.5446e-09, 1.7641e-09, 5.7018e-09, 6.4347e-09],
[ 1.2352e-09, -3.4732e-10, 4.1553e-10, -1.2996e-09, 3.8971e-09]],
[[ 2.6607e-09, 1.7756e-09, -1.0923e-09, -1.4272e-09, -1.1840e-09],
[ 2.0668e-10, -1.8130e-09, -2.3864e-09, -1.7061e-09, -1.7147e-10],
[-6.7161e-10, -1.3440e-09, -6.3196e-10, -8.7677e-10, -1.4851e-09],
[ 3.1475e-09, -1.6574e-09, -3.4180e-09, -3.5224e-09, -2.6642e-09],
[-1.9703e-09, -3.2277e-09, -2.4733e-09, -2.3707e-09, -8.7598e-10]],
[[ 3.5573e-09, 7.8113e-09, 6.8232e-09, 1.2285e-09, -9.3973e-10],
[ 6.6368e-09, 8.2877e-09, 9.2108e-10, 9.7531e-10, 7.0011e-10],
[ 6.6954e-09, 9.1019e-09, 1.5128e-08, 3.3151e-09, 2.1899e-10],
[ 1.2152e-08, 7.7002e-09, 1.6406e-08, 1.4948e-08, -6.0882e-10],
[ 6.9930e-09, 7.3222e-09, -7.4308e-10, 5.2505e-09, 3.4365e-09]]]],
grad_fn=<PreluBackward>)

Amazing question (and welcome to StackOverflow)! Research paper for quick reference.
TLDR
Try wider networks (64 channels)
Add Batch Normalization after activation (or even before, shouldn't make much difference)
Add residual connections (shouldn't improve much over batch norm, last resort)
Please check this out in this order and give a comment what (and if) any of that worked in your case (as I'm also curious).
Things you do differently
Your neural network is very deep, yet very narrow (81 parameters per layer only!)
Due to above, one cannot reliably create those weights from normal distribution as the sample is just too small.
Try wider networks, 64 channels or more
You are trying much deeper network than they did
Section: Comparison Experiments
We conducted comparisons on a deep but efficient model with 14 weight
layers (actually 22 was also tested in comparison with Xavier)
That was due to date of release of this paper (2015) and hardware limitations "back in the days" (let's say)
Is this normal?
Approach itself is quite strange with layers of this depth, at least currently;
each conv block is usually followed by activation like ReLU and Batch Normalization (which normalizes signal and helps with exploding/vanishing signals)
usually networks of this depth (even of depth half of what you've got) use also residual connections (though this is not directly linked to vanishing/small signal, more connected to degradation problem of even deep networks, like 1000 layers)

BERT zero layer fixed word embeddings [duplicate]

I know that BERT has total vocabulary size of 30522 which contains some words and subwords. I want to get the initial input embeddings of BERT. So, my requirement is to get the table of size [30522, 768] to which I can index by token id to get its embeddings. Where can I get this table?

The BertModels have get_input_embeddings():
import torch
from transformers import BertModel, BertTokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
bert = BertModel.from_pretrained('bert-base-uncased')
token_embedding = {token: bert.get_input_embeddings()(torch.tensor(id)) for token, id in tokenizer.get_vocab().items()}
print(len(token_embedding))
print(token_embedding['[CLS]'])
Output:
30522
tensor([ 1.3630e-02, -2.6490e-02, -2.3503e-02, -7.7876e-03, 8.5892e-03,
-7.6645e-03, -9.8808e-03, 6.0184e-03, 4.6921e-03, -3.0984e-02,
1.8883e-02, -6.0093e-03, -1.6652e-02, 1.1684e-02, -3.6245e-02,
8.3482e-03, -1.2112e-03, 1.0322e-02, 1.6692e-02, -3.0354e-02,
-1.2372e-02, -2.5173e-02, -8.9602e-03, 8.1994e-03, -2.0011e-02,
-1.5901e-02, -3.8394e-03, 1.4241e-03, 7.0500e-03, 1.6092e-03,
-2.7764e-03, 9.4931e-03, -2.2768e-02, 1.9317e-02, -1.3442e-02,
-2.3763e-02, -1.4617e-02, 9.7735e-03, -2.2428e-03, 3.0642e-02,
6.7829e-03, -2.6471e-03, -1.8553e-02, -1.2363e-02, 7.6489e-03,
-2.5461e-03, -3.1498e-01, 6.3761e-03, 4.8914e-02, -7.7636e-03,
6.0919e-02, 2.1346e-02, -3.9741e-02, 2.2853e-01, 2.6502e-02,
-1.0144e-03, -7.8480e-03, -1.9995e-03, 1.7057e-02, -3.3270e-02,
4.5421e-03, 6.1751e-03, -1.0077e-01, -2.0973e-02, -1.4512e-04,
-9.6657e-03, 1.0871e-02, -1.4786e-02, 2.6437e-04, 2.1166e-02,
1.6492e-02, -5.1928e-03, -1.1857e-02, -9.9159e-03, -1.4363e-02,
-1.2405e-02, -1.2973e-02, 2.6778e-02, -1.0986e-02, 1.0572e-02,
-2.5566e-02, 5.2494e-03, 1.5890e-02, -5.1504e-03, -7.5859e-03,
2.0259e-02, -7.0155e-03, 1.6359e-02, 1.7487e-02, 5.4297e-03,
-8.6403e-03, 2.8821e-02, -7.8964e-03, 1.9259e-02, 2.3868e-02,
-4.3472e-03, 5.5662e-02, -2.1940e-02, 4.1779e-03, -5.7216e-03,
2.6712e-02, -5.0371e-03, 2.4923e-02, -1.3429e-02, -8.4337e-03,
9.8188e-02, -1.2940e-03, 1.2865e-02, -1.5930e-03, 3.6437e-03,
1.5569e-02, 1.8620e-02, -9.0643e-03, -1.9740e-02, 1.0530e-02,
-2.7359e-03, -7.5283e-03, 1.1492e-03, 2.6162e-03, -6.2757e-03,
-8.6096e-03, 6.6221e-01, -3.2235e-03, -4.1309e-02, 3.3047e-03,
-2.5040e-03, 1.2838e-04, -6.8073e-03, 6.0291e-03, -9.8468e-03,
8.0641e-03, -1.9815e-03, 2.5801e-02, 5.7429e-03, -1.0712e-02,
2.9176e-02, 5.9414e-03, 2.4795e-02, -1.7887e-02, 7.3183e-01,
1.0964e-02, 5.9942e-03, -4.6157e-02, 4.0131e-02, -9.7481e-03,
-8.9496e-01, 1.6385e-02, -1.9816e-03, 1.4691e-02, -1.9837e-02,
-1.7611e-02, -4.5263e-04, -1.8605e-02, -1.5660e-02, -1.0709e-02,
1.8016e-02, -3.4149e-03, -1.2632e-02, 4.2877e-03, -3.9169e-01,
1.0016e-02, -1.0955e-02, 4.5133e-03, -5.1150e-03, 4.9968e-03,
1.7852e-02, 1.1313e-02, 2.6519e-03, 3.3658e-01, -1.8168e-02,
1.3170e-02, 7.3927e-03, 5.2521e-03, -9.6230e-03, 1.2844e-02,
4.1554e-01, -9.7247e-03, -4.2439e-03, 5.5287e-04, 1.8271e-02,
-1.3889e-03, -2.0502e-03, -8.1946e-03, -6.5979e-06, -7.2764e-04,
-1.4625e-03, -6.9872e-03, -6.9633e-03, -8.0701e-03, 1.9936e-02,
4.8370e-03, 8.6883e-03, -4.9246e-02, -2.0028e-02, 1.4124e-03,
1.0444e-02, -1.1236e-02, -4.4654e-03, -2.0491e-02, -2.7654e-02,
-3.7079e-02, 1.3215e-02, 6.9498e-02, -3.1109e-02, 7.0562e-03,
1.0887e-02, -7.8090e-03, -1.0501e-02, -4.8735e-03, -6.8399e-04,
1.4717e-02, 4.4342e-03, 1.6012e-02, -1.0427e-02, -2.5767e-02,
-2.2699e-01, 8.6569e-02, 2.3453e-02, 4.6362e-02, 3.5609e-03,
2.1353e-02, 2.3703e-02, -2.0252e-02, 2.1580e-02, 7.2652e-03,
2.0933e-01, 1.2108e-02, 1.0869e-02, 7.0568e-03, -3.1132e-02,
2.0505e-02, 3.2248e-03, -2.2724e-03, 5.5342e-03, 3.0563e-03,
1.9542e-02, 1.2827e-03, 1.5952e-02, -1.5458e-02, -3.8455e-03,
-4.9417e-03, -1.0446e-02, 7.0516e-03, 2.2467e-03, -9.3643e-03,
1.9163e-02, 1.4239e-02, -1.5816e-02, 8.7413e-03, 2.4737e-02,
-7.3777e-03, -4.0975e-02, 9.4948e-03, 1.4700e-02, 2.6819e-02,
1.0706e-02, 1.0621e-02, -7.1816e-03, -8.5402e-03, 1.2261e-02,
-4.8679e-03, -9.6136e-03, 7.8765e-04, 3.8504e-02, -7.7485e-03,
-6.5018e-03, 3.4352e-03, 2.2931e-04, 5.7456e-03, -4.8441e-03,
-9.0898e-03, 8.6298e-03, 5.4740e-03, 2.2274e-02, -2.1218e-02,
-2.6795e-02, -3.5337e-03, 1.0785e-02, 1.2475e-02, -6.1160e-03,
1.0729e-02, -9.7955e-03, 1.8543e-02, -6.0488e-03, -4.5744e-03,
2.7089e-03, 1.5632e-02, -1.2928e-02, -3.0778e-03, -1.0325e-02,
-7.9550e-03, -6.3065e-02, 2.1062e-02, -6.6717e-03, 8.4616e-03,
1.4475e-02, 1.1477e-01, -2.2838e-02, -3.7491e-02, -3.6218e-02,
-3.1994e-02, -8.9252e-03, 3.1720e-02, -1.1260e-02, -1.2980e-01,
-1.0315e-03, -4.7242e-03, -2.0092e-02, -9.4521e-01, -2.2178e-02,
-4.4297e-04, 1.9711e-02, 3.3402e-02, -1.0513e-02, 1.4492e-02,
-1.9697e-02, -9.8452e-03, -1.7347e-02, 2.3472e-02, 7.6570e-02,
1.9504e-02, 9.3617e-03, 8.2672e-03, -1.0471e-02, -1.9932e-03,
2.0000e-02, 2.0485e-02, 1.0977e-02, 1.7720e-02, 1.3532e-02,
7.3682e-03, 3.4906e-04, 1.8772e-03, 1.9976e-02, -3.2041e-02,
-8.9169e-03, 1.2900e-02, -1.3331e-02, 6.6207e-03, -5.7063e-03,
-1.1482e-02, 8.3907e-03, -6.4162e-03, 1.5816e-02, 7.8921e-03,
4.4177e-03, 2.2568e-02, 1.0239e-02, -3.0194e-04, 1.3294e-02,
-2.1606e-02, 3.8832e-03, 2.4475e-02, 4.3808e-02, -2.1031e-03,
-1.2163e-02, -4.0786e-02, 1.5565e-02, 1.4750e-02, 1.6645e-02,
2.8083e-02, 1.8920e-03, -1.4733e-04, -2.6208e-02, 2.3780e-02,
1.8657e-04, -2.2931e-03, 3.0334e-03, -1.7294e-02, -2.3001e-02,
8.6004e-03, -3.3497e-02, 2.5660e-02, -1.9225e-02, -2.7186e-02,
-2.1020e-02, -3.5213e-02, -1.8228e-03, -8.2840e-03, 1.1212e-02,
1.0387e-02, -3.4194e-01, -1.9705e-03, 1.1558e-02, 5.1976e-03,
7.4498e-03, 5.7142e-03, 2.8401e-02, -7.7551e-03, 1.0682e-02,
-1.2657e-02, -1.8065e-02, 2.6681e-03, 3.3947e-03, -4.5565e-02,
-2.1170e-02, -1.7830e-02, 3.4679e-03, -2.2051e-02, -5.4176e-03,
-1.1517e-02, -3.4155e-02, -3.0335e-03, -1.3915e-02, 6.2173e-03,
-1.1101e-02, -1.5308e-02, 9.2188e-03, -7.5665e-03, 6.5685e-03,
8.0935e-03, 3.1139e-03, -5.5047e-03, -3.1347e-02, 2.2140e-02,
1.0865e-02, -2.7849e-02, -4.9580e-03, 1.8804e-03, 1.0007e-01,
-1.8013e-03, -4.8792e-03, 1.5534e-02, -2.0179e-02, -1.2351e-02,
-1.3871e-02, 1.1439e-02, -9.0208e-03, 1.2580e-02, -2.5973e-02,
-2.0398e-02, -1.9464e-03, 4.3189e-03, 2.0707e-02, 5.0029e-03,
-1.0679e-02, 1.2298e-02, 1.0269e-02, 2.2228e-02, 2.9754e-02,
-2.6392e-03, 1.9286e-02, -1.5137e-02, 2.1914e-01, 1.3030e-02,
-7.4460e-03, -9.6818e-04, 2.9736e-02, 9.8722e-03, -5.6688e-03,
4.2518e-03, 1.8941e-02, -6.3909e-03, 8.0590e-03, -6.7893e-03,
6.0878e-03, -5.3970e-03, 7.5776e-04, 1.1374e-03, -5.0035e-03,
-1.6159e-03, 1.6764e-02, 9.1251e-03, 1.3020e-02, -1.0368e-02,
2.2141e-02, -2.5411e-03, -1.5227e-02, 2.3444e-02, 8.4076e-04,
-1.1465e-01, 2.7017e-03, -4.4961e-03, 2.9762e-04, -3.9612e-03,
8.9038e-05, 2.8683e-02, 5.0068e-03, 1.6509e-02, 7.8983e-04,
5.7728e-03, 3.2685e-02, -1.0457e-01, 1.2989e-02, 1.1278e-02,
1.1943e-02, 1.5258e-02, -6.2411e-04, 1.0682e-04, 1.2087e-02,
7.2984e-03, 2.7758e-02, 1.7572e-02, -6.0345e-03, 1.7211e-02,
1.4121e-02, 6.4663e-02, 9.1813e-03, 3.2555e-03, -3.2667e-02,
2.9132e-02, -1.7770e-02, 1.5302e-03, -2.9944e-02, -2.0706e-02,
-3.6528e-03, -1.5497e-02, 1.5223e-02, -1.4751e-02, -2.2381e-02,
6.9636e-03, -8.0838e-03, -2.4583e-03, -2.0677e-02, 8.8132e-03,
-6.9554e-04, 1.6965e-02, 1.8535e-01, 3.5843e-04, 1.0812e-02,
-4.2391e-03, 8.1779e-03, 3.4144e-02, -1.8996e-03, 2.9939e-03,
3.6898e-04, -1.0144e-02, -5.7416e-03, -5.7676e-03, 1.7565e-01,
-1.5793e-03, -2.6617e-02, -1.2572e-02, 3.0421e-04, -1.2132e-02,
-1.4168e-02, 1.2154e-02, 8.4700e-03, -1.6284e-02, 2.6983e-03,
-6.8554e-03, 2.7829e-01, 2.4060e-02, 1.1130e-02, 7.6095e-04,
3.1341e-01, 2.1668e-02, 1.0277e-02, -3.0065e-02, -8.3565e-03,
5.2488e-03, -1.1287e-02, -1.8266e-02, 1.1814e-02, 1.2662e-02,
2.9036e-04, 7.0254e-04, -1.4084e-02, 1.2925e-02, 3.9504e-03,
-7.9568e-03, 3.2794e-02, 7.3839e-03, 2.4609e-02, 9.6109e-03,
-8.7206e-03, 9.2571e-03, -3.5850e-03, -8.9996e-03, 2.3120e-03,
-1.8475e-02, -1.9610e-02, 1.1994e-02, 6.7156e-03, 1.9903e-02,
3.0703e-02, -4.9538e-03, -6.1673e-02, -6.4986e-03, -2.1317e-02,
-3.3650e-03, 2.3200e-03, -6.2224e-03, 3.7458e-03, 1.1542e-02,
-1.0181e-02, -8.4711e-03, 1.1603e-02, -5.6247e-03, -1.0220e-02,
-8.6501e-04, -1.2285e-02, -8.7487e-03, -1.1265e-02, 1.6322e-02,
1.5160e-02, 1.8882e-02, 5.1557e-03, -8.8616e-03, 4.2153e-03,
-1.9450e-02, -8.7365e-03, -9.7867e-03, 1.1667e-02, 5.0613e-03,
2.8221e-03, -7.1795e-03, 9.3306e-03, -4.9663e-02, 1.7708e-02,
-2.0959e-02, -3.3989e-02, 2.2581e-03, 5.1748e-03, -1.0133e-01,
2.1052e-03, 5.5644e-03, 1.3607e-03, 8.8388e-03, 1.0244e-02,
-3.8072e-03, 5.9209e-03, 6.7993e-03, 1.1594e-02, -1.1802e-02,
-2.4233e-03, -5.1504e-03, -1.1903e-02, 1.4075e-02, -4.0701e-03,
-2.9465e-02, -1.7579e-03, 4.3654e-03, 1.0429e-02, 3.7096e-02,
8.6493e-03, 1.5871e-02, 1.8034e-02, -3.2165e-03, -2.1941e-02,
2.6274e-02, -7.6941e-03, -5.9618e-03, -1.4179e-02, 8.0281e-03,
1.1293e-02, -6.6936e-05, 1.2899e-02, 1.0056e-02, -6.3919e-04,
2.0299e-02, 3.1528e-03, -4.8988e-03, 3.2754e-03, -1.1003e-01,
1.8414e-02, 2.2272e-03, -2.2185e-02, -4.8672e-03, 1.9643e-03,
3.0928e-02, -8.9599e-03, -1.1446e-02, -1.3794e-02, 7.1943e-03,
-5.8965e-03, 2.2605e-03, -2.6114e-02, -5.6616e-03, 6.5073e-03,
9.2219e-02, -6.7243e-03, 4.4427e-04, 7.2846e-03, -1.1021e-02,
7.8802e-04, -3.8878e-03, 1.0489e-02, 9.2883e-03, 1.8895e-02,
2.1808e-02, 6.2590e-04, -2.6519e-02, 7.0343e-04, -2.9067e-02,
-9.1515e-03, 1.0418e-03, 8.3222e-03, -8.7548e-03, -2.0637e-03,
-1.1450e-02, -8.8985e-04, -4.4062e-03, 2.3629e-02, -2.7221e-02,
3.2008e-02, 6.6325e-03, -1.1302e-02, -1.0138e-03, -1.6902e-01,
-8.4473e-03, 2.8536e-02, 1.4117e-03, -1.2136e-02, -1.4781e-02,
4.9960e-03, 3.3916e-02, 5.2710e-03, 1.7382e-02, -4.6315e-03,
1.1680e-02, -9.1395e-03, 1.8310e-02, 1.2321e-02, -2.4871e-02,
1.1535e-02, 5.0308e-03, 5.5028e-03, -7.2184e-03, -5.5210e-03,
1.7085e-02, 5.7236e-03, 1.7463e-03, 1.9969e-03, 6.1670e-03,
2.9347e-03, 1.3946e-02, -1.9984e-03, 1.0091e-02, 1.0388e-03,
-6.1902e-03, 3.0905e-02, 6.6038e-03, -9.1223e-02, -1.8411e-02,
5.4185e-03, 2.4396e-02, 1.5696e-02, -1.2742e-02, 1.8126e-02,
-2.6138e-02, 1.1170e-02, -1.3058e-02, -1.9386e-02, -5.9828e-03,
1.9176e-02, 1.9962e-03, -2.1538e-03, 3.3003e-02, 1.8407e-02,
-5.9498e-03, -3.2533e-03, -1.8917e-02, -1.5897e-02, -4.7057e-03,
5.4162e-03, -3.0037e-02, 8.6773e-03, -1.7942e-03, 6.6826e-03,
-1.1929e-02, -1.4076e-02, 1.6709e-02, 1.6860e-03, -3.3842e-03,
8.6805e-03, 7.1340e-03, 1.5147e-02], grad_fn=<EmbeddingBackward>)

To get context-sensitive word embedding for given input sentence/text, here is the code,
import numpy as np
import torch
from transformers import AutoTokenizer, AutoModel
def get_word_idx(sent: str, word: str):
return sent.split(" ").index(word)
def get_hidden_states(encoded, token_ids_word, model, layers):
"""Push input IDs through model. Stack and sum `layers` (last four by default).
Select only those subword token outputs that belong to our word of interest
and average them."""
with torch.no_grad():
output = model(**encoded)
# Get all hidden states
states = output.hidden_states
# Stack and sum all requested layers
output = torch.stack([states[i] for i in layers]).sum(0).squeeze()
# Only select the tokens that constitute the requested word
word_tokens_output = output[token_ids_word]
return word_tokens_output.mean(dim=0)
def get_word_vector(sent, idx, tokenizer, model, layers):
"""Get a word vector by first tokenizing the input sentence, getting all token idxs
that make up the word of interest, and then `get_hidden_states`."""
encoded = tokenizer.encode_plus(sent, return_tensors="pt")
# get all token idxs that belong to the word of interest
token_ids_word = np.where(np.array(encoded.word_ids()) == idx)
return get_hidden_states(encoded, token_ids_word, model, layers)
def main(layers=None):
# Use last four layers by default
layers = [-4, -3, -2, -1] if layers is None else layers
tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
model = AutoModel.from_pretrained("bert-base-cased", output_hidden_states=True)
sent = "I like cookies ."
idx = get_word_idx(sent, "cookies")
word_embedding = get_word_vector(sent, idx, tokenizer, model, layers)
return word_embedding
if __name__ == '__main__':
main()
More details can be found here.

Top 4 Prediction Using Keras Model

I made my own Keras CNN and used the code below to predict. The prediction give all the 143 prediction while I only want the four major classes with the highest percentage.
Code:
preds = model.predict(imgs)
for cls in train_generator.class_indices:
x = preds[0][train_generator.class_indices[cls]]
x_pred = "{:.1%}".format(x)
value = (cls+":"+ x_pred)
print (value)
Prediction:
Acacia_abyssinica:0.0%
Acacia_kirkii:0.0%
Acacia_mearnsii:0.0%
Acacia_melanoxylon:0.0%
Acacia_nilotica:0.0%
Acacia_polyacantha:0.0%
Acacia_senegal:0.0%
Acacia_seyal:0.0%
Acacia_xanthophloea:0.0%
Afrocarpus_falcatus:0.0%
Afzelia_quanzensis:0.0%
Albizia_gummifera:0.0%
Albizia_lebbeck:0.0%
Allanblackia_floribunda:0.0%
Artocarpus_heterophyllus:0.0%
Azadirachta_indica:0.0%
Balanites_aegyptiaca:0.0%
Bersama_abyssinica:0.0%
Bischofia_javanica:0.0%
Brachylaena_huillensis:0.0%
Bridelia_micrantha:0.0%
Calodendron_capensis:0.0%
Calodendrum_capense:0.0%
Casimiroa_edulis:0.0%
Cassipourea_malosana:0.0%
Casuarina_cunninghamiana:0.0%
Casuarina_equisetifolia:4.8%
Catha_edulis:0.0%
Cathium_Keniensis:0.0%
Ceiba_pentandra:39.1%
Celtis_africana:0.0%
Chionanthus_battiscombei:0.0%
Clausena_anisat:0.0%
Clerodendrum_johnstonii:0.0%
Combretum_molle:0.0%
Cordia_africana:0.0%
Cordia_africana_Cordia:0.0%
Cotoneaster_Pannos:0.0%
Croton_macrostachyus:0.0%
Croton_megalocarpus:0.0%
Cupressus_lusitanica:0.0%
Cussonia_Spicata:0.2%
Cussonia_holstii:0.0%
Diospyros_abyssinica:0.0%
Dodonaea_angustifolia:0.0%
Dodonaea_viscosa:0.0%
Dombeya_goetzenii:0.0%
Dombeya_rotundifolia:0.0%
Dombeya_torrida:0.0%
Dovyalis_abyssinica:0.0%
Dovyalis_macrocalyx:0.0%
Drypetes_gerrardii:0.0%
Ehretia_cymosa:0.0%
Ekeber_Capensis:0.0%
Erica_arborea:0.0%
Eriobotrya_japonica:0.0%
Erythrina_abyssinica:0.0%
Eucalyptus_camaldulensis:0.0%
Eucalyptus_globulus:55.9%
Eucalyptus_grandis:0.0%
Eucalyptus_grandis_saligna:0.0%
Eucalyptus_hybrids:0.0%
Eucalyptus_saligna:0.0%
Euclea_divinorum:0.0%
Ficus_indica:0.0%
Ficus_natalensi:0.0%
Ficus_sur:0.0%
Ficus_sycomorus:0.0%
Ficus_thonningii:0.0%
Flacourtia_indica:0.0%
Flacourtiaceae:0.0%
Fraxinus_pennsylvanica:0.0%
Grevillea_robusta:0.0%
Hagenia_abyssinica:0.0%
Jacaranda_mimosifolia:0.0%
Juniperus_procera:0.0%
Kigelia_africana:0.0%
Macaranga_capensis:0.0%
Mangifera_indica:0.0%
Manilkara_Discolor:0.0%
Markhamia_lutea:0.0%
Maytenus_senegalensis:0.0%
Melia_volkensii:0.0%
Meyna_tetraphylla:0.0%
Milicia_excelsa:0.0%
Moringa_Oleifera:0.0%
Murukku_Trichilia_emetica:0.0%
Myrianthus_holstii:0.0%
Newtonia_buchananii:0.0%
Nuxia_congesta:0.0%
Ochna_holstii:0.0%
Ochna_ovata:0.0%
Ocotea_usambarensis:0.0%
Olea_Europaea:0.0%
Olea_africana:0.0%
Olea_capensis:0.0%
Olea_hochstetteri:0.0%
Olea_welwitschii:0.0%
Osyris_lanceolata:0.0%
Persea_americana:0.0%
Pinus_radiata:0.0%
Podocarpus _falcatus:0.0%
Podocarpus_latifolius:0.0%
Polyscias_fulva:0.0%
Polyscias_kikuyuensis:0.0%
Pouteria_adolfi_friedericii:0.0%
Prunus_africana:0.0%
Psidium_guajava:0.0%
Rauvolfia_Vomitoria:0.0%
Rhus_natalensis:0.0%
Rhus_vulgaris:0.0%
Schinus_molle:0.0%
Schrebera_alata:0.0%
Sclerocarya_birrea:0.0%
Scolopia_zeyheri:0.0%
Senna_siamea:0.0%
Sinarundinaria_alpina:0.0%
Solanum_mauritianum:0.0%
Spathodea_campanulata:0.0%
Strychnos_usambare:0.0%
Syzygium_afromontana:0.0%
Syzygium_cordatum:0.0%
Syzygium_cuminii:0.0%
Syzygium_guineense:0.0%
Tamarindus_indica:0.0%
Tarchonanthus_camphoratus:0.0%
Teclea_Nobilis:0.0%
Teclea_simplicifolia:0.0%
Terminalia_brownii:0.0%
Terminalia_mantaly:0.0%
Toddalia_asiatica:0.0%
Trema_Orientalis:0.0%
Trichilia_emetica:0.0%
Trichocladus_ellipticus:0.0%
Trimeria_grandifolia:0.0%
Vangueria_madagascariensis:0.0%
Vepris_nobilis:0.0%
Vepris_simplicifolia:0.0%
Vernonia_auriculifera:0.0%
Vitex_keniensis:0.0%
Warburgia_ugandensis:0.0%
Zanthoxylum_gilletii:0.0%
Mahogany_tree:0.0%

You can just get all your predictions, sort them and take top four
preds = model.predict(imgs)
sorted_preds = []
for cls in train_generator.class_indices:
x = preds[0][train_generator.class_indices[cls]]
x_pred = "{:.1%}".format(x)
sorted_preds.append([x, x_pred, cls])
top_4 = sorted(sorted_preds, reverse=True)[:4]

Reducing Model file size in LIBSVM

I want to reduce the model file size . Can we reduce it by reducing the number of digits in the weights of the model file. The number of classes in my model file is around 3800 and the number of features is around 357000. Here is some excerpt from the model file. Can I reduce the number of digits in these weights.
solver_type L2R_L2LOSS_SVC_DUAL
nr_class 3821
nr_feature 357021
bias -1.000000000000000
w
-0.6298615183549175 -0.6884816945277815 -0.9850473581929793
-0.2730180225739936 -0.4444522939544599 -0.3045368061994185
-0.6752904784743610 -0.4936186126242763 -0.8167435931134331
-0.8747648882598349 -0.4980187300672689 -0.8255372912521536
-0.3329812532124196 -0.1751416471640286 -0.7447656595877303
-0.4240569914873799 -0.9004909961812873 -0.9857813112641359
-0.3674085365663847 -0.4819407419877990 -0.3645238468547681
-0.5827397105860186 -0.7290781581209491 -0.8615229165775795
-0.3975308017493017 -0.6522787326004871 -0.9846626520798610
-0.5583216247458188 -0.9488816092738117 -0.6469158771901011
-0.2306256734853684 -0.2940612946888093 -0.6895719661937446
-0.3041407180695167 -0.5602587606930518 -0.4434458835686698
-0.3960629365410545 -0.7512211790407204 -0.6082476608695304
-1.336132842955273 -0.6057066303450040 -0.5726087731282288
-0.4918814547677718 -0.7606578865363953 -0.2951659264868926
-0.3881680788359501 -0.3109241231671961 -0.7078707491799914
-0.3623625688446360 -0.4430137729068305 -0.9279271098475936
-0.2290838088700753 -0.3870980678621480 -0.8000332693180561
-0.7964744879675550 -0.4950551119251316 -0.5201500981458075
-0.6654200978736288 -0.9037766341356712 -0.5921799507740539
-0.4552915755388566 -0.8048467444625557 -0.08638961422716016
-0.3175800991399296 -0.8889281355804046 -0.8889673432972257
0.009443893188055608 -0.3033030733905986 -0.6063958370642328
-0.7781676697747630 -0.9969339455729528 -0.7847641855193951
-0.3709450948897945 -0.9293821956430142 -0.6711216076980766
-0.6472048031763484 -0.2844660995208588 -0.4547657013618363
-0.3093274839631762 -0.8264594986328345 -0.2693948669009715
-0.5691246530468883 -0.5816949288414970 -0.7988407843132017
-0.5846410991542126 -0.6102733673192773 -0.9474472897104326
-0.4619018809588187 -0.6922626991585266 -0.8529509393486879
-0.9341690394723746 -0.2048861760333368 -0.5763255438056814
-0.4753823007333206 -0.9847858814169310 -0.6084670508904806
-0.6097889096385636 -0.1558026578670219 -0.5407452525949980
-0.8426597160875828 -0.5728578082647764 -0.6254655056167889
-0.5002570985981800 -0.5660289375686121 -0.6966970933117435
-0.3595184568720410 -0.8869769517170271 -0.8293060581021244
-0.7660244640066636 -0.9191108227612158 -0.7495472111112249
-0.3250789003708131 -0.8545862221106031 -0.9847863669982040
-0.9862358540926807 -0.9843872487122278 -0.3764841688606632
-0.6665806111063707 -0.6998869717621219 -0.8398491506346015
-0.7498849663083538 -0.2584536929034274 -0.8798094698402976
-0.8659064866640068 -0.8540212609217359 -0.4705628403387491
-0.9848057457322186 -0.5870303872290659 -0.9105115844147157
-0.6855534064105064 -0.7447256224770895 -0.9845164901161550
-0.9267803381073205 -0.6874399094864110 -0.9868490844056681
-0.9871049327408159 -0.9127271706215343 -0.8894132571749456
-0.7481430771200624 -0.7661512147794380 -0.4619076734386954
-0.3463253354355214 -0.7324122395130058 -0.7198934949704492
-0.3869971300152642 -0.3580173602243875 -0.8144411145869335
-0.4708508640578066 -0.7583061726079500 -0.6102585014526588
-0.2323551831668570 -0.7124730357532248 -0.6407019387626708
-0.8770555543363814 -0.7747723882503575 -0.8880529094965369
-0.5221765657051773 -0.8927103129537772 -0.8873570244928761
-0.6814118942525524 -0.4812414843861851 -0.07723442473878635
-0.3004215736435181 -0.7901826925719376 -0.6000050603345796
-0.9391488020802135 -0.6130019120301854 -0.6519260224181763
-0.6312423953207323 -0.6236684911320279 -0.8319901021019791
-0.9846585341126538 -0.8241847119432536 -0.9849733862258551
0.03619613868867930 -0.9402473523400392 -0.4963043182116479
-0.06988396609313940 -0.6160025364808686 -0.9485679374403244
-0.9552678112333591 -0.2951058860501357 -0.9871232492575841
-0.2801466899229405 -0.5623043303

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Numbers of hidden layers and units in AutoKeras dense block - keras

Related

Computing gradient twice for two different losses in Pytorch

Gradients vanishing despite using Kaiming initialization

BERT zero layer fixed word embeddings [duplicate]

Top 4 Prediction Using Keras Model

Reducing Model file size in LIBSVM

Categories

Resources