Numbers of hidden layers and units in AutoKeras dense block - keras
I am training a model with Autokeras. So far my best model is this:
structured_data_block_1/normalize:false
structured_data_block_1/dense_block_1/use_batchnorm:true
structured_data_block_1/dense_block_1/num_layers:2
structured_data_block_1/dense_block_1/units_0:32
structured_data_block_1/dense_block_1/dropout:0
structured_data_block_1/dense_block_1/units_1:32
dense_block_2/use_batchnorm:true
dense_block_2/num_layers:2
dense_block_2/units_0:128
dense_block_2/dropout:0
dense_block_2/units_1:16
dense_block_3/use_batchnorm:false
dense_block_3/num_layers:1
dense_block_3/units_0:32
dense_block_3/dropout:0
dense_block_3/units_1:32
regression_head_1/dropout:0
optimizer:"adam"
learning_rate:0.1
dense_block_2/units_2:32
structured_data_block_1/dense_block_1/units_2:256
dense_block_3/units_2:128
My first dense_block_1 has 2 layers (num_layers:2), how can I have three units / neurons then? It say units_0: 32, units_1: 32 and units_2: 256, this implies to me that I have three layers, so why is num_layers:2?
If I would want to recreate the above model in this code, how would I do it properly?
input_node = ak.StructuredDataInput()
output_node = ak.StructuredDataBlock(categorical_encoding=False, normalize=False)(input_node)
output_node = ak.DenseBlock()(output_node)
output_node = ak.DenseBlock()(output_node)
output_node = ak.RegressionHead()(output_node)
Thx for any input
Related
Computing gradient twice for two different losses in Pytorch
I want to compute the gradients twice for two different losses in the same iteration. Code: batch_output0,batch_output1 = get_output_from_model(model=model, data=batch[0]) train_loss0 = loss_fun0(batch_output0, batch_labels0.float().view(-1, 1)) train_loss0.backward() grad0_conv_w = model.conv1.conv1.weight.grad batch_output0,batch_output1 = get_output_from_model(model=model, data=batch[0]) train_loss1 = loss_fun1(batch_output1, batch_labels1.float().view(-1, 1)) train_loss1.backward() grad1_conv_w = model.conv1.conv1.weight.grad Outputs: train_loss0: tensor(0.6950, grad_fn=<BinaryCrossEntropyBackward>) train_loss1: tensor(25.5431, grad_fn=<MseLossBackward>) Grad0: tensor([-2.4883e-05, 3.7842e-05, 1.2635e-04, ..., -1.6413e-04, -1.8419e-04, -1.7884e-04]) Grad1: tensor([-2.4883e-05, 3.7842e-05, 1.2635e-04, ..., -1.6413e-04, -1.8419e-04, -1.7884e-04]) You may note that even though the two losses are quite different, the gradients for the corresponding losses are exactly the same. Please help me to diagnose the problem. Thank you.
Gradients vanishing despite using Kaiming initialization
I was implementing a conv block in pytorch with activation function(prelu). I used Kaiming initilization to initialize all my weights and set all the bias to zero. However as I tested these blocks (by stacking 100 such conv and activation blocks on top of each other), I noticed that the output I am getting values of the order of 10^(-10). Is this normal, considering I am stacking upto 100 layers. Adding a small bias to each layer fixes the problem. But in Kaiming initialization the biases are supposed to be zero. Here is the conv block code from collections import Iterable def convBlock( input_channels, output_channels, kernel_size=3, padding=None, activation="prelu" ): """ Initializes a conv block using Kaiming Initialization """ padding_par = 0 if padding == "same": padding_par = same_padding(kernel_size) conv = nn.Conv2d(input_channels, output_channels, kernel_size, padding=padding_par) relu_negative_slope = 0.25 act = None if activation == "prelu" or activation == "leaky_relu": nn.init.kaiming_normal_(conv.weight, a=relu_negative_slope, mode="fan_in") if activation == "prelu": act = nn.PReLU(init=relu_negative_slope) else: act = nn.LeakyReLU(negative_slope=relu_negative_slope) if activation == "relu": nn.init.kaiming_normal_(conv.weight, nonlinearity="relu") act = nn.ReLU() nn.init.constant_(conv.bias.data, 0) block = nn.Sequential(conv, act) return block def flatten(lis): for item in lis: if isinstance(item, Iterable) and not isinstance(item, str): for x in flatten(item): yield x else: yield item def Sequential(args): flattened_args = list(flatten(args)) return nn.Sequential(*flattened_args) This is the test Code ls=[] for i in range(100): ls.append(convBlock(3,3,3,"same")) model=Sequential(ls) test=np.ones((1,3,5,5)) model(torch.Tensor(test)) And the output I am getting is tensor([[[[-1.7771e-10, -3.5088e-10, 5.9369e-09, 4.2668e-09, 9.8803e-10], [ 1.8657e-09, -4.0271e-10, 3.1189e-09, 1.5117e-09, 6.6546e-09], [ 2.4237e-09, -6.2249e-10, -5.7327e-10, 4.2867e-09, 6.0034e-09], [-1.8757e-10, 5.5446e-09, 1.7641e-09, 5.7018e-09, 6.4347e-09], [ 1.2352e-09, -3.4732e-10, 4.1553e-10, -1.2996e-09, 3.8971e-09]], [[ 2.6607e-09, 1.7756e-09, -1.0923e-09, -1.4272e-09, -1.1840e-09], [ 2.0668e-10, -1.8130e-09, -2.3864e-09, -1.7061e-09, -1.7147e-10], [-6.7161e-10, -1.3440e-09, -6.3196e-10, -8.7677e-10, -1.4851e-09], [ 3.1475e-09, -1.6574e-09, -3.4180e-09, -3.5224e-09, -2.6642e-09], [-1.9703e-09, -3.2277e-09, -2.4733e-09, -2.3707e-09, -8.7598e-10]], [[ 3.5573e-09, 7.8113e-09, 6.8232e-09, 1.2285e-09, -9.3973e-10], [ 6.6368e-09, 8.2877e-09, 9.2108e-10, 9.7531e-10, 7.0011e-10], [ 6.6954e-09, 9.1019e-09, 1.5128e-08, 3.3151e-09, 2.1899e-10], [ 1.2152e-08, 7.7002e-09, 1.6406e-08, 1.4948e-08, -6.0882e-10], [ 6.9930e-09, 7.3222e-09, -7.4308e-10, 5.2505e-09, 3.4365e-09]]]], grad_fn=<PreluBackward>)
Amazing question (and welcome to StackOverflow)! Research paper for quick reference. TLDR Try wider networks (64 channels) Add Batch Normalization after activation (or even before, shouldn't make much difference) Add residual connections (shouldn't improve much over batch norm, last resort) Please check this out in this order and give a comment what (and if) any of that worked in your case (as I'm also curious). Things you do differently Your neural network is very deep, yet very narrow (81 parameters per layer only!) Due to above, one cannot reliably create those weights from normal distribution as the sample is just too small. Try wider networks, 64 channels or more You are trying much deeper network than they did Section: Comparison Experiments We conducted comparisons on a deep but efficient model with 14 weight layers (actually 22 was also tested in comparison with Xavier) That was due to date of release of this paper (2015) and hardware limitations "back in the days" (let's say) Is this normal? Approach itself is quite strange with layers of this depth, at least currently; each conv block is usually followed by activation like ReLU and Batch Normalization (which normalizes signal and helps with exploding/vanishing signals) usually networks of this depth (even of depth half of what you've got) use also residual connections (though this is not directly linked to vanishing/small signal, more connected to degradation problem of even deep networks, like 1000 layers)
BERT zero layer fixed word embeddings [duplicate]
I know that BERT has total vocabulary size of 30522 which contains some words and subwords. I want to get the initial input embeddings of BERT. So, my requirement is to get the table of size [30522, 768] to which I can index by token id to get its embeddings. Where can I get this table?
The BertModels have get_input_embeddings(): import torch from transformers import BertModel, BertTokenizer tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') bert = BertModel.from_pretrained('bert-base-uncased') token_embedding = {token: bert.get_input_embeddings()(torch.tensor(id)) for token, id in tokenizer.get_vocab().items()} print(len(token_embedding)) print(token_embedding['[CLS]']) Output: 30522 tensor([ 1.3630e-02, -2.6490e-02, -2.3503e-02, -7.7876e-03, 8.5892e-03, -7.6645e-03, -9.8808e-03, 6.0184e-03, 4.6921e-03, -3.0984e-02, 1.8883e-02, -6.0093e-03, -1.6652e-02, 1.1684e-02, -3.6245e-02, 8.3482e-03, -1.2112e-03, 1.0322e-02, 1.6692e-02, -3.0354e-02, -1.2372e-02, -2.5173e-02, -8.9602e-03, 8.1994e-03, -2.0011e-02, -1.5901e-02, -3.8394e-03, 1.4241e-03, 7.0500e-03, 1.6092e-03, -2.7764e-03, 9.4931e-03, -2.2768e-02, 1.9317e-02, -1.3442e-02, -2.3763e-02, -1.4617e-02, 9.7735e-03, -2.2428e-03, 3.0642e-02, 6.7829e-03, -2.6471e-03, -1.8553e-02, -1.2363e-02, 7.6489e-03, -2.5461e-03, -3.1498e-01, 6.3761e-03, 4.8914e-02, -7.7636e-03, 6.0919e-02, 2.1346e-02, -3.9741e-02, 2.2853e-01, 2.6502e-02, -1.0144e-03, -7.8480e-03, -1.9995e-03, 1.7057e-02, -3.3270e-02, 4.5421e-03, 6.1751e-03, -1.0077e-01, -2.0973e-02, -1.4512e-04, -9.6657e-03, 1.0871e-02, -1.4786e-02, 2.6437e-04, 2.1166e-02, 1.6492e-02, -5.1928e-03, -1.1857e-02, -9.9159e-03, -1.4363e-02, -1.2405e-02, -1.2973e-02, 2.6778e-02, -1.0986e-02, 1.0572e-02, -2.5566e-02, 5.2494e-03, 1.5890e-02, -5.1504e-03, -7.5859e-03, 2.0259e-02, -7.0155e-03, 1.6359e-02, 1.7487e-02, 5.4297e-03, -8.6403e-03, 2.8821e-02, -7.8964e-03, 1.9259e-02, 2.3868e-02, -4.3472e-03, 5.5662e-02, -2.1940e-02, 4.1779e-03, -5.7216e-03, 2.6712e-02, -5.0371e-03, 2.4923e-02, -1.3429e-02, -8.4337e-03, 9.8188e-02, -1.2940e-03, 1.2865e-02, -1.5930e-03, 3.6437e-03, 1.5569e-02, 1.8620e-02, -9.0643e-03, -1.9740e-02, 1.0530e-02, -2.7359e-03, -7.5283e-03, 1.1492e-03, 2.6162e-03, -6.2757e-03, -8.6096e-03, 6.6221e-01, -3.2235e-03, -4.1309e-02, 3.3047e-03, -2.5040e-03, 1.2838e-04, -6.8073e-03, 6.0291e-03, -9.8468e-03, 8.0641e-03, -1.9815e-03, 2.5801e-02, 5.7429e-03, -1.0712e-02, 2.9176e-02, 5.9414e-03, 2.4795e-02, -1.7887e-02, 7.3183e-01, 1.0964e-02, 5.9942e-03, -4.6157e-02, 4.0131e-02, -9.7481e-03, -8.9496e-01, 1.6385e-02, -1.9816e-03, 1.4691e-02, -1.9837e-02, -1.7611e-02, -4.5263e-04, -1.8605e-02, -1.5660e-02, -1.0709e-02, 1.8016e-02, -3.4149e-03, -1.2632e-02, 4.2877e-03, -3.9169e-01, 1.0016e-02, -1.0955e-02, 4.5133e-03, -5.1150e-03, 4.9968e-03, 1.7852e-02, 1.1313e-02, 2.6519e-03, 3.3658e-01, -1.8168e-02, 1.3170e-02, 7.3927e-03, 5.2521e-03, -9.6230e-03, 1.2844e-02, 4.1554e-01, -9.7247e-03, -4.2439e-03, 5.5287e-04, 1.8271e-02, -1.3889e-03, -2.0502e-03, -8.1946e-03, -6.5979e-06, -7.2764e-04, -1.4625e-03, -6.9872e-03, -6.9633e-03, -8.0701e-03, 1.9936e-02, 4.8370e-03, 8.6883e-03, -4.9246e-02, -2.0028e-02, 1.4124e-03, 1.0444e-02, -1.1236e-02, -4.4654e-03, -2.0491e-02, -2.7654e-02, -3.7079e-02, 1.3215e-02, 6.9498e-02, -3.1109e-02, 7.0562e-03, 1.0887e-02, -7.8090e-03, -1.0501e-02, -4.8735e-03, -6.8399e-04, 1.4717e-02, 4.4342e-03, 1.6012e-02, -1.0427e-02, -2.5767e-02, -2.2699e-01, 8.6569e-02, 2.3453e-02, 4.6362e-02, 3.5609e-03, 2.1353e-02, 2.3703e-02, -2.0252e-02, 2.1580e-02, 7.2652e-03, 2.0933e-01, 1.2108e-02, 1.0869e-02, 7.0568e-03, -3.1132e-02, 2.0505e-02, 3.2248e-03, -2.2724e-03, 5.5342e-03, 3.0563e-03, 1.9542e-02, 1.2827e-03, 1.5952e-02, -1.5458e-02, -3.8455e-03, -4.9417e-03, -1.0446e-02, 7.0516e-03, 2.2467e-03, -9.3643e-03, 1.9163e-02, 1.4239e-02, -1.5816e-02, 8.7413e-03, 2.4737e-02, -7.3777e-03, -4.0975e-02, 9.4948e-03, 1.4700e-02, 2.6819e-02, 1.0706e-02, 1.0621e-02, -7.1816e-03, -8.5402e-03, 1.2261e-02, -4.8679e-03, -9.6136e-03, 7.8765e-04, 3.8504e-02, -7.7485e-03, -6.5018e-03, 3.4352e-03, 2.2931e-04, 5.7456e-03, -4.8441e-03, -9.0898e-03, 8.6298e-03, 5.4740e-03, 2.2274e-02, -2.1218e-02, -2.6795e-02, -3.5337e-03, 1.0785e-02, 1.2475e-02, -6.1160e-03, 1.0729e-02, -9.7955e-03, 1.8543e-02, -6.0488e-03, -4.5744e-03, 2.7089e-03, 1.5632e-02, -1.2928e-02, -3.0778e-03, -1.0325e-02, -7.9550e-03, -6.3065e-02, 2.1062e-02, -6.6717e-03, 8.4616e-03, 1.4475e-02, 1.1477e-01, -2.2838e-02, -3.7491e-02, -3.6218e-02, -3.1994e-02, -8.9252e-03, 3.1720e-02, -1.1260e-02, -1.2980e-01, -1.0315e-03, -4.7242e-03, -2.0092e-02, -9.4521e-01, -2.2178e-02, -4.4297e-04, 1.9711e-02, 3.3402e-02, -1.0513e-02, 1.4492e-02, -1.9697e-02, -9.8452e-03, -1.7347e-02, 2.3472e-02, 7.6570e-02, 1.9504e-02, 9.3617e-03, 8.2672e-03, -1.0471e-02, -1.9932e-03, 2.0000e-02, 2.0485e-02, 1.0977e-02, 1.7720e-02, 1.3532e-02, 7.3682e-03, 3.4906e-04, 1.8772e-03, 1.9976e-02, -3.2041e-02, -8.9169e-03, 1.2900e-02, -1.3331e-02, 6.6207e-03, -5.7063e-03, -1.1482e-02, 8.3907e-03, -6.4162e-03, 1.5816e-02, 7.8921e-03, 4.4177e-03, 2.2568e-02, 1.0239e-02, -3.0194e-04, 1.3294e-02, -2.1606e-02, 3.8832e-03, 2.4475e-02, 4.3808e-02, -2.1031e-03, -1.2163e-02, -4.0786e-02, 1.5565e-02, 1.4750e-02, 1.6645e-02, 2.8083e-02, 1.8920e-03, -1.4733e-04, -2.6208e-02, 2.3780e-02, 1.8657e-04, -2.2931e-03, 3.0334e-03, -1.7294e-02, -2.3001e-02, 8.6004e-03, -3.3497e-02, 2.5660e-02, -1.9225e-02, -2.7186e-02, -2.1020e-02, -3.5213e-02, -1.8228e-03, -8.2840e-03, 1.1212e-02, 1.0387e-02, -3.4194e-01, -1.9705e-03, 1.1558e-02, 5.1976e-03, 7.4498e-03, 5.7142e-03, 2.8401e-02, -7.7551e-03, 1.0682e-02, -1.2657e-02, -1.8065e-02, 2.6681e-03, 3.3947e-03, -4.5565e-02, -2.1170e-02, -1.7830e-02, 3.4679e-03, -2.2051e-02, -5.4176e-03, -1.1517e-02, -3.4155e-02, -3.0335e-03, -1.3915e-02, 6.2173e-03, -1.1101e-02, -1.5308e-02, 9.2188e-03, -7.5665e-03, 6.5685e-03, 8.0935e-03, 3.1139e-03, -5.5047e-03, -3.1347e-02, 2.2140e-02, 1.0865e-02, -2.7849e-02, -4.9580e-03, 1.8804e-03, 1.0007e-01, -1.8013e-03, -4.8792e-03, 1.5534e-02, -2.0179e-02, -1.2351e-02, -1.3871e-02, 1.1439e-02, -9.0208e-03, 1.2580e-02, -2.5973e-02, -2.0398e-02, -1.9464e-03, 4.3189e-03, 2.0707e-02, 5.0029e-03, -1.0679e-02, 1.2298e-02, 1.0269e-02, 2.2228e-02, 2.9754e-02, -2.6392e-03, 1.9286e-02, -1.5137e-02, 2.1914e-01, 1.3030e-02, -7.4460e-03, -9.6818e-04, 2.9736e-02, 9.8722e-03, -5.6688e-03, 4.2518e-03, 1.8941e-02, -6.3909e-03, 8.0590e-03, -6.7893e-03, 6.0878e-03, -5.3970e-03, 7.5776e-04, 1.1374e-03, -5.0035e-03, -1.6159e-03, 1.6764e-02, 9.1251e-03, 1.3020e-02, -1.0368e-02, 2.2141e-02, -2.5411e-03, -1.5227e-02, 2.3444e-02, 8.4076e-04, -1.1465e-01, 2.7017e-03, -4.4961e-03, 2.9762e-04, -3.9612e-03, 8.9038e-05, 2.8683e-02, 5.0068e-03, 1.6509e-02, 7.8983e-04, 5.7728e-03, 3.2685e-02, -1.0457e-01, 1.2989e-02, 1.1278e-02, 1.1943e-02, 1.5258e-02, -6.2411e-04, 1.0682e-04, 1.2087e-02, 7.2984e-03, 2.7758e-02, 1.7572e-02, -6.0345e-03, 1.7211e-02, 1.4121e-02, 6.4663e-02, 9.1813e-03, 3.2555e-03, -3.2667e-02, 2.9132e-02, -1.7770e-02, 1.5302e-03, -2.9944e-02, -2.0706e-02, -3.6528e-03, -1.5497e-02, 1.5223e-02, -1.4751e-02, -2.2381e-02, 6.9636e-03, -8.0838e-03, -2.4583e-03, -2.0677e-02, 8.8132e-03, -6.9554e-04, 1.6965e-02, 1.8535e-01, 3.5843e-04, 1.0812e-02, -4.2391e-03, 8.1779e-03, 3.4144e-02, -1.8996e-03, 2.9939e-03, 3.6898e-04, -1.0144e-02, -5.7416e-03, -5.7676e-03, 1.7565e-01, -1.5793e-03, -2.6617e-02, -1.2572e-02, 3.0421e-04, -1.2132e-02, -1.4168e-02, 1.2154e-02, 8.4700e-03, -1.6284e-02, 2.6983e-03, -6.8554e-03, 2.7829e-01, 2.4060e-02, 1.1130e-02, 7.6095e-04, 3.1341e-01, 2.1668e-02, 1.0277e-02, -3.0065e-02, -8.3565e-03, 5.2488e-03, -1.1287e-02, -1.8266e-02, 1.1814e-02, 1.2662e-02, 2.9036e-04, 7.0254e-04, -1.4084e-02, 1.2925e-02, 3.9504e-03, -7.9568e-03, 3.2794e-02, 7.3839e-03, 2.4609e-02, 9.6109e-03, -8.7206e-03, 9.2571e-03, -3.5850e-03, -8.9996e-03, 2.3120e-03, -1.8475e-02, -1.9610e-02, 1.1994e-02, 6.7156e-03, 1.9903e-02, 3.0703e-02, -4.9538e-03, -6.1673e-02, -6.4986e-03, -2.1317e-02, -3.3650e-03, 2.3200e-03, -6.2224e-03, 3.7458e-03, 1.1542e-02, -1.0181e-02, -8.4711e-03, 1.1603e-02, -5.6247e-03, -1.0220e-02, -8.6501e-04, -1.2285e-02, -8.7487e-03, -1.1265e-02, 1.6322e-02, 1.5160e-02, 1.8882e-02, 5.1557e-03, -8.8616e-03, 4.2153e-03, -1.9450e-02, -8.7365e-03, -9.7867e-03, 1.1667e-02, 5.0613e-03, 2.8221e-03, -7.1795e-03, 9.3306e-03, -4.9663e-02, 1.7708e-02, -2.0959e-02, -3.3989e-02, 2.2581e-03, 5.1748e-03, -1.0133e-01, 2.1052e-03, 5.5644e-03, 1.3607e-03, 8.8388e-03, 1.0244e-02, -3.8072e-03, 5.9209e-03, 6.7993e-03, 1.1594e-02, -1.1802e-02, -2.4233e-03, -5.1504e-03, -1.1903e-02, 1.4075e-02, -4.0701e-03, -2.9465e-02, -1.7579e-03, 4.3654e-03, 1.0429e-02, 3.7096e-02, 8.6493e-03, 1.5871e-02, 1.8034e-02, -3.2165e-03, -2.1941e-02, 2.6274e-02, -7.6941e-03, -5.9618e-03, -1.4179e-02, 8.0281e-03, 1.1293e-02, -6.6936e-05, 1.2899e-02, 1.0056e-02, -6.3919e-04, 2.0299e-02, 3.1528e-03, -4.8988e-03, 3.2754e-03, -1.1003e-01, 1.8414e-02, 2.2272e-03, -2.2185e-02, -4.8672e-03, 1.9643e-03, 3.0928e-02, -8.9599e-03, -1.1446e-02, -1.3794e-02, 7.1943e-03, -5.8965e-03, 2.2605e-03, -2.6114e-02, -5.6616e-03, 6.5073e-03, 9.2219e-02, -6.7243e-03, 4.4427e-04, 7.2846e-03, -1.1021e-02, 7.8802e-04, -3.8878e-03, 1.0489e-02, 9.2883e-03, 1.8895e-02, 2.1808e-02, 6.2590e-04, -2.6519e-02, 7.0343e-04, -2.9067e-02, -9.1515e-03, 1.0418e-03, 8.3222e-03, -8.7548e-03, -2.0637e-03, -1.1450e-02, -8.8985e-04, -4.4062e-03, 2.3629e-02, -2.7221e-02, 3.2008e-02, 6.6325e-03, -1.1302e-02, -1.0138e-03, -1.6902e-01, -8.4473e-03, 2.8536e-02, 1.4117e-03, -1.2136e-02, -1.4781e-02, 4.9960e-03, 3.3916e-02, 5.2710e-03, 1.7382e-02, -4.6315e-03, 1.1680e-02, -9.1395e-03, 1.8310e-02, 1.2321e-02, -2.4871e-02, 1.1535e-02, 5.0308e-03, 5.5028e-03, -7.2184e-03, -5.5210e-03, 1.7085e-02, 5.7236e-03, 1.7463e-03, 1.9969e-03, 6.1670e-03, 2.9347e-03, 1.3946e-02, -1.9984e-03, 1.0091e-02, 1.0388e-03, -6.1902e-03, 3.0905e-02, 6.6038e-03, -9.1223e-02, -1.8411e-02, 5.4185e-03, 2.4396e-02, 1.5696e-02, -1.2742e-02, 1.8126e-02, -2.6138e-02, 1.1170e-02, -1.3058e-02, -1.9386e-02, -5.9828e-03, 1.9176e-02, 1.9962e-03, -2.1538e-03, 3.3003e-02, 1.8407e-02, -5.9498e-03, -3.2533e-03, -1.8917e-02, -1.5897e-02, -4.7057e-03, 5.4162e-03, -3.0037e-02, 8.6773e-03, -1.7942e-03, 6.6826e-03, -1.1929e-02, -1.4076e-02, 1.6709e-02, 1.6860e-03, -3.3842e-03, 8.6805e-03, 7.1340e-03, 1.5147e-02], grad_fn=<EmbeddingBackward>)
To get context-sensitive word embedding for given input sentence/text, here is the code, import numpy as np import torch from transformers import AutoTokenizer, AutoModel def get_word_idx(sent: str, word: str): return sent.split(" ").index(word) def get_hidden_states(encoded, token_ids_word, model, layers): """Push input IDs through model. Stack and sum `layers` (last four by default). Select only those subword token outputs that belong to our word of interest and average them.""" with torch.no_grad(): output = model(**encoded) # Get all hidden states states = output.hidden_states # Stack and sum all requested layers output = torch.stack([states[i] for i in layers]).sum(0).squeeze() # Only select the tokens that constitute the requested word word_tokens_output = output[token_ids_word] return word_tokens_output.mean(dim=0) def get_word_vector(sent, idx, tokenizer, model, layers): """Get a word vector by first tokenizing the input sentence, getting all token idxs that make up the word of interest, and then `get_hidden_states`.""" encoded = tokenizer.encode_plus(sent, return_tensors="pt") # get all token idxs that belong to the word of interest token_ids_word = np.where(np.array(encoded.word_ids()) == idx) return get_hidden_states(encoded, token_ids_word, model, layers) def main(layers=None): # Use last four layers by default layers = [-4, -3, -2, -1] if layers is None else layers tokenizer = AutoTokenizer.from_pretrained("bert-base-cased") model = AutoModel.from_pretrained("bert-base-cased", output_hidden_states=True) sent = "I like cookies ." idx = get_word_idx(sent, "cookies") word_embedding = get_word_vector(sent, idx, tokenizer, model, layers) return word_embedding if __name__ == '__main__': main() More details can be found here.
Top 4 Prediction Using Keras Model
I made my own Keras CNN and used the code below to predict. The prediction give all the 143 prediction while I only want the four major classes with the highest percentage. Code: preds = model.predict(imgs) for cls in train_generator.class_indices: x = preds[0][train_generator.class_indices[cls]] x_pred = "{:.1%}".format(x) value = (cls+":"+ x_pred) print (value) Prediction: Acacia_abyssinica:0.0% Acacia_kirkii:0.0% Acacia_mearnsii:0.0% Acacia_melanoxylon:0.0% Acacia_nilotica:0.0% Acacia_polyacantha:0.0% Acacia_senegal:0.0% Acacia_seyal:0.0% Acacia_xanthophloea:0.0% Afrocarpus_falcatus:0.0% Afzelia_quanzensis:0.0% Albizia_gummifera:0.0% Albizia_lebbeck:0.0% Allanblackia_floribunda:0.0% Artocarpus_heterophyllus:0.0% Azadirachta_indica:0.0% Balanites_aegyptiaca:0.0% Bersama_abyssinica:0.0% Bischofia_javanica:0.0% Brachylaena_huillensis:0.0% Bridelia_micrantha:0.0% Calodendron_capensis:0.0% Calodendrum_capense:0.0% Casimiroa_edulis:0.0% Cassipourea_malosana:0.0% Casuarina_cunninghamiana:0.0% Casuarina_equisetifolia:4.8% Catha_edulis:0.0% Cathium_Keniensis:0.0% Ceiba_pentandra:39.1% Celtis_africana:0.0% Chionanthus_battiscombei:0.0% Clausena_anisat:0.0% Clerodendrum_johnstonii:0.0% Combretum_molle:0.0% Cordia_africana:0.0% Cordia_africana_Cordia:0.0% Cotoneaster_Pannos:0.0% Croton_macrostachyus:0.0% Croton_megalocarpus:0.0% Cupressus_lusitanica:0.0% Cussonia_Spicata:0.2% Cussonia_holstii:0.0% Diospyros_abyssinica:0.0% Dodonaea_angustifolia:0.0% Dodonaea_viscosa:0.0% Dombeya_goetzenii:0.0% Dombeya_rotundifolia:0.0% Dombeya_torrida:0.0% Dovyalis_abyssinica:0.0% Dovyalis_macrocalyx:0.0% Drypetes_gerrardii:0.0% Ehretia_cymosa:0.0% Ekeber_Capensis:0.0% Erica_arborea:0.0% Eriobotrya_japonica:0.0% Erythrina_abyssinica:0.0% Eucalyptus_camaldulensis:0.0% Eucalyptus_globulus:55.9% Eucalyptus_grandis:0.0% Eucalyptus_grandis_saligna:0.0% Eucalyptus_hybrids:0.0% Eucalyptus_saligna:0.0% Euclea_divinorum:0.0% Ficus_indica:0.0% Ficus_natalensi:0.0% Ficus_sur:0.0% Ficus_sycomorus:0.0% Ficus_thonningii:0.0% Flacourtia_indica:0.0% Flacourtiaceae:0.0% Fraxinus_pennsylvanica:0.0% Grevillea_robusta:0.0% Hagenia_abyssinica:0.0% Jacaranda_mimosifolia:0.0% Juniperus_procera:0.0% Kigelia_africana:0.0% Macaranga_capensis:0.0% Mangifera_indica:0.0% Manilkara_Discolor:0.0% Markhamia_lutea:0.0% Maytenus_senegalensis:0.0% Melia_volkensii:0.0% Meyna_tetraphylla:0.0% Milicia_excelsa:0.0% Moringa_Oleifera:0.0% Murukku_Trichilia_emetica:0.0% Myrianthus_holstii:0.0% Newtonia_buchananii:0.0% Nuxia_congesta:0.0% Ochna_holstii:0.0% Ochna_ovata:0.0% Ocotea_usambarensis:0.0% Olea_Europaea:0.0% Olea_africana:0.0% Olea_capensis:0.0% Olea_hochstetteri:0.0% Olea_welwitschii:0.0% Osyris_lanceolata:0.0% Persea_americana:0.0% Pinus_radiata:0.0% Podocarpus _falcatus:0.0% Podocarpus_latifolius:0.0% Polyscias_fulva:0.0% Polyscias_kikuyuensis:0.0% Pouteria_adolfi_friedericii:0.0% Prunus_africana:0.0% Psidium_guajava:0.0% Rauvolfia_Vomitoria:0.0% Rhus_natalensis:0.0% Rhus_vulgaris:0.0% Schinus_molle:0.0% Schrebera_alata:0.0% Sclerocarya_birrea:0.0% Scolopia_zeyheri:0.0% Senna_siamea:0.0% Sinarundinaria_alpina:0.0% Solanum_mauritianum:0.0% Spathodea_campanulata:0.0% Strychnos_usambare:0.0% Syzygium_afromontana:0.0% Syzygium_cordatum:0.0% Syzygium_cuminii:0.0% Syzygium_guineense:0.0% Tamarindus_indica:0.0% Tarchonanthus_camphoratus:0.0% Teclea_Nobilis:0.0% Teclea_simplicifolia:0.0% Terminalia_brownii:0.0% Terminalia_mantaly:0.0% Toddalia_asiatica:0.0% Trema_Orientalis:0.0% Trichilia_emetica:0.0% Trichocladus_ellipticus:0.0% Trimeria_grandifolia:0.0% Vangueria_madagascariensis:0.0% Vepris_nobilis:0.0% Vepris_simplicifolia:0.0% Vernonia_auriculifera:0.0% Vitex_keniensis:0.0% Warburgia_ugandensis:0.0% Zanthoxylum_gilletii:0.0% Mahogany_tree:0.0%
You can just get all your predictions, sort them and take top four preds = model.predict(imgs) sorted_preds = [] for cls in train_generator.class_indices: x = preds[0][train_generator.class_indices[cls]] x_pred = "{:.1%}".format(x) sorted_preds.append([x, x_pred, cls]) top_4 = sorted(sorted_preds, reverse=True)[:4]
Reducing Model file size in LIBSVM
I want to reduce the model file size . Can we reduce it by reducing the number of digits in the weights of the model file. The number of classes in my model file is around 3800 and the number of features is around 357000. Here is some excerpt from the model file. Can I reduce the number of digits in these weights. solver_type L2R_L2LOSS_SVC_DUAL nr_class 3821 nr_feature 357021 bias -1.000000000000000 w -0.6298615183549175 -0.6884816945277815 -0.9850473581929793 -0.2730180225739936 -0.4444522939544599 -0.3045368061994185 -0.6752904784743610 -0.4936186126242763 -0.8167435931134331 -0.8747648882598349 -0.4980187300672689 -0.8255372912521536 -0.3329812532124196 -0.1751416471640286 -0.7447656595877303 -0.4240569914873799 -0.9004909961812873 -0.9857813112641359 -0.3674085365663847 -0.4819407419877990 -0.3645238468547681 -0.5827397105860186 -0.7290781581209491 -0.8615229165775795 -0.3975308017493017 -0.6522787326004871 -0.9846626520798610 -0.5583216247458188 -0.9488816092738117 -0.6469158771901011 -0.2306256734853684 -0.2940612946888093 -0.6895719661937446 -0.3041407180695167 -0.5602587606930518 -0.4434458835686698 -0.3960629365410545 -0.7512211790407204 -0.6082476608695304 -1.336132842955273 -0.6057066303450040 -0.5726087731282288 -0.4918814547677718 -0.7606578865363953 -0.2951659264868926 -0.3881680788359501 -0.3109241231671961 -0.7078707491799914 -0.3623625688446360 -0.4430137729068305 -0.9279271098475936 -0.2290838088700753 -0.3870980678621480 -0.8000332693180561 -0.7964744879675550 -0.4950551119251316 -0.5201500981458075 -0.6654200978736288 -0.9037766341356712 -0.5921799507740539 -0.4552915755388566 -0.8048467444625557 -0.08638961422716016 -0.3175800991399296 -0.8889281355804046 -0.8889673432972257 0.009443893188055608 -0.3033030733905986 -0.6063958370642328 -0.7781676697747630 -0.9969339455729528 -0.7847641855193951 -0.3709450948897945 -0.9293821956430142 -0.6711216076980766 -0.6472048031763484 -0.2844660995208588 -0.4547657013618363 -0.3093274839631762 -0.8264594986328345 -0.2693948669009715 -0.5691246530468883 -0.5816949288414970 -0.7988407843132017 -0.5846410991542126 -0.6102733673192773 -0.9474472897104326 -0.4619018809588187 -0.6922626991585266 -0.8529509393486879 -0.9341690394723746 -0.2048861760333368 -0.5763255438056814 -0.4753823007333206 -0.9847858814169310 -0.6084670508904806 -0.6097889096385636 -0.1558026578670219 -0.5407452525949980 -0.8426597160875828 -0.5728578082647764 -0.6254655056167889 -0.5002570985981800 -0.5660289375686121 -0.6966970933117435 -0.3595184568720410 -0.8869769517170271 -0.8293060581021244 -0.7660244640066636 -0.9191108227612158 -0.7495472111112249 -0.3250789003708131 -0.8545862221106031 -0.9847863669982040 -0.9862358540926807 -0.9843872487122278 -0.3764841688606632 -0.6665806111063707 -0.6998869717621219 -0.8398491506346015 -0.7498849663083538 -0.2584536929034274 -0.8798094698402976 -0.8659064866640068 -0.8540212609217359 -0.4705628403387491 -0.9848057457322186 -0.5870303872290659 -0.9105115844147157 -0.6855534064105064 -0.7447256224770895 -0.9845164901161550 -0.9267803381073205 -0.6874399094864110 -0.9868490844056681 -0.9871049327408159 -0.9127271706215343 -0.8894132571749456 -0.7481430771200624 -0.7661512147794380 -0.4619076734386954 -0.3463253354355214 -0.7324122395130058 -0.7198934949704492 -0.3869971300152642 -0.3580173602243875 -0.8144411145869335 -0.4708508640578066 -0.7583061726079500 -0.6102585014526588 -0.2323551831668570 -0.7124730357532248 -0.6407019387626708 -0.8770555543363814 -0.7747723882503575 -0.8880529094965369 -0.5221765657051773 -0.8927103129537772 -0.8873570244928761 -0.6814118942525524 -0.4812414843861851 -0.07723442473878635 -0.3004215736435181 -0.7901826925719376 -0.6000050603345796 -0.9391488020802135 -0.6130019120301854 -0.6519260224181763 -0.6312423953207323 -0.6236684911320279 -0.8319901021019791 -0.9846585341126538 -0.8241847119432536 -0.9849733862258551 0.03619613868867930 -0.9402473523400392 -0.4963043182116479 -0.06988396609313940 -0.6160025364808686 -0.9485679374403244 -0.9552678112333591 -0.2951058860501357 -0.9871232492575841 -0.2801466899229405 -0.5623043303