How to print the weights of Keras embedding? - keras

import gensim
Load Google's pre-trained Word2Vec model.
model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
embed = model.get_keras_embedding()
print(embed.weights)
output: []
How to print the weights of embedding layer in Keras, currently it is printing empty list?

Related

How to concatenate BERT-like sentence representation and word embeddings - Keras & huggingface

How to concatenate BERT-like sentence representation and word embeddings - Keras & huggingface
I am following this Keras tutorial to combine Hugging Face transformers with other layers:
https://keras.io/examples/nlp/text_extraction_with_bert/
I want to concatenate the transformer embedding layer (included in the tutorial) with some regular word embeddings:
encoder = TFBertModel.from_pretrained("bert-base-uncased")
input_ids = layers.Input(shape=(max_transformer_len,), dtype=tf.int32)
token_type_ids = layers.Input(shape=(max_transformer_len,), dtype=tf.int32)
attention_mask = layers.Input(shape=(max_transformer_len,), dtype=tf.int32)
embedding1 = encoder(
input_ids, token_type_ids=token_type_ids, attention_mask=attention_mask)[0]`
input_wordembedding = Input(shape=(max_sentence_len,), dtype='int32',
name='we_input')
embedding2 = Embedding(output_dim=wordembedding_VECTOR_SIZE,
input_dim=wordembedding_VOCAB_SIZE,
input_length=max_sentence_len,
weights = [emb_matrix],
name='emb1')(we_input)
z = Concatenate(name='merged')([embedding1, embedding2])
My problem is that the layer embedding1 has sub-words representations and embedding2 has words representations. Then, I want to do a max-pooling of the sub-words in the embedding1 layer. In this way, I will obtain word representations with a transformer model.
Does anyone know how to approach this problem in Keras? If it is impossible to solve it with Keras, is it doable in PyTorch? How?

How to load BertforSequenceClassification models weights into BertforTokenClassification model?

Initially, I have a fine-tuned BERT base cased model using a text classification dataset and I have used BertforSequenceClassification class for this.
from transformers import BertForSequenceClassification, AdamW, BertConfig
# Load BertForSequenceClassification, the pretrained BERT model with a single
# linear classification layer on top.
model = BertForSequenceClassification.from_pretrained(
"bert-base-uncased", # Use the 12-layer BERT model, with an uncased vocab.
num_labels = 2, # The number of output labels--2 for binary classification.
# You can increase this for multi-class tasks.
output_attentions = False, # Whether the model returns attentions weights.
output_hidden_states = False, # Whether the model returns all hidden-states.
)
Now I want to use this fine-tuned BERT model weights for Named Entity Recognition and I have to use BertforTokenClassification class for this. I'm unable to figure out how to load the fine-tuned BERT model weights into the new model created using BertforTokenClassification.
Thanks in advance.......................
You can get weights from the bert inside the first model and load into the bert inside the second:
new_model = BertForTokenClassification(config=config)
new_model.bert.load_state_dict(model.bert.state_dict())
This worked for me
new_model = BertForTokenClassification.from_pretrained('/config path')
new_model.bert.load_state_dict(model.bert.state_dict())

How to combine POS tag feature with associated word vector for word get from Pretrained gensim word2vec ans use in embedding layer in keras

I have already pretrained word2vec in gensim. In keras , I want to use Word vector for word get from pretrained word2vec combined with that word's POS tag feature that i encode in one hot vector. In Keras, I think use embedding matrix So, I want to make embedding layer in Keras to achieve this so that It can be used in further layers(LSTM). Can you tell me in detail how to do this?
Just merge both the features. Take care of the shapes when merging, your word embedding layer would have 3 dimensions and POS you have is 2 dimension. You would have to convert POS to 3 dimension before merging.
from keras.layers import Input, Concatenate, Reshape
from keras.layers import Embedding
max_len = 100
word_in = Input(shape=(max_len,))
emb_word = Embedding(1000, 300, input_length=max_len)(word_in)
pos_ohe = Input(shape=(max_len,))
pos_in = Reshape((max_len, 1))(pos_ohe)
x = Concatenate()([emb_word, pos])

How to extract features from a layer of the pretrained ResNet model Keras

I trained a model with Resnet3D and I want to extract the neurons of a layer. I plan to use them with the SVM classifier. How can I extract these weights and put them to the numpy array?
Load the weights by keras
model = Resnet3DBuilder.build_resnet_18((128, 96, 96, 3), nClass[0])
model.load_weights('drive/app/models/3d_resnet_modelq.hdf5')
extract a layer
dns = model.layers[-1].output
now what should i do?
If you just want to visualise the features, in pure Keras you can define a Model with the desired layer as output:
from keras.models import Model
model_cut = Model(inputs=model.inputs, output=model.layers[-1].output)
features = model_cut.predict(x) # Assuming you have your images in x
Note that in order for this to work, model must have been compiled at least once.

How to implement word2vec CBOW in keras with shared Embedding layer and negative sampling?

I want to create a word embedding pretraining network which adds something on top of word2vec CBOW. Therefore, I'm trying to implement word2vec CBOW first. Since I'm very new to keras, I'm unable to figure out how to implement CBOW in it.
Initialization:
I have calculated the vocabulary and have the mapping of word to integers.
Input to the (yet to be implemented) network:
A list of 2*k + 1 integers (representing the central word and 2*k words in context)
Network Specification
A shared Embedding layer should take this list of integers and give their corresponding vector outputs. Further a mean of 2*k context vector is to be taken (I believe this can be done using add_node(layer, name, inputs=[2*k vectors], merge_mode='ave')).
It will be very helpful if anyone can share a small code-snippet of this.
P.S.: I was looking at word2veckeras, but couldn't follow its code because it also uses a gensim.
UPDATE 1:
I want to share the embedding layer in the network. The embedding layer should be able to take context words (2*k) and the current word as well. I can do this by taking all 2*k + 1 word indices in the input and write a custom lambda function which will do the needful. But, after that I also want to add negative sampling network for which I'll have to take embedding of more words and dot product with the context vector. Can someone provide with an example where Embedding layer is a shared node in the Graph() network
Graph() has been deprecated from keras
Any arbitrary network can be created by using keras functional API.
Following is the demo code which created a word2vec cbow model with negative sampling tested on randomized inputs
from keras import backend as K
import numpy as np
from keras.utils.np_utils import accuracy
from keras.models import Sequential, Model
from keras.layers import Input, Lambda, Dense, merge
from keras.layers.embeddings import Embedding
k = 3 # context windows size
context_size = 2*k
neg = 5 # number of negative samples
# generate weight matrix for embeddings
embedding = []
for i in range(10):
embedding.append(np.full(100, i))
embedding = np.array(embedding)
print embedding
# Creating CBOW model
word_index = Input(shape=(1,))
context = Input(shape=(context_size,))
negative_samples = Input(shape=(neg,))
shared_embedding_layer = Embedding(input_dim=10, output_dim=100, weights=[embedding])
word_embedding = shared_embedding_layer(word_index)
context_embeddings = shared_embedding_layer(context)
negative_words_embedding = shared_embedding_layer(negative_samples)
cbow = Lambda(lambda x: K.mean(x, axis=1), output_shape=(100,))(context_embeddings)
word_context_product = merge([word_embedding, cbow], mode='dot')
negative_context_product = merge([negative_words_embedding, cbow], mode='dot', concat_axis=-1)
model = Model(input=[word_index, context, negative_samples], output=[word_context_product, negative_context_product])
model.compile(optimizer='rmsprop', loss='mse', metrics=['accuracy'])
input_context = np.random.randint(10, size=(1, context_size))
input_word = np.random.randint(10, size=(1,))
input_negative = np.random.randint(10, size=(1, neg))
print "word, context, negative samples"
print input_word.shape, input_word
print input_context.shape, input_context
print input_negative.shape, input_negative
output_dot_product, output_negative_product = model.predict([input_word, input_context, input_negative])
print "word cbow dot product"
print output_dot_product.shape, output_dot_product
print "cbow negative dot product"
print output_negative_product.shape, output_negative_product
Hope it helps!
UPDATE 1:
I've completed the code and uploaded it here
You could try something like this. Here I've initialized the embedding matrix to a fixed value. For an input array of shape (1, 6) you'll get the output of shape (1, 100) where the 100 is the average of the 6 input embedding.
model = Sequential()
k = 3 # context windows size
context_size = 2*k
# generate weight matrix for embeddings
embedding = []
for i in range(10):
embedding.append(np.full(100, i))
embedding = np.array(embedding)
print embedding
model.add(Embedding(input_dim=10, output_dim=100, input_length=context_size, weights=[embedding]))
model.add(Lambda(lambda x: K.mean(x, axis=1), output_shape=(100,)))
model.compile('rmsprop', 'mse')
input_array = np.random.randint(10, size=(1, context_size))
print input_array.shape
output_array = model.predict(input_array)
print output_array.shape
print output_array[0]

Resources