I have a list of embeddings. The list has N lists with M embedding (tensors) each.
list_embd = [[M embeddings], [M embeddings], ...]
(Each embedding is a tensor with size (1,512))
What I want to do is create a tensor size (N, M), where each "cell" is one embedding.
Tried this for numpy array.
array = np.zeros(n,m)
for i in range(n):
for j in range(m):
array[i, j] = list_embd[i][j]
But still got errors.
In pytorch tried to concat all M embeddings into one tensor size (1, M), and then concat all rows. But when I concat along dim 1 two of those M embeddings, I get a tensor shaped (1, 1028) instead (1, 2).
final = torch.tensor([])
for i in range(n):
interm = torch.tensor([])
for j in range(m):
interm = torch.cat((interm, list_embd[i][j]), 0)
final = = torch.cat((final, interm), 1)
Any ideas or suggestions?
I need a matrix with the embeddings in each cell.
You can use torch.cat and torch.stack to create a final 3D tensor of shape (N, M, 512):
final = torch.stack([torch.cat(sub_list, dim=0) for sub_list in list_embd], dim=0)
First, you use torch.cat to create a list of N 2D tensors of shape (M, 512) from each list of M embeddings. Then torch.stack is used to stack these N 2D matrices into a single 3D tensor final.
Related
I have tensor t with shape (Batch_Size x Dims) and another tensor v with shape (Vocab_Size x Dims). I'd like to produce a tensor d with shape (Batch_Size x Vocab_Size), such that d[i,j] = norm(t[i] - v[j]).
Doing this for a single tensor (no batches) is trivial: d = torch.norm(v - t), since t would be broadcast. How can I do this when the tensors have batches?
Insert unitary dimensions into v and t to make them (1 x Vocab_Size x Dims) and (Batch_Size x 1 x Dims) respectively. Next, take the broadcasted difference to get a tensor of shape (Batch_Size x Vocab_Size x Dims). Pass that to torch.norm along with the optional dim=2 argument so that the norm is taken along the last dimension. This will result in the desired (Batch_Size x Vocab_Size) tensor of norms.
d = torch.norm(v.unsqueeze(0) - t.unsqueeze(1), dim=2)
Edit: As pointed out by #KonstantinosKokos in the comments, due to the broadcasting rules used by numpy and pytorch, the leading unitary dimension on v does not need to be explicit. I.e. you can use
d = torch.norm(v - t.unsqueeze(1), dim=2)
I want to sample a tensor of probability distributions with shape (N, C, H, W), where dimension 1 (size C) contains normalized probability distributions with āCā possibilities. Is there a pytorch function to efficiently sample all the distributions in the tensor in parallel? I just need to sample each distribution once, so the result could either be a one-hot tensor with the same shape or a tensor of indices with shape (N, 1, H, W).
There was no single function to sample that I saw, but I was able to sample the tensor in several steps by computing the cumulative probabilities, sampling each point independently, and then picking the first point that sampled a 1 in the distribution dimension:
reverse_cumulative = torch.flip(torch.cumsum(torch.flip(probabilities, [1]), dim=1), [1])
cumulative = probabilities / reverse_cumulative
sampled = (torch.rand(cumulative.shape, device=device()) <= cumulative)
idxs = sampled * one_hot
idxs[~sampled] = self.tile_count
sampled_idxs = idxs.min(dim=1).indices
Dual Encoder LSTM
I want to implement this model in TensorFlow Keras API. I am confused about how to implement the sigmoid(CMR) function in Keras. How to merge the output of both LSTM's an compute the above function ?
RNN here means LSTM
C and R are sentences encoded into a fixed dimension by the two LSTM's. Then they are passed through a function sigmoid(CMR). We can assume that R and C are both 256 dimensional matrices and M is a 256 * 256 matrix. The matrix M is learned during training.
Assuming you only consider the final output of the LSTMs and not the whole sequence, the shape of the output of each LSTM model would be (batch_size, 256).
Now, we have the following vectors and their shapes:
C: (batch_size, 256)
R: (batch_size, 256)
M: (256, 256).
The simplest case is for batch_size = 1. Then,
C: (1, 256)
R: (1, 256)
So, mathematically, CTMR would practically be CMRT, and give you a vector of shape (1, 1), which can be represented by any number of dimensions.
In code, this is straightforward:
def compute_cmr(c, m, r):
r = tf.transpose(r, [1, 0])
output = tf.matmul(c, m)
output = tf.matmul(output, r)
return output
However, if your batch_size is greater than 1, things can get tricky. My approach (using eager execution) is to unstack along the batch axis, process individually, then restack. It may not be the most efficient way, but it works flawlessly and the time overhead usually is negligible.
Here's how you can do it:
def compute_cmr(c, m, r):
outputs = []
c_list = tf.unstack(c, axis=0)
r_list = tf.unstack(r, axis=0)
for batch_number in range(len(c_list)):
r = tf.expand_dims(r_list[batch_number], axis=1)
c = tf.expand_dims(c_list[batch_number], axis=0)
output = tf.matmul(c, m)
output = tf.matmul(output, r)
outputs.append(output)
return tf.stack(outputs, axis=0)
I have a Pytorch code which generates a Pytorch tensor in each iteration of for loop, all of the same size. I want to assign each of those tensors to a row of new tensor, which will include all the tensors at the end. In other works something like this
for i=1:N:
X = torch.Tensor([[1,2,3], [3,2,5]])
#Y is a pytorch tensor
Y[i] = X
I wonder how I can implement this with Pytorch.
You can concatenate the tensors using torch.cat:
tensors = []
for i in range(N):
X = torch.tensor([[1,2,3], [3,2,5]])
tensors.append(X)
Y = torch.cat(tensors, dim=0) # dim 0 is the rows of the tensor
I'm working on a project where I have to predict the future states of a 1D vector with y entries. I'm trying to do this using an ANN setup with LSTM units in combination with a convolution layer. The method I'm using is based on the method they used in a (pre-release paper). The suggested setup is as follows:
In the picture c is the 1D vector with y entries. The ANN gets the n previous states as an input and produces o next states as an output.
Currently, my ANN setup looks like this:
inputLayer = Input(shape = (n, y))
encoder = LSTM(200)(inputLayer)
x = RepeatVector(1)(encoder)
decoder = LSTM(200, return_sequences=True)(x)
x = Conv1D(y, 4, activation = 'linear', padding = 'same')(decoder)
model = Model(inputLayer, x)
Here n is the length of the input sequences and y is the length of the state array. As can be seen I'm repeating the d vector only 1 time, as I'm trying to predict only 1 time step in the future. Is this the way to setup the above mentioned network?
Furthermore, I have a numpy array (data) with a shape of (Sequences, Time Steps, State Variables) to train with. I was trying to divide this in randomly selected batches with a generator like this:
def BatchGenerator(batch_size, n, y, data):
# Infinite loop.
while True:
# Allocate a new array for the batch of input-signals.
x_shape = (batch_size, n, y)
x_batch = np.zeros(shape=x_shape, dtype=np.float16)
# Allocate a new array for the batch of output-signals.
y_shape = (batch_size, 1, y)
y_batch = np.zeros(shape=y_shape, dtype=np.float16)
# Fill the batch with random sequences of data.
for i in range(batch_size):
# Select a random sequence
seq_idx = np.random.randint(data.shape[0])
# Get a random start-index.
# This points somewhere into the training-data.
start_idx = np.random.randint(data.shape[1] - n)
# Copy the sequences of data starting at this
# Each batch inside x_batch has a shape of [n, y]
x_batch[i,:,:] = data[seq_idx, start_idx:start_idx+n, :]
# Each batch inside y_batch has a shape of [1, y] (as we predict only 1 time step in advance)
y_batch[i,:,:] = data[seq_idx, start_idx+n, :]
yield (x_batch, y_batch)
The problem is that it gives an error if I'm using a batch_size of more than 1. Could anyone help me to set this data up in a way that it can be used optimally to train my neural network?
The model is now trained using:
generator = BatchGenerator(batch_size, n, y, data)
model.fit_generator(generator = generator, steps_per_epoch = steps_per_epoch, epochs = epochs)
Thanks in advance!