I have two tensors a and b which are of different dimensions. a is of shape [100,100] and b is of the shape [100,3,10]. I want to concatenate these two tensors.
For example:
a = torch.randn(100,100)
tensor([[ 1.3236, 2.4250, 1.1547, ..., -0.7024, 1.0758, 0.2841],
[ 1.6699, -1.2751, -0.0120, ..., -0.2290, 0.9522, -0.4066],
[-0.3429, -0.5260, -0.7748, ..., -0.5235, -1.8952, 1.2944],
...,
[-1.3465, 1.2641, 1.6785, ..., 0.5144, 1.7024, -1.0046],
[-0.7652, -1.2940, -0.6964, ..., 0.4661, -0.3998, -1.2428],
[-0.4720, -1.0981, -2.3715, ..., 1.6423, 0.0560, 1.0676]])
The tensor b is as follows:
tensor([[[ 0.4747, -1.9529, -0.0448, ..., -0.9694, 0.8009, -0.0610],
[ 0.5160, 0.0810, 0.1037, ..., -1.7519, -0.3439, 1.2651],
[-0.5975, -0.2000, -1.6451, ..., 1.3082, -0.4023, -0.3105]],
...,
[[ 0.4747, -1.9529, -0.0448, ..., -0.9694, 0.8009, -0.0610],
[ 0.1939, 1.0365, -0.0927, ..., -2.4948, -0.2278, -0.2390],
[-0.5975, -0.2000, -1.6451, ..., 1.3082, -0.4023, -0.3105]]],
dtype=torch.float64, grad_fn=<CopyBackwards>)
I want to concatenate such that the first row in tensor a of size [100] is concatenated with the first row in tensor b which is of size [3,10]. This should be applicable to all rows in both tensors. That is, in simple words, considering just the first row in a and b, I want to get an output with size [100,130] as follows:
[ 1.3236, 2.4250, 1.1547, ..., -0.7024, 1.0758, 0.2841, 0.4747, -1.9529, -0.0448, ..., -0.9694, 0.8009, -0.0610, 0.5160, 0.0810, 0.1037, ..., -1.7519, -0.3439, 1.2651, -0.5975, -0.2000, -1.6451, ..., 1.3082, -0.4023, -0.3105]
In order to do this, I performed unsqueezed to tensor a to get the two tensors in the same dimensions as follows.
a = a.unsqueeze(1)
When I perform torch.cat([a,b], I still get an error. Can somebody help me in solving this?
Thanks in advance.
Reshape b tensor accordingly and then merge it to a using torch.cat on 1 dim
torch.cat((a, b.reshape(100, -1)), dim=1)
Related
I have a use-case where I have to do FFT for a given tensor as. Here, FFT is applied to each of the 10 rows, in a column-wise manner which gives the dimension (10, 11) post FFT.
# Random data-
x = torch.rand((10, 20))
# Compute RFFT of 'x'-
x_fft = torch.fft.rfft(x)
# Sanity check-
x.shape, x_fft.shape
# (torch.Size([10, 20]), torch.Size([10, 11]))
# FFT for the first 2 rows are-
x_fft[:2, :]
'''
tensor([[12.2561+0.0000j, 0.7551-1.2075j, 1.1119-0.0458j, -0.2814-1.5266j,
1.4083-0.7302j, 0.6648+0.3311j, 0.3969+0.0632j, -0.8031-0.1904j,
-0.4206+0.9066j, -0.2149+0.9160j, 0.4800+0.0000j],
[ 9.8967+0.0000j, -0.5100-0.2377j, -0.6344+2.2406j, 0.4584-1.0705j,
0.2235+0.4788j, -0.3923+0.8205j, -1.0372-0.0292j, -1.6368+0.5517j,
1.5093+0.0419j, 0.5755-1.2133j, 2.9269+0.0000j]])
'''
# The goal is to have for each row, 1-D vector (of size = 11) as follows:
# So, for first row, the desired 1-D vector (size = 11) is-
[12.2561, 0.0000, 0.7551, -1.2075, 1.1119, -0.0458, -0.2814, -1.5266,
1.4083, -0.7302, 0.6648, 0.3311, 0.3969, 0.0632, -0.8031, -0.1904,
-0.4206, 0.9066, -0.2149, 0.9160, 0.4800, 0.0000]
'''
Here, you are taking the real and imaginary components and placing them adjacent to each other.
Adjacent means:
[a_1_real, a_1_imag, a_2_real, a_2_imag, a_3_real, a_3_imag, ....., a_n_real, a_n_imag]
Since for each row, you get 11 FFT complex numbers, a_n = a_11.
How to go about it?
Your question seems to come down to: how to interleave two tensors together. Given x and y the two tensors. You can do so with a combination of transpose and reshape.
>>> torch.stack((x,y),1).transpose(1,2).reshape(2,-1)
tensor([[ 1.1547e+01, 0.0000e+00, 1.3786e+00, -8.1970e-01, -3.2118e-02,
-2.3900e-02, -3.2898e-01, -3.4610e-01, -1.7916e-01, 1.2308e+00,
-5.4203e-01, 1.2580e-01, 8.5273e-01, 8.9980e-01, -2.7096e+00,
-3.8060e-01, 3.0016e-01, -4.5240e-01, -7.7809e-02, 4.5630e-01,
-4.5805e-03, 0.0000e+00],
[ 1.1106e+01, 0.0000e+00, 1.3362e-01, 1.3830e-01, -7.4233e-01,
7.7570e-01, -9.9461e-01, 1.0834e+00, 1.6952e+00, 5.2920e-01,
-1.1884e+00, -2.5970e-01, -8.7958e-01, 4.3180e-01, -9.3039e-01,
8.8130e-01, -1.0048e+00, 1.2823e+00, 2.0595e-01, -6.5170e-01,
1.7209e+00, 0.0000e+00]])
I have a PyTree params (in my case a nested dictionary) containing my parameters of a neural network. My goal is to compute the diagonal entries of the Hessian of a loss function with respect to the parameters and store it in a PyTree of the same structure as the parameters.
When I call jax.hessian(loss_fn)(params, data), I get a (as expected) an even more nested dictionary with the full Hessian.
How can I transform this dictionary to get the desired PyTree with diagonal entries?
To be more concrete: Lets say I have only 1 layer in my network and paramsis given by
params:
'linear':
'w': DeviceArray() of shape [5 x 1]
'b': DeviceArray() of shape [1]
The returned Hessian has the keys and shape given by
hessian:
'linear':
'b':
'linear':
'b': (1, 1),
'w': (1, 5, 1),
'w':
'linear':
'b': (5, 1, 1),
'w': (5, 1, 5, 1)
As far as I understand it, I need the entries
jnp.diag(hessian['linear']['b']['linear']['b'])
as the diagonal hessian for the bias and
jnp.diag(jnp.squeeze(hessian['linear']['w']['linear']['w']))
as the diagonal hessian for the weights. (However, the squeeze may only work for 1 dim outputs...)
How can I automate this transformation in order to work for more complex models with multiple layers?
I know that this does not scale to huge networks, I need it for testing purposes of optimizers.
I ran into the exact same problem. Unfortunately, working with Pytrees in Jax can be awkward. I was also looking at a way to construct the diagonal Hessian entry-for-entry, since that could yield a practical method.
I now have the following:
def ravelled_diagonal_indices(dims: Sequence[int]) -> jnp.ndarray:
# Get the indices for the diagonal elements of a flattened square matrix.
return (dims[0] + 1) * jnp.arange(dims[0])
# Alias to reduce clutter.
_diag_idx = ravelled_diagonal_indices
def tree_matrix_diagonal(tree: Any, reference: Optional[Any] = None) -> Any:
"""Utility function for extracting the diagonal of a Pytree of jax.numpy.array objects.
The Pytree is assumed to be square in its children and in its array objects.
Parameters
----------
tree : Any
Pytree of jax.numpy.array objects for which the number of Pytree leaves and
the sizes of each constituent array is square.
reference : Any, default = None
The intended structure for the diagonal of `tree`. For example, this can be
the Pytree with which `tree` could have been created through e.g., an outer-product
or the Hessian of a function.
Returns
-------
diag : Any
Pytree containing the flattened diagonals of `tree` if no reference was provided.
Otherwise, the diagonal elements are shaped according to the structure of `reference`.
"""
flat = jax.tree_leaves(tree)
h = jax.numpy.sqrt(len(flat)).astype(int)
_idx = _diag_idx((h,))
block_diag = [flat[i] for i in _idx]
flat_diagonal = lambda w: w.ravel()[_diag_idx((jax.numpy.sqrt(w.size).astype(int),))]
diag = jax.tree_map(flat_diagonal, block_diag)
if reference is not None:
# Reshape the diagonal Pytree to reference Pytree structure and shape
diag_tree = jax.tree_unflatten(jax.tree_structure(reference), diag)
diag = jax.tree_multimap(lambda a, b: a.reshape(jax.numpy.shape(b)), diag_tree, reference)
return diag
When I try this out on the Hessian of a very simple MLP:
params
>> {'dense/~/affine': {'weights': DeviceArray([[ 1. , 1. ],
[ 0.546326 , -0.77997607]], dtype=float32)},
'dense_1/~/affine': {'weights': DeviceArray([[ 1. ],
[-0.5155028],
[ 0.9487318]], dtype=float32)}}
hessian
>> {'dense/~/affine': {'weights': {'dense/~/affine': {'weights': DeviceArray([[[[[-0.02324889, 0.04278728],
[ 0.00814307, -0.01498652]],
[[ 0.04278728, -0.07874574],
[-0.01498652, 0.0275812 ]]],
[[[ 0.00814307, -0.01498652],
[-0.00285216, 0.00524912]],
[[-0.01498652, 0.0275812 ],
[ 0.00524912, -0.00966049]]]]], dtype=float32)},
'dense_1/~/affine': {'weights': DeviceArray([[[[[ 0.04509945],
[ 0.15897979],
[ 0.05742025]],
[[-0.08300105],
[-0.06711845],
[ 0.01683405]]],
[[[-0.01579637],
[-0.05568369],
[-0.02011181]],
[[ 0.02907166],
[ 0.02350867],
[-0.00589623]]]]], dtype=float32)}}},
'dense_1/~/affine': {'weights': {'dense/~/affine': {'weights': DeviceArray([[[[[ 0.04509945, -0.08300105],
[-0.01579637, 0.02907165]]],
[[[ 0.15897979, -0.06711845],
[-0.0556837 , 0.02350867]]],
[[[ 0.05742024, 0.01683406],
[-0.02011181, -0.00589624]]]]], dtype=float32)},
'dense_1/~/affine': {'weights': DeviceArray([[[[[-0.08748633],
[-0.07074545],
[-0.11138687]]],
[[[-0.07074545],
[-0.05720801],
[-0.09007253]]],
[[[-0.11138687],
[-0.09007251],
[-0.14181684]]]]], dtype=float32)}}}}
Then, the function returns:
tree_matrix_diagonal(hessian, reference=params)
>> {'dense/~/affine': {'weights': DeviceArray([[-0.02324889, -0.07874574],
[-0.00285216, -0.00966049]], dtype=float32)},
'dense_1/~/affine': {'weights': DeviceArray([[-0.08748633],
[-0.05720801],
[-0.14181684]], dtype=float32)}}
Upon visual inspection, you can see that the returned elements are indeed the diagonal elements of hessian cast to the canonical structure of params.
Funnily enough, for the Gauss-Newton approximation to the Hessian the procedure is much simpler. Simply take the element-wise square of the Jacobians :).
I have the following code segment to generate random samples. The generated samples is a list, where each entry of the list is a tensor. Each tensor has two elements. I would like to extract the first element from all tensors in the list; and extract the second element from all tensors in the list as well. How to perform this kind of tensor slice operation
import torch
import pyro.distributions as dist
num_samples = 250
# note that both covariance matrices are diagonal
mu1 = torch.tensor([0., 5.])
sig1 = torch.tensor([[2., 0.], [0., 3.]])
dist1 = dist.MultivariateNormal(mu1, sig1)
samples1 = [pyro.sample('samples1', dist1) for _ in range(num_samples)]
samples1
I'd recommend torch.cat with a list comprehension:
col1 = torch.cat([t[0] for t in samples1])
col2 = torch.cat([t[1] for t in samples1])
Docs for torch.cat: https://pytorch.org/docs/stable/generated/torch.cat.html
ALTERNATIVELY
You could turn your list of 1D tensors into a single big 2D tensor using torch.stack, then do a normal slice:
samples1_t = torch.stack(samples1)
col1 = samples1_t[:, 0] # : means all rows
col2 = samples1_t[:, 1]
Docs for torch.stack: https://pytorch.org/docs/stable/generated/torch.stack.html
I should mention PyTorch tensors come with unpacking out of the box, this means you can unpack the first axis into multiple variables without additional considerations. Here torch.stack will output a tensor of shape (rows, cols), we just need to transpose it to (cols, rows) and unpack:
>>> c1, c2 = torch.stack(samples1).T
So you get c1 and c2 shaped (rows,):
>>> c1
tensor([0.6433, 0.4667, 0.6811, 0.2006, 0.6623, 0.7033])
>>> c2
tensor([0.2963, 0.2335, 0.6803, 0.1575, 0.9420, 0.6963])
Other answers that suggest .stack() or .cat() are perfectly fine from PyTorch perspective.
However, since the context of the question involves pyro, may I add the following:
Since you are doing IID samples
[pyro.sample('samples1', dist1) for _ in range(num_samples)]
A better way to do it with pyro is
dist1 = dist.MultivariateNormal(mu1, sig1).expand([num_samples])
This tells pyro that the distribution is batched with a batch size of num_samples. Sampling from this will produce
>> dist1.sample()
tensor([[-0.8712, 6.6087],
[ 1.6076, -0.2939],
[ 1.4526, 6.1777],
...
[-0.0168, 7.5085],
[-1.6382, 2.1878]])
Now its easy to solve your original question. Just slice it like
samples = dist1.sample()
samples[:, 0] # all first elements
samples[:, 1] # all second elements
For Python 3.8 and TensorFlow 2.5, I have a 3-D tensor of shape (3, 3, 3) where the goal is to compute the L2-norm for each of the three (3, 3) square matrices. The code that I came up with is:
a = tf.random.normal(shape = (3, 3, 3))
a.shape
# TensorShape([3, 3, 3])
a.numpy()
'''
array([[[-0.30071023, 0.9958398 , -0.77897555],
[-1.4251901 , 0.8463568 , -0.6138699 ],
[ 0.23176959, -2.1303613 , 0.01905925]],
[[-1.0487134 , -0.36724553, -1.0881581 ],
[-0.12025198, 0.20973174, -2.1444907 ],
[ 1.4264063 , -1.5857363 , 0.31582597]],
[[ 0.8316077 , -0.7645084 , 1.5271858 ],
[-0.95836663, -1.868056 , -0.04956183],
[-0.16384012, -0.18928945, 1.04647 ]]], dtype=float32)
'''
I am using axis = 2 since the 3rd axis should contain three 3x3 square matrices. The output I get is:
tf.math.reduce_euclidean_norm(input_tensor = a, axis = 2).numpy()
'''
array([[1.299587 , 1.7675754, 2.1430166],
[1.5552354, 2.158075 , 2.15614 ],
[1.8995634, 2.1001325, 1.0759989]], dtype=float32)
'''
How are these values computed? The formula for computing L2-norm is this. What am I missing?
Also, I was expecting three L2-norm values, one for each of the three (3, 3) matrices. The code I have to achieve this is:
tf.math.reduce_euclidean_norm(a[0]).numpy()
# 3.0668826
tf.math.reduce_euclidean_norm(a[1]).numpy()
# 3.4241767
tf.math.reduce_euclidean_norm(a[2]).numpy()
# 3.0293021
Is there any better way to get this without having to explicitly refer to each indices of tensor 'a'?
Thanks!
The formula you linked for computing the L2 norm looks correct. What you have is basically this:
np.sqrt(np.sum((a[0]**2)))
# 3.0668826
np.sqrt(np.sum((a[1]**2)))
# 3.4241767
np.sqrt(np.sum((a[2]**2)))
# 3.0293021
This can be vectorized by the following:
np.sqrt(np.sum(a**2, axis=(1,2)))
Output:
array([3.0668826, 3.4241767, 3.0293021], dtype=float32)
Which is effectively the same as using np.lingalg.norm (or tf.math.reduce_euclidean_norm if you want to use tensorflow)
np.linalg.norm(a, ord=None, axis=(1,2))
Output:
array([3.0668826, 3.4241767, 3.0293021], dtype=float32)
The default keyword ord=None is for calculating the L2 norm per the documentation. The axis keyword is to specify which dimensions we want to reduce which should be clear from the first code snippet.
I try to use the function NearestNeighbors on Sklearn. I write an example to understand what's happening on these function.
from sklearn.neighbors import NearestNeighbors
samples = [[0.2, 0], [0.5, 0.1], [0.4,0.4]]
neigh = NearestNeighbors(n_neighbors=2,metric='mahalanobis')
neigh.fit(samples)
print(neigh.kneighbors([[272,7522752]])) # use any point to test
Above code work well and it can correctly compute the 2 - nearest point .
But when I try to use my dataset , and some mistakes are happend. Dataset matrix are 9959 * 384 matrix. I print the matrix below , and I declare the matrix training_data
[[ 0.069915 0.020142 0.070054 ..., 0.333937 0.477351 0.055993]
[ 0.131826 0.038203 0.131573 ..., 0.353589 0.426197 0.048557]
[ 0.130338 0.02595 0.130351 ..., 0.315951 0.32355 0.098884]
...,
[ 0.053331 0.023395 0.0534 ..., 0.366064 0.404756 0.066217]
[ 0.063554 0.021197 0.063671 ..., 0.235945 0.439595 0.105366]
[ 0.123632 0.045492 0.12322 ..., 0.308702 0.437344 0.040144]]
And when I use training_data into above code which just change the samples to training_data, it has a mistake.
LinAlgError: 0-dimensional array given. Array must be at least two- dimensional
Please help me solve these questions, tks a lot !