PyTorch: new_ones vs ones - pytorch

In PyTorch what is the difference between new_ones() vs ones(). For example,
x2.new_ones(3,2, dtype=torch.double)
vs
torch.ones(3,2, dtype=torch.double)

For the sake of this answer, I am assuming that your x2 is a previously defined torch.Tensor. If we then head over to the PyTorch documentation, we can read the following on new_ones():
Returns a Tensor of size size filled with 1. By default, the
returned Tensor has the same torch.dtype and torch.device as this
tensor.
Whereas ones()
Returns a tensor filled with the scalar value 1, with the shape
defined by the variable argument sizes.
So, essentially, new_ones allows you to quickly create a new torch.Tensor on the same device and data type as a previously existing tensor (with ones), whereas ones() serves the purpose of creating a torch.Tensor from scratch (filled with ones).

new_ones()
# defining the tensor along with device to run on. (Assuming CUDA hardware is available)
x = torch.rand(5, 3, device="cuda")
new_ones() works with existing tensor. y will inherit the datatype from x and it will run on same device as defined in x
y = x.new_ones(2, 2)
print(y)
Output:
tensor([[1., 1.],
[1., 1.]], device='cuda:0')
ones()
# defining tensor. By default it will run on CPU.
x = torch.ones(5, 3)
print(x)
Output:
tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
ones() is used to define tensor with 1. (as shown in example) of given size and is not dependent on the existing tensor, whereas new_ones() works with existing tensor which inherits properties like datatype and device from existing tensor and define the tensor with given size.

Related

Swapping the batch axis has effect on the performance in pytorch?

I know that usually the batch dimension is axis zero, and I imagine this has a reason: The underlying memory for each item in the batch is contiguous.
My model calls a function that becomes simpler if I have another dimension in the first axis, so that I can use x[k] instead of x[:, k].
Results from arithmetic operations seems to keep the same memory layout
x = torch.ones(2,3,4).transpose(0,1)
y = torch.ones_like(x)
u = (x + 1)
v = (x + y)
print(x.stride(), u.stride(), v.stride())
When I create additional variables I am creating them with torch.zeros and then transposing, so that the largest stride goes to the axis 1, as well.
e.g.
a,b,c = torch.zeros(
(3, x.shape[1], ADDITIONAL_DIM, x.shape[0]) + x.shape[2:]
).transpose(1,2)
Will create three tensors with the same batch size x.shape[1].
In terms of memory locality it would make any difference to have
a,b,c = torch.zeros(
(x.shape[1], 3, ADDITIONAL_DIM, x.shape[0]) + x.shape[2:]
).permute(1,2,0, ...)
instead.
Should I care about this at all?
TLDR; Slices seemingly contain less information... but in fact share the identical storage buffer with the original tensor. Since permute doesn't affect the underlying memory layout, both operations are essentially equivalent.
Those two are essentially the same, the underlying data storage buffer is kept the same, only the metadata i.e. how you interact with that buffer (strides and shape) changes.
Let us look at a simple example:
>>> x = torch.ones(2,3,4).transpose(0,1)
>>> x_ptr = x.data_ptr()
>>> x.shape, x.stride(), x_ptr
(3, 2, 4), (4, 12, 1), 94674451667072
We have kept the data pointer for our 'base' tensor in x_ptr:
Slicing on the second axis:
>>> y = x[:, 0]
>>> y.shape, y.stride(), x_ptr == y.data_ptr()
(3, 4), (4, 1), True
As you can see, x and x[:, k] shared the same storage.
Permuting the first two axes then slicing on the first one:
>>> z = x.permute(1, 0, 2)[0]
>>> z.shape, z.stride(), x_ptr == z.data_ptr()
(3, 4), (4, 1), True
Here again, you notice that x.data_ptr is the same as z.data_ptr.
In fact, you can even go from y to x's representation using torch.as_strided:
>>> torch.as_strided(y, size=x.shape, stride=x.stride())
tensor([[[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
Same with z:
>>> torch.as_strided(z, size=x.shape, stride=x.stride())
Both will return a copy of x because torch.as_strided is allocating memory for the newly created tensor. These two lines were just to illustrate how we can still 'get back' to x from a slice of x, we can recover the apparent content by changing the tensor's metadata.

How to add to pytorch tensor at indices?

I have to admit, I'm a bit confused by the scatter* and index* operations - I'm not sure any of them do exactly what I'm looking for, which is very simple:
Given some 2-D tensor
z = tensor([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
And a list (or tensor?) of 2-d indexes:
inds = tensor([[0, 0],
[1, 1],
[1, 2]])
I want to add a scalar to z at those indexes (and do it efficiently):
znew = z.something_add(inds, 3)
->
znew = tensor([[4., 1., 1., 1.],
[1., 4., 4., 1.],
[1., 1., 1., 1.]])
If I have to I can make that scalar a tensor of whatever shape (where all elements = 3), but I'd rather not...
You must provide two lists to your indexing. The first having the row positions and the second the column positions. In your example, it would be:
z[[0, 1, 1], [0, 1, 2]] += 3
torch.Tensor indexing follows Numpy. See https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#integer-array-indexing for more details.
This code achieves what you want:
z_new = z.clone() # copy the tensor
z_new[inds[:, 0], inds[:, 1]] += 3 # modify selected indices of new tensor
In PyTorch, you can index each axis of a tensor with another tensor.

Scipy sparse.kron gives non-sparse matrix

I am getting unexpected non-sparse results when using the kron method of Scipy's sparse module. Specifically, matrix elements that are equal to zero after performing the kronecker product are being kept in the result, and I'd like to understand what I should do to ensure the output is still fully sparse.
Here's an example of what I mean, taking the kronecker product of two copies of the identity:
import scipy.sparse as sp
s = sp.eye(2)
S = sp.kron(s,s)
S
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 8 stored elements (blocksize = 2x2) in Block Sparse Row format>
print(S)
(0, 0) 1.0
(0, 1) 0.0
(1, 0) 0.0
(1, 1) 1.0
(2, 2) 1.0
(2, 3) 0.0
(3, 2) 0.0
(3, 3) 1.0
The sparse matrix S should only contain the 4 (diagonal) non-zero entries, but here it also has other entries that are equal to zero. Any pointers on what I am doing wrong would be much appreciated.
In
Converting from sparse to dense to sparse again decreases density after constructing sparse matrix
I point out that sparse.kron produces, by default a BSR format matrix. That's what your display shows. Those extra zeros are part of the dense blocks.
If you specify another format, kron will not produce those zeros:
In [672]: sparse.kron(s,s,format='csr')
Out[672]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 4 stored elements in Compressed Sparse Row format>
In [673]: _.A
Out[673]:
array([[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]])

Keras custom layer/constraint to implement equal weights

I would like to create a layer in Keras such that:
y = Wx + c
where W is a block matrix with the form:
A and B are square matrices with elements:
and c is a bias vector with repeated elements:
How can I implement these restrictions? I was thinking it could either be implemented in the MyLayer.build() when initializing weights or as a constraint where I can specify certain indices to be equal but I am unsure how to do so.
You can define such W using Concatenate layer.
import keras.backend as K
from keras.layers import Concatenate
A = K.placeholder()
B = K.placeholder()
row1 = Concatenate()([A, B])
row2 = Concatenate()([B, A])
W = Concatenate(axis=1)([row1, row2])
Example evaluation:
import numpy as np
get_W = K.function(outputs=[W], inputs=[A, B])
get_W([np.eye(2), np.ones((2,2))])
Returns
[array([[1., 0., 1., 1.],
[0., 1., 1., 1.],
[1., 1., 1., 0.],
[1., 1., 0., 1.]], dtype=float32)]
To figure out exact solution you can use placeholder's shape argument. Addition and multiplication are quite straightforward.

What is the difference between dtype= and .astype() in numpy?

Context: I would like to use numpy ndarrays with float32 instead of float64.
Edit: Additional context - I'm concerned about how numpy is executing these calls because they will be happening repeatedly as part of a backpropagation routine in a neural net. I'd like the net to carry out all addition/subtraction/multiplication/division in float32 for validation purposes, as I want to compare results with another group's work. It seems like initialization for methods like randn will always go from float64 -> float32 with .astype() casting. Once my ndarray is of type float32 if i use np.dot for example will those multiplications happen in float32? How can I verify?
The documentation is not clear to me - http://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html
I figured out I can just add .astype('float32') to the end of a numpy call, for example, np.random.randn(y, 1).astype('float32').
I also see that dtype=np.float32 is an option, for example, np.zeros(5, dtype=np.float32). However, trying np.random.randn((y, 1), dtype=np.float32) returns the following error:
b = np.random.randn((3,1), dtype=np.float32)
TypeError: randn() got an unexpected keyword argument 'dtype'
What is the difference between declaring the type as float32 using dtype and using .astype()?
Both b = np.zeros(5, dtype=np.float32) and b = np.zeros(5).astype('float32') when evaluated with:
print(type(b))
print(b[0])
print(type(b[0]))
prints:
[ 0. 0. 0. 0. 0.]
<class 'numpy.ndarray'>
0.0
<class 'numpy.float32'>
Let's see if I can address some of the confusion I'm seeing in the comments.
Make an array:
In [609]: x=np.arange(5)
In [610]: x
Out[610]: array([0, 1, 2, 3, 4])
In [611]: x.dtype
Out[611]: dtype('int32')
The default for arange is to make an int32.
astype is an array method; it can used on any array:
In [612]: x.astype(np.float32)
Out[612]: array([ 0., 1., 2., 3., 4.], dtype=float32)
arange also takes a dtype parameter
In [614]: np.arange(5, dtype=np.float32)
Out[614]: array([ 0., 1., 2., 3., 4.], dtype=float32)
whether it created the int array first and converted it, or made the float32 directly isn't any concern to me. This is a basic operation, done in compiled code.
I can also give it a float stop value, in which case it will give me a float array - the default float type.
In [615]: np.arange(5.0)
Out[615]: array([ 0., 1., 2., 3., 4.])
In [616]: _.dtype
Out[616]: dtype('float64')
zeros is similar; the default dtype is float64, but with a parameter I can change that. Since its primary task with to allocate memory, and it doesn't have to do any calculation, I'm sure it creates the desired dtype right away, without further conversion. But again, this is compiled code, and I shouldn't have to worry about what it is doing under the covers.
In [618]: np.zeros(5)
Out[618]: array([ 0., 0., 0., 0., 0.])
In [619]: _.dtype
Out[619]: dtype('float64')
In [620]: np.zeros(5,dtype=np.float32)
Out[620]: array([ 0., 0., 0., 0., 0.], dtype=float32)
randn involves a lot of calculation, and evidently it is compiled to work with the default float type. It does not take a dtype. But since the result is an array, it can be cast with astype.
In [623]: np.random.randn(3)
Out[623]: array([-0.64520949, 0.21554705, 2.16722514])
In [624]: _.dtype
Out[624]: dtype('float64')
In [625]: __.astype(np.float32)
Out[625]: array([-0.64520949, 0.21554704, 2.16722512], dtype=float32)
Let me stress that astype is a method of an array. It takes the values of the array and produces a new array with the desire dtype. It does not act retroactively (or in-place) on the array itself, or on the function that created that array.
The effect of astype is often (always?) the same as a dtype parameter, but the sequence of actions is different.
In https://stackoverflow.com/a/39625960/901925 I describe a sparse matrix creator that takes a dtype parameter, and implements it with an astype method call at the end.
When you do calculations such as dot or *, it tries to match the output dtype with inputs. In the case of mixed types it goes with the higher precision alternative.
In [642]: np.arange(5,dtype=np.float32)*np.arange(5,dtype=np.float64)
Out[642]: array([ 0., 1., 4., 9., 16.])
In [643]: _.dtype
Out[643]: dtype('float64')
In [644]: np.arange(5,dtype=np.float32)*np.arange(5,dtype=np.float32)
Out[644]: array([ 0., 1., 4., 9., 16.], dtype=float32)
There are casting rules. One way to look those up is with can_cast function:
In [649]: np.can_cast(np.float64,np.float32)
Out[649]: False
In [650]: np.can_cast(np.float32,np.float64)
Out[650]: True
It is possible in some calculations that it will cast the 32 to 64, do the calculation, and then cast back to 32. The purpose would be to avoid rounding errors. But I don't know how you find that out from the documentation or tests.
arr1 = np.array([25, 56, 12, 85, 34, 75])
arr2 = np.array([42, 3, 86, 32, 856, 46])
arr1.astype(np.complex)
print (arr1)
print(type(arr1[0]))
print(arr1.astype(np.complex))
arr2 = np.array(arr2,dtype='complex')
print(arr2)
print(type(arr2[0]))
OUTPUT for above
[25 56 12 85 34 75]
<class 'numpy.int64'>
[25.+0.j 56.+0.j 12.+0.j 85.+0.j 34.+0.j 75.+0.j]
[ 42.+0.j 3.+0.j 86.+0.j 32.+0.j 856.+0.j 46.+0.j]
<class 'numpy.complex128'>
It can be seen that astype changes the type temporally as we do in normal type casting but where as the generic method changes the type permanently
.astype() copies the data.
>>> a = np.ones(3, dtype=float)
>>> a
array([ 1., 1., 1.])
>>> b = a.astype(int)
>>> b
array([1, 1, 1])
>>> np.may_share_memory(a, b)
False
Note that astype() copies the data even if the dtype is actually the same:
>>> c = a.astype(float)
>>> np.may_share_memory(a, c)
False

Resources