Pytorch tensor and it's transpose have different storage - pytorch

I was reading the book Deep Learning with Pytorch and was trying out an example which shows that a tensor and it's transpose share the same storage.
However, when i tried it out on my local machine, I can see that the storage is different for both. Just wanted to understand why this might be the case here ?
The code i tried and the output is as below:
>>> points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
>>> points_t = torch.transpose(points,0,1)
>>> points_t
tensor([[4., 5., 2.],
[1., 3., 1.]])
>>> id(points.storage())==id(points_t.storage())
False
>>> id(points.storage())
2796700202176
>>> id(points_t.storage())
2796700201888
My python version is 3.9.7 and pytorch version is 1.11.0

You need to compare the pointer of storages instead of taking the id of it.
>>> points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
>>> points_t = torch.transpose(points,0,1)
>>> points_t
tensor([[4., 5., 2.],
[1., 3., 1.]])
>>> points.storage().data_ptr() == points_t.storage().data_ptr()
True
The reason you are getting False for id comparison is that Python objects (points and points_t) are different objects but the underlying storages (the memory that you allocate to keep the data) are the same.

Related

How to evaluate a pyTorch/DGL tensor

From a DGL graph I want to see the adjacency matrix with
adjM = g.adjacency_matrix()
adjM
and I get the following which is fine:
tensor(indices=tensor([[0, 0, 0, 1],
[1, 2, 3, 3]]),
values=tensor([1., 1., 1., 1.]),
size=(4, 4), nnz=4, layout=torch.sparse_coo)
Now I want to have the adjacency matrix and the node values each by itself. I imagine something of this kind:
adjMatrix = adjM.indices # or
adjMatrix = adjM[0]
nodeValues = adjM.values # or
nodeValues = adjM[1]
But this form is not estimated by pyTorch/DGL.
My beginner's question:
how to do this correctly and sucsessfully? and
is there a tutorial for a nuby? ( I have searched a lot just for this detail...!)
Click here!
You will find the usage of dgl.adj(). As the doc said, the return is an adjacency matrix, and the return type is the SparseTensor.
I noticed that the output that you post is a SparseTensor.
You can try it as follows then you can get the entire adj_matrix
I create a dgl graph g, get the adjacency matrix as adj
g = dgl.graph(([0, 1, 2], [1, 2, 3]))
adj = g.adj()
adj
output is:
tensor(indices=tensor([[0, 1, 2],
[1, 2, 3]]),
values=tensor([1., 1., 1.]),
size=(4, 4), nnz=3, layout=torch.sparse_coo)
We can find that adj is the presence of sparse, and the sparse type is coo, we can use the following code to verify if adj is a SparseTensor
adj.is_sparse
output :
True
so we can use to_dense() get the original adj matrix
adj.to_dense()
the result is:
tensor([[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.],
[0., 0., 0., 0.]])
When you have a problem with DGL you can check the Deep Graph Library Tutorials and Documentation.

Strange behaviour of np.rint function [duplicate]

numpy's round int doesn't seem to be consistent with how it deals with xxx.5
In [2]: np.rint(1.5)
Out[2]: 2.0
In [3]: np.rint(10.5)
Out[3]: 10.0
1.5 is rounded up while 10.5 is rounded down. Is there a reason for this? Is it just and artifact of the inaccuracy of floats?
Edit
Is there a way to get the desired functionality where n.5 is rounded up i.e. to n+1 for both n = even or odd?
So, this kind of behavior (as noted in comments), is a very traditional form of rounding, seen in the round half to even method. Also known (according to David Heffernan) as banker's rounding. The numpy documentation around this behavior implies that they are using this type of rounding, but also implies that there may be issues with the way in which numpy interacts with the IEEE floating point format. (shown below)
Notes
-----
For values exactly halfway between rounded decimal values, Numpy
rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0,
-0.5 and 0.5 round to 0.0, etc. Results may also be surprising due
to the inexact representation of decimal fractions in the IEEE
floating point standard [1]_ and errors introduced when scaling
by powers of ten.
Whether or not that is the case, I honestly don't know. I do know that large portions of the numpy core are still written in FORTRAN 77, which predates the IEEE standard (set in 1984), but I don't know enough FORTRAN 77 to say whether or not there's some issue with the interface here.
If you're looking to just round up regardless, the np.ceil function (ceiling function in general), will do this. If you're looking for the opposite (always rounding down), the np.floor function will achieve this.
This is in fact exactly the rounding specified by the IEEE floating point standard IEEE 754 (1985 and 2008). It is intended to make rounding unbiased. In normal probability theory, a random number between two integers has zero probability of being exactly N + 0.5, so it shouldn't matter how you round it because that case never happens. But in real programs, numbers are not random and N + 0.5 occurs quite often. (In fact, you have to round 0.5 every time a floating point number loses 1 bit of precision!) If you always round 0.5 up to the next largest number, then the average of a bunch rounded numbers is likely to be slightly larger than the average of the unrounded numbers: this bias or drift can have very bad effects on some numerical algorithms and make them inaccurate.
The reason rounding to even is better than rounding to odd is that the last digit is guaranteed to be zero, so if you have to divide by 2 and round again, you don't lose any information at all.
In summary, this kind of rounding is the best that mathematicians have been able to devise, and you should WANT it under most circumstances. Now all we need to do is get schools to start teaching it to children.
Numpy rounding does round towards even, but the other rounding modes can be expressed using a combination of operations.
>>> a=np.arange(-4,5)*0.5
>>> a
array([-2. , -1.5, -1. , -0.5, 0. , 0.5, 1. , 1.5, 2. ])
>>> np.floor(a) # Towards -inf
array([-2., -2., -1., -1., 0., 0., 1., 1., 2.])
>>> np.ceil(a) # Towards +inf
array([-2., -1., -1., -0., 0., 1., 1., 2., 2.])
>>> np.trunc(a) # Towards 0
array([-2., -1., -1., -0., 0., 0., 1., 1., 2.])
>>> a+np.copysign(0.5,a) # Shift away from 0
array([-2.5, -2. , -1.5, -1. , 0.5, 1. , 1.5, 2. , 2.5])
>>> np.trunc(a+np.copysign(0.5,a)) # 0.5 towards higher magnitude round
array([-2., -2., -1., -1., 0., 1., 1., 2., 2.])
In general, numbers of the form n.5 can be accurately represented by binary floating point (they are m.1 in binary, as 0.5=2**-1), but calculations expected to reach them might not. For instance, negative powers of ten are not exactly represented:
>>> (0.1).as_integer_ratio()
(3602879701896397, 36028797018963968)
>>> [10**n * 10**-n for n in range(20)]
[1, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
0.9999999999999999, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
Numpy uses bankers rounding so .5 is rounded to the nearest even number. If you always want to round .5 up but round .4 down:
np.rint(np.nextafter(a, a+1))
or if you always want to round .5 down and .4 down but .6 up:
np.rint(np.nextafter(a, a-1))
NB this also works with np.around if you want the same logic but not integers.
>>> a = np.array([1, 1.5, 2, 2.5, 3, 3.5])
>>> np.rint(a)
array([1., 2., 2., 2., 3., 4.])
>>> np.rint(np.nextafter(a, a+1))
array([1., 2., 2., 3., 3., 4.])
>>> np.rint(np.nextafter(a, a-1))
array([1., 1., 2., 2., 3., 3.])
What is happening?
nextafter gives the next representable number in a direction, so this is enough to push the number off 'exactly' 2.5.
Note this is different to ceil and floor.
>>> np.ceil(a)
array([1., 2., 2., 3., 3., 4.])
>>> np.floor(a)
array([1., 1., 2., 2., 3., 3.])
An answer to you edit:
y = int(np.floor(n + 0.5))
The built-in round function seems to do what you want, although it only works on scalars:
def correct_round(x):
try:
y = [ round(z) for z in x ]
except:
y = round(x)
return y
and then to verify:
print correct_round([-2.5,-1.5,-0.5,0.5,1.5,2.5])
> [-3.0, -2.0, -1.0, 1.0, 2.0, 3.0]
Not sure its the most efficient solution but it works:
signs = np.sign(arr)
tmp = signs * arr
arr = np.floor(tmp + 0.5)
arr = arr * signs
round half up function for for scalar, list and numpy array:
import numpy as np
def round_half_up(x):
round_lambda = lambda z: (int(z > 0) - int(z < 0)) * int(abs(z) + 0.5)
if isinstance(x, (np.ndarray, np.generic)):
return np.vectorize(round_lambda)(x)
else:
return round_lambda(x)
This worked for me:
def my_round(a):
return np.round(a)*(a-np.floor(a)!=0.5) + np.ceil(a)*(a-np.floor(a)==0.5)
>>> my_round([0.5, 1.5, 2.5, 3.5])
array([1., 2., 3., 4.])

How to add to pytorch tensor at indices?

I have to admit, I'm a bit confused by the scatter* and index* operations - I'm not sure any of them do exactly what I'm looking for, which is very simple:
Given some 2-D tensor
z = tensor([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
And a list (or tensor?) of 2-d indexes:
inds = tensor([[0, 0],
[1, 1],
[1, 2]])
I want to add a scalar to z at those indexes (and do it efficiently):
znew = z.something_add(inds, 3)
->
znew = tensor([[4., 1., 1., 1.],
[1., 4., 4., 1.],
[1., 1., 1., 1.]])
If I have to I can make that scalar a tensor of whatever shape (where all elements = 3), but I'd rather not...
You must provide two lists to your indexing. The first having the row positions and the second the column positions. In your example, it would be:
z[[0, 1, 1], [0, 1, 2]] += 3
torch.Tensor indexing follows Numpy. See https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#integer-array-indexing for more details.
This code achieves what you want:
z_new = z.clone() # copy the tensor
z_new[inds[:, 0], inds[:, 1]] += 3 # modify selected indices of new tensor
In PyTorch, you can index each axis of a tensor with another tensor.

How to divide a list of arrays into into subarrays?

I have a list 'a' with following element values. In my code, I created a list :
a=[]
b=np.zeros(3)
c=[]
for i in range(0,4):
b[0]=i+1
b[1]=i+2
b[2]=i+3
c.append(deepcopy(b))
a.append(c)
c=[]
print(a)
Output:
[[array([1., 2., 3.]), array([2., 3., 4.]), array([3., 4., 5.]), array([4., 5., 6.])]]
Above list is example like I get in my data
I tried to make array
b=np.array(a)
array([[[1., 2., 3.],
[2., 3., 4.],
[3., 4., 5.],
[4., 5., 6.]]])
b.shape
(1,4,3)
But I want to make b of shape (4,1,3) which gives following output:
So that when I access
b[0] gives [1,2,3]
b[1] gives [2,3,4]
b[2] gives [3,4,5]
b[3] gives [4,5,6]
There's a built-in function for this:
b = np.vstack(a)
EDITED
After using np.vstack(a)
b=b.reshape(4,3,1)
This gives required result
b[0]- > [1,2,3]
EDIT:
Use #orli answer it's easier to type!
Using basic Python 3.
import numpy as np
from copy import deepcopy
a=[]
b=np.zeros(3)
c=[]
for i in range(0,4):
b[0]=i+1
b[1]=i+2
b[2]=i+3
c.append(deepcopy(b))
a.append(c)
res = []
for r in a:
for c in r:
rw = []
for e in c.tolist():
rw.append(e)
res.append(rw)
print(res)
Yields:
[[1.0, 2.0, 3.0], [2.0, 3.0, 4.0], [3.0, 4.0, 5.0], [4.0, 5.0, 6.0]]
Maybe I'm missing something but you should be able to get the result as:
b = np.array(a[0])
print(b[0]) # [1. 2. 3.]
print(b[1]) # [2. 3. 4.]
print(b[2]) # [3. 4. 5.]
print(b[3]) # [4. 5. 6.]
To preserve 3D array:
np.array([a[0]]).reshape(4,1,3)
print(b[0]) #=> [[1. 2. 3.]]
print(b[1]) #=> [[2. 3. 4.]]
print(b[2]) #=> [[3. 4. 5.]]
print(b[3]) #=> [[4. 5. 6.]]

What is the difference between dtype= and .astype() in numpy?

Context: I would like to use numpy ndarrays with float32 instead of float64.
Edit: Additional context - I'm concerned about how numpy is executing these calls because they will be happening repeatedly as part of a backpropagation routine in a neural net. I'd like the net to carry out all addition/subtraction/multiplication/division in float32 for validation purposes, as I want to compare results with another group's work. It seems like initialization for methods like randn will always go from float64 -> float32 with .astype() casting. Once my ndarray is of type float32 if i use np.dot for example will those multiplications happen in float32? How can I verify?
The documentation is not clear to me - http://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html
I figured out I can just add .astype('float32') to the end of a numpy call, for example, np.random.randn(y, 1).astype('float32').
I also see that dtype=np.float32 is an option, for example, np.zeros(5, dtype=np.float32). However, trying np.random.randn((y, 1), dtype=np.float32) returns the following error:
b = np.random.randn((3,1), dtype=np.float32)
TypeError: randn() got an unexpected keyword argument 'dtype'
What is the difference between declaring the type as float32 using dtype and using .astype()?
Both b = np.zeros(5, dtype=np.float32) and b = np.zeros(5).astype('float32') when evaluated with:
print(type(b))
print(b[0])
print(type(b[0]))
prints:
[ 0. 0. 0. 0. 0.]
<class 'numpy.ndarray'>
0.0
<class 'numpy.float32'>
Let's see if I can address some of the confusion I'm seeing in the comments.
Make an array:
In [609]: x=np.arange(5)
In [610]: x
Out[610]: array([0, 1, 2, 3, 4])
In [611]: x.dtype
Out[611]: dtype('int32')
The default for arange is to make an int32.
astype is an array method; it can used on any array:
In [612]: x.astype(np.float32)
Out[612]: array([ 0., 1., 2., 3., 4.], dtype=float32)
arange also takes a dtype parameter
In [614]: np.arange(5, dtype=np.float32)
Out[614]: array([ 0., 1., 2., 3., 4.], dtype=float32)
whether it created the int array first and converted it, or made the float32 directly isn't any concern to me. This is a basic operation, done in compiled code.
I can also give it a float stop value, in which case it will give me a float array - the default float type.
In [615]: np.arange(5.0)
Out[615]: array([ 0., 1., 2., 3., 4.])
In [616]: _.dtype
Out[616]: dtype('float64')
zeros is similar; the default dtype is float64, but with a parameter I can change that. Since its primary task with to allocate memory, and it doesn't have to do any calculation, I'm sure it creates the desired dtype right away, without further conversion. But again, this is compiled code, and I shouldn't have to worry about what it is doing under the covers.
In [618]: np.zeros(5)
Out[618]: array([ 0., 0., 0., 0., 0.])
In [619]: _.dtype
Out[619]: dtype('float64')
In [620]: np.zeros(5,dtype=np.float32)
Out[620]: array([ 0., 0., 0., 0., 0.], dtype=float32)
randn involves a lot of calculation, and evidently it is compiled to work with the default float type. It does not take a dtype. But since the result is an array, it can be cast with astype.
In [623]: np.random.randn(3)
Out[623]: array([-0.64520949, 0.21554705, 2.16722514])
In [624]: _.dtype
Out[624]: dtype('float64')
In [625]: __.astype(np.float32)
Out[625]: array([-0.64520949, 0.21554704, 2.16722512], dtype=float32)
Let me stress that astype is a method of an array. It takes the values of the array and produces a new array with the desire dtype. It does not act retroactively (or in-place) on the array itself, or on the function that created that array.
The effect of astype is often (always?) the same as a dtype parameter, but the sequence of actions is different.
In https://stackoverflow.com/a/39625960/901925 I describe a sparse matrix creator that takes a dtype parameter, and implements it with an astype method call at the end.
When you do calculations such as dot or *, it tries to match the output dtype with inputs. In the case of mixed types it goes with the higher precision alternative.
In [642]: np.arange(5,dtype=np.float32)*np.arange(5,dtype=np.float64)
Out[642]: array([ 0., 1., 4., 9., 16.])
In [643]: _.dtype
Out[643]: dtype('float64')
In [644]: np.arange(5,dtype=np.float32)*np.arange(5,dtype=np.float32)
Out[644]: array([ 0., 1., 4., 9., 16.], dtype=float32)
There are casting rules. One way to look those up is with can_cast function:
In [649]: np.can_cast(np.float64,np.float32)
Out[649]: False
In [650]: np.can_cast(np.float32,np.float64)
Out[650]: True
It is possible in some calculations that it will cast the 32 to 64, do the calculation, and then cast back to 32. The purpose would be to avoid rounding errors. But I don't know how you find that out from the documentation or tests.
arr1 = np.array([25, 56, 12, 85, 34, 75])
arr2 = np.array([42, 3, 86, 32, 856, 46])
arr1.astype(np.complex)
print (arr1)
print(type(arr1[0]))
print(arr1.astype(np.complex))
arr2 = np.array(arr2,dtype='complex')
print(arr2)
print(type(arr2[0]))
OUTPUT for above
[25 56 12 85 34 75]
<class 'numpy.int64'>
[25.+0.j 56.+0.j 12.+0.j 85.+0.j 34.+0.j 75.+0.j]
[ 42.+0.j 3.+0.j 86.+0.j 32.+0.j 856.+0.j 46.+0.j]
<class 'numpy.complex128'>
It can be seen that astype changes the type temporally as we do in normal type casting but where as the generic method changes the type permanently
.astype() copies the data.
>>> a = np.ones(3, dtype=float)
>>> a
array([ 1., 1., 1.])
>>> b = a.astype(int)
>>> b
array([1, 1, 1])
>>> np.may_share_memory(a, b)
False
Note that astype() copies the data even if the dtype is actually the same:
>>> c = a.astype(float)
>>> np.may_share_memory(a, c)
False

Resources