Array conforming shape of a given variable - python-3.x

I need to do some calculations with a NetCDF file.
So I have two variables with following dimensions and sizes:
A [time | 1] x [lev | 12] x [lat | 84] x [lon | 228]
B [lev | 12]
What I need is to produce a new array, C, that is shaped as (1,12,84,228) where B contents are propagated to all dimensions of A.
Usually, this is easily done in NCL with the conform function. I am not sure what is the equivalent of this in Python.
Thank you.

The numpy.broadcast_to function can do something like this, although in this case it does require B to have a couple of extra trailing size 1 dimension added to it to satisfy the numpy broadcasting rules
>>> import numpy
>>> B = numpy.arange(12).reshape(12, 1, 1)
>>> B
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
>>> B = B.reshape(12, 1, 1)
>>> B.shape
(12, 1, 1)
>>> C = numpy.broadcast_to(b, (1, 12, 84, 228))
>>> C.shape
(1, 12, 84, 228)
>>> C[0, :, 0, 0]
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
>>> C[-1, :, -1, -1]
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])

Related

When i use set( list_a + list_b ) it returns a dictionary. Do sets naturally return dictionaries?

I'm doing some beginner python exercises and one of them is to remove duplicates from a list. I've successfully done it, but the strange thing is that it is returning a dictionary instead of a list.
This is my code.
import random
a = []
b = []
for i in range(0,20):
n = random.randint(0,10)
a.append(n)
for i in range(0,20):
n = random.randint(0,10)
b.append(n)
print(sorted(a))
print(sorted(b))
c = set(list(a+b))
print(c)
and this is what it's spitting out
[0, 0, 1, 1, 1, 1, 2, 3, 4, 4, 6, 6, 7, 7, 7, 8, 9, 9, 10, 10]
[0, 1, 2, 2, 2, 2, 2, 4, 4, 4, 4, 4, 6, 7, 8, 9, 9, 10, 10, 10]
{0, 1, 2, 3, 4, 6, 7, 8, 9, 10}
thanks in advance!
{0, 1, 2, 3, 4, 6, 7, 8, 9, 10} is a set, not a dictionary, a dictionary would be printed as {key:value, key:value, ...}
Try print(type(c)) and you'll see it prints <class 'set'> rather than <class 'dict'>
Also try the following
s = {1,2,3}
print(type(s))
d = {'a':1,'b':2,'c':3}
print(type(d))
You'll see the type is different

PyTorch: How to insert before a certain element

Currently I have a 2D tensor, for each row, I want to insert a new element e before the first index of a specified value v. Additional information: cannot guarantee each row could have a such value. If there isn't, just append the element
Example: Supporse e is 0, v is 10, Given a tensor
[[9, 6, 5, 4, 10],
[8, 7, 3, 5, 5],
[4, 9, 10, 10, 10]]
I want to get
[[9, 6, 5, 4, 0, 10],
[8, 7, 3, 5, 5, 0],
[4, 9, 0, 10, 10, 10]]
Are there some Torch-style ways to do this? The worst case I can treat this as a trivial Python problem but I think the corresponding solution is a little time-consuming.
I haven't yet found a full PyTorch solution. I'll keep looking, but here is somewhere to start:
>>> v, e = 10, 0
>>> v, e = torch.tensor([v]), torch.tensor([e])
>>> x = torch.tensor([[ 9, 6, 5, 4, 10],
[ 8, 7, 3, 5, 5],
[ 4, 9, 10, 10, 10],
[10, 9, 7, 10, 2]])
To deal with the edge case where v is not found in one of the rows you can add a temporary column to x. This will ensure every row has a value v in it. We will use x_ as a helper tensor:
>>> x_ = torch.cat([x, v.repeat(x.size(0))[:, None]], axis=1)
tensor([[ 9, 6, 5, 4, 10, 10],
[ 8, 7, 3, 5, 5, 10],
[ 4, 9, 10, 10, 10, 10],
[10, 9, 7, 10, 2, 10]])
Find the indices of the first value v on each row:
>>> bp = (x_ == v).int().argmax(axis=1)
tensor([4, 5, 2, 0])
Finally, the easiest way to insert values at different positions in each row is with a list comprehension:
>>> torch.stack([torch.cat([xi[:bpi], e, xi[bpi:]]) for xi, bpi in zip(x, bp)])
tensor([[ 9, 6, 5, 4, 0, 10],
[ 8, 7, 3, 5, 5, 0],
[ 4, 9, 0, 10, 10, 10],
[ 0, 10, 9, 7, 10, 2]])
Edit - If v cannot occur in the first position, then no need for x_:
>>> x
tensor([[ 9, 6, 5, 4, 10],
[ 8, 7, 3, 5, 5],
[ 4, 9, 10, 10, 10]])
>>> bp = (x == v).int().argmax(axis=1) - 1
>>> torch.stack([torch.cat([xi[:bpi], e, xi[bpi:]]) for xi, bpi in zip(x, bp)])
tensor([[ 9, 6, 5, 0, 4, 10],
[ 8, 7, 3, 5, 0, 5],
[ 4, 0, 9, 10, 10, 10]])

Adding 2 numpy nd.array

I have to numpy.ndarray A & B which are of the following shape
A=(500000,784),B =(500000,).I need to add these 2 arrays in a way that the array B , which has labels gets added as the 785th column in the array without changing any sequence in its row- wise data.
i.e, A becomes of shape (500000,785).
np.append(A.T,[B.T], axis=0).T
For example:
A = np.array([[1,2,3],[4,5,6],[7,8,9],[10,9,11]])
B = np.array([4,5,3,6])
np.append(A.T,[B.T], axis=0).T
Output:
array([[ 1, 2, 3, 4],
[ 4, 5, 6, 5],
[ 7, 8, 9, 3],
[10, 9, 11, 6]])

How to get indices of a specific number in an array?

I want to pick the indices of number 8 without knowing its position in the array.
a = np.arange(10)
You can use np.where like :
>>> import numpy as np
>>> a = np.array([1,4,8,2,6,7,9,8,7,8,8,9,1,0])
>>> a
array([1, 4, 8, 2, 6, 7, 9, 8, 7, 8, 8, 9, 1, 0])
>>> np.where(a==8)[0]
array([ 2, 7, 9, 10], dtype=int64)

Error:setting an array element with a sequence

if triangles is None:
tridata = mesh['face'].data['vertex_indices']
print(tridata)
print(type(tridata))
print(tridata.dtype)
triangles = plyfile.make2d(tridata)
there have a error :setting an array element with a sequence.
I check the type of tridata:
[array([ 0, 5196, 10100], dtype=int32)
array([ 0, 2850, 10103], dtype=int32)
array([ 0, 3112, 10102], dtype=int32) ...
array([ 2849, 10076, 5728], dtype=int32)
array([ 2849, 10099, 8465], dtype=int32)
array([ 2849, 10098, 8602], dtype=int32)]
<class 'numpy.ndarray'>
object
ValueError:Error:setting an array element with a sequence.
I don't know where is wrong?
There is the code of function "make2d" :
def make2d(array, cols=None, dtype=None):
'''
Make a 2D array from an array of arrays. The `cols' and `dtype'
arguments can be omitted if the array is not empty.
'''
if (cols is None or dtype is None) and not len(array):
raise RuntimeError("cols and dtype must be specified for empty "
"array")
if cols is None:
cols = len(array[0])
if dtype is None:
dtype = array[0].dtype
return _np.fromiter(array, [('_', dtype, (cols,))],
count=len(array))['_']
Where's this code from? The use of a compound dtype in fromiter is tricky.
In [102]: dt1=np.dtype([('_',int,(4,))])
In [103]: dt2=np.dtype('i,i,i,i')
In [104]: x = np.arange(12).reshape(3,4)
In [105]: np.fromiter(x, dt1)
....
ValueError: setting an array element with a sequence.
In [106]: np.fromiter(x, dt2)
...
ValueError: setting an array element with a sequence.
If I flatten the array, it works - except values are replicated:
In [107]: np.fromiter(x.ravel(), dt1)
Out[107]:
array([([ 0, 0, 0, 0],), ([ 1, 1, 1, 1],), ([ 2, 2, 2, 2],),
([ 3, 3, 3, 3],), ([ 4, 4, 4, 4],), ([ 5, 5, 5, 5],),
([ 6, 6, 6, 6],), ([ 7, 7, 7, 7],), ([ 8, 8, 8, 8],),
([ 9, 9, 9, 9],), ([10, 10, 10, 10],), ([11, 11, 11, 11],)],
dtype=[('_', '<i8', (4,))])
Converting the array to a nested list, works:
In [108]: np.fromiter(x.tolist(), dt1)
Out[108]:
array([([ 0, 1, 2, 3],), ([ 4, 5, 6, 7],), ([ 8, 9, 10, 11],)],
dtype=[('_', '<i8', (4,))])
In [109]: np.fromiter(x.tolist(), dt2)
....
ValueError: setting an array element with a sequence.
But if I make it a list of tuples, I can create this structured array. List of tuples is the normal way of filling a structured array.
In [110]: np.fromiter([tuple(i) for i in x.tolist()], dt2)
Out[110]:
array([(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11)],
dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4'), ('f3', '<i4')])
But with an object dtype array, none of these tricks work:
In [111]: a
Out[111]:
array([array([0, 1, 2, 3]), array([5, 6, 7, 8]), array([10, 11, 12, 13])],
dtype=object)
I can make an array with dt1 using assignment to an initialized array:
In [123]: b = np.zeros((3,), dt1)
In [124]: b
Out[124]:
array([([0, 0, 0, 0],), ([0, 0, 0, 0],), ([0, 0, 0, 0],)],
dtype=[('_', '<i8', (4,))])
In [125]: b['_']=x
In [126]: b
Out[126]:
array([([ 0, 1, 2, 3],), ([ 4, 5, 6, 7],), ([ 8, 9, 10, 11],)],
dtype=[('_', '<i8', (4,))])
I can also iteratively fill it from the array of arrays:
In [128]: for i in range(3):
...: b['_'][i]=a[i]
...:
In [129]: b
Out[129]:
array([([ 0, 1, 2, 3],), ([ 5, 6, 7, 8],), ([10, 11, 12, 13],)],
dtype=[('_', '<i8', (4,))])

Resources