How add a column to the front of np array? - python-3.x

I want to add a column x0 of shape(1,10) to the front of an existing nparray X of shape(10,3) so that the final np array X_new becomes of the shape (10,4).
x0 = np.ones((1,np.shape(X)[0]))
X = np.array([[1500,1,2],[1700,3,3],[2000,2,2],[2400,2,3],[2700,3,3],[3000,3,4],[3100,2,3],[3300,3,4],[3500,4,5],[3600,3,4]])
output:
X_new = np.array([[1,1500,1,2],[1,1700,3,3],[1,2000,2,2],[1,2400,2,3],[1,2700,3,3],[1,3000,3,4],[1,3100,2,3],[1,3300,3,4],[1,3500,4,5],[1,3600,3,4]])
I have tried doing concatenation, hstack but I am not able to get the desired resultant np array.
Please help.
Thank you.

You are using the wrong shape for x0, once you modify that, you can use np.hstack:
X = np.array([[1500,1,2],[1700,3,3],[2000,2,2],[2400,2,3],[2700,3,3],[3000,3,4],[3100,2,3],[3300,3,4],[3500,4,5],[3600,3,4]])
x0 = np.ones((np.shape(X)[0],1))
x_new = np.hstack([x0,X])
x_new
array([[1, 1500, 1, 2],
[1, 1700, 3, 3],
[1, 2000, 2, 2],
[1, 2400, 2, 3],
[1, 2700, 3, 3],
[1, 3000, 3, 4],
[1, 3100, 2, 3],
[1, 3300, 3, 4],
[1, 3500, 4, 5],
[1, 3600, 3, 4]])

Related

3D matrix addition python

I am trying to add 3D matrix but third loop is not starting from 0.
Here shape of matrix is (2,3,3).
Code:
for i in range(0,r):
for j in range(0,c):
for l in range(0,k):
sum[i][j][k]=A1[i][j][k]+A2[i][j][k]
Output:
IndexError: index 3 is out of bounds for axis 0 with size 3
For element-wise addition of two matrices, you can simply use the + operator between two numpy arrays:
#create two matrices of random integers
matrix1 = np.random.randint(10, size=(2,3,3))
matrix2 = np.random.randint(10, size=(2,3,3))
#add the two matrices element-wise
sum_matrix = matrix1 + matrix2
print(matrix1, matrix2, sum_matrix, sep='\n__________\n')
I don't get IndexError. Maybe you post your whole code?
This is my code:
arr1 = [[[2, 4, 8], [7, 7, 1], [4, 9, 0]], [[5, 0, 0], [3, 8, 6], [0, 5, 8]]]
arr2 = [[[3, 8, 0], [1, 5, 2], [0, 3, 9]], [[9, 7, 7], [1, 2, 5], [1, 1, 3]]]
sumArr = [[[0, 0, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0],[0, 0, 0]]]
for i in range(2): #can also use range(0,2)
for j in range(3):
for k in range(3):
sumArr[i][j][k]=arr1[i][j][k]+arr2[i][j][k]
print(sumArr)
By the way, is it necessary to use for loop?
If not, you can use numpy library.
import numpy as np
Convert your manual array to numpy matrix array, then do addition.
arr1 = [[[2, 4, 8], [7, 7, 1], [4, 9, 0]], [[5, 0, 0], [3, 8, 6], [0, 5, 8]]]
arr2 = [[[3, 8, 0], [1, 5, 2], [0, 3, 9]], [[9, 7, 7], [1, 2, 5], [1, 1, 3]]]
m1 = np.array(arr1)
m2 = np.array(arr2)
print("M1: \n", m1)
print("M2: \n", m2)
print("Sum: \n", m1 + m2)
You iterate with 'l' in the third loop but to access in list, you used k. As a result, your code is trying to access k-th index which doesn't exists, and you're getting an error.
Use this:
for i in range(0, r):
for j in range(0, c):
for l in range(0, k):
sum[i][j][l] = A1[i][j][l] + A2[i][j][l]

How to efficiently repeat tensor element variable of time in pytorch?

For example, if I have a tensor A = [[1,1,1], [2,2,2], [3,3,3]], and B = [1,2,3]. How do I get C = [[1,1,1], [2,2,2], [2,2,2], [3,3,3], [3,3,3], [3,3,3]], and doing this batch-wise?
My current element-wise solution btw (takes forever...):
def get_char_context(valid_embeds, words_lens):
chars_contexts = []
for ve, wl in zip(valid_embeds, words_lens):
for idx, (e, l) in enumerate(zip(ve, wl)):
if idx ==0:
chars_context = e.view(1,-1).repeat(l, 1)
else:
chars_context = torch.cat((chars_context, e.view(1,-1).repeat(l, 1)),0)
chars_contexts.append(chars_context)
return chars_contexts
I'm doing this to add bert word embedding to a char level seq2seq task...
Use this:
import torch
# A is your tensor
B = torch.tensor([1, 2, 3])
C = A.repeat_interleave(B, dim = 0)
EDIT:
The above works fine if A is a single 2D tensor. To repeat all (2D) tensors in a batch in the same manner, this is a simple workaround:
A = torch.tensor([[[1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[1, 2, 3], [4, 5, 6], [2,2,2]]]) # A has 2 tensors each of shape (3, 3)
B = torch.tensor([1, 2, 3]) # Rep. of each row of every tensor in the batch
A1 = A.reshape(1, -1, A.shape[2]).squeeze()
B1 = B.repeat(A.shape[0])
C = A1.repeat_interleave(B1, dim = 0).reshape(A.shape[0], -1, A.shape[2])
C is:
tensor([[[1, 1, 1],
[2, 2, 2],
[2, 2, 2],
[3, 3, 3],
[3, 3, 3],
[3, 3, 3]],
[[1, 2, 3],
[4, 5, 6],
[4, 5, 6],
[2, 2, 2],
[2, 2, 2],
[2, 2, 2]]])
As you can see each inside tensor in the batch is repeated in the same manner.

Function to generate incremental weights based on np.select conditions

Objective: Define function to use flags (1,2,3) as conditions that trigger different weights (.2,.4,0). Output is a new df with the weights only.
The np.select is generating this error:
TypeError: invalid entry 0 in condlist: should be boolean ndarray
Image shows desired output as "incremental weight output"
import pandas as pd
import numpy as np
flags = pd.DataFrame({'Date': ['2020-01-01','2020-02-01','2020-03-01'],
'flag_1': [1, 2, 3],
'flag_2': [1, 1, 1],
'flag_3': [2, 1, 2],
'flag_4': [3, 1, 3],
'flag_5' : [1, 2, 2],
'flag_6': [2, 1, 2],
'flag_7': [1, 1, 1],
'flag_8': [1, 1, 1],
'flag_9': [3, 3, 2]})
flags = flags.set_index('Date')
def inc_weights(dfin, wt1, wt2, wt3):
dfin = pd.DataFrame(dfin.iloc[:,::-1])
dfout = pd.DataFrame()
conditions = [1,2,3]
choices = [wt1,wt2,wt3]
dfout=np.select(conditions, choices, default=np.nan)
return(dfout.iloc[:,::-1])
inc_weights = inc_weights(flags, .2, .4, 0)
print(inc_weights)
Input and Output
np.select was unnecessary. simple solution using df.replace with a mapping dict.
import pandas as pd
import numpy as np
flags = pd.DataFrame({'Date': ['2020-01-01','2020-02-01','2020-03-01'],
'flag_1': [1, 2, 3],
'flag_2': [1, 1, 1],
'flag_3': [2, 1, 2],
'flag_4': [3, 1, 3],
'flag_5' : [1, 2, 2],
'flag_6': [2, 1, 2],
'flag_7': [1, 1, 1],
'flag_8': [1, 1, 1],
'flag_9': [3, 3, 2]})
flags = flags.set_index('Date')
print(flags)
def inc_weights(dfin, wt1, wt2, wt3):
dfin = pd.DataFrame(dfin.iloc[:,::-1])
dfout = pd.DataFrame()
mapping = {1:wt1,2:wt2,3:wt3}
dfout=dfin.replace(mapping)
return(dfout.iloc[:,::-1])
inc_weights = inc_weights(flags, .2, .4, 0)
print(inc_weights)

Merge two tensor in pytorch

Tensor a:
tensor([[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
Tensor b:
tensor([4,4,4,4])
Question 1:
How to merge two tensors and get result c:
tensor([[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4]])
Question 2: How to divide tensor c and get original a and b.
Question 1: Merge two tensors -
torch.cat((a, b.unsqueeze(1)), 1)
>>> tensor([[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4]])
First, we use torch.unsqueeze to add single dim in b tensor to match a dim to be concanate. Then use torch.cat Concatenates tensors a and b.
Question 2:
a = c[:][:,:-1]
a
>>> tensor([[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
b = c[:][:,-1:].squeeze(1)
b
>>> tensor([4, 4, 4, 4])
You have to slightly modify tensor b:
a = torch.tensor([[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
b = torch.tensor([4,4,4,4])
b = b.reshape(1, 4)
Then you get your "joined" tensor:
c = torch.cat((a, torch.t(b)), 1)
And backward:
a1 = c[:,:-1]
b1 = torch.t(c[:,-1:])

idiom for getting contiguous copies

In the help of numpy.broadcst-array, an idiom is introduced.
However, the idiom give exactly the same output as original command.
Waht is the meaning of "getting contiguous copies instead of non-contiguous views."?
https://docs.scipy.org/doc/numpy/reference/generated/numpy.broadcast_arrays.html
x = np.array([[1,2,3]])
y = np.array([[1],[2],[3]])
np.broadcast_arrays(x, y)
[array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]
Here is a useful idiom for getting contiguous copies instead of non-contiguous views.
[np.array(a) for a in np.broadcast_arrays(x, y)]
[array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]
To understand the difference try writing into the new arrays:
Let's begin with the contiguous copies.
>>> import numpy as np
>>> x = np.array([[1,2,3]])
>>> y = np.array([[1],[2],[3]])
>>>
>>> xc, yc = [np.array(a) for a in np.broadcast_arrays(x, y)]
>>> xc
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
We can modify an element and nothing unexpected will happen.
>>> xc[0, 0] = 0
>>> xc
array([[0, 2, 3],
[1, 2, 3],
[1, 2, 3]])
>>> x
array([[1, 2, 3]])
Now, let's try the same with the broadcasted arrays:
>>> xb, yb = np.broadcast_arrays(x, y)
>>> xb
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
Although we only write to the top left element ...
>>> xb[0, 0] = 0
... the entire left column will change ...
>>> xb
array([[0, 2, 3],
[0, 2, 3],
[0, 2, 3]])
... and also the input array.
>>> x
array([[0, 2, 3]])
It means that broadcast_arrays function doesn't create entirely new object. It creates views from original arrays which means the elements of it's results have memory addresses as those arrays which may or may not be contiguous. But when you create a list you're creating new copies within a list which guarantees that its items are stored contiguous in memory.
You can check this like following:
arr = np.broadcast_arrays(x, y)
In [144]: arr
Out[144]:
[array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]
In [145]: x
Out[145]: array([[1, 2, 3]])
In [146]: arr[0][0] = 0
In [147]: arr
Out[147]:
[array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]
In [148]: x
Out[148]: array([[0, 0, 0]])
As you can see, changing the arr's elements is changing both its elements and the original x array.

Resources