I have below array
import numpy as np
a = np.array([[7412, 33, 2],
[2, 7304, 83],
[3, 101, 7237]])
I would like to extract only lower off-diagonal elements from above array and put them in a vector.
I tried with np.extract(~a, a), but is extracting all elements.
Desired output will be [2, 3, 101] for above example.
Any insight would be helpful
You can use np.tril_indices or np.tri:
import numpy as np
a = np.array([[7412, 33, 2],
[2, 7304, 83],
[3, 101, 7237]])
n, m = a.shape
# Option 1
out = a[ np.tril_indices(n=n, k=-1, m=m) ]
# Option 2 (should have equivalent output)
out = a[ np.tri(N=n, M=m, k=-1, dtype=bool) ]
out:
array([ 2, 3, 101])
Related
I wanted to add a constant number to all the elements in a matrix but except to the diagonal elements.
e.g., matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Desired output : adding 10 to all the elements except to diagonal elements
matrix = np.array([[1, 12, 13],
[14, 5, 16],
[17, 18, 9]])
How can I exclude diagonal elements from this operation ?
I would use an identity matrix multplied by the number you add and subtract like this:
import numpy as np
x= 9 #number to add
matrix = np.array([ [1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
matrix2 = matrix + x - (np.identity(len(matrix))*x)
print(matrix2)
I am trying to roll only the first n elements from my numpy axis instead of all. However, I am at a loss on how to accomplish this.
import numpy as np
foo = np.random.rand(32,3,16,16)
#Foo is a batch of 32 images, with 3 channels and a height, width of 16
print("Foo Shape = ", foo.shape)
#Foo Shape = (32, 3, 16, 16)
I would like to roll each first element of the second axis by 1 step. Basically roll the first channel of each image by 1.
np.roll(foo, 1, 1)
The code above rolls all the elements of the second axis (channel dimension) by 1, instead of just rolling the first element. I couldn't find any numpy functionality that helps with this issue.
Select only the elements you want using a 2D slice:
>>> import numpy as np
>>> arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
>>> print(arr)
[[[1 2]
[3 4]]
[[5 6]
[7 8]]]
>>> arr[:, 0] = np.roll(arr[:, 0], shift=1, axis=1)
>>> print(arr)
[[[2 1]
[3 4]]
[[6 5]
[7 8]]]
>>>
In order to feed data into a LSTM network to predict remaining-useful-life (RUL) I need to create a 3D numpy array (No of machines, No of sequences, No of variables).
I already tried to combine solutions from stackoverflow and managed to create a prototype (which you can see below).
import numpy as np
import tensorflow as tf
import pandas as pd
df = pd.DataFrame({'ID': [1, 1, 2, 3, 3, 3, 3],
'V1': [1, 2, 2, 3, 3, 4, 2],
'V2': [4, 2, 3, 2, 1, 5, 1],
})
df_desired_result = np.array([[[1, 4], [2, 2], [-99, -99]],
[[2, 3], [-99, -99], [-99, -99]],
[[3, 2], [3, 1], [4, 5]]])
max_len = df['ID'].value_counts().max()
def pad_df(df, cols, max_seq, group_col= 'ID'):
array_for_pad = np.array(list(df[cols].groupby(df[group_col]).apply(pd.DataFrame.as_matrix)))
padded_array = tf.keras.preprocessing.sequence.pad_sequences(array_for_pad,
padding='post',
maxlen=max_seq,
value=-99
)
return padded_array
#testing prototype
pad_df(df, ['V1', 'V2'], max_len)
But when I apply the code above to my data, it applies the right-padding correctly but all values are set to 0.0.
I can't fully figure out this behaviour, I noticed that in the first line of my function, I get returned an array with nested arrays for 'array_for_pad'.
Here is a screenshot of the result:
result padding
I have a pandas dataframe that contains 2 dimensional vector as a column. I would like to groupby one of the columns and add the vectors up.
I have tried groupby then sum as shown in the code below, but the output column is adding dimensions to the vector rather than adding the vectors (similarly to when using np.add).
import pandas as pd
data = pd.DataFrame({'label': ['A', 'B', 'A'], 'label2' : ['X', 'Y', 'Z'],
'output' : [[[1,2,3,4],[5,6,7,8]] ,[[9,10,11,12],[13,14,15,16]],[[17,18,19,20],[21,22,23,24]]] })
data_grouped = data.groupby('label')['output'].sum()
I would like to groupby 'label' and have the outputs aggregated. Given that the output is two dimensional vector, i would like the vectors to be added and not combined. Therfore, my expectation is to have:
label A: output is [[18,20,22,24],[26,28,30,32]]
label B: output is [[9,10,11,12],[13,14,15,16]]
but I am getting:
label A: [[1, 2, 3, 4], [5, 6, 7, 8], [17, 18, 19, 20],[21,22,23,24]]
label B: [[9, 10, 11, 12], [13, 14, 15, 16]]
The solution
import pandas as pd
import numpy as np
data = pd.DataFrame({'label': ['A', 'B', 'A'], 'label2' : ['X', 'Y', 'Z'],
'output' : [[[1,2,3,4],[5,6,7,8]] ,[[9,10,11,12],[13,14,15,16]],[[17,18,19,20],[21,22,23,24]]] })
data['output'] = data['output'].map(np.array)
data_grouped = data[['label', 'output']].groupby('label').sum()
print(data_group)
>>> output
>>> label
>>> A [[18, 20, 22, 24], [26, 28, 30, 32]]
>>> B [[9, 10, 11, 12], [13, 14, 15, 16]]
The explanation
Your output contains python lists. Operation + on 2 lists concatenates the lists together:
print([1, 2] + [3, 4])
>>> [1, 2, 3, 4]
print([[1], [2]] + [[3], [4]])
>>> [[1], [2], [3], [4]]
data['output'].map(np.array) turns your 2D lists into 2D numpy arrays. Numpy arrays + operation (which is used by sum()) sums the values that are on "the same place" in both arrays.
I have a 2D tensor and an index tensor. The 2D tensor has a batch dimension, and a dimension with 3 values. I have an index tensor that selects exactly 1 element of the 3 values. What is the "best" way to product a slice containing just the elements in the index tensor?
t = torch.tensor([[1,2,3], [4,5,6], [7,8,9]])
t = tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
i = torch.tensor([0,0,1], dtype=torch.int64)
tensor([0, 0, 1])
Expected output...
tensor([1, 4, 8])
An example of the answer is as follows.
import torch
t = torch.tensor([[1,2,3], [4,5,6], [7,8,9]])
col_i = [0, 0, 1]
row_i = range(3)
print(t[row_i, col_i])
# tensor([1, 4, 8])