i have a problem in Python. I am creating two numpy arrays from dict entries. I want to join those two numpy arrays in a specific way like this:
# create array with classes
probVec = filePickle['classID']
a = np.empty([0, 1])
for x in np.nditer(probVec):
a = np.append(a,x)
timeVec = filePickle['start']
timeVec = np.asarray(timeVec)
b = np.empty([0, 1])
for x in np.nditer(timeVec):
b = np.append(b,x)
# create input-vectors for clustering
c = np.vstack((b,a)).transpose()
Now, if i want to join them in a more specific way, like taking only specific items of array "probVec" to join them with the corresponding entry of array "timeVec" like this:
for x in np.nditer(probVec):
if x == 3.0:
a = np.append(a,x)
for x in np.nditer(timeVec):
b = append with x values that have the same indices as the ones appended in the first loop
Because both arrays contain values corresponding to each other they have the same length. So my goal is something like this:
probVec = [2.0, 1.0, 3.0, 3.0, 4.0, 3.0...]
timeVec = [t1, t2, t3, t4, t5, t6...]
c = [[3.0 t3]
[3.0 t4]
[3.0 t6]
.
.
.
]
I just don't know what's the best way to realize that.
Using a comparison operator on an array, like a == 3.0, you get a boolean array that can be used for indexing, selecting the rows where the condition is true.
In [87]: a = np.random.randint(low=1, high=4, size=10) # example data
In [88]: a
Out[88]: array([3, 1, 3, 1, 1, 3, 2, 2, 2, 2])
In [89]: b = np.arange(10)
In [90]: c = np.column_stack((a, b))
In [91]: c[a == 3]
Out[91]:
array([[3, 0],
[3, 2],
[3, 5]])
Related
I have a CSV file that I read using pandas. I would like to make a comparison between some of the columns and then use the outcome of the comparison to make a decision. An example of the data is shown below.
A
B
C
D
6
[5, 3, 4, 1]
-4.2974843
[-5.2324843, -5.2974843, -6.2074043, -6.6974803]
2
[3, 6,4, 7]
-6.4528433
[-6.2324843, -7.0974845, -7.2034041, -7.6974804]
3
[6, 2, 4, 5]
-3.5322451
[-4.3124440, -4.9073840, -5.2147042, -6.1904800]
1
[4, 3, 6,2]
-5.9752843
[-5.2324843, -5.2974843, -6.2074043, -6.6974803]
7
[2, 3, 4, 1]
-1.2974652
[-3.1232843, -4.2474643, -5.2074043, -6.1994802]
5
[1, 3, 7, 2]
-9.884843
[-8.0032843, -8.0974843, -9.2074043, -9.6904603]
4
[7, 3, 1, 4]
-2.3984843
[-7.2324843, -8.2094845, -9.2044013, -9.7914001]
Here is the code I am using:
n_A = data['A']
n_B = data['B']
n_C = data['C']
n_D = data['D']
result_compare = []
for w, e in enumerate(n_A):
for ro, ver in enumerate(n_B):
for row, m in enumerate(n_C):
for r, t in enumerate(n_D):
if ro==w:
if r ==row:
if row==ro:
if r==0:
if t[r]>m:
b = ver[r]
result_compare.append(b)
else:
b = e
result_compare.append(b)
elif r>=0:
q = r-r
if t[q]>m:
b = ver[q]
result_compare.append(b)
else:
b = e
result_compare.append(b)
I had to select only the columns required for the comparison and that was why I did the following.
n_A = data['A']
n_B = data['B']
n_C = data['C']
n_D = data['D']
Results could be as:
result_compare = [6, 3 , 3, 4, 7 , 1, 4 ]
The values in D are arranged in descending order which is why the first element of the list is selected in this case. So when the first element in the row of the list D is greater than the one of C, we choose the first element of the list B, otherwise A. I would like an efficient way since my code takes lots of time to provide results most especially in the case of large data.
I would do this in your case
data['newRow']=data.apply(lambda row: row["B"][0] if row["D"][0] > row["C"] else row['A'], axis=1)
And if you need it as a list by the end:
list(data['newRow'])
In pytorch , unique (with return_count is True) operation do like this
[1,1,2,2,3,3] => ([1,2,3],[2,2,2])
Are there any reverse operations of torch.unique() ?
i.e Given a unique list and its count , return the original list like
([1,2,3],[2,2,2]) = > [1,1,2,2,3,3]
If you include the return inverse parameter it will return the indices for where elements in the original input ended up in the returned unique list. Then you can use take to create a new tensor.
test_tensor = torch.tensor([1,1,2,2,3,3])
items, inverse, counts = test_tensor.unique(return_counts=True, return_inverse=True)
new_tensor = torch.take(items, inverse)
assert new_tensor.eq(test_tensor).all() # true
EDIT:
If you only have a list and the counts this code should give you what you want using repeat. Not sure if a pure pytorch function exists.
test_tensor = torch.tensor([1,1,2,2,3,3])
items, counts = test_tensor.unique(return_counts=True)
new_tensor = torch.tensor([], dtype=torch.int64)
for item, count in zip(items, counts):
new_tensor = torch.cat((new_tensor, item.repeat(count)))
assert new_tensor.eq(test_tensor).all() # true
You probably want torch.repeat_interleave(). You can use it like this:
>>> x = torch.tensor([1, 1, 2, 3, 3, 3])
>>> v, c = torch.unique(x, return_counts=True)
>>> v, c
(tensor([1, 2, 3]), tensor([2, 1, 3]))
>>> torch.repeat_interleave(v, c)
tensor([1, 1, 2, 3, 3, 3])
I have a 2d nummpy array something like this:
(array([[7.948829 , 3.7127783, 3.6365926, 3.4607997]], dtype=float32),
array([[ 5, 15, 7, 39]]))
The first array in this array are distances and the second is indices, I want to know if there is a way I could filter the first array based on a certain threshold and then also delete the corresponding indices from the index?
you mean like this?
import numpy as np
a = np.array([7.948829 , 3.7127783, 3.6365926, 3.4607997])
b = np.array([ 5, 15, 7, 39])
c = b[a>3.7]
d = a[a>3.7]
print(f'c = \n{c}')
print(f'd = \n{d}')
output:
c =
[ 5 15]
d =
[7.948829 3.7127783]
I have a numpy array:
a = np.array([0, 1, 2, 3, 4, 5])
I would like to iterate over each element and compute a statistic using all the elements except the one at the current index. Statistic would require looping over each element.
stats = np.array([])
for ind1,x in enumerate(a):
s = 0
for ind2,y in enumerate(a):
if ind2 != ind1:
s = s + compute_stat(y)
stats[ind1] = s
It looks like I might mask the current index, however, I would then need to reset the mask within each loop.
a = np.array([0, 1, 2, 3, 4, 5])
a = np.ma.array(a)
for ind1,x in enumerate(a):
a[ind1] = np.ma.masked
What is the best way to use a nested loop in this case to iterate a computation over (n-1) elements of the same array?
Thank you
I have a theano tensor and I would like to clip its values, but each index to a different range.
For example, if I have a vector [a,b,c] , I want to clip a to [0,1] , clip b to [2,3] and c to [3,5].
How can I do that efficiently?
Thanks!
The theano.tensor.clip operation supports symbolic minimum and maximum values so you can pass three tensors, all of the same shape, and it will perform an element-wise clip of the first with respect to the second (minimum) and third (maximum).
This code shows two variations on this theme. v1 requires the minimum and maximum values to be passed as separate vectors while v2 allows the minimum and maximum values to be passed more like a list of pairs, represented as a two column matrix.
import theano
import theano.tensor as tt
def v1():
x = tt.vector()
min_x = tt.vector()
max_x = tt.vector()
y = tt.clip(x, min_x, max_x)
f = theano.function([x, min_x, max_x], outputs=y)
print f([2, 1, 4], [0, 2, 3], [1, 3, 5])
def v2():
x = tt.vector()
min_max = tt.matrix()
y = tt.clip(x, min_max[:, 0], min_max[:, 1])
f = theano.function([x, min_max], outputs=y)
print f([2, 1, 4], [[0, 1], [2, 3], [3, 5]])
def main():
v1()
v2()
main()