I need to make a function which would compare each value in a list and then set each value accordingly. Code follows:
actions = [0, 0, 0, 0.5, 0, 0.3, 0.8, 0, 0.00000000156]
def treshold(element, value):
if element >= value:
element == 1
else:
element == 0
treshold(actions, 0.5)
This code however results in the following error:
TypeError: '>=' not supported between instances of 'list' and 'float'
I understand what this error says, however I do not know how to fix that.
A compact way of doing this, as pointed out by user202729 is with a list comprehension. The key is, you need to do this for each entry into the list. If you want to run it on the whole list at once, you could consider using numpy
actions = [0, 0, 0, 0.5, 0, 0.3, 0.8, 0, 0.00000000156]
def treshold(element, value):
thresholded_list = [int(a>=value) for a in actions]
return thresholded_list
this function is essentially a shorthand for
def treshold_long(element_list, value):
thresholded_list = []
for element in element_list:
if element >= value:
thresholded_list.append(1)
else:
thresholded_list.append(0)
return thresholded_list
Thanks to user202729 I have discovered list comprehensions.
actions = [0, 0, 0, 0.5, 0, 0.3, 0.8, 0, 0.00000000156]
treshold = 0.5
actions = [1 if i>=treshold else 0 for i in actions]
print(actions)
This basically solves my problem. I also thank to user3235916 for a valid function.
Related
I have a matrix (2d numpy ndarray, to be precise):
A = np.array([[4, 0, 0],
[1, 2, 3],
[0, 0, 5]])
And I want to roll each row of A independently, according to roll values in another array:
r = np.array([2, 0, -1])
That is, I want to do this:
print np.array([np.roll(row, x) for row,x in zip(A, r)])
[[0 0 4]
[1 2 3]
[0 5 0]]
Is there a way to do this efficiently? Perhaps using fancy indexing tricks?
Sure you can do it using advanced indexing, whether it is the fastest way probably depends on your array size (if your rows are large it may not be):
rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]
# Use always a negative shift, so that column_indices are valid.
# (could also use module operation)
r[r < 0] += A.shape[1]
column_indices = column_indices - r[:, np.newaxis]
result = A[rows, column_indices]
numpy.lib.stride_tricks.as_strided stricks (abbrev pun intended) again!
Speaking of fancy indexing tricks, there's the infamous - np.lib.stride_tricks.as_strided. The idea/trick would be to get a sliced portion starting from the first column until the second last one and concatenate at the end. This ensures that we can stride in the forward direction as needed to leverage np.lib.stride_tricks.as_strided and thus avoid the need of actually rolling back. That's the whole idea!
Now, in terms of actual implementation we would use scikit-image's view_as_windows to elegantly use np.lib.stride_tricks.as_strided under the hoods. Thus, the final implementation would be -
from skimage.util.shape import view_as_windows as viewW
def strided_indexing_roll(a, r):
# Concatenate with sliced to cover all rolls
a_ext = np.concatenate((a,a[:,:-1]),axis=1)
# Get sliding windows; use advanced-indexing to select appropriate ones
n = a.shape[1]
return viewW(a_ext,(1,n))[np.arange(len(r)), (n-r)%n,0]
Here's a sample run -
In [327]: A = np.array([[4, 0, 0],
...: [1, 2, 3],
...: [0, 0, 5]])
In [328]: r = np.array([2, 0, -1])
In [329]: strided_indexing_roll(A, r)
Out[329]:
array([[0, 0, 4],
[1, 2, 3],
[0, 5, 0]])
Benchmarking
# #seberg's solution
def advindexing_roll(A, r):
rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]
r[r < 0] += A.shape[1]
column_indices = column_indices - r[:,np.newaxis]
return A[rows, column_indices]
Let's do some benchmarking on an array with large number of rows and columns -
In [324]: np.random.seed(0)
...: a = np.random.rand(10000,1000)
...: r = np.random.randint(-1000,1000,(10000))
# #seberg's solution
In [325]: %timeit advindexing_roll(a, r)
10 loops, best of 3: 71.3 ms per loop
# Solution from this post
In [326]: %timeit strided_indexing_roll(a, r)
10 loops, best of 3: 44 ms per loop
In case you want more general solution (dealing with any shape and with any axis), I modified #seberg's solution:
def indep_roll(arr, shifts, axis=1):
"""Apply an independent roll for each dimensions of a single axis.
Parameters
----------
arr : np.ndarray
Array of any shape.
shifts : np.ndarray
How many shifting to use for each dimension. Shape: `(arr.shape[axis],)`.
axis : int
Axis along which elements are shifted.
"""
arr = np.swapaxes(arr,axis,-1)
all_idcs = np.ogrid[[slice(0,n) for n in arr.shape]]
# Convert to a positive shift
shifts[shifts < 0] += arr.shape[-1]
all_idcs[-1] = all_idcs[-1] - shifts[:, np.newaxis]
result = arr[tuple(all_idcs)]
arr = np.swapaxes(result,-1,axis)
return arr
I implement a pure numpy.lib.stride_tricks.as_strided solution as follows
from numpy.lib.stride_tricks import as_strided
def custom_roll(arr, r_tup):
m = np.asarray(r_tup)
arr_roll = arr[:, [*range(arr.shape[1]),*range(arr.shape[1]-1)]].copy() #need `copy`
strd_0, strd_1 = arr_roll.strides
n = arr.shape[1]
result = as_strided(arr_roll, (*arr.shape, n), (strd_0 ,strd_1, strd_1))
return result[np.arange(arr.shape[0]), (n-m)%n]
A = np.array([[4, 0, 0],
[1, 2, 3],
[0, 0, 5]])
r = np.array([2, 0, -1])
out = custom_roll(A, r)
Out[789]:
array([[0, 0, 4],
[1, 2, 3],
[0, 5, 0]])
By using a fast fourrier transform we can apply a transformation in the frequency domain and then use the inverse fast fourrier transform to obtain the row shift.
So this is a pure numpy solution that take only one line:
import numpy as np
from numpy.fft import fft, ifft
# The row shift function using the fast fourrier transform
# rshift(A,r) where A is a 2D array, r the row shift vector
def rshift(A,r):
return np.real(ifft(fft(A,axis=1)*np.exp(2*1j*np.pi/A.shape[1]*r[:,None]*np.r_[0:A.shape[1]][None,:]),axis=1).round())
This will apply a left shift, but we can simply negate the exponential exponant to turn the function into a right shift function:
ifft(fft(...)*np.exp(-2*1j...)
It can be used like that:
# Example:
A = np.array([[1,2,3,4],
[1,2,3,4],
[1,2,3,4]])
r = np.array([1,-1,3])
print(rshift(A,r))
Building on divakar's excellent answer, you can apply this logic to 3D array easily (which was the problematic that brought me here in the first place). Here's an example - basically flatten your data, roll it & reshape it after::
def applyroll_30(cube, threshold=25, offset=500):
flattened_cube = cube.copy().reshape(cube.shape[0]*cube.shape[1], cube.shape[2])
roll_matrix = calc_roll_matrix_flattened(flattened_cube, threshold, offset)
rolled_cube = strided_indexing_roll(flattened_cube, roll_matrix, cube_shape=cube.shape)
rolled_cube = triggered_cube.reshape(cube.shape[0], cube.shape[1], cube.shape[2])
return rolled_cube
def calc_roll_matrix_flattened(cube_flattened, threshold, offset):
""" Calculates the number of position along time axis we need to shift
elements in order to trig the data.
We return a 1D numpy array of shape (X*Y, time) elements
"""
# armax(...) finds the position in the cube (3d) where we are above threshold
roll_matrix = np.argmax(cube_flattened > threshold, axis=1) + offset
# ensure we don't have index out of bound
roll_matrix[roll_matrix>cube_flattened.shape[1]] = cube_flattened.shape[1]
return roll_matrix
def strided_indexing_roll(cube_flattened, roll_matrix_flattened, cube_shape):
# Concatenate with sliced to cover all rolls
# otherwise we shift in the wrong direction for my application
roll_matrix_flattened = -1 * roll_matrix_flattened
a_ext = np.concatenate((cube_flattened, cube_flattened[:, :-1]), axis=1)
# Get sliding windows; use advanced-indexing to select appropriate ones
n = cube_flattened.shape[1]
result = viewW(a_ext,(1,n))[np.arange(len(roll_matrix_flattened)), (n - roll_matrix_flattened) % n, 0]
result = result.reshape(cube_shape)
return result
Divakar's answer doesn't do justice to how much more efficient this is on large cube of data. I've timed it on a 400x400x2000 data formatted as int8. An equivalent for-loop does ~5.5seconds, Seberg's answer ~3.0seconds and strided_indexing.... ~0.5second.
x=[1280.0, 2050.0, 709.0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
num1=0
den1=0
num2=0
den2=0
for i in range(0,3):
num1=num1+x[i]
den1=den1+1
del i
for i in range(0,6):
num2=num2+x[i]
den2=den2+1
avgc1= num1/den1
avgc2= num2/den2
val = (100* avgc1 / avgc2)
print(val)
The value of variable val should be 200 but I get 199.99999999999997. Could someone please help me understand the reason.
At the same time, if I try the following, it returns 200.
y=4039.0
x1=y/3
x2=y/6
x3=100*x1/x2
print(x3)
I get 199.99999999999997 for both (Python version 3.7.1). The issue is due to rounding errors in floating point arithmetic.
You can do as #Josh Friedlander said and use the double //, but this will result in floor division which may not be what you want. To maintain higher accuracy you can try using numpy for division.
import numpy as np
y=4039
x1=np.divide(y,3)
x2=np.divide(y,6)
x3=100*np.divide(x1,x2)
print(x3)
Returns
200.0
Works for your other case too:
x=[1280.0, 2050.0, 709.0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
num1=0
den1=0
num2=0
den2=0
for i in range(0,3):
num1=num1+x[i]
den1=den1+1
del i
for i in range(0,6):
num2=num2+x[i]
den2=den2+1
avgc1= np.divide(num1,den1)
avgc2= np.divide(num2,den2)
val = (100* np.divide(avgc1,avgc2))
print(val)
Returns
200.0
This is using np.__version__ 1.15.4 for reference.
Edit
As noted by #Mark Dickinson, order of operations is important. Putting parentheses around the division with pure Python will result in 200.0 without using numpy.
x=[1280.0, 2050.0, 709.0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
num1=0
den1=0
num2=0
den2=0
for i in range(0,3):
num1=num1+x[i]
den1=den1+1
del i
for i in range(0,6):
num2=num2+x[i]
den2=den2+1
avgc1= num1 / den1
avgc2= num2 / den2
# use parentheses to perform division first
val = (100* (avgc1 / avgc2))
print(val)
num1/den1 and num2/den2 are computed with floating-point arithmetic. This includes rounding exact mathematical results to values representable in floating-point.
The result is that avgc1 and avgc2 may differ from their ideal mathematical values, and so does their quotient.
Help please!!
I was trying to create a column 'Segment' based on the condition:
if 'Pro_vol' >1 and 'Cost' >=43 then append 1
if 'Pro_vol' ==1 and 'Cost' >=33 then append 1
or append 0
Below is the code for data:
df = pd.DataFrame({'ID':[1,2,3,4,5,6,7,8,9,10],
'Pro_vol':[1,2,3,1,5,1,2,1,4,5],
'Cost' : [12.34,13.55,34.00, 19.15,13.22,22.34,33.55,44.00, 29.15,53.22]})
I tried a code:
Segment=[]
for i in df['Pro_vol']:
if i >1:
Segment.append(1)
for j in df['Cost']:
if j>=43:
Segment.append(1)
elif i==1:
Segment.append(1)
elif j>=33:
Segment.append(1)
else:
Segment.append(0)
df['Segment']=Segment
And it was giving me an error:
ValueError: Length of values does not match length of index
I don't know any other way to try to find an answer!!
You may consider np.where
np.where(((df.Cost>=33)&(df.Pro_vol==1))|((df.Cost>=43)&(df.Pro_vol>1)),1,0)
Out[538]: array([0, 0, 0, 0, 0, 0, 0, 1, 0, 1])
I'm a student, and I'm trying to figure out what I have done wrong. Could anyone spot what I might have done?
super(Fighter, self).__init__(modelPath, parentNode, nodeName, 0, 0, 0, 3.0, )
TypeError: __init__() takes at most 3 arguments (8 given)**
Code:
class Fighter(ShowBase, object):
fighterCount = 0
def __init__(self, modelPath, parentNode, nodeName, posVec, traverser, scaleVec = 1.0):
super(Fighter, self).__init__(modelPath, parentNode, nodeName, 0, 0, 0, 3.0, )
self.modelNode.setScale(scaleVec)
self.modelNode.setPos(posVec)
self.trav = traverser
self.origin = render.attachNewNode("origin")
self.origin.setPos(0, 0, 0)
self.origin.hide()
self.setKeyBindings()
self.hud = Hud("./Tools/Hud.x", self.modelNode, "Hud", (0, 10, 0))
I need your help to fix my code. I try to append a value to a list in a dictionary.
def distance(x1, y1, x2, y2):
dis=((x1-x2)**2) + ((y1-y2)**2)
return dis
def cluster_member_formation2(arrCH, arrN, k):
dicCH = dict.fromkeys(arrCH,[])
arrE = []
for j in range(len(arrCH)):
d_nya = distance(arrN[1][0], arrN[1][1], arrN[arrCH[j]][0], arrN[arrCH[j]][1])
arrE.append(d_nya)
minC = min(arrE)
ind = arrE.index(minC)
x = arrCH[ind]
dicCH[x].append(1)
print(arrE, minC, ind, x, dicCH)
arrCH=[23, 35]
arrN={0:[23, 45, 2, 0], 1:[30,21,2,0], 23:[12, 16, 2, 0], 35:[48, 77, 2, 0]}
cluster_member_formation2(arrCH, arrN, 1)
The output:
[349, 3460] 349 0 23 {35: [1], 23: [1]}
I try to calculate the distance between node 1 and all node in arrCH, and then take the minimum distance. In the output show the result of arrE is [349, 3460], so the minimum is 349. 349 has index 0, then I find arrCH with index 0, likes arrCH[0]=23. Finally, I want update dicCH[23].append(1) so the result is
{35: [], 23: [1]}
But, why my code update the all keys, 35 and 23?
I hope someone can help me.
Thank you..
classmethod fromkeys(seq[, value])
Create a new dictionary with keys
from seq and values set to value.
All of your dictionary values reference the same single list instance ([]) which you provide as a value to the fromkeys function.
You could use dictionary comprehension as seen in this answer.
dicCH = {key: [] for key in arrCH}