Python 3 array average calculation - python-3.x

x=[1280.0, 2050.0, 709.0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
num1=0
den1=0
num2=0
den2=0
for i in range(0,3):
num1=num1+x[i]
den1=den1+1
del i
for i in range(0,6):
num2=num2+x[i]
den2=den2+1
avgc1= num1/den1
avgc2= num2/den2
val = (100* avgc1 / avgc2)
print(val)
The value of variable val should be 200 but I get 199.99999999999997. Could someone please help me understand the reason.
At the same time, if I try the following, it returns 200.
y=4039.0
x1=y/3
x2=y/6
x3=100*x1/x2
print(x3)

I get 199.99999999999997 for both (Python version 3.7.1). The issue is due to rounding errors in floating point arithmetic.
You can do as #Josh Friedlander said and use the double //, but this will result in floor division which may not be what you want. To maintain higher accuracy you can try using numpy for division.
import numpy as np
y=4039
x1=np.divide(y,3)
x2=np.divide(y,6)
x3=100*np.divide(x1,x2)
print(x3)
Returns
200.0
Works for your other case too:
x=[1280.0, 2050.0, 709.0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
num1=0
den1=0
num2=0
den2=0
for i in range(0,3):
num1=num1+x[i]
den1=den1+1
del i
for i in range(0,6):
num2=num2+x[i]
den2=den2+1
avgc1= np.divide(num1,den1)
avgc2= np.divide(num2,den2)
val = (100* np.divide(avgc1,avgc2))
print(val)
Returns
200.0
This is using np.__version__ 1.15.4 for reference.
Edit
As noted by #Mark Dickinson, order of operations is important. Putting parentheses around the division with pure Python will result in 200.0 without using numpy.
x=[1280.0, 2050.0, 709.0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
num1=0
den1=0
num2=0
den2=0
for i in range(0,3):
num1=num1+x[i]
den1=den1+1
del i
for i in range(0,6):
num2=num2+x[i]
den2=den2+1
avgc1= num1 / den1
avgc2= num2 / den2
# use parentheses to perform division first
val = (100* (avgc1 / avgc2))
print(val)

num1/den1 and num2/den2 are computed with floating-point arithmetic. This includes rounding exact mathematical results to values representable in floating-point.
The result is that avgc1 and avgc2 may differ from their ideal mathematical values, and so does their quotient.

Related

Roll of different amount along a single axis in a 3D matrix [duplicate]

I have a matrix (2d numpy ndarray, to be precise):
A = np.array([[4, 0, 0],
[1, 2, 3],
[0, 0, 5]])
And I want to roll each row of A independently, according to roll values in another array:
r = np.array([2, 0, -1])
That is, I want to do this:
print np.array([np.roll(row, x) for row,x in zip(A, r)])
[[0 0 4]
[1 2 3]
[0 5 0]]
Is there a way to do this efficiently? Perhaps using fancy indexing tricks?
Sure you can do it using advanced indexing, whether it is the fastest way probably depends on your array size (if your rows are large it may not be):
rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]
# Use always a negative shift, so that column_indices are valid.
# (could also use module operation)
r[r < 0] += A.shape[1]
column_indices = column_indices - r[:, np.newaxis]
result = A[rows, column_indices]
numpy.lib.stride_tricks.as_strided stricks (abbrev pun intended) again!
Speaking of fancy indexing tricks, there's the infamous - np.lib.stride_tricks.as_strided. The idea/trick would be to get a sliced portion starting from the first column until the second last one and concatenate at the end. This ensures that we can stride in the forward direction as needed to leverage np.lib.stride_tricks.as_strided and thus avoid the need of actually rolling back. That's the whole idea!
Now, in terms of actual implementation we would use scikit-image's view_as_windows to elegantly use np.lib.stride_tricks.as_strided under the hoods. Thus, the final implementation would be -
from skimage.util.shape import view_as_windows as viewW
def strided_indexing_roll(a, r):
# Concatenate with sliced to cover all rolls
a_ext = np.concatenate((a,a[:,:-1]),axis=1)
# Get sliding windows; use advanced-indexing to select appropriate ones
n = a.shape[1]
return viewW(a_ext,(1,n))[np.arange(len(r)), (n-r)%n,0]
Here's a sample run -
In [327]: A = np.array([[4, 0, 0],
...: [1, 2, 3],
...: [0, 0, 5]])
In [328]: r = np.array([2, 0, -1])
In [329]: strided_indexing_roll(A, r)
Out[329]:
array([[0, 0, 4],
[1, 2, 3],
[0, 5, 0]])
Benchmarking
# #seberg's solution
def advindexing_roll(A, r):
rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]
r[r < 0] += A.shape[1]
column_indices = column_indices - r[:,np.newaxis]
return A[rows, column_indices]
Let's do some benchmarking on an array with large number of rows and columns -
In [324]: np.random.seed(0)
...: a = np.random.rand(10000,1000)
...: r = np.random.randint(-1000,1000,(10000))
# #seberg's solution
In [325]: %timeit advindexing_roll(a, r)
10 loops, best of 3: 71.3 ms per loop
# Solution from this post
In [326]: %timeit strided_indexing_roll(a, r)
10 loops, best of 3: 44 ms per loop
In case you want more general solution (dealing with any shape and with any axis), I modified #seberg's solution:
def indep_roll(arr, shifts, axis=1):
"""Apply an independent roll for each dimensions of a single axis.
Parameters
----------
arr : np.ndarray
Array of any shape.
shifts : np.ndarray
How many shifting to use for each dimension. Shape: `(arr.shape[axis],)`.
axis : int
Axis along which elements are shifted.
"""
arr = np.swapaxes(arr,axis,-1)
all_idcs = np.ogrid[[slice(0,n) for n in arr.shape]]
# Convert to a positive shift
shifts[shifts < 0] += arr.shape[-1]
all_idcs[-1] = all_idcs[-1] - shifts[:, np.newaxis]
result = arr[tuple(all_idcs)]
arr = np.swapaxes(result,-1,axis)
return arr
I implement a pure numpy.lib.stride_tricks.as_strided solution as follows
from numpy.lib.stride_tricks import as_strided
def custom_roll(arr, r_tup):
m = np.asarray(r_tup)
arr_roll = arr[:, [*range(arr.shape[1]),*range(arr.shape[1]-1)]].copy() #need `copy`
strd_0, strd_1 = arr_roll.strides
n = arr.shape[1]
result = as_strided(arr_roll, (*arr.shape, n), (strd_0 ,strd_1, strd_1))
return result[np.arange(arr.shape[0]), (n-m)%n]
A = np.array([[4, 0, 0],
[1, 2, 3],
[0, 0, 5]])
r = np.array([2, 0, -1])
out = custom_roll(A, r)
Out[789]:
array([[0, 0, 4],
[1, 2, 3],
[0, 5, 0]])
By using a fast fourrier transform we can apply a transformation in the frequency domain and then use the inverse fast fourrier transform to obtain the row shift.
So this is a pure numpy solution that take only one line:
import numpy as np
from numpy.fft import fft, ifft
# The row shift function using the fast fourrier transform
# rshift(A,r) where A is a 2D array, r the row shift vector
def rshift(A,r):
return np.real(ifft(fft(A,axis=1)*np.exp(2*1j*np.pi/A.shape[1]*r[:,None]*np.r_[0:A.shape[1]][None,:]),axis=1).round())
This will apply a left shift, but we can simply negate the exponential exponant to turn the function into a right shift function:
ifft(fft(...)*np.exp(-2*1j...)
It can be used like that:
# Example:
A = np.array([[1,2,3,4],
[1,2,3,4],
[1,2,3,4]])
r = np.array([1,-1,3])
print(rshift(A,r))
Building on divakar's excellent answer, you can apply this logic to 3D array easily (which was the problematic that brought me here in the first place). Here's an example - basically flatten your data, roll it & reshape it after::
def applyroll_30(cube, threshold=25, offset=500):
flattened_cube = cube.copy().reshape(cube.shape[0]*cube.shape[1], cube.shape[2])
roll_matrix = calc_roll_matrix_flattened(flattened_cube, threshold, offset)
rolled_cube = strided_indexing_roll(flattened_cube, roll_matrix, cube_shape=cube.shape)
rolled_cube = triggered_cube.reshape(cube.shape[0], cube.shape[1], cube.shape[2])
return rolled_cube
def calc_roll_matrix_flattened(cube_flattened, threshold, offset):
""" Calculates the number of position along time axis we need to shift
elements in order to trig the data.
We return a 1D numpy array of shape (X*Y, time) elements
"""
# armax(...) finds the position in the cube (3d) where we are above threshold
roll_matrix = np.argmax(cube_flattened > threshold, axis=1) + offset
# ensure we don't have index out of bound
roll_matrix[roll_matrix>cube_flattened.shape[1]] = cube_flattened.shape[1]
return roll_matrix
def strided_indexing_roll(cube_flattened, roll_matrix_flattened, cube_shape):
# Concatenate with sliced to cover all rolls
# otherwise we shift in the wrong direction for my application
roll_matrix_flattened = -1 * roll_matrix_flattened
a_ext = np.concatenate((cube_flattened, cube_flattened[:, :-1]), axis=1)
# Get sliding windows; use advanced-indexing to select appropriate ones
n = cube_flattened.shape[1]
result = viewW(a_ext,(1,n))[np.arange(len(roll_matrix_flattened)), (n - roll_matrix_flattened) % n, 0]
result = result.reshape(cube_shape)
return result
Divakar's answer doesn't do justice to how much more efficient this is on large cube of data. I've timed it on a 400x400x2000 data formatted as int8. An equivalent for-loop does ~5.5seconds, Seberg's answer ~3.0seconds and strided_indexing.... ~0.5second.

manipulating a Python list with treshold value

I need to make a function which would compare each value in a list and then set each value accordingly. Code follows:
actions = [0, 0, 0, 0.5, 0, 0.3, 0.8, 0, 0.00000000156]
def treshold(element, value):
if element >= value:
element == 1
else:
element == 0
treshold(actions, 0.5)
This code however results in the following error:
TypeError: '>=' not supported between instances of 'list' and 'float'
I understand what this error says, however I do not know how to fix that.
A compact way of doing this, as pointed out by user202729 is with a list comprehension. The key is, you need to do this for each entry into the list. If you want to run it on the whole list at once, you could consider using numpy
actions = [0, 0, 0, 0.5, 0, 0.3, 0.8, 0, 0.00000000156]
def treshold(element, value):
thresholded_list = [int(a>=value) for a in actions]
return thresholded_list
this function is essentially a shorthand for
def treshold_long(element_list, value):
thresholded_list = []
for element in element_list:
if element >= value:
thresholded_list.append(1)
else:
thresholded_list.append(0)
return thresholded_list
Thanks to user202729 I have discovered list comprehensions.
actions = [0, 0, 0, 0.5, 0, 0.3, 0.8, 0, 0.00000000156]
treshold = 0.5
actions = [1 if i>=treshold else 0 for i in actions]
print(actions)
This basically solves my problem. I also thank to user3235916 for a valid function.

Segment/profiling in python

Help please!!
I was trying to create a column 'Segment' based on the condition:
if 'Pro_vol' >1 and 'Cost' >=43 then append 1
if 'Pro_vol' ==1 and 'Cost' >=33 then append 1
or append 0
Below is the code for data:
df = pd.DataFrame({'ID':[1,2,3,4,5,6,7,8,9,10],
'Pro_vol':[1,2,3,1,5,1,2,1,4,5],
'Cost' : [12.34,13.55,34.00, 19.15,13.22,22.34,33.55,44.00, 29.15,53.22]})
I tried a code:
Segment=[]
for i in df['Pro_vol']:
if i >1:
Segment.append(1)
for j in df['Cost']:
if j>=43:
Segment.append(1)
elif i==1:
Segment.append(1)
elif j>=33:
Segment.append(1)
else:
Segment.append(0)
df['Segment']=Segment
And it was giving me an error:
ValueError: Length of values does not match length of index
I don't know any other way to try to find an answer!!
You may consider np.where
np.where(((df.Cost>=33)&(df.Pro_vol==1))|((df.Cost>=43)&(df.Pro_vol>1)),1,0)
Out[538]: array([0, 0, 0, 0, 0, 0, 0, 1, 0, 1])

Solving matrix equations with some Boundary Conditions

How can I solve a system of linear equations with some Boundary conditions, using Numpy?
Ax=B
Where x is a column vector with, let's say x1=0.
For different iterations BCs are going to be different, so different variables of vector x going to be zero.
[A] and [B] are known.
Here is an example from my FEM course:
{F} Is the column vector of known values
[k] is the stiffness matrix with the known values
{U} is the displacement column vector where U1 and U3 are known to be zero, but U2 and U4 need to be found.
Here is an example:
This would result in these values:
Naturally this would reduce to the 2X2 matrix equation, but I because for different elements the BC would be different, I'm looking for some numpy matrix equation solver where I can let it know that some of the unknowns must be this certain value and nothing else.
Is there something similar to np.linalg.solve() with conditions to it?
Thank you.
the matrix k in your example is invertible. that means there is one and only one solution; you can not choose any of the Us. this is the solution:
import numpy as np
k = np.array(((1000, 0, -1000, 0),
(0, 3000, 0, -3000),
(-100, 0, 3000, -2000),
(0, -3000, -2000, 5000)))
F = np.array((0, 0, 0, 5000))
U = np.linalg.solve(k, F)
print(U)
# # or:
# k_inv = np.linalg.inv(k)
# U = k_inv.dot(F)
# [ 5.55555556 8.05555556 5.55555556 8.05555556]
the same in sage:
k = matrix(((1000, 0, -1000, 0),
(0, 3000, 0, -3000),
(-100, 0, 3000, -2000),
(0, -3000, -2000, 5000)))
F = vector((0, 0, 0, 5000))
U = k.inverse() * F
# (50/9, 145/18, 50/9, 145/18)

How can I; if var is integer then execute

I am trying to create a decimal to binary converter. The user inputs their value, and the amount is divided by two each time and added to the invertedbinary list. The amount is then converted back into an integer, to be divided by two again and so on.
value = int(input("Please enter the decimal value to be converted to binary."))
invertedbinary = []
while value >= 1:
value = (value/2)
invertedbinary.append(value)
value = int(value)
print (invertedbinary)
for n,i in enumerate(invertedbinary):
if i == isinstance(invertedbinary,int):
invertedbinary[n]=0
else:
invertedbinary[n]=1
print (invertedbinary)
Let's say I input the number seventeen. This is the output:
[8.5]
[8.5, 4.0]
[8.5, 4.0, 2.0]
[8.5, 4.0, 2.0, 1.0]
[8.5, 4.0, 2.0, 1.0, 0.5]
[1, 1, 1, 1, 1]
So we can tell that from the last line of ones, my isinstance attempt did not work. What I want to be able to do, is that if the amount is anynumber.5 then to display it as a 1, and if it is a whole number to display as a zero. So what it should look like is [1, 0, 0, 0, 1]. Ones for each float value, and zeroes for the integers.
What can I use instead of is instance to achieve this?
For anyone wondering, I've called it invertedbinary because when printed invertedbinary needs to be flipped around and then printed as a string to display the correct binary value.
You can always check wether the round value is equal to the value...
if (round(x) == x):
# x is int
else:
# x is float/double

Resources