How can I solve a system of linear equations with some Boundary conditions, using Numpy?
Ax=B
Where x is a column vector with, let's say x1=0.
For different iterations BCs are going to be different, so different variables of vector x going to be zero.
[A] and [B] are known.
Here is an example from my FEM course:
{F} Is the column vector of known values
[k] is the stiffness matrix with the known values
{U} is the displacement column vector where U1 and U3 are known to be zero, but U2 and U4 need to be found.
Here is an example:
This would result in these values:
Naturally this would reduce to the 2X2 matrix equation, but I because for different elements the BC would be different, I'm looking for some numpy matrix equation solver where I can let it know that some of the unknowns must be this certain value and nothing else.
Is there something similar to np.linalg.solve() with conditions to it?
Thank you.
the matrix k in your example is invertible. that means there is one and only one solution; you can not choose any of the Us. this is the solution:
import numpy as np
k = np.array(((1000, 0, -1000, 0),
(0, 3000, 0, -3000),
(-100, 0, 3000, -2000),
(0, -3000, -2000, 5000)))
F = np.array((0, 0, 0, 5000))
U = np.linalg.solve(k, F)
print(U)
# # or:
# k_inv = np.linalg.inv(k)
# U = k_inv.dot(F)
# [ 5.55555556 8.05555556 5.55555556 8.05555556]
the same in sage:
k = matrix(((1000, 0, -1000, 0),
(0, 3000, 0, -3000),
(-100, 0, 3000, -2000),
(0, -3000, -2000, 5000)))
F = vector((0, 0, 0, 5000))
U = k.inverse() * F
# (50/9, 145/18, 50/9, 145/18)
Related
I am trying to optimize a funciton that is trying to maximize the correlation between two (pandas) time series arrays (X and Y). This is done by using three parameters (a, b, c) and a third time series array (Z). The Z array is used to reindex the values in the X array (based on the parameters a, b, c) in such a way as to maximize the correlation of the reindexed X array (Xnew) with the Y array.
Below is some pseudo-code to demonstrate what I amy trying to do. I have attempted this using LMfit and scipy optimize but I am not sure how to make this task work in those packages. For example in LMfit if I tried to minimize the MyOpt function (which passes back a single value of the correlation metric) then it complains that I have more parameters than outputs. However, if I pass back the time series of the corrlation metric (diff) the the parameter values remain fixed at their input values.
I know the reindexing function I am using works because using the rather crude methods similar to the code below give signifianct changes in the mean (diff) metric passed back.
My knowledge of these optimizaiton packages is not up to scratch for this job so if anyone has a suggestion on how to tackle this, I would be greatfull.
def GetNewIndex(Z, a, b ,c):
old_index = np.arange(0, len(Z))
index_adj = some_func(a,b,c)
new_index = old_index + index_adj
max_old = np.max(old_index)
new_index[new_index > max_old] = max_old
new_index[new_index < 0] = 0
return new_index
def MyOpt(params, X, Y ,Z):
a = params['A']
b = params['B']
c = params['C']
# estimate lag (in samples) based on ambient RH
new_index = GetNewIndex(Z, a, b, c)
# assign old values to new locations and convert back to pandas series
Xnew = np.take(X.values, new_index)
Xnew = pd.Series(Xnew, index=X.index)
cc = Y.rolling(1201, center=True).corr(Xnew)
cc = cc.interpolate(limit_direction='both', limit_area=None)
diff = 1-np.abs(cc)
return np.mean(diff)
#==================================================
X = some long pandas time series data
Y = some long pandas time series data
Z = some long pandas time series data
As = [1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2]
Bs = [0, 0 ,0, 1, 1, 1, 0, 0, 0, 1, 1, 1]
Cs = [5, 6, 5, 6, 5, 6, 5, 6, 5, 6, 5, 6]
outs = []
for A, B, C in zip(As, Bs, Cs):
params={'A':A, 'B':B, 'C':C}
out = MyOpt(params, X, Y, Z)
outs.append(out)
I was recently working on a CodeForce problem
So, I was using SymPy to solve this.
My code is :
from sympy import *
x,y = symbols("x,y", integer = True)
m,n = input().split(" ")
sol = solve([x**2 + y - int(n), y**2 + x - int(m)], [x, y])
print(sol)
What I wanted to do:
Filter only Positive and integer value from SymPy
Ex: If I put 14 28 in the terminal it will give me tons of result, but I just want it to show [(5, 3)]
I don't think that this is the intended way to solve the code force problem (I think you're just supposed to loop over the possible values for one of the variables).
I'll show how to make use of SymPy here anyway though. Your problem is a diophantine system of equations. Although SymPy has a diophantine solver it only works for individual equations rather than systems.
Usually the idea of using a CAS for something like this though is to symbolically find something like a general result that then helps you to write faster concrete numerical code. Here are your equations with m and n as arbitrary symbols:
In [62]: x, y, m, n = symbols('x, y, m, n')
In [63]: eqs = [x**2 + y - n, y**2 + x - m]
Using the polynomial resultant we can eliminate either x or y from this system to obtain a quartic polynomial for the remaining variable:
In [31]: py = resultant(eqs[0], eqs[1], x)
In [32]: py
Out[32]:
2 2 4
m - 2⋅m⋅y - n + y + y
While there is a quartic general formula that SymPy can use (if you use solve or roots here) it is too complicated to be useful for a problem like the one that you are describing. Instead though the rational root theorem tells us that an integer root for y must be a divisor of the constant term:
In [33]: py.coeff(y, 0)
Out[33]:
2
m - n
Therefore the possible values for y are:
In [64]: yvals = divisors(py.coeff(y, 0).subs({m:14, n:28}))
In [65]: yvals
Out[65]: [1, 2, 3, 4, 6, 7, 8, 12, 14, 21, 24, 28, 42, 56, 84, 168]
Since x is m - y**2 the corresponding values for x are:
In [66]: solve(eqs[1], x)
Out[66]:
⎡ 2⎤
⎣m - y ⎦
In [67]: xvals = [14 - yv**2 for yv in yvals]
In [68]: xvals
Out[68]: [13, 10, 5, -2, -22, -35, -50, -130, -182, -427, -562, -770, -1750, -3122, -7042, -28210]
The candidate solutions are then given by:
In [69]: candidates = [(xv, yv) for xv, yv in zip(xvals, yvals) if xv > 0]
In [70]: candidates
Out[70]: [(13, 1), (10, 2), (5, 3)]
From there you can test which values are solutions:
In [74]: eqsmn = [eq.subs({m:14, n:28}) for eq in eqs]
In [75]: [c for c in candidates if all(eq.subs(zip([x,y],c))==0 for eq in eqsmn)]
Out[75]: [(5, 3)]
The algorithmically minded will probably see from the above example how to make a much more efficient way of implementing the solver.
I've figured out the answer to my question ! At first, I was trying to filter the result from solve(). But there is an easy way to do this.
Pseudo code:
solve() gives the intersection point of both Parabolic Equations as a List
I just need to filter() the other types of values. Which in my case is <sympy.core.add.Add>
def rem(_list):
return list(filter(lambda v: type(v) != Add, _list))
Yes, You can also use type(v) == int
Final code:
from sympy import *
# the other values were <sympy.core.add.Add> type. So, I just defined a function to filterOUT these specific types from my list.
def rem(_list):
return list(filter(lambda v: type(v) != Add, _list))
x,y = symbols("x,y", integer = True, negative = False)
output = []
m,n = input().split(' ')
# I need to solve these 2 equations separately. Otherwise, my defined function will not work without loop.
solX = rem(solve((x+(int(n)-x**2)**2 - int(m)), x))
solY = rem(solve((int(m) - y**2)**2 + y - int(n), y))
if len(solX) == 0 or len(solY) == 0:
print(0)
else:
output.extend(solX) # using "Extend" to add multiple values in the list.
output.extend(solY)
print(int((len(output))/2)) # Obviously, result will come in pairs. So, I need to divide the length of the list by 2.
Why I used this way :
I tried to solve it by algorithmic way, but it still had some float numbers. I just wanted to skip the loop thing here again !
As sympy solve() has already found the values. So, I skipped the other way and focused on filtering !
Sadly, code force compiler shows a runtime error! I guess it can't import sympy. However, it works fine in VSCode.
I have a matrix (2d numpy ndarray, to be precise):
A = np.array([[4, 0, 0],
[1, 2, 3],
[0, 0, 5]])
And I want to roll each row of A independently, according to roll values in another array:
r = np.array([2, 0, -1])
That is, I want to do this:
print np.array([np.roll(row, x) for row,x in zip(A, r)])
[[0 0 4]
[1 2 3]
[0 5 0]]
Is there a way to do this efficiently? Perhaps using fancy indexing tricks?
Sure you can do it using advanced indexing, whether it is the fastest way probably depends on your array size (if your rows are large it may not be):
rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]
# Use always a negative shift, so that column_indices are valid.
# (could also use module operation)
r[r < 0] += A.shape[1]
column_indices = column_indices - r[:, np.newaxis]
result = A[rows, column_indices]
numpy.lib.stride_tricks.as_strided stricks (abbrev pun intended) again!
Speaking of fancy indexing tricks, there's the infamous - np.lib.stride_tricks.as_strided. The idea/trick would be to get a sliced portion starting from the first column until the second last one and concatenate at the end. This ensures that we can stride in the forward direction as needed to leverage np.lib.stride_tricks.as_strided and thus avoid the need of actually rolling back. That's the whole idea!
Now, in terms of actual implementation we would use scikit-image's view_as_windows to elegantly use np.lib.stride_tricks.as_strided under the hoods. Thus, the final implementation would be -
from skimage.util.shape import view_as_windows as viewW
def strided_indexing_roll(a, r):
# Concatenate with sliced to cover all rolls
a_ext = np.concatenate((a,a[:,:-1]),axis=1)
# Get sliding windows; use advanced-indexing to select appropriate ones
n = a.shape[1]
return viewW(a_ext,(1,n))[np.arange(len(r)), (n-r)%n,0]
Here's a sample run -
In [327]: A = np.array([[4, 0, 0],
...: [1, 2, 3],
...: [0, 0, 5]])
In [328]: r = np.array([2, 0, -1])
In [329]: strided_indexing_roll(A, r)
Out[329]:
array([[0, 0, 4],
[1, 2, 3],
[0, 5, 0]])
Benchmarking
# #seberg's solution
def advindexing_roll(A, r):
rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]
r[r < 0] += A.shape[1]
column_indices = column_indices - r[:,np.newaxis]
return A[rows, column_indices]
Let's do some benchmarking on an array with large number of rows and columns -
In [324]: np.random.seed(0)
...: a = np.random.rand(10000,1000)
...: r = np.random.randint(-1000,1000,(10000))
# #seberg's solution
In [325]: %timeit advindexing_roll(a, r)
10 loops, best of 3: 71.3 ms per loop
# Solution from this post
In [326]: %timeit strided_indexing_roll(a, r)
10 loops, best of 3: 44 ms per loop
In case you want more general solution (dealing with any shape and with any axis), I modified #seberg's solution:
def indep_roll(arr, shifts, axis=1):
"""Apply an independent roll for each dimensions of a single axis.
Parameters
----------
arr : np.ndarray
Array of any shape.
shifts : np.ndarray
How many shifting to use for each dimension. Shape: `(arr.shape[axis],)`.
axis : int
Axis along which elements are shifted.
"""
arr = np.swapaxes(arr,axis,-1)
all_idcs = np.ogrid[[slice(0,n) for n in arr.shape]]
# Convert to a positive shift
shifts[shifts < 0] += arr.shape[-1]
all_idcs[-1] = all_idcs[-1] - shifts[:, np.newaxis]
result = arr[tuple(all_idcs)]
arr = np.swapaxes(result,-1,axis)
return arr
I implement a pure numpy.lib.stride_tricks.as_strided solution as follows
from numpy.lib.stride_tricks import as_strided
def custom_roll(arr, r_tup):
m = np.asarray(r_tup)
arr_roll = arr[:, [*range(arr.shape[1]),*range(arr.shape[1]-1)]].copy() #need `copy`
strd_0, strd_1 = arr_roll.strides
n = arr.shape[1]
result = as_strided(arr_roll, (*arr.shape, n), (strd_0 ,strd_1, strd_1))
return result[np.arange(arr.shape[0]), (n-m)%n]
A = np.array([[4, 0, 0],
[1, 2, 3],
[0, 0, 5]])
r = np.array([2, 0, -1])
out = custom_roll(A, r)
Out[789]:
array([[0, 0, 4],
[1, 2, 3],
[0, 5, 0]])
By using a fast fourrier transform we can apply a transformation in the frequency domain and then use the inverse fast fourrier transform to obtain the row shift.
So this is a pure numpy solution that take only one line:
import numpy as np
from numpy.fft import fft, ifft
# The row shift function using the fast fourrier transform
# rshift(A,r) where A is a 2D array, r the row shift vector
def rshift(A,r):
return np.real(ifft(fft(A,axis=1)*np.exp(2*1j*np.pi/A.shape[1]*r[:,None]*np.r_[0:A.shape[1]][None,:]),axis=1).round())
This will apply a left shift, but we can simply negate the exponential exponant to turn the function into a right shift function:
ifft(fft(...)*np.exp(-2*1j...)
It can be used like that:
# Example:
A = np.array([[1,2,3,4],
[1,2,3,4],
[1,2,3,4]])
r = np.array([1,-1,3])
print(rshift(A,r))
Building on divakar's excellent answer, you can apply this logic to 3D array easily (which was the problematic that brought me here in the first place). Here's an example - basically flatten your data, roll it & reshape it after::
def applyroll_30(cube, threshold=25, offset=500):
flattened_cube = cube.copy().reshape(cube.shape[0]*cube.shape[1], cube.shape[2])
roll_matrix = calc_roll_matrix_flattened(flattened_cube, threshold, offset)
rolled_cube = strided_indexing_roll(flattened_cube, roll_matrix, cube_shape=cube.shape)
rolled_cube = triggered_cube.reshape(cube.shape[0], cube.shape[1], cube.shape[2])
return rolled_cube
def calc_roll_matrix_flattened(cube_flattened, threshold, offset):
""" Calculates the number of position along time axis we need to shift
elements in order to trig the data.
We return a 1D numpy array of shape (X*Y, time) elements
"""
# armax(...) finds the position in the cube (3d) where we are above threshold
roll_matrix = np.argmax(cube_flattened > threshold, axis=1) + offset
# ensure we don't have index out of bound
roll_matrix[roll_matrix>cube_flattened.shape[1]] = cube_flattened.shape[1]
return roll_matrix
def strided_indexing_roll(cube_flattened, roll_matrix_flattened, cube_shape):
# Concatenate with sliced to cover all rolls
# otherwise we shift in the wrong direction for my application
roll_matrix_flattened = -1 * roll_matrix_flattened
a_ext = np.concatenate((cube_flattened, cube_flattened[:, :-1]), axis=1)
# Get sliding windows; use advanced-indexing to select appropriate ones
n = cube_flattened.shape[1]
result = viewW(a_ext,(1,n))[np.arange(len(roll_matrix_flattened)), (n - roll_matrix_flattened) % n, 0]
result = result.reshape(cube_shape)
return result
Divakar's answer doesn't do justice to how much more efficient this is on large cube of data. I've timed it on a 400x400x2000 data formatted as int8. An equivalent for-loop does ~5.5seconds, Seberg's answer ~3.0seconds and strided_indexing.... ~0.5second.
How can I compare two arrays with different sizes but with some floats that are approximate? For example:
# I have two arrays
a = np.array( [-2.83, -2.54, ..., 0.05, ..., 2.54, 2.83] )
b = np.array( [-3.0, -2.9, -2.8, ..., -0.1, 0.0, 0.1, ..., 2.9, 3.0] )
# wherein len( b ) > len( a )
What I need is the index where (considering those two values from both lists)
math.isclose( -2.54, -2.5, rel_tol=1e-1) == True
The answer that I need is something like
list_of_index_of_b = [1, 5, ..., -2]
Here list_of_index_of_b is a list with the "coordinates" where that specific element of b is approximate to some element of a. Not all ellements of a have an approximate in b. Also:
len(list_of_index_of_b) == len(a)
You can use broadcasting. This creates an array of the pairwise differences between every element in a and b which you can then check against the specified tolerance.
Of course this is computationally inefficient from a complexity standpoint since you construct an array of size |a|*|b| and compare every pairwise distance against the tolerance, even if once of the differences is already small enough. That said, if one of |a| or |b| is relatively small, then then this approach can be quite fast since it is pure numpy and requires no loops.
a = np.array([1,5,6,7])
b = np.array([1.1,2,3,4.8,4.9,5,8])
rtol = 0.15
diff = a - b[:,None]
mask2d = (1/np.abs(b))*np.abs(a - b[:,None]) < rtol
mask = np.any(mask2d,axis=1)
This can be combined to a single line:
indices = np.where(np.any((1/np.abs(b))*np.abs(a-b[:,None]) < rtol,axis=1))
I feel this must be very basic but I cannot find a simple way.
I am using python3
I have many data files with x,y data where x goes from 0 to 140 (floating).
Let's say
0, 2.1
0.5,3.5
0.8,3.2
...
I want to import values of x within the range 25.4 to 28.1 and their correspondent values in y. Every file might have different length so the value x>25.4 might appear in different row.
I am looking for something equivalent to the following command in gnuplot:
set xrange [25.4:28.1]
This time I cannot use gnuplot because the data processing requires more than the capabilities of gnuplot.
I imported the data with Pandas but I cannot set a range.
Thank you.
r = range(start, stop, step) is the pattern for this in Python.
So, for example, to get:
r == [0, 1, 2]
You would write:
r = [x for x in range(3)]
And to get:
r == [0, 5, 10]
You would write:
r = [x for x in range(0, 11, 5)]
This doesn't get you very far because:
r = [0, .2, 4.3, 6.3]
r = [x for x in r if x in range(3, 10)]
# r == []
But you can do:
r = [0, .2, 4.3, 6.3]
r = [x for x in r if ((x > 3) & (x < 10))]
# r == [4.3, 6.3]
Pandas and Numpy give you a much more concise way of doing this. Consider the following demo of .between
import pandas as pd
import io
text = io.StringIO("""Close Top_Barrier Bottom_Barrier
0 441.86 441.964112 426.369888
1 448.95 444.162225 425.227108
2 449.99 446.222271 424.285063
3 449.74 447.947051 423.678282
4 451.97 449.879254 423.029413""")
df = pd.read_csv(text, sep='\\s+')
df = df[df["Close"].between(449, 452)] # between
df
So for your df you can do the same: df = df[df["x"].between(min, max)]