This question is about python/numpy, but it may apply to other languages as well.
How can the following code be improved to safely clamp large float values to the
maximum int64 value during conversion? (Ideally, it should still be efficient.)
import numpy as np
def int64_from_clipped_float64(x, dtype=np.int64):
x = np.round(x)
x = np.clip(x, np.iinfo(dtype).min, np.iinfo(dtype).max)
# The problem is that np.iinfo(dtype).max is imprecisely approximated as a
# float64, and the approximation leads to overflow in the conversion.
return x.astype(dtype)
for x in [-3.6, 0.4, 1.7, 1e18, 1e25]:
x = np.array(x, dtype=np.float64)
print(f'x = {x:<10} result = {int64_from_clipped_float64(x)}')
# x = -3.6 result = -4
# x = 0.4 result = 0
# x = 1.7 result = 2
# x = 1e+18 result = 1000000000000000000
# x = 1e+25 result = -9223372036854775808
The problem is that the largest np.int64 is 263 - 1, which is not representable in floating point. The same issue doesn't happen on the other end, because -263 is exactly representable.
So do the clipping half in float space (for detection) and in integer space (for correction):
def int64_from_clipped_float64(x, dtype=np.int64):
assert x.dtype == np.float64
limits = np.iinfo(dtype)
too_small = x <= np.float64(limits.min)
too_large = x >= np.float64(limits.max)
ix = x.astype(dtype)
ix[too_small] = limits.min
ix[too_large] = limits.max
return ix
Here is a generalization of the answer by orlp# to safely clip-convert from
arbitrary floats to arbitrary integers, and to support scalar values as input.
The function is also useful for the conversion of np.float32 to np.int32
because it avoids the creation of intermediate np.float64 values,
as seen in the timing measurements.
def int_from_float(x, dtype=np.int64):
x = np.asarray(x)
assert issubclass(x.dtype.type, np.floating)
input_is_scalar = x.ndim == 0
x = np.atleast_1d(x)
imin, imax = np.iinfo(dtype).min, np.iinfo(dtype).max
fmin, fmax = x.dtype.type((imin, imax))
too_small = x <= fmin
too_large = x >= fmax
ix = x.astype(dtype)
ix[too_small] = imin
ix[too_large] = imax
return ix.item() if input_is_scalar else ix
print(int_from_float(np.float32(3e9), dtype=np.int32)) # 2147483647
print(int_from_float(np.float32(5e9), dtype=np.uint32)) # 4294967295
print(int_from_float(np.float64(1e25), dtype=np.int64)) # 9223372036854775807
a = np.linspace(0, 5e9, 1_000_000, dtype=np.float32).reshape(1000, 1000)
%timeit int_from_float(np.round(a), dtype=np.int32)
# 100 loops, best of 3: 3.74 ms per loop
%timeit np.clip(np.round(a), np.iinfo(np.int32).min, np.iinfo(np.int32).max).astype(np.int32)
# 100 loops, best of 3: 5.56 ms per loop
Related
I would like to solve the above formulation in Scipy and solve it using milp(). For a given graph (V, E), f_ij and x_ij are the decision variables. f_ij is the flow from i to j (it can be continuous). x_ij is the number of vehicles from i to j. p is the price. X is the available number vehicles in a region. c is the capacity.
I have difficulty in translating the formulation to Scipy milp code. I would appreciate it if anyone could give me some pointers.
What I have done:
The code for equation (1):
f_obj = [p[i] for i in Edge]
x_obj = [0]*len(Edge)
obj = f_obj + v_obj
Integrality:
f_cont = [0 for i in Edge] # continous
x_int = [1]*len(Edge) # integer
integrality = f_cont + x_int
Equation (2):
def constraints(self):
b = []
A = []
const = [0]*len(Edge) # for f_ij
for i in v: # for x_ij
for e in Edge:
if e[0] == i:
const.append(1)
else:
const.append(0)
A.append(const)
b.append(self.accInit[i])
const = [0]*len(Edge) # for f_ij
return A, b
Equation (4):
[(0, demand[e]) for e in Edge]
I'm going to do some wild guessing, given how much you've left open to interpretation. Let's assume that
this is a maximisation problem, since the minimisation problem is trivial
Expression (1) is actually the maximisation objective function, though you failed to write it as such
p and d are floating-point vectors
X is an integer vector
c is a floating-point scalar
the graph edges, since you haven't described them at all, do not matter for problem setup
The variable names are not well-chosen and hide what they actually contain. I demonstrate potential replacements.
import numpy as np
from numpy.random._generator import Generator
from scipy.optimize import milp, Bounds, LinearConstraint
import scipy.sparse
from numpy.random import default_rng
rand: Generator = default_rng(seed=0)
N = 20
price = rand.uniform(low=0, high=10, size=N) # p
demand = rand.uniform(low=0, high=10, size=N) # d
availability = rand.integers(low=0, high=10, size=N) # X aka. accInit
capacity = rand.uniform(low=0, high=10) # c
c = np.zeros(2*N) # f and x
c[:N] = -price # (1) f maximized with coefficients of 'p'
# x not optimized
CONTINUOUS = 0
INTEGER = 1
integrality = np.empty_like(c, dtype=int)
integrality[:N] = CONTINUOUS # f
integrality[N:] = INTEGER # x
upper = np.empty_like(c)
upper[:N] = demand # (4) f
upper[N:] = availability # (2) x
eye_N = scipy.sparse.eye(N)
A = scipy.sparse.hstack((-eye_N, capacity*eye_N)) # (3) 0 <= -f + cx
result = milp(
c=c, integrality=integrality,
bounds=Bounds(lb=np.zeros_like(c), ub=upper),
constraints=LinearConstraint(lb=np.zeros(N), A=A),
)
print(result.message)
flow = result.x[:N]
vehicles = result.x[N:].astype(int)
I am doing some numerical analysis exercise where I need calculate solution of linear system using a specific algorithm. My answer differs from the answer of the book by some decimal places which I believe is due to rounding errors. Is there a way where I can automatically set arithmetic to round eight decimal places after each arithmetic operation? The following is my python code.
import numpy as np
A1 = [4, -1, 0, 0, -1, 4, -1, 0,\
0, -1, 4, -1, 0, 0, -1, 4]
A1 = np.array(A1).reshape([4,4])
I = -np.identity(4)
O = np.zeros([4,4])
A = np.block([[A1, I, O, O],
[I, A1, I, O],
[O, I, A1, I],
[O, O, I, A1]])
b = np.array([1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6])
def conj_solve(A, b, pre=False):
n = len(A)
C = np.identity(n)
if pre == True:
for i in range(n):
C[i, i] = np.sqrt(A[i, i])
Ci = np.linalg.inv(C)
Ct = np.transpose(Ci)
x = np.zeros(n)
r = b - np.matmul(A, x)
w = np.matmul(Ci, r)
v = np.matmul(Ct, w)
alpha = np.dot(w, w)
for i in range(MAX_ITER):
if np.linalg.norm(v, np.infty) < TOL:
print(i+1, "steps")
print(x)
print(r)
return
u = np.matmul(A, v)
t = alpha/np.dot(v, u)
x = x + t*v
r = r - t*u
w = np.matmul(Ci, r)
beta = np.dot(w, w)
if np.abs(beta) < TOL:
if np.linalg.norm(r, np.infty) < TOL:
print(i+1, "steps")
print(x)
print(r)
return
s = beta/alpha
v = np.matmul(Ct, w) + s*v
alpha = beta
print("Max iteration exceeded")
return x
MAX_ITER = 1000
TOL = 0.05
sol = conj_solve(A, b, pre=True)
Using this, I get 2.55516527 as first element of array which should be 2.55613420.
OR, is there a language/program where I can specify the precision of arithmetic?
Precision/rounding during the calculation is unlikely to be the issue.
To test this I ran the calculation with precisions that bracket the precision you are aiming for: once with np.float64, and once with np.float32. Here is a table of the printed results, their approximate decimal precision, and the result of the calculation (ie, the first printed array value).
numpy type decimal places result
-------------------------------------------------
np.float64 15 2.55516527
np.float32 6 2.5551653
Given that these are so much in agreement, I doubt an intermediate precision of 8 decimal places is going to give an answer that's not between these two results (ie, 2.55613420 that's off in the 4th digit).
This isn't part isn't part of my answer, but is a comment on using mpmath. The questioner suggested it in the comments, and it was my first thought too, so I ran a quick test to see if it behaved how I expected with low precision calculations. It didn't, so I abandoned it (but I'm not an expert with it).
Here's my test function, basically multiplying 1/N by N and 1/N repeatedly to emphasise the error in 1/N.
def precision_test(dps=100, N=19, t=mpmath.mpf):
with mpmath.workdps(dps):
x = t(1)/t(N)
print(x)
y = x
for i in range(10000):
y *= x
y *= N
print(y)
This works as expected with, eg, np.float32:
precision_test(dps=2, N=3, t=np.float32)
# 0.33333334
# 0.3334327041164994
Note that the error has propagated into more significant digits, as expected.
But with mpmath, I could never get that to happen (testing with a range of dps and a various prime N values):
precision_test(dps=2, N=3)
# 0.33
# 0.33
Because of this test, I decided mpmath is not going to give normal results for low precision calculations.
TL;DR:
mpmath didn't behave how I expected at low precision so I abandoned it.
I have a code that works perfectly well but I wish to speed up the time it takes to converge. A snippet of the code is shown below:
def myfunction(x, i):
y = x + (min(0, target[i] - data[i, :]x))*data[i]/(norm(data[i])**2))
return y
rows, columns = data.shape
start = time.time()
iterate = 0
iterate_count = []
norm_count = []
res = 5
x_not = np.ones(columns)
norm_count.append(norm(x_not))
iterate_count.append(0)
while res > 1e-8:
for row in range(rows):
y = myfunction(x_not, row)
x_not = y
iterate += 1
iterate_count.append(iterate)
norm_count.append(norm(x_not))
res = abs(norm_count[-1] - norm_count[-2])
print('Converge at {} iterations'.format(iterate))
print('Duration: {:.4f} seconds'.format(time.time() - start))
I am relatively new in Python. I will appreciate any hint/assistance.
Ax=b is the problem we wish to solve. Here, 'A' is the 'data' and 'b' is the 'target'
Ugh! After spending a while on this I don't think it can be done the way you've set up your problem. In each iteration over the row, you modify x_not and then pass the updated result to get the solution for the next row. This kind of setup can't be vectorized easily. You can learn the thought process of vectorization from the failed attempt, so I'm including it in the answer. I'm also including a different iterative method to solve linear systems of equations. I've included a vectorized version -- where the solution is updated using matrix multiplication and vector addition, and a loopy version -- where the solution is updated using a for loop to demonstrate what you can expect to gain.
1. The failed attempt
Let's take a look at what you're doing here.
def myfunction(x, i):
y = x + (min(0, target[i] - data[i, :] # x)) * (data[i] / (norm(data[i])**2))
return y
You subtract
the dot product of (the ith row of data and x_not)
from the ith row of target,
limited at zero.
You multiply this result with the ith row of data divided my the norm of that row squared. Let's call this part2
Then you add this to the ith element of x_not
Now let's look at the shapes of the matrices.
data is (M, N).
target is (M, ).
x_not is (N, )
Instead of doing these operations rowwise, you can operate on the entire matrix!
1.1. Simplifying the dot product.
Instead of doing data[i, :] # x, you can do data # x_not and this gives an array with the ith element giving the dot product of the ith row with x_not. So now we have data # x_not with shape (M, )
Then, you can subtract this from the entire target array, so target - (data # x_not) has shape (M, ).
So far, we have
part1 = target - (data # x_not)
Next, if anything is greater than zero, set it to zero.
part1[part1 > 0] = 0
1.2. Finding rowwise norms.
Finally, you want to multiply this by the row of data, and divide by the square of the L2-norm of that row. To get the norm of each row of a matrix, you do
rownorms = np.linalg.norm(data, axis=1)
This is a (M, ) array, so we need to convert it to a (M, 1) array so we can divide each row. rownorms[:, None] does this. Then divide data by this.
part2 = data / (rownorms[:, None]**2)
1.3. Add to x_not
Finally, we're adding each row of part1 * part2 to the original x_not and returning the result
result = x_not + (part1 * part2).sum(axis=0)
Here's where we get stuck. In your approach, each call to myfunction() gives a value of part1 that depends on target[i], which was changed in the last call to myfunction().
2. Why vectorize?
Using numpy's inbuilt methods instead of looping allows it to offload the calculation to its C backend, so it runs faster. If your numpy is linked to a BLAS backend, you can extract even more speed by using your processor's SIMD registers
The conjugate gradient method is a simple iterative method to solve certain systems of equations. There are other more complex algorithms that can solve general systems well, but this should do for the purposes of our demo. Again, the purpose is not to have an iterative algorithm that will perfectly solve any linear system of equations, but to show what kind of speedup you can expect if you vectorize your code.
Given your system
data # x_not = target
Let's define some variables:
A = data.T # data
b = data.T # target
And we'll solve the system A # x = b
x = np.zeros((columns,)) # Initial guess. Can be anything
resid = b - A # x
p = resid
while (np.abs(resid) > tolerance).any():
Ap = A # p
alpha = (resid.T # resid) / (p.T # Ap)
x = x + alpha * p
resid_new = resid - alpha * Ap
beta = (resid_new.T # resid_new) / (resid.T # resid)
p = resid_new + beta * p
resid = resid_new + 0
To contrast the fully vectorized approach with one that uses iterations to update the rows of x and resid_new, let's define another implementation of the CG solver that does this.
def solve_loopy(data, target, itermax = 100, tolerance = 1e-8):
A = data.T # data
b = data.T # target
rows, columns = data.shape
x = np.zeros((columns,)) # Initial guess. Can be anything
resid = b - A # x
resid_new = b - A # x
p = resid
niter = 0
while (np.abs(resid) > tolerance).any() and niter < itermax:
Ap = A # p
alpha = (resid.T # resid) / (p.T # Ap)
for i in range(len(x)):
x[i] = x[i] + alpha * p[i]
resid_new[i] = resid[i] - alpha * Ap[i]
# resid_new = resid - alpha * A # p
beta = (resid_new.T # resid_new) / (resid.T # resid)
p = resid_new + beta * p
resid = resid_new + 0
niter += 1
return x
And our original vector method:
def solve_vect(data, target, itermax = 100, tolerance = 1e-8):
A = data.T # data
b = data.T # target
rows, columns = data.shape
x = np.zeros((columns,)) # Initial guess. Can be anything
resid = b - A # x
resid_new = b - A # x
p = resid
niter = 0
while (np.abs(resid) > tolerance).any() and niter < itermax:
Ap = A # p
alpha = (resid.T # resid) / (p.T # Ap)
x = x + alpha * p
resid_new = resid - alpha * Ap
beta = (resid_new.T # resid_new) / (resid.T # resid)
p = resid_new + beta * p
resid = resid_new + 0
niter += 1
return x
Let's solve a simple system to see if this works first:
2x1 + x2 = -5
−x1 + x2 = -2
should give a solution of [-1, -3]
data = np.array([[ 2, 1],
[-1, 1]])
target = np.array([-5, -2])
print(solve_loopy(data, target))
print(solve_vect(data, target))
Both give the correct solution [-1, -3], yay! Now on to bigger things:
data = np.random.random((100, 100))
target = np.random.random((100, ))
Let's ensure the solution is still correct:
sol1 = solve_loopy(data, target)
np.allclose(data # sol1, target)
# Output: False
sol2 = solve_vect(data, target)
np.allclose(data # sol2, target)
# Output: False
Hmm, looks like the CG method doesn't work for badly conditioned random matrices we created. Well, at least both give the same result.
np.allclose(sol1, sol2)
# Output: True
But let's not get discouraged! We don't really care if it works perfectly, the point of this is to demonstrate how amazing vectorization is. So let's time this:
import timeit
timeit.timeit('solve_loopy(data, target)', number=10, setup='from __main__ import solve_loopy, data, target')
# Output: 0.25586539999994784
timeit.timeit('solve_vect(data, target)', number=10, setup='from __main__ import solve_vect, data, target')
# Output: 0.12008900000000722
Nice! A ~2x speedup simply by avoiding a loop while updating our solution!
For larger systems, this will be even better.
for N in [10, 50, 100, 500, 1000]:
data = np.random.random((N, N))
target = np.random.random((N, ))
t_loopy = timeit.timeit('solve_loopy(data, target)', number=10, setup='from __main__ import solve_loopy, data, target')
t_vect = timeit.timeit('solve_vect(data, target)', number=10, setup='from __main__ import solve_vect, data, target')
print(N, t_loopy, t_vect, t_loopy/t_vect)
This gives us:
N t_loopy t_vect speedup
00010 0.002823 0.002099 1.345390
00050 0.051209 0.014486 3.535048
00100 0.260348 0.114601 2.271773
00500 0.980453 0.240151 4.082644
01000 1.769959 0.508197 3.482822
I want to solve this equation without any Modules(NumPy, Sympy... etc.)
Px + Qy = W
(ex. 5x + 6y = 55)
Thanks.
It is a very crude way to do this, but you can use brute-force technique, as I said in comment under your question. It can probably be optimized a lot, gives only int outputs, but overall shows the method:
import numpy as np
# Provide the equation:
print("Provide a, b and c to evaluate in equation of form {ax + by - c = 0}")
a = float(input("a: "))
b = float(input("b: "))
c = float(input("c: "))
x_range = int(input("x-searching range (-a, a): "))
y_range = int(input("y-searching range (-b, b): "))
error = float(input("maximum accepted error from the exact solution: "))
x_range = np.arange(-x_range, x_range, 1)
y_range = np.arange(-y_range, y_range, 1)
for x in x_range:
for y in y_range:
if -error <= a * x + b * y - c <= error:
print("Got an absolute error of {} or less with numbers x = {} and y = {}.".format(error, x, y))
Example output for a = 1, b = 2, c = 3, x_range = 10, y_range = 10, error = 0.001:
Got an error of 0.001 or less with numbers x = -9 and y = 6.
Got an error of 0.001 or less with numbers x = -7 and y = 5.
Got an error of 0.001 or less with numbers x = -5 and y = 4.
Got an error of 0.001 or less with numbers x = -3 and y = 3.
Got an error of 0.001 or less with numbers x = -1 and y = 2.
Got an error of 0.001 or less with numbers x = 1 and y = 1.
Got an error of 0.001 or less with numbers x = 3 and y = 0.
Got an error of 0.001 or less with numbers x = 5 and y = -1.
Got an error of 0.001 or less with numbers x = 7 and y = -2.
Got an error of 0.001 or less with numbers x = 9 and y = -3.
I am using numpy, but not a built-in function to solve the equation itself, just to create an array. This can be done without it, of course.
There are thousands of ways to solve an equation with python.
One of those is:
def myfunc (x=None, y=None):
return ((55-6*y)/5.0) if y else ((55-5*x)/6.0)
print(myfunc(x=10)) # OUTPUT: 0.833333333333, y value for x == 10
print(myfunc(y=42)) # OUTPUT: -39.4, x value for y == 42
You simply define inside a function the steps required to solve the equation.
In our example, if we have y value we subtract 6*y to 55 then we divide by 5.0 (we add .0 to have a float as result), otherwise (means we have x) we subtract 5*x from 55 and then we divide by 6.0
with the same principle, you can generalize:
def myfunc (x=None, y=None, P=None, Q=None, W=None):
if not W:
return P*x + Q*y
elif not x:
return (W-Q*y)/float(P)
elif not y:
return (W-P*x)/float(Q)
elif not P:
return (W-Q*y)/float(x)
elif not Q:
return (W-P*x)/float(y)
print(myfunc(x=10, P=5, Q=6, W=55)) # OUTPUT: 0.833333333333, y value for x == 10
print(myfunc(y=42, P=5, Q=6, W=55)) # OUTPUT: -39.4, x value for y == 42
check this QA for some other interesting ways to approach this problem
Instructions: Compute and store R=1000 random values from 0-1 as x. moving_window_average(x, n_neighbors) is pre-loaded into memory from 3a. Compute the moving window average for x for the range of n_neighbors 1-9. Store x as well as each of these averages as consecutive lists in a list called Y.
My solution:
R = 1000
n_neighbors = 9
x = [random.uniform(0,1) for i in range(R)]
Y = [moving_window_average(x, n_neighbors) for n_neighbors in range(1,n_neighbors)]
where moving_window_average(x, n_neighbors) is a function as follows:
def moving_window_average(x, n_neighbors=1):
n = len(x)
width = n_neighbors*2 + 1
x = [x[0]]*n_neighbors + x + [x[-1]]*n_neighbors
# To complete the function,
# return a list of the mean of values from i to i+width for all values i from 0 to n-1.
mean_values=[]
for i in range(1,n+1):
mean_values.append((x[i-1] + x[i] + x[i+1])/width)
return (mean_values)
This gives me an error, Check your usage of Y again. Even though I've tested for a few values, I did not get yet why there is a problem with this exercise. Did I just misunderstand something?
The instruction tells you to compute moving averages for all neighbors ranging from 1 to 9. So the below code should work:
import random
random.seed(1)
R = 1000
x = []
for i in range(R):
num = random.uniform(0,1)
x.append(num)
Y = []
Y.append(x)
for i in range(1,10):
mov_avg = moving_window_average(x, n_neighbors=i)
Y.append(mov_avg)
Actually your moving_window_average(list, n_neighbors) function is not going to work with a n_neighbors bigger than one, I mean, the interpreter won't say a thing, but you're not delivering correctness on what you have been asked.
I suggest you to use something like:
def moving_window_average(x, n_neighbors=1):
n = len(x)
width = n_neighbors*2 + 1
x = [x[0]]*n_neighbors + x + [x[-1]]*n_neighbors
mean_values = []
for i in range(n):
temp = x[i: i+width]
sum_= 0
for elm in temp:
sum_+= elm
mean_values.append(sum_ / width)
return mean_values
My solution for +100XP
import random
random.seed(1)
R=1000
Y = list()
x = [random.uniform(0, 1) for num in range(R)]
for n_neighbors in range(10):
Y.append(moving_window_average(x, n_neighbors))