docplex.cp.model is slower than the exhaustive search - python-3.x

I am working on a combinatorial optimisation problem and realised the CPLEX is taking a significant time to run. Here is a toy example:
I am using the python API for docplex
import numpy as np
from docplex.cp.model import CpoModel
N = 5000
S = 10
k = 2
u_i = np.random.rand(N)[:,np.newaxis]
u_ij = np.random.rand(N*S).reshape(N, S)
beta = np.random.rand(N)[:,np.newaxis]
m = CpoModel(name = 'model')
R = range(0, S)
idx = [(j) for j in R]
I = m.binary_var_dict(idx)
m.add_constraint(m.sum(I[j] for j in R)<= k)
total_rev = m.sum(beta[i,0] / ( 1 + u_i[i,0]/sum(I[j] * u_ij[i,j] for j in R) ) for i in range(N) )
m.maximize(total_rev)
sol=m.solve(agent='local')
sol.print_solution()
for i in R:
if sol[I[i]]==1:
print('i : '+str(i))
Part of the output is as follows:
Model constraints: 1, variables: integer: 10, interval: 0, sequence: 0
Solve status: Optimal
Search status: SearchCompleted, stop cause: SearchHasNotBeenStopped
Solve time: 76.14 sec
-------------------------------------------------------------------------------
Objective values: (1665.58,), bounds: (1665.74,), gaps: (9.27007e-05,)
Variables:
+ 10 anonymous variables
The same I tried with an exhaustive search:
import numpy as np
import pandas as pd
from itertools import combinations,permutations,product
import time
start = time.time()
results = []
for K_i in range(1,k+1): #K
comb = list(combinations(range(S), K_i))
A = len(comb)
for a in range(A):# A
comb_i = comb[a]
I = np.repeat(0,S).reshape(-1,1)
I[comb_i,0] = 1
u_j = np.matmul(u_ij,I)
total_rev = np.sum(beta/ (1 + u_i/u_j))
results.append({'comb_i':comb_i, 'total_rev':total_rev })
end = time.time()
time_elapsed = end - start
print('time_elapsed : ', str(time_elapsed))
results = pd.DataFrame(results)
opt_results = results[results['total_rev'] == max(results['total_rev'].values)]
print(opt_results)
Output:
time_elapsed : 0.012971639633178711
comb_i total_rev
23 (1, 6) 1665.581329
As you can see the CPLEX is 1000 times slower than the exhaustive search. Is there a way to improve the CPLEX algorithm?

if you change
sol=m.solve(agent='local')
to
sol=m.solve(agent='local',SearchType="DepthFirst")
you ll get the optimal solution faster.
Nb:
Proving optimality may take time sometimes with CPOptimizer

For this particular problem:
sol=m.solve(agent='local', SearchType='DepthFirst', Workers=1)
should help out a lot.

Related

Translating a mixed-integer programming formulation to Scipy

I would like to solve the above formulation in Scipy and solve it using milp(). For a given graph (V, E), f_ij and x_ij are the decision variables. f_ij is the flow from i to j (it can be continuous). x_ij is the number of vehicles from i to j. p is the price. X is the available number vehicles in a region. c is the capacity.
I have difficulty in translating the formulation to Scipy milp code. I would appreciate it if anyone could give me some pointers.
What I have done:
The code for equation (1):
f_obj = [p[i] for i in Edge]
x_obj = [0]*len(Edge)
obj = f_obj + v_obj
Integrality:
f_cont = [0 for i in Edge] # continous
x_int = [1]*len(Edge) # integer
integrality = f_cont + x_int
Equation (2):
def constraints(self):
b = []
A = []
const = [0]*len(Edge) # for f_ij
for i in v: # for x_ij
for e in Edge:
if e[0] == i:
const.append(1)
else:
const.append(0)
A.append(const)
b.append(self.accInit[i])
const = [0]*len(Edge) # for f_ij
return A, b
Equation (4):
[(0, demand[e]) for e in Edge]
I'm going to do some wild guessing, given how much you've left open to interpretation. Let's assume that
this is a maximisation problem, since the minimisation problem is trivial
Expression (1) is actually the maximisation objective function, though you failed to write it as such
p and d are floating-point vectors
X is an integer vector
c is a floating-point scalar
the graph edges, since you haven't described them at all, do not matter for problem setup
The variable names are not well-chosen and hide what they actually contain. I demonstrate potential replacements.
import numpy as np
from numpy.random._generator import Generator
from scipy.optimize import milp, Bounds, LinearConstraint
import scipy.sparse
from numpy.random import default_rng
rand: Generator = default_rng(seed=0)
N = 20
price = rand.uniform(low=0, high=10, size=N) # p
demand = rand.uniform(low=0, high=10, size=N) # d
availability = rand.integers(low=0, high=10, size=N) # X aka. accInit
capacity = rand.uniform(low=0, high=10) # c
c = np.zeros(2*N) # f and x
c[:N] = -price # (1) f maximized with coefficients of 'p'
# x not optimized
CONTINUOUS = 0
INTEGER = 1
integrality = np.empty_like(c, dtype=int)
integrality[:N] = CONTINUOUS # f
integrality[N:] = INTEGER # x
upper = np.empty_like(c)
upper[:N] = demand # (4) f
upper[N:] = availability # (2) x
eye_N = scipy.sparse.eye(N)
A = scipy.sparse.hstack((-eye_N, capacity*eye_N)) # (3) 0 <= -f + cx
result = milp(
c=c, integrality=integrality,
bounds=Bounds(lb=np.zeros_like(c), ub=upper),
constraints=LinearConstraint(lb=np.zeros(N), A=A),
)
print(result.message)
flow = result.x[:N]
vehicles = result.x[N:].astype(int)

DOcplexException: Expression xx cannot be used as divider of xxx

I am new to CPLEX and I was trying to find an example where the decision variable is in the denominator of the objective function but couldn't. My optimisation problem;
I have tried the following on Python3;
from docplex.mp.model import Model
import numpy as np
N = 1000
S = 10
k = 2
u_i = np.random.rand(N)[:,np.newaxis]
u_ij = np.random.rand(N*S).reshape(N, S)
beta = np.random.rand(N)[:,np.newaxis]
m = Model(name = 'model')
R = range(1, S+1)
idx = [(j) for j in R]
I = m.binary_var_dict(idx)
m.add_constraint(m.sum(I[j] for j in R)<= k)
total_rev = m.sum(beta[i,0] / ( 1 + u_i[i,0]/sum(I[j] * u_ij[j,i-1] for j in R) ) for i in range(N) )
m.maximize(total_rev)
sol = m.solve()
sol.display()
However Im getting the following error when running the line;
total_rev = m.sum(beta[i,0] / ( 1 + u_i[i,0]/sum(I[j] * u_ij[j,i-1] for j in R) ) for i in range(N) )
Error :
DOcplexException: Expression 0.564x1+0.057x2+0.342x3+0.835x4+0.452x5+0.802x6+0.324x7+0.763x8+0.264x9+0.226x10 cannot be used as divider of 0.17966220449798675
Can you please help me to overcome this error?
Since your objective is not linear you should use CPO within CPLEX
from docplex.cp.model import CpoModel
import numpy as np
N = 10
S = 10
k = 2
u_i = np.random.rand(N)[:,np.newaxis]
u_ij = np.random.rand(N*S).reshape(N, S)
beta = np.random.rand(N)[:,np.newaxis]
m = CpoModel(name = 'model')
R = range(1, S)
idx = [(j) for j in R]
I = m.binary_var_dict(idx)
m.add_constraint(m.sum(I[j] for j in R)<= k)
total_rev = m.sum(beta[i,0] / ( 1 + u_i[i,0]/sum(I[j] * u_ij[j,i-1] for j in R) ) for i in range(N) )
m.maximize(total_rev)
sol=m.solve()
for i in R:
print(sol[I[i]])
works fine

Chudnovsky algorithm python incorrect decimals

my goal is to get 100.000 or 200.000 correct decimals of Pi in Python. For this, I have tried using the Chudnovsky algorithm, but I've got some issues along the way.
First, the program only gives me 29 chars, instead of the 50 I want to test the correctness. I know this is a small issue, but I don't understand what I've done wrong.
Second, only the first 14 decimals are correct. After those, I start getting inaccurate Pi decimals according to about all Pi numbers on the internet. How do I get way more correct decimals?
And last, how do I let my code run on all 4 of the threads I have? I've tried using Pool, but it doesn't seem to work. (Checked it with Windows task manager)
This is my code:
from math import *
from decimal import Decimal, localcontext
from multiprocessing import Pool
import time
k = 0
s = 0
c = Decimal(426880*sqrt(10005))
if __name__ == '__main__':
start = time.time()
pi = 0
with localcontext() as ctx:
ctx.prec = 50
with Pool(None) as pool:
for k in range(0,500):
m = Decimal((factorial(6 * k)) / (factorial(3 * k) * Decimal((factorial(k) ** 3))))
l = Decimal((545140134 * k) + 13591409)
x = Decimal((-262537412640768000) ** k)
subPi = Decimal(((m*l)/x))
s = s + subPi
print(c*(s**-1))
print(time.time() - start)
In addition to the small details discussed in the comments and proposed by #mark-dickinson I think I've fixed the multithreading but I haven't had a chance to test it, let me know if it works properly
UPDATE: Problems after the 28th digits were due to the assignment of sq and c before the decimal context change. Reassign their value after changing the context precision solved the problem
from math import *
import decimal
from decimal import Decimal, localcontext
from multiprocessing import Pool
import time
k = 0
s = 0
sq = Decimal(10005).sqrt() #useless here
c = Decimal(426880*sq) #useless here
def calculate():
global s, k
for k in range(0,500):
m = Decimal((factorial(6 * k)) / (factorial(3 * k) * Decimal((factorial(k) ** 3))))
l = Decimal((545140134 * k) + 13591409)
x = Decimal((-262537412640768000) ** k)
subPi = Decimal((m*l)/x)
s = s + subPi
print(c*(s**-1))
if __name__ == '__main__':
start = time.time()
pi = 0
decimal.getcontext().prec = 100 #change the precision to increse the result digits
sq = Decimal(10005).sqrt()
c = Decimal(426880*sq)
pool = Pool()
result = pool.apply_async(calculate)
result.get()
print(time.time() - start)

Performance issues of second/third/etc factorial program

I'm having some performance issues with my program that calculates the second, third, fourth, etc. factorial.
ie 11!! is 119753*1
ie 111!!! is 111108105...
I expect an exponential time scaling, but what I'm getting it more aggressive than I expect.
6th factorial takes 0.2 seconds
7th factorial takes 21 seconds
8th factorial takes so long I haven't finished it
I've noticed a little over half the time is taken printing due to the int to string conversion. I've testing counting up while // 10 but that took even longer.
Anything I can do to improve performance?
import MyFormatter
import datetime
import math
def first_n_digits(num, n):
return num // 10 ** (int(math.log(num, 10)) - n + 1)
start = datetime.datetime.now()
parser = argparse.ArgumentParser(
formatter_class=MyFormatter.MyFormatter,
description="Calcs x factorial",
usage="",
)
parser.add_argument("-n", "--number", type=int)
args = parser.parse_args()
s = ""
for i in range(0, args.number) :
s = s + "1"
n = 1
s = int(s)
arr = []
while (s > 0) :
arr.append(s)
s -= args.number
n = math.prod(arr)
fnd = str(first_n_digits(n,3))
print("{}.{}{}e{}".format(fnd[0], fnd[1], fnd[2], str(len(str(n))-1)))
end = datetime.datetime.now()
print(end-start)

Monte Carlo Optimization

I am trying to do Monte Carlo minimization to solve for parameters of a given equation. My equation has 4 parameters, making my iteration about 4**n
when I try iteration n = 100, I saw it is not a good idea to search all the parameter space.
Here is my code:
import sys
import numpy as np
#import matplotlib.pyplot as plt
#import pandas as pd
import random
#method returns sum square for given parameter m and c
def currentFunc(x,alpha1,alpha2,alpha3,alpha4):
term = -(x/alpha4)
term_Norm = term
expoterm = np.exp(term_Norm)
expoterm = np.exp(term_Norm)
#print('check term: x: %0.10f %0.10f exp: %0.10f' % (x,term_Norm,expoterm) )
return(-alpha1*( (alpha2/(alpha3+ expoterm )) - 1))
def sumsquarecurr(x,y,a1,a2,a3,a4):
xsize = len(x)
ysize = len(y)
sumsqdiff = 0
if(xsize != ysize):
print("check your X and Y length exiting ...")
sys.exit(0)
for i in range(ysize):
diff = y[i] - currentFunc(x[i],a1,a2,a3,a4)
sumsqdiff+=diff*diff
return sumsqdiff
# number of random number (this affects the accuracy of the Monte Carlo method
n = 10
a_rnad = []
b_rnad = []
c_rnad = []
d_rnad = []
for i in range(n):
#random.seed(555)
xtemp = random.uniform(0.0, 2.0)
print('check %.4f ' % (xtemp))
a_rnad.append(xtemp)
b_rnad.append(xtemp)
c_rnad.append(xtemp)
d_rnad.append(xtemp)
Yfit=[-7,-5,-3,-1,1,3,5,7]
Xfit=[8.077448e-07,6.221196e-07,4.231292e-07,1.710039e-07,-4.313762e-05,-8.248818e-05,-1.017410e-04,-1.087409e-04]
# placeholder for the parameters and the minimun sum squared
#[alpha1,alpha2,alpha3,alpha4,min]
minparam = [0,0,0,0,99999999999.0]
for j in range(len(a_rnad)):
for i in range(len(b_rnad)):
for k in range(len(c_rnad)):
for m in range(len(d_rnad)):
minsumsqdiff_temp =sumsquarecurr(Xfit,Yfit,a_rnad[j],b_rnad[i],c_rnad[k],d_rnad[m])
print('alpha1: %.4f alpha2: %.4f alpha3: %.4f alpha4: %.4f min: %0.4f' % (a_rnad[j],b_rnad[i],c_rnad[k],d_rnad[m],minsumsqdiff_temp))
if(minsumsqdiff_temp<minparam[4]):
minparam[0] = a_rnad[j]
minparam[1] = b_rnad[i]
minparam[2] = c_rnad[k]
minparam[3] = d_rnad[m]
minparam[4] = minsumsqdiff_temp
print('minimazation: alpha1: %.4f alpha2: %.4f alpha3: %.4f alpha4: %.4f min: %0.4f' % (minparam[0],minparam[1],minparam[2],minparam[3],minparam[4]))
Question:
is there a way to make this algorithm run faster (either by cutting the search/phase space down)?
I feel I am reinventing the wheel. please does anyone know a python module that can do what I am trying to do?
Thanks in advance for your help

Resources