I am new to CPLEX and I was trying to find an example where the decision variable is in the denominator of the objective function but couldn't. My optimisation problem;
I have tried the following on Python3;
from docplex.mp.model import Model
import numpy as np
N = 1000
S = 10
k = 2
u_i = np.random.rand(N)[:,np.newaxis]
u_ij = np.random.rand(N*S).reshape(N, S)
beta = np.random.rand(N)[:,np.newaxis]
m = Model(name = 'model')
R = range(1, S+1)
idx = [(j) for j in R]
I = m.binary_var_dict(idx)
m.add_constraint(m.sum(I[j] for j in R)<= k)
total_rev = m.sum(beta[i,0] / ( 1 + u_i[i,0]/sum(I[j] * u_ij[j,i-1] for j in R) ) for i in range(N) )
m.maximize(total_rev)
sol = m.solve()
sol.display()
However Im getting the following error when running the line;
total_rev = m.sum(beta[i,0] / ( 1 + u_i[i,0]/sum(I[j] * u_ij[j,i-1] for j in R) ) for i in range(N) )
Error :
DOcplexException: Expression 0.564x1+0.057x2+0.342x3+0.835x4+0.452x5+0.802x6+0.324x7+0.763x8+0.264x9+0.226x10 cannot be used as divider of 0.17966220449798675
Can you please help me to overcome this error?
Since your objective is not linear you should use CPO within CPLEX
from docplex.cp.model import CpoModel
import numpy as np
N = 10
S = 10
k = 2
u_i = np.random.rand(N)[:,np.newaxis]
u_ij = np.random.rand(N*S).reshape(N, S)
beta = np.random.rand(N)[:,np.newaxis]
m = CpoModel(name = 'model')
R = range(1, S)
idx = [(j) for j in R]
I = m.binary_var_dict(idx)
m.add_constraint(m.sum(I[j] for j in R)<= k)
total_rev = m.sum(beta[i,0] / ( 1 + u_i[i,0]/sum(I[j] * u_ij[j,i-1] for j in R) ) for i in range(N) )
m.maximize(total_rev)
sol=m.solve()
for i in R:
print(sol[I[i]])
works fine
Related
I am working on a combinatorial optimisation problem and realised the CPLEX is taking a significant time to run. Here is a toy example:
I am using the python API for docplex
import numpy as np
from docplex.cp.model import CpoModel
N = 5000
S = 10
k = 2
u_i = np.random.rand(N)[:,np.newaxis]
u_ij = np.random.rand(N*S).reshape(N, S)
beta = np.random.rand(N)[:,np.newaxis]
m = CpoModel(name = 'model')
R = range(0, S)
idx = [(j) for j in R]
I = m.binary_var_dict(idx)
m.add_constraint(m.sum(I[j] for j in R)<= k)
total_rev = m.sum(beta[i,0] / ( 1 + u_i[i,0]/sum(I[j] * u_ij[i,j] for j in R) ) for i in range(N) )
m.maximize(total_rev)
sol=m.solve(agent='local')
sol.print_solution()
for i in R:
if sol[I[i]]==1:
print('i : '+str(i))
Part of the output is as follows:
Model constraints: 1, variables: integer: 10, interval: 0, sequence: 0
Solve status: Optimal
Search status: SearchCompleted, stop cause: SearchHasNotBeenStopped
Solve time: 76.14 sec
-------------------------------------------------------------------------------
Objective values: (1665.58,), bounds: (1665.74,), gaps: (9.27007e-05,)
Variables:
+ 10 anonymous variables
The same I tried with an exhaustive search:
import numpy as np
import pandas as pd
from itertools import combinations,permutations,product
import time
start = time.time()
results = []
for K_i in range(1,k+1): #K
comb = list(combinations(range(S), K_i))
A = len(comb)
for a in range(A):# A
comb_i = comb[a]
I = np.repeat(0,S).reshape(-1,1)
I[comb_i,0] = 1
u_j = np.matmul(u_ij,I)
total_rev = np.sum(beta/ (1 + u_i/u_j))
results.append({'comb_i':comb_i, 'total_rev':total_rev })
end = time.time()
time_elapsed = end - start
print('time_elapsed : ', str(time_elapsed))
results = pd.DataFrame(results)
opt_results = results[results['total_rev'] == max(results['total_rev'].values)]
print(opt_results)
Output:
time_elapsed : 0.012971639633178711
comb_i total_rev
23 (1, 6) 1665.581329
As you can see the CPLEX is 1000 times slower than the exhaustive search. Is there a way to improve the CPLEX algorithm?
if you change
sol=m.solve(agent='local')
to
sol=m.solve(agent='local',SearchType="DepthFirst")
you ll get the optimal solution faster.
Nb:
Proving optimality may take time sometimes with CPOptimizer
For this particular problem:
sol=m.solve(agent='local', SearchType='DepthFirst', Workers=1)
should help out a lot.
im trying to write a program that gives the integral approximation of e(x^2) between 0 and 1 based on this integral formula:
Formula
i've done this code so far but it keeps giving the wrong answer (Other methods gives 1.46 as an answer, this one gives 1.006).
I think that maybe there is a problem with the two for cycles that does the Riemman sum, or that there is a problem in the way i've wrote the formula. I also tried to re-write the formula in other ways but i had no success
Any kind of help is appreciated.
import math
import numpy as np
def f(x):
y = np.exp(x**2)
return y
a = float(input("¿Cual es el limite inferior? \n"))
b = float(input("¿Cual es el limite superior? \n"))
n = int(input("¿Cual es el numero de intervalos? "))
x = np.zeros([n+1])
y = np.zeros([n])
z = np.zeros([n])
h = (b-a)/n
print (h)
x[0] = a
x[n] = b
suma1 = 0
suma2 = 0
for i in np.arange(1,n):
x[i] = x[i-1] + h
suma1 = suma1 + f(x[i])
alfa = (x[i]-x[i-1])/3
for i in np.arange(0,n):
y[i] = (x[i-1]+ alfa)
suma2 = suma2 + f(y[i])
z[i] = y[i] + alfa
int3 = ((b-a)/(8*n)) * (f(x[0])+f(x[n]) + (3*(suma2+f(z[i]))) + (2*(suma1)))
print (int3)
I'm not a math major but I remember helping a friend with this rule for something about waterplane area for ships.
Here's an implementation based on Wikipedia's description of the Simpson's 3/8 rule:
# The input parameters
a, b, n = 0, 1, 10
# Divide the interval into 3*n sub-intervals
# and hence 3*n+1 endpoints
x = np.linspace(a,b,3*n+1)
y = f(x)
# The weight for each points
w = [1,3,3,1]
result = 0
for i in range(0, 3*n, 3):
# Calculate the area, 4 points at a time
result += (x[i+3] - x[i]) / 8 * (y[i:i+4] * w).sum()
# result = 1.4626525814387632
You can do it using numpy.vectorize (Based on this wikipedia post):
a, b, n = 0, 1, 10**6
h = (b-a) / n
x = np.linspace(0,n,n+1)*h + a
fv = np.vectorize(f)
(
3*h/8 * (
f(x[0]) +
3 * fv(x[np.mod(np.arange(len(x)), 3) != 0]).sum() + #skip every 3rd index
2 * fv(x[::3]).sum() + #get every 3rd index
f(x[-1])
)
)
#Output: 1.462654874404461
If you use numpy's built-in functions (which I think is always possible), performance will improve considerably:
a, b, n = 0, 1, 10**6
x = np.exp(np.square(np.linspace(0,n,n+1)*h + a))
(
3*h/8 * (
x[0] +
3 * x[np.mod(np.arange(len(x)), 3) != 0].sum()+
2 * x[::3].sum() +
x[-1]
)
)
#Output: 1.462654874404461
I have a task:
How many pairs of (i,j): array_1[ i ] + array_1[ j ] > array_2[ i ] + array_2[ j ]
This is my code:
import numpy as np
import pandas as pd
n = 200000
series_1 = np.random.randint(low = 1,high = 1000,size = n)
series_1_T = series_1.reshape(n,1)
series_2 = np.random.randint(low = 1,high = 1000,size = n)
series_2_T = series_2.reshape(n,1)
def differ(x):
count = 0
tabel_1 = series_1 + series_1_T[x:x+2000]
tabel_2 = series_2 + series_2_T[x:x+2000]
diff= tabel_1[tabel_1>tabel_2].shape[0]
count += diff
return count
arr = pd.DataFrame(data = np.arange(0,n,2000),columns = ["numbers"])
count_each_run = arr["numbers"].apply(differ) #this one take about 8min 40s
print(count_each_run.sum())
Are there any ways to speedup this?
If you don't run in memory error you can do:
n = 200_000
s1 = np.random.randint(low=1, high=1000, size=(n,1))
s2 = np.random.randint(low=1, high=1000, size=(n,1))
t1 = s1 + s1.T
t2 = s2 + s2.T
tot = np.sum(t1>t2)
Otherwise you can create batches, and again depending on what you can fit in memory you can use one or two for loops:
n = 200_000
s1 = np.random.randint(low=1, high=1000, size=(n,1))
s2 = np.random.randint(low=1, high=1000, size=(n,1))
bs = 10_000 # batchsize
tot = 0
for i in range(0, n, bs):
for j in range(0, n, bs):
t1 = s1[i:i+bs] + s1[j:j+bs].T
t2 = s2[i:i+bs] + s2[j:j+bs].T
tot += np.sum(t1>t2)
If you need speed you can try something like numba or cython.
I want to calculate the Linear Programming Problem(LPP) for my Max function and its constraints as below
I used the follwing pulp code in python 3.7.
import random
import pulp
import pandas as pd
L1=[5,10,15]
L2=[1,2,3]
L3=[5,6,7]
n = 10
set_I = range(2, n-1)
set_J = range(2, n)
c = {(i,j): random.normalvariate(0,1) for i in set_I for j in set_J}
a = {(i,j): random.normalvariate(0,5) for i in set_I for j in set_J}
l = {(i,j): random.randint(0,10) for i in set_I for j in set_J}
u = {(i,j): random.randint(10,20) for i in set_I for j in set_J}
b = {j: random.randint(0,30) for j in set_J}
e={0 or 1 or 0.5}
I=L1
P=L2
C=L3
opt_model = pulp.LpProblem(name="LPP")
# if x is Continuous
x_vars = {(i,j):
pulp.LpVariable(cat=pulp.LpContinuous,
lowBound=l[i,j], upBound=u[i,j],
name="x_{0}_{1}".format(i,j))
for i in set_I for j in set_J}
# if x is Binary
x_vars = {(i,j):
pulp.LpVariable(cat=pulp.LpBinary, name="x_{0}_{1}".format(i,j))
for i in set_I for j in set_J}
# if x is Integer
x_vars = {(i,j):
pulp.LpVariable(cat=pulp.LpInteger,
lowBound=l[i,j], upBound= u[i,j],
name="x_{0}_{1}".format(i,j))
for i in set_I for j in set_J}
# Less than equal constraints
constraints = {j :
pulp.LpConstraint(
e=pulp.lpSum(a[i,j] * x_vars[i,j] for i in set_I),
sense=pulp.pulp.LpConstraintLE,
rhs=b[j],
name="constraint_{0}".format(j))
for j in set_J}
# >= constraints
constraints = {j :
pulp.LpConstraint(
e=pulp.lpSum(a[i,j] * x_vars[i,j] for i in set_I),
sense=pulp.LpConstraintGE,
rhs=b[j],
name="constraint_{0}".format(j))
for j in set_J}
# == constraints
constraints = {j :
pulp.LpConstraint(
e=pulp.lpSum(a[i,j] * x_vars[i,j] for i in set_I),
sense=pulp.LpConstraintEQ,
rhs=b[j],
name="constraint_{0}".format(j))
for j in set_J}
objective = pulp.lpSum(x_vars[i,j] * ((e*I*(C[i])) + (1-e)* P(i) )
for i in set_I
for j in set_J)
# for maximization
opt_model.sense = pulp.LpMaximize
opt_model.setObjective(objective)
# solving with CBC
opt_model.solve()
# solving with Glpk
opt_model.solve(solver = GLPK_CMD())
opt_df = pd.DataFrame.from_dict(x_vars, orient="index",
columns = ["variable_object"])
opt_df.index =pd.MultiIndex.from_tuples(opt_df.index,names=
["column_i", "column_j"])
opt_df.reset_index(inplace=True)
opt_df["solution_value"] = opt_df["variable_object"].apply(lambda
item: item.varValue)
opt_df.drop(columns=["variable_object"], inplace=True)
opt_df.to_csv("./optimization_solution.csv")
I am beginner to LPP and PULP. So I implemented first 2 equations only up to my knowledge.That code also gives me error as below
TypeError: can't multiply sequence by non-int of type 'set'
How to add the constraints of equations 3 and 4 to my code and resolve my error. Also guide me whether my code is correct or not where I want to modify to satisfy the Max function.. Thanks in advance.
I need to optimize a non-convex problem (max likelihood), and when I try quadratic optmiziation algorithms such as bfgs, Nelder-Mead, it fails to find the extremum, I frequently get saddle point, instead.
You can download data from here.
import numpy as np
import csv
from scipy.stats import norm
f=open('data.csv','r')
reader = csv.reader(f)
headers = next(reader)
column={}
for h in headers:
column[h] = []
for row in reader:
for h,v in zip(headers, row):
column[h].append(float(v))
ini=[-0.0002,-0.01,.002,-0.09,-0.04,0.01,-0.02,-.0004]
for i in range(0,len(x[0])):
ini.append(float(x[0][i]))
x_header = list(Coef_headers)
N = 19 # no of observations
I = 4
P =7
Yobs=np.zeros(N)
Yobs[:] = column['size']
X=np.zeros((N,P))
X[:,0] = column['costTon']
X[:,1] = column['com1']
X[:,2] = column['com3']
X[:,3] = column['com4']
X[:,4] = column['com5']
X[:,5] = column['night']
X[:,6] = 1 #constant
def myfunction(B):
beta = B[0.299,18.495,2.181,2.754,3.59,2.866,-12.846]
theta = 30
U=np.zeros((N,I))
mm=np.zeros(I)
u = np.zeros((N,I))
F = np.zeros((N,I))
G = np.zeros(N)
l = 0
s1 = np.expm1(-theta)
for n in range (0,N):
m = 0
U[n,0] = B[0]*column['cost_van'][n]+ B[4]*column['cap_van'][n]
U[n,1] = B[1]+ B[5]*column['ex'][n]+ B[8]*column['dist'][n]+ B[0]*column['cost_t'][n]+ B[4]*column['cap_t'][n]
U[n,2] = B[2]+ B[6]*column['ex'][n]+ B[9]*column['dist'][n] + B[0]*column['cost_Ht'][n]+ B[4]*column['cap_Ht'][n]
U[n,3] = B[3]+ B[7]*column['ex'][n]+ B[10]*column['dist'][n]+ B[0]*column['cost_tr'][n]+ B[4]*column['cap_tr'][n]
for i in range(0,I):
mm[i]=np.exp(U[n,i])
m= sum(mm)
for i in range(0,I):
u[n,i]=1/(1+ np.exp(U[n,i]- np.log(m-np.exp(U[n,i]))))
F[n,i] = np.expm1(-u[n,i]*theta)
CDF = np.zeros(N)
Y = X.dot(beta)
resid = 0
for n in range (0,N):
resid = resid + (np.square(Yobs[n]-Y[n]))
SSR = resid / N
dof = N - P - 1
s2 = resid/dof # MSE, or variance: the mean squarred error of residuals
for n in range(0,N):
CDF[n] = norm.cdf((Yobs[n]+1),SSR,s2) - norm.cdf((Yobs[n]-1),SSR,s2)
G[n] = np.expm1(-CDF[n]*theta)
k = column['Choice_Veh'][n]-1
l = l + (np.log10(1+(F[n,k]*G[n]/s1))/(-theta))
loglikelihood = np.log10(l)
return -loglikelihood
rranges = np.repeat(slice(-10, 10, 1),11, axis = 0)
a = rranges
from scipy import optimize
resbrute = optimize.brute(myfunction, rranges, full_output=True,finish=optimize.fmin)
print("# global minimum:", resbrute[0])
print("function value at global minimum :", resbrute[1])
Now, I decided to go for grid search and tried scipy.optimize.brute, but I get this error. In fact, my real variables are 47, I decreased it to 31 to work, but still doesn't. please help.
File "C:\...\site-packages\numpy\core\numeric.py", line 1906, in indices
res = empty((N,)+dimensions, dtype=dtype)
ValueError: array is too big.