What is the CP/MILP problem name for the assignment of agents to tasks with fixed start time and end time? - constraint-programming

I'm trying to solve a Constraint Satisfaction Optimisation Problem that assigns agents to tasks. However, different then the basic Assignment Problem, a agent can be assigned to many tasks if the tasks do not overlap.
Each task has a fixed start_time and end_time.
The agents are assigned to the tasks according to some unary&binary constraints.
Variables = set of tasks
Domain = set of compatible agents (for each variable)
Constraints = unary&binary
Optimisation fct = some liniar function
An example of the problem: the allocation of parking space (or teams) for trucks for which we know the arrival and departure time.
I'm interested if there is in the literature a precise name for these type of problems. I presume it is some kind of assignment problem. Also, if you ever approach the problem, how do you solve it?
Thank you.

I would interpret this as: rectangular assignment-problem with conflicts
which is arguably much more hard (NP-hard in general) than the polynomially-solvable assignment-problem.
The demo shown in the other answer might work and ortools' cp-sat is great, but i don't see a good reason to use discrete-time based reasoning here like it's done: interval-variables, edge-finding and co. based scheduling constraints (+ conflict-analysis / explanations). This stuff is total overkill and the overhead will be noticable. I don't see any need to reason about time, but just about time-induced conflicts.
Edit: One could label those two approaches (linked + proposed) as compact formulation and extended formulation. Extended formulations usually show stronger relaxations and better (solving) results as long as scalability is not an issue. Compact approaches might become more viable again with bigger data (bit it's hard to guess here as scheduling-propagators are not that cheap).
What i would propose:
(1) Formulate an integer-programming model following the basic assignment-problem formulation + adaptions to make it rectangular -> a worker is allowed to tackle multiple tasks while all tasks are tackled (one sum-equality dropped)
see wiki
(2) Add integrality = mark variables as binary -> because the problem is not satisfying total unimodularity anymore
(3) Add constraints to forbid conflicts
(4) Add constraints: remaining stuff (e.g. compatibility)
Now this is all straightforward, but i would propose one non-naive improvement in regards to (3):
The conflicts can be interpreted as stable-set polytope
Your conflicts are induced by a-priori defined time-windows and their overlappings (as i interpret it; this is the core assumption behind this whole answer)
This is an interval graph (because of time-windows)
All interval graphs are chordal
Chordal graphs allow enumeration of all max-cliques in poly-time (implying there are only polynomial many)
python: networkx.algorithms.chordal.chordal_graph_cliques
The set (enumeration) of all maximal cliques define the facets of the stable-set polytope
Those (a constraint for each element in the set) we add as constraints!
(The stable-set polytope on the graph in use here would also allow very very powerful semidefinite-relaxations but it's hard to foresee in which cases this would actually help due to SDPs being much more hard to work with: warmstart within tree-search; scalability; ...)
This will lead to a poly-size integer-programming problem which should be very very good when using a good IP-solver (commercials or if open-source needed: Cbc > GLPK).
Small demo about (3)
import itertools
import networkx as nx
# data: inclusive, exclusive
# --------------------------
time_windows = [
(2, 7),
(0, 10),
(6, 12),
(12, 20),
(8, 12),
(16, 20)
]
# helper
# ------
def is_overlapping(a, b):
return (b[1] > a[0] and b[0] < a[1])
# raw conflicts
# -------------
binary_conflicts = []
for a, b in itertools.combinations(range(len(time_windows)), 2):
if is_overlapping(time_windows[a], time_windows[b]):
binary_conflicts.append( (a, b) )
# conflict graph
# --------------
G = nx.Graph()
G.add_edges_from(binary_conflicts)
# maximal cliques
# ---------------
max_cliques = nx.chordal_graph_cliques(G)
print('naive constraints: raw binary conflicts')
for i in binary_conflicts:
print('sum({}) <= 1'.format(i))
print('improved constraints: clique-constraints')
for i in max_cliques:
print('sum({}) <= 1'.format(list(i)))
Output:
naive constraints: raw binary conflicts
sum((0, 1)) <= 1
sum((0, 2)) <= 1
sum((1, 2)) <= 1
sum((1, 4)) <= 1
sum((2, 4)) <= 1
sum((3, 5)) <= 1
improved constraints: clique-constraints
sum([1, 2, 4]) <= 1
sum([0, 1, 2]) <= 1
sum([3, 5]) <= 1
Fun facts:
Commercial integer-programming solvers and maybe even Cbc might even try to do the same reasoning about clique-constraints to some degree although without the assumption of chordality where it's an NP-hard problem
ortools' cp-sat solver has also a code-path for this (again: general NP-hard case)
Should trigger when expressing the conflict-based model (much harder to decide on this exploitation on general discrete-time based scheduling models)
Caveats
Implementation / Scalability
There are still open questions like:
duplicating max-clique constraints over each worker vs. merging them somehow
be more efficient/clever in finding conflicts (sorting)
will it scale to the data: how big will the graph be / how many conflicts and constraints from those do we need
But those things usually follow instance-statistics (aka "don't decide blindly").

I don't know a name for the specific variant you're describing - maybe others would. However, this indeed seems a good fit for a CP/MIP solver; I would go with the OR-Tools CP-SAT solver, which is free, flexible and usually works well.
Here's a reference implementation with Python, assuming each vehicle requires a team assigned to it with no overlaps, and that the goal is to minimize the number of teams in use.
The framework allows to directly model allowed / forbidden assignments (check out the docs)
from ortools.sat.python import cp_model
model = cp_model.CpModel()
## Data
num_vehicles = 20
max_teams = 10
# Generate some (semi-)interesting data
interval_starts = [i % 9 for i in range(num_vehicles)]
interval_len = [ (num_vehicles - i) % 6 for i in range(num_vehicles)]
interval_ends = [ interval_starts[i] + interval_len[i] for i in range(num_vehicles)]
### variables
# t, v is true iff vehicle v is served by team t
team_assignments = {(t, v): model.NewBoolVar("team_assignments_%i_%i" % (t, v)) for t in range(max_teams) for v in range(num_vehicles)}
#intervals for vehicles. Each interval can be active or non active, according to team_assignments
vehicle_intervals = {(t, v): model.NewOptionalIntervalVar(interval_starts[v], interval_len[v], interval_ends[v], team_assignments[t, v], 'vehicle_intervals_%i_%i' % (t, v))
for t in range(max_teams) for v in range(num_vehicles)}
team_in_use = [model.NewBoolVar('team_in_use_%i' % (t)) for t in range(max_teams)]
## constraints
# non overlap for each team
for t in range(max_teams):
model.AddNoOverlap([vehicle_intervals[t, v] for v in range(num_vehicles)])
# each vehicle must be served by exactly one team
for v in range(num_vehicles):
model.Add(sum(team_assignments[t, v] for t in range(max_teams)) == 1)
# what teams are in use?
for t in range(max_teams):
model.AddMaxEquality(team_in_use[t], [team_assignments[t, v] for v in range(num_vehicles)])
#symmetry breaking - use teams in-order
for t in range(max_teams-1):
model.AddImplication(team_in_use[t].Not(), team_in_use[t+1].Not())
# let's say that the goal is to minimize the number of teams required
model.Minimize(sum(team_in_use))
solver = cp_model.CpSolver()
# optional
# solver.parameters.log_search_progress = True
# solver.parameters.num_search_workers = 8
# solver.parameters.max_time_in_seconds = 5
result_status = solver.Solve(model)
if (result_status == cp_model.INFEASIBLE):
print('No feasible solution under constraints')
elif (result_status == cp_model.OPTIMAL):
print('Optimal result found, required teams=%i' % (solver.ObjectiveValue()))
elif (result_status == cp_model.FEASIBLE):
print('Feasible (non optimal) result found')
else:
print('No feasible solution found under constraints within time')
# Output:
#
# Optimal result found, required teams=7
EDIT:
#sascha suggested a beautiful approach for analyzing the (known in advance) time window overlaps, which would make this solvable as an assignment problem.
So while the formulation above might not be the optimal one for this (although it could be, depending on how the solver works), I've tried to replace the no-overlap conditions with the max-clique approach suggested - full code below.
I did some experiments with moderately large problems (100 and 300 vehicles), and it seems empirically that on smaller problems (~100) this does improve by some - about 15% on average on the time to optimal solution; but I could not find a significant improvement on the larger (~300) problems. This might be either because my formulation is not optimal; because the CP-SAT solver (which is also a good IP solver) is smart enough; or because there's something I've missed :)
Code:
(this is basically the same code from above, with the logic to support using the network approach instead of the no-overlap one copied from #sascha's answer):
from timeit import default_timer as timer
from ortools.sat.python import cp_model
model = cp_model.CpModel()
run_start_time = timer()
## Data
num_vehicles = 300
max_teams = 300
USE_MAX_CLIQUES = True
# Generate some (semi-)interesting data
interval_starts = [i % 9 for i in range(num_vehicles)]
interval_len = [ (num_vehicles - i) % 6 for i in range(num_vehicles)]
interval_ends = [ interval_starts[i] + interval_len[i] for i in range(num_vehicles)]
if (USE_MAX_CLIQUES):
## Max-cliques analysis
# for the max-clique approach
time_windows = [(interval_starts[i], interval_ends[i]) for i in range(num_vehicles)]
def is_overlapping(a, b):
return (b[1] > a[0] and b[0] < a[1])
# raw conflicts
# -------------
binary_conflicts = []
for a, b in itertools.combinations(range(len(time_windows)), 2):
if is_overlapping(time_windows[a], time_windows[b]):
binary_conflicts.append( (a, b) )
# conflict graph
# --------------
G = nx.Graph()
G.add_edges_from(binary_conflicts)
# maximal cliques
# ---------------
max_cliques = nx.chordal_graph_cliques(G)
##
### variables
# t, v is true iff point vehicle v is served by team t
team_assignments = {(t, v): model.NewBoolVar("team_assignments_%i_%i" % (t, v)) for t in range(max_teams) for v in range(num_vehicles)}
#intervals for vehicles. Each interval can be active or non active, according to team_assignments
vehicle_intervals = {(t, v): model.NewOptionalIntervalVar(interval_starts[v], interval_len[v], interval_ends[v], team_assignments[t, v], 'vehicle_intervals_%i_%i' % (t, v))
for t in range(max_teams) for v in range(num_vehicles)}
team_in_use = [model.NewBoolVar('team_in_use_%i' % (t)) for t in range(max_teams)]
## constraints
# non overlap for each team
if (USE_MAX_CLIQUES):
overlap_constraints = [list(l) for l in max_cliques]
for t in range(max_teams):
for l in overlap_constraints:
model.Add(sum(team_assignments[t, v] for v in l) <= 1)
else:
for t in range(max_teams):
model.AddNoOverlap([vehicle_intervals[t, v] for v in range(num_vehicles)])
# each vehicle must be served by exactly one team
for v in range(num_vehicles):
model.Add(sum(team_assignments[t, v] for t in range(max_teams)) == 1)
# what teams are in use?
for t in range(max_teams):
model.AddMaxEquality(team_in_use[t], [team_assignments[t, v] for v in range(num_vehicles)])
#symmetry breaking - use teams in-order
for t in range(max_teams-1):
model.AddImplication(team_in_use[t].Not(), team_in_use[t+1].Not())
# let's say that the goal is to minimize the number of teams required
model.Minimize(sum(team_in_use))
solver = cp_model.CpSolver()
# optional
solver.parameters.log_search_progress = True
solver.parameters.num_search_workers = 8
solver.parameters.max_time_in_seconds = 120
result_status = solver.Solve(model)
if (result_status == cp_model.INFEASIBLE):
print('No feasible solution under constraints')
elif (result_status == cp_model.OPTIMAL):
print('Optimal result found, required teams=%i' % (solver.ObjectiveValue()))
elif (result_status == cp_model.FEASIBLE):
print('Feasible (non optimal) result found, required teams=%i' % (solver.ObjectiveValue()))
else:
print('No feasible solution found under constraints within time')
print('run time: %.2f sec ' % (timer() - run_start_time))

Related

Python Quadruple For Performance

I have code that reads data and grabs specific data from an object's fields.
How can I eliminate the quadruple for loop here? Its performance seems quite slow.
data = readnek(filename) # read in data
bigNum=200000
for myNodeVal in range(0, 7): # all 6 elements.
cs_coords = np.ones((bigNum, 2)) # initialize data
counter = 0
for iel in range(bigNum):
for ix in range(0,7):
for iy in range(0,7):
z = data.elem[iel].pos[2, myNodeVal, iy, ix]
x = data.elem[iel].pos[0, myNodeVal, iy, ix]
y = data.elem[iel].pos[1, myNodeVal, iy, ix]
cs_coords[counter, 0:2] = [x, y]
counter += 1
You can remove the two innermost loops using a transposed view that is reshaped so to build a block of 49 [x, y] values then assigned to cs_coords in a vectorized way. The access to z can be removed for better performance (since the Python interpreter optimize nearly nothing). Here is an (untested) example:
data = readnek(filename) # read in data
bigNum=200000
for myNodeVal in range(0, 7): # all 6 elements.
cs_coords = np.ones((bigNum, 2)) # initialize data
counter = 0
for iel in range(bigNum):
arr = data.elem[iel].pos
view_x = arr[0, myNodeVal, 0:7, 0:7].T
view_y = arr[1, myNodeVal, 0:7, 0:7].T
cs_coords[counter:counter+49] = np.hstack([view_x.reshape(-1, 1), view_y.reshape(-1, 1)])
counter += 49
Note that the initial code is probably flawed since cs_coords.shape[0] is bigNum and counter will be bigNum * 49. You certainly need to use the shape (bigNum*49, 2) instead so to avoid out of bound errors.
Note the above code is still far from being optimal since it will create many small arrays and Numpy is not optimized to deal with very small arrays (CPython neither). It is hard to do much better without more information on data. Using Numba or Cython can certainly help a lot to speed up this code. Still, even with such tool, the code will not be very efficient since the memory access pattern is inefficient (bad cache locality) and the overall code will be memory-bound.

Using multivalue DDs to solve multistate reliability quantification

#DCTLib, do you recall this discussion below? You suggested a recursive equation, which was the right approach.
Cudd_PrintMinterm, accessing the individual minterms in the sum of products
Now, I am considering multistate reliability, where we can have either not fail or fail to n-1 different states, with n >= 2. Tulip-dd implements MDDs as described in:
https://github.com/tulip-control/dd/blob/master/doc.md#multi-valued-decision-diagrams-mdd
https://github.com/tulip-control/dd/issues/71
https://github.com/tulip-control/dd/issues/66
In the diagrams in the drawings below, we have defined an MDD declared by:
aut.declare_variable(x=(0,3))
u = aut.add_expr(‘x=i’)
Each value/state of the multi-value variable (MSV) x, x=0, x=1, x=2, or x=3 leads to a specific BDD as shown in the diagrams at the bottom, taking a four-state variable x as example here. The notation is that state 0 represents the normal state and x can fail to different states 1, 2, and 3. The failure probabilities are assigned in table below. In the BDDs below, we (and tulip as well) use the binary coding with two bits x_1 and x_0 to represent each state/value of the MSV. The least significant bit (LSB), i.e., x_0, is always the ancestor. Each of the BDD diagrams below is a representation of a specific value, or state.
To quantify the BDD of a specific state, i.e., the top node, we must know probabilities of binary variables x_0 and x_1 taking different branches (then or else) in the BDD. These branch probabilities are not given directly but need to be calculated according to the BDD structure.
The key here is that the child node probabilities and the branch probabilities of the parent node must be known prior to the calculation of the parent node probability.
In the previous BDD quantification, we knew the probabilities of branches from node x_1 to leaf nodes when calculating node x_1 probability. We did not need to know how node x_1 was connected to node x_0.
Now, for this four-state variable x, we need to know how node x_1 is connected to node x_0, the binary variable representing the least significant bit, to determine the probabilities of branches from node x_1 to leaf nodes. The question is how to implement this?
Here’s one attempt that falls short:
import numpy as np
from omega.symbolic import temporal as trl
aut = trl.Automaton()
# Example of function that returns branch probabilities
def prntr(v, pvars):
assert sum(pvars)==1.0,"Probs must add to 1!"
if (v.high).var == 'x_1':
if (v.low) == aut.true:
return pvars[1] + pvars[3], pvars[1]
else:
return pvars[1] + pvars[3], pvars[3]
if (v.low).var == 'x_1':
if (v.low).negated:
return pvars[1] + pvars[3], pvars[0]
else:
return pvars[1] + pvars[3], pvars[2]
aut.declare_variables(x=(0, 3))
u=aut.add_expr('x = 3')
pvars = np.array([0.1, 0.2, 0.3, 0.4])
prntr(u,pvars)
The key here is that the child node probabilities and the branch probabilities of the parent node must be known prior to the calculation of the parent node probability.
Yes, exactly. In this case, a fully recursive bottom-up computation, like normally done with BDDs, will not work for the reason that you wrote.
However, the approach will start to work again when you treat the variables that together form a state to be a block. So in your recursive function for the probability calculation, whenever you encounter a variable for a block, you treat the node and the successor nodes for the same state component as a block and only recurse when you encounter a node not belonging to the block.
Note that this approach requires that the variables for the state appear continuously in the variable ordering. For the CUDD library, you can constrain the automatic variable reordering to guarantee this.
The following code is a modification of yours implementing this idea:
#!/usr/bin/env python3
import numpy as np
from omega.symbolic import temporal as trl
aut = trl.Automaton()
# Example of function that returns branch probabilities
# Does not yet use result caching and does not yet support assigning probabilities
# to more than one state variable set
def prntr(v, pvars):
assert abs(sum(pvars)-1.0)<=0.0001,"Probs must add to 1!"
if v == aut.true:
return 1.0
elif v == aut.false:
return 0.0
if v.var in ["x_0","x_1"]:
thisSum = 0.0
# Compute probabilities
for i,p in enumerate(pvars):
# Find successor
# Assuming that x_2 always comes after x_1
thisV = v
negate = thisV.negated
if thisV.var == 'x_0':
if i & 1:
thisV = thisV.high
else:
thisV = thisV.low
negate = negate ^ thisV.negated
if thisV.var == 'x_1':
if i & 2:
thisV = thisV.high
else:
thisV = thisV.low
if negate:
thisSum += p*prntr(~thisV, pvars)
else:
thisSum += p*prntr(thisV, pvars)
return thisSum
# TODO: What is the semantics for variables outside of the current block?
return 0.5*prntr(v.high, pvars) + 0.5*prntr(v.low, pvars)
pvars = np.array([0.1, 0.2, 0.3, 0.4])
aut.declare_variables(x=(0, 3))
u= aut.add_expr('x = 0')
print(prntr(u,pvars))
u2 = aut.add_expr('x = 3') | aut.add_expr('x = 2')
print(prntr(u2,pvars))

Using Pyomo Kernel to solve Disjunctive models

I am trying to recreate a problem in the "Pyomo - Optimization Modeling in Python" book using the pyomo kernel instead of the environ. The problem is on page 163 and called "9.4 A mixing problem with semi-continuous variables." For those without the book, here it is:
The following model illustrates a simple mixing problem with three semi-continuous
variables (x1, x2, x3) which represent quantities that are mixed to meet a volumetric
constraint. In this simple example, the number of sources is minimized:
from pyomo.environ import *
from pyomo.gdp import *
L = [1,2,3]
U = [2,4,6]
index = [0,1,2]
model = ConcreteModel()
model.x = Var(index, within=Reals, bounds=(0,20))
# Each disjunct is a semi-continuous variable
# x[k] == 0 or L[k] <= x[k] <= U[k]
def d_rule(block, k, i):
m = block.model()
if i == 0:
block.c = Constraint(expr=m.x[k] == 0)
else:
block.c = Constraint(expr=L[k] <= m.x[k] <= U[k])
model.d = Disjunct(index, [0,1], rule=d_rule)
# There are three disjunctions
def D_rule(block, k):
model = block.model()
return [model.d[k,0], model.d[k,1]]
model.D = Disjunction(index, rule=D_rule)
# Minimize the number of x variables that are nonzero
model.o = Objective(expr=sum(model.d[k,1].indicator_var for k in index))
# Satisfy a demand that is met by these variables
model.c = Constraint(expr=sum(model.x[k] for k in index) >= 7)
I need to refactor this problem to work in the pyomo kernel, but the kernel is not yet compatible with the pyomo gdp used to transform disjunctive models to linear ones. Has anyone ran into this problem, and if so did you find a good method to solve disjunctive models in the pyomo kernel?
I have a partial rewrite of pyomo.gdp that I could make available on a public github branch (probably working, but lacks testing). However, I am weary of investing more time in rewrites like this, as the better approach would be to re-implement the standard pyomo.environ api on top of kernel, which would make all of the extensions compatible.
With that being said, If there are collaborators willing to share in some of the development and testing, I would be happy help complete the kernel-gdp version I've started. If you want to discuss this further, it would probably be best to open an issue on the Pyomo Github page.

mutual visibility of nodes on a grid

I have a simple grid, and I need check two nodes for mutual visibility. All walls and nodes coordinations is known. I need check two nodes for mutual visibility.
I have tried use vectors, but I didn't get acceptable result. This algorithm works, but it bad fit in my program, because of this i must do transformations of data to get acceptable result.
I used this code for check nodes for mutual visibility:
def finding_vector_grid(start, goal):
distance = [start[0]-goal[0], start[1]-goal[1]]
norm = math.sqrt(distance[0] ** 2 + distance[1] ** 2)
if norm == 0: return [1, 1]
direction = [(distance[0]/norm), (distance[1]/norm)]
return direction
def finding_vector_path(start, goal):
path = [start]
direction = finding_vector_grid((start[0]*cell_width, start[1]*cell_height),
(goal[0]*cell_width, goal[1]*cell_height))
x, y = start[0]*cell_width, start[1]*cell_height
point = start
while True:
if point not in path and in_map(point):
path.append(point)
elif not in_map(point):
break
x -= direction[0]
y -= direction[1]
point = (x//cell_width, y//cell_height)
return path
def vector_obstacles_clean(path, obstacles):
result = []
for node in path:
if node in obstacles:
result.append(node)
break
result.append(node)
return result
for example:
path = finding_vector_path((0, 0), (0, 5))
path = vector_obstacles_clean(path, [(0, 3)])
in_map - check if point not abroad map frontiers;
start, goal - tuples width x and y coords;
cell_width, cell_height - int variables with node width and height in pixels (I use pygame for visualization graph).
I have not any problems with this method, but it works not with graphs, it works "by itself", it not quite the that I need to. I am not good at English, please forgive me :)
The code you posted seems perfectly nice,
and your question doesn't clarify what needs improving.
Rather than doing FP arithmetic on vectors,
you might prefer to increment an integer X or Y pointer
one pixel at a time.
Consider using Bresenham's line algorithm,
which enumerates pixels in the line of sight
between start and goal.
The key observation is that for a given slope
it notices whether X or Y will increment faster,
and loops on that index.

Minizinc lazyfd solution ignores constraint

The problem is to find a schedule for some people to play golf (or whatever) in groups of a fixed size.
We have to guarantee that every player is only in one group at a time.
Here is my code:
int: gr; % number of groups
int: sz; % size of groups
int: we; % number of weeks
int: n=gr*sz; % number of players
set of int: G=1..gr; % set of group indices
set of int: P=1..n; % set of players
set of int: W=1..we; % set of weeks
% test instance
gr = 2;
sz = 2;
we = 2;
array[G,W] of var set of P: X;
%X[g,w] is the set of people that form group with index g in week w
% forall group x, |x| = sz
constraint forall (g in G, w in W)
(card (X[g,w]) = sz);
% one person cannot play in two groups simultaneously
constraint forall (g in G, w in W, p in X[g,w], g2 in (g+1..gr))
(not(p in X[g2,w]));
solve satisfy;
My problem now is that if I use the G12 lazyfd solver, i.e.
$ minizinc -b lazy this.mzn
I get
X = array2d(1..2 ,1..2 ,[1..2, 1..2, 1..2, 1..2]);
----------
which seems to ignore my second constraint.
On the other hand, using G12 without the lazy option, i.e.
$ minizinc this.mzn
yields
X = array2d(1..2 ,1..2 ,[1..2, 1..2, 3..4, 3..4]);
----------
which is correct.
G12 MIP and Gecode also give a correct result.
How is this possible? And how can I use the lazy solver such that I can rely on it? Or is it just my installation that is messed up somehow?
G12/lazyFD is known to be broken in various places. The problem is that the G12 solvers are no longer being developed and will most likely be removed from the distributions soon.
I would offer Chuffed as an alternative. Chuffed is FD solver written in C++ with lazy clause generation. It should be correct and will perform better then the G12 solver (at least when the solutions are correct).
Chuffed and other MiniZinc solvers can be found on the software page of the MiniZinc website.

Resources