Need help for defining appropriate constraints - constraint-programming

I'm very new to constraint programming and try to find some real situations to test it.
I found one i think may be solved with CP.
Here it is :
I have a group of kids that i have to assign to some activities.
These kids fill a form where they specify 3 choices in order of preference.
Activities have a max number of participant so, the idea is to find a solution where the choices are respected for the best without exceedind max.
So, in first approach, i defined vars for kids with [1,2,3] for domain (the link between the number of choice, activity and children being known somewhere else).
But then, i don't really know how to define relevant constraints so I have all the permutation (very long) and then, i have to give a note to each (adding the numbers of choices to get the min) and eliminate results with to big groups.
I think there must be a good way to do this using CP but i can't figure it out.
Does someone can help me ?
Thanks

I'm not sure that I understand everything in your description, for example "so I have all the permutation (very long)" and "i have to give a note to each (adding the numbers of choices to get the min)". That said, here is a simple encoding of what I think would be a model of your problem, or at least a starter.
It's written in MiniZinc and is shown below with a small example of 6 kids and 4 activities. The full model (including variants of some constraints) is here as well: http://hakank.org/minizinc/max_activity.mzn
Description of the variables:
"x" is an array of decision variables containing the selected activity for each kid. "scores" is the scores (1, 2, or 3 depending on which activity that was selected) for the selected activity, and "total_score" just sums the "scores" array.
include "globals.mzn";
int: num_kids;
array[1..num_kids, 1..3] of int: prefs;
int: num_activities;
array[1..num_activities] of int: activity_size;
% decision variables
array[1..num_kids] of var 1..num_activities: x; % the selected activity
array[1..num_kids] of var 1..num_activities: scores;
var int: total_score = sum(scores);
solve maximize total_score;
constraint
forall(k in 1..num_kids) (
% select one of the prefered activities
let {
var 1..3: p
} in
x[k] = prefs[k,p] /\
scores[k] = 4-p % score for the selected activity
)
/\ % ensure size of the activities
global_cardinality_low_up(x, [i | i in 1..num_activities], [0 | i in 1..num_activities], activity_size)
;
output [
"x : ", show(x), "\n",
"scores: ", show(scores), "\n",
"total_score: ", show(total_score), "\n",
];
%
% some small fake data
%
num_kids = 6;
num_activities = 4;
% Activity preferences for each kid
prefs = array2d(1..num_kids, 1..3,
[
1,2,3,
4,2,1,
2,1,4,
4,2,1,
3,2,4,
4,1,3
]);
% max size of activity
activity_size = [2,2,2,3];
The solution of this problem instance is:
x : [1, 4, 2, 4, 3, 4]
scores: [3, 3, 3, 3, 3, 3]
total_score: 18
This is a unique solution.
Using a slightly smaller activity_size ([2,2,2,2]) we get another optimal solution (total_score = 17), since there can be just 2 kids in activity #4 (kid #6 is here forced to take activity #1 instead)
x : [1, 4, 2, 4, 3, 1]
scores: [3, 3, 3, 3, 3, 2]
total_score: 17
There is two other possible selections for the second variant, namely
x : [1, 4, 2, 2, 3, 4]
scores: [3, 3, 3, 2, 3, 3]
total_score: 17
----------
x : [1, 2, 2, 4, 3, 4]
scores: [3, 2, 3, 3, 3, 3]
total_score: 17
Update: I also did a Picat model using the same principal approach: http://hakank.org/picat/max_activity.pi .
Update 2: The above model assumes that all kids get some of their preferred activities. When this assumption is not met one have then fix this somehow instead of just throwing a "UNSATISFIED" as an answer. One way is to select some other - not preferred - activity to kid which will yield a score of 0. This is done in this model: http://hakank.org/minizinc/max_activity2.mzn
The changes compared to the original model are small:
the domain of "scores" are 0..num_activities
we add a disjunction "/ scores[k] = 0" to the forall loop that selects the activity
Since this is a maximization problem a score of 0 will not be used unless it is necessary.
I also added a sanity check that there are enough activities for all kids:
constraint
assert(sum(activity_size) >= num_kids, "There is not activities enough for all kids.")
;

Related

Is there an algorithm that can be parallelized for the "unique" problem?

We have a long (about 100,000) two-dimension numpy array.
Like:
A_in =
[[1, 2, 3, 4, 3, 2, 1, …, 100000],
[2, 3, 3, 5, 4, 3, 1, …, 100000]] (edge_index_cpu in code)
You can treat one column as one group here. Every number means a point, one column means the line between these two points.
We need get output, like:
A_out =
(new_edge_indices in code)
and index of these output values in the original array, like:
Idx_out =
[0, 2, 3]
The output group cannot any intersection with all the previous groups. In addition, if the previous group has been removed (like [[2],[3]] above), then the removed group will not be used to calculate the intersection (thus, [[3], [3]] is kept).
It can be easily implemented with a for loop. But because the data is too large for ‘for loop’, we would like to ask for an algorithm that can be parallelized for this problem.
I have tried to use numpy's unique operator from a flatten version of A_in
([1, 2, 2, 3, 3, 3, 4, 5, 3, 4, 2, 3, 1, 1, …]). But it cannot meet this “if the previous group has been removed (like [[2],[3]] above), then the removed group will not be used to calculate the intersection (thus, [[3], [3]] is kept)”.
We want to handle a graph containing edges and points.
edge_index_cpu = edge_index.cpu()
for edge_idx in edge_argsort.tolist():
source = edge_index_cpu[0, edge_idx].item()
if source not in nodes_remaining:
continue
target = edge_index_cpu[1, edge_idx].item()
if target not in nodes_remaining:
continue
new_edge_indices.append(edge_idx)
cluster[source] = i
nodes_remaining.remove(source)
if source != target:
cluster[target] = i
nodes_remaining.remove(target)
i += 1
# The remaining nodes are simply kept.
for node_idx in nodes_remaining:
cluster[node_idx] = i
i += 1
cluster = cluster.to(x.device)
I would not parallelize just yet as your problem can be solved in O(n)which should be fast enough.
definitions
lets consider we got this:
const int pnts=1000000; // max points
const int lins=1000000; // number of lines
int lin[2][lins]; // lines
bool his[pnts]; // histogram of points (used edge?)
int out[pnts],outs=0; // result out[outs]
I am C++/GL oriented so I use indexes starting from zero !!! I used static arrays not to confuse with dynamic allocation or list templates so its easy to understand.
histogram
create histogram for the points used. Its simply a table holding one counter or value per each possible point index. At start clear it. As we do not need to know how many times point is used I chose bool so its just true/false value that tells us if point is already used or not.
so clear this table at start with false:
for (i=0;i<pnts;i++) his[i]=0;
process lines data
simply process all points/lines in their order and update histogram for each point. So take a points index p0/p1 from lin[0/1][i] and test if the both point are already used:
p0=lin[0][i];
p1=lin[1][i];
if ((!his[p0])&&(!his[p1])){ his[p0]=true; his[p1]=true; add i to result }
if they are not add i to the result and set p0,p1 as used in histogram. As you can see this is O(1) I assume you where using for loop linear search for now making your version O(n^2).
Here small O(n) C++ example for this (sorry not a python coder):
void compute()
{
const int pnts=1000000; // max points
const int lins=1000000; // number of lines
int lin[2][lins]; // lines
bool his[pnts]; // histogram of points (used edge?)
int out[pnts],outs=0; // result out[outs]
int i,p0,p1;
// generate data
Randomize();
for (i=0;i<lins;i++)
{
lin[0][i]=Random(pnts);
lin[1][i]=Random(pnts);
}
// clear histogram
for (i=0;i<pnts;i++) his[i]=0;
// compute result O(lins)
for (i=0;i<lins;i++) // process all lines
{
p0=lin[0][i]; // first point of line
p1=lin[1][i]; // second point of line
if ((!his[p0])&&(!his[p1])) // both unused yet?
{
his[p0]=true; // set them as used
his[p1]=true;
out[outs]=i; // add new edge to result list
outs++;
}
}
// here out[outs] holds the result
}
runtime is linear and on my machine it took ~10ms so no need for parallelization.
In case bool is not a single bit you can pack the histogram into unsigned integers using its bits (for example pack 32 points into single 32 bit int variable) to preserve memory. In such case 1M points results in 125000 Bytes table which is not a problem these days
When I feed your data to the code:
int lin[2][lins]= // lines
{
{ 1, 2, 3, 4, 3, 2, 1 },
{ 2, 3, 3, 5, 4, 3, 1 },
};
I got this result:
{ 0, 2, 3 }

how to iterate two different lists parallelly, converges to one

a = [1, 2, 7, 5, 11]
b = [3, 4, 5, 11]
above example is related to
(1)-->(2)-->(7)-->(5)--(11)
/
(3) -->(4) ----
here node 5 is merging point of two list.
For case of guaranteeed common tail:
Iterate both these lists in reverse direction, from the ends, until difference is discovered. Remember indexes of the common tail start.
Now, if needed, traverse beginning of both lists before junction point.
Alternative way - if strict ordering exists:
At every step go ahead in the list with smaller current element

SolverStudio how to reference 1 column in a 2D list in a for loop(PuLP)

I have 2 data sets x1 and x2. I want to be able to get a total sum of all the products of x1 and x2 only in the rows where the From column has Auckland in it.
see here
The final answer should be (5*1) + (2*1) + (3*1) + (4*1) or 14. The PuLP code that I wrote to do this is given below
# Import PuLP modeller functions
from pulp import *
varFinal = sum([x1[a] * x2[a] for a in Arcs if a == Nodes[0]])
print Nodes[0]
print Arcs[0]
Final = varFinal
The output that gets printed to the console is
Auckland
('Auckland', 'Albany')
I realise that my final value is zero because Arcs[some number] does not equal Nodes[some number]. Is there anyway to change the code so my final value is 14?
Any help is appreciated.
Welcome to stack overflow! Cause you've only posted part of your code, I have to guess at what data-types you're using. From the output, I'm guessing your Nodes are strings, and your Arcs are tuples of strings.
Your attempt is very close, you want the from column to have Auckland in it. You can index into a tuple the same way you would into an array, so you want to do: a[0] == Nodes[0].
Below is a self-contained example with the first bit of your data in which outputs the following (note that I've changed to python 3.x print statements (with parentheses)):
Output:
Auckland
('Auckland', 'Albany')
14
Code:
# Import PuLP modeller functions
from pulp import *
# Data
Nodes = ['Auckland',
'Wellington',
'Hamilton',
'Kansas City',
'Christchuch',
'Albany',
'Whangarei',
'Rotorua',
'New Plymouth']
Arcs = [('Auckland','Albany'),
('Auckland','Hamilton'),
('Auckland','Kansas City'),
('Auckland','Christchuch'),
('Wellington','Hamilton'),
('Hamilton','Albany'),
('Kansas City','Whangarei'),
('Christchuch','Rotorua')]
x1_vals = [1, 2, 3, 4, 5, 9, 11, 13]
x2_vals = [5, 1, 1, 1, 1, 1, 1, 1]
x1 = dict((Arcs[i], x1_vals[i]) for i in range(len(Arcs)))
x2 = dict((Arcs[i], x2_vals[i]) for i in range(len(Arcs)))
varFinal = sum([x1[a] * x2[a] for a in Arcs if a[0] == Nodes[0]])
print(Nodes[0])
print(Arcs[0])
print(varFinal)
For future reference, answers are most likely to be forthcoming if you include code which others can try to run (without external data dependencies), that way people can try to run it, fix it, and re-post it.

how to change the type of constraint's arguments in ortools

I don't know my question is possible or not. I am using ortools to solve an optimization problem and I know in the part of conditions the argument should be defined in double type, like this:
constraints[i] = solver.Constraint(0.0 , 10,0)
But my problem is that, I don't want to use this type of argument in creating conditions. For example I want to have a list.
So I wrote this in my code:
constraints[i] = solver.Constraint([1,2,3,...])
And I got this error:
return _pywraplp.Solver_Constraint(self, *args)
NotImplementedError: Wrong number or type of arguments for overloaded
function 'Solver_Constraint'.
Possible C/C++ prototypes are:
operations_research::MPSolver::MakeRowConstraint(double,double)
operations_research::MPSolver::MakeRowConstraint()
operations_research::MPSolver::MakeRowConstraint(double,double,std::string
const &)
operations_research::MPSolver::MakeRowConstraint(std::string const &)
Is there any way to change the type of condition's argument?
My Assumptions
your constraint expression is "a sum of some lists", meaning something along the lines of what the NumPy library does: e.g., if you have two lists of values, [1, 2, 3] and [4, 5, 6], their sum would be element-wise, s.t. [1, 2, 3] + [4, 5, 6] = [1+4, 2+5, 3+6] = [5, 7, 9].
your "list constraint" is also element-wise; e.g., [x1, x2, x3] <= [1, 2, 3] means x1 <= 1, x2 <= 2 and x3 <= 3.
you're using the GLOP Linear Solver. (Everything I say below applies to the ILP/CP/CP-SAT solvers, but some of the particular method names/other details are different.)
My Answer
The thing is, ortools only lets you set scalar values (like numbers) as variables; you can't make a "list variable", so to speak.
Therefore, you'll have to make a list of scalar variables that effectively represents the same thing.
For example, let's say you wanted your "list variable" to be a list of values, each one subjected to a particular constraint which you have stored in a list. Let's say you have a list of upper bounds:
upper_bounds = [1, 2, 3, ..., n]
And you have several lists of solver variables like so:
vars1 = [
# variable bounds here are chosen arbitrarily; set them to your purposes
solver.NumVar(0, solver.infinity, 'x{0}'.format(i))
for i in range(n)
]
vars2 = [...] # you define any other variable lists in the same way
Then, you would make a list of constraint objects, one constraint for each upper bound in your list:
constraints = [
solver.Constraint(0, ubound)
for ubound in upper_bounds
]
And you insert the variables into your constraints however is dictated for your problem:
# Example expression: X1 - X2 + 0.5*X3 < UBOUND
for i in range(n):
constraints[i].SetCoefficient(vars1[i], 1)
constraints[i].SetCoefficient(vars2[i], -1)
constraints[i].SetCoefficient(vars3[i], 0.5)
Hope this helps! I recommend taking (another, if you already have) look at the examples for your particular solver. The one for GLOP can be found here.

State Machine Capable of All Transitions

I am trying to design a state machine which will traverse all possible transitions between states. However, the state machine cannot move from a given state back to itself. From the diagram below, I have worked out that given the number of states (N), the number of transitions is equal to N^2 - N.
Any ideas on how to approach this please?
After having misunderstood the problem the first time, here is another attempt.
So we want to transverse the graph in one go, and we are not allowed to use the same transition twice. The trick is probably to leave a track free to get back to the starting state.
states = 4 # Select number of states
path = [0] # Start in state 0 (must be zero)
def walk(path):
home_state = path[-1]
for i in range(home_state + 2, states):
# We leave a state out that we go to next
path.append(i)
path.append(home_state)
if home_state + 1 < states:
path.append(home_state + 1)
walk(path)
path.append(home_state)
walk(path)
print path
should give
[0, 2, 0, 3, 0, 1, 3, 1, 2, 3, 2, 1, 0]

Resources