I am trying to recreate a problem in the "Pyomo - Optimization Modeling in Python" book using the pyomo kernel instead of the environ. The problem is on page 163 and called "9.4 A mixing problem with semi-continuous variables." For those without the book, here it is:
The following model illustrates a simple mixing problem with three semi-continuous
variables (x1, x2, x3) which represent quantities that are mixed to meet a volumetric
constraint. In this simple example, the number of sources is minimized:
from pyomo.environ import *
from pyomo.gdp import *
L = [1,2,3]
U = [2,4,6]
index = [0,1,2]
model = ConcreteModel()
model.x = Var(index, within=Reals, bounds=(0,20))
# Each disjunct is a semi-continuous variable
# x[k] == 0 or L[k] <= x[k] <= U[k]
def d_rule(block, k, i):
m = block.model()
if i == 0:
block.c = Constraint(expr=m.x[k] == 0)
else:
block.c = Constraint(expr=L[k] <= m.x[k] <= U[k])
model.d = Disjunct(index, [0,1], rule=d_rule)
# There are three disjunctions
def D_rule(block, k):
model = block.model()
return [model.d[k,0], model.d[k,1]]
model.D = Disjunction(index, rule=D_rule)
# Minimize the number of x variables that are nonzero
model.o = Objective(expr=sum(model.d[k,1].indicator_var for k in index))
# Satisfy a demand that is met by these variables
model.c = Constraint(expr=sum(model.x[k] for k in index) >= 7)
I need to refactor this problem to work in the pyomo kernel, but the kernel is not yet compatible with the pyomo gdp used to transform disjunctive models to linear ones. Has anyone ran into this problem, and if so did you find a good method to solve disjunctive models in the pyomo kernel?
I have a partial rewrite of pyomo.gdp that I could make available on a public github branch (probably working, but lacks testing). However, I am weary of investing more time in rewrites like this, as the better approach would be to re-implement the standard pyomo.environ api on top of kernel, which would make all of the extensions compatible.
With that being said, If there are collaborators willing to share in some of the development and testing, I would be happy help complete the kernel-gdp version I've started. If you want to discuss this further, it would probably be best to open an issue on the Pyomo Github page.
Related
I want to solve the following optimization problem using Gekko in python 3.7 window version.
Original Problem
Here, x_s are continuous variables, D and Epsilon are deterministic and they are also parameters.
However, since minimization function exists in the objective function, I remove it using binary variables(z1, z2) and then the problem becomes MINLP as follows.
Modified problem
With Gekko,
(1) Can both original problem & modified problem be solved?
(2) How can I code summation in the objective function and also D & epsilon which are parameters in Gekko?
Thanks in advance.
Both problems should be feasible with Gekko but the original appears easier to solve. Here are a few suggestions for the original problem:
Use m.Maximize() for the objective
Use sum() for the inner summation and m.sum() for outer summation for the objective function. I switch to m.sum() when the summation would create an expression that is over 15,000 characters. Using sum() creates one long expression and m.sum() breaks the summation into pieces but takes longer to compile.
Use m.min3() for the min(Dt,xs) terms or slack variables s with x[i]+s[i]=D[i]. It appears that Dt (size 30) is an upper bound, but it has different dimensions that xs (size 100). Slack variables are much more efficient than using binary variables.
D = np.array(100)
x = m.Array(m.Var,100,lb=0,ub=2000000)
The modified problem has 6000 binary variables and 100 continuous variables. There are 2^6000 potential combinations of those variables so it may take a while to solve, even with the efficient branch and bound method of APOPT. Here are a few suggestions for the modified problem:
Use matrix multiplications when possible. Below is an example of matrix operations with Gekko.
from gekko import GEKKO
import numpy as np
m = GEKKO(remote=False)
ni = 3; nj = 2; nk = 4
# solve AX=B
A = m.Array(m.Var,(ni,nj),lb=0)
X = m.Array(m.Var,(nj,nk),lb=0)
AX = np.dot(A,X)
B = m.Array(m.Var,(ni,nk),lb=0)
# equality constraints
m.Equations([AX[i,j]==B[i,j] for i in range(ni) \
for j in range(nk)])
m.Equation(5==m.sum([m.sum([A[i][j] for i in range(ni)]) \
for j in range(nj)]))
m.Equation(2==m.sum([m.sum([X[i][j] for i in range(nj)]) \
for j in range(nk)]))
# objective function
m.Minimize(m.sum([m.sum([B[i][j] for i in range(ni)]) \
for j in range(nk)]))
m.solve()
print(A)
print(X)
print(B)
Declare z1 and z2 variables as integer type with integer=True. Here is more information on using the integer type.
Solve locally with m=GEKKO(remote=False). The processing time will be large and the public server resets connections and deletes jobs every day. Switch to local mode to avoid a potential disruption.
#DCTLib, do you recall this discussion below? You suggested a recursive equation, which was the right approach.
Cudd_PrintMinterm, accessing the individual minterms in the sum of products
Now, I am considering multistate reliability, where we can have either not fail or fail to n-1 different states, with n >= 2. Tulip-dd implements MDDs as described in:
https://github.com/tulip-control/dd/blob/master/doc.md#multi-valued-decision-diagrams-mdd
https://github.com/tulip-control/dd/issues/71
https://github.com/tulip-control/dd/issues/66
In the diagrams in the drawings below, we have defined an MDD declared by:
aut.declare_variable(x=(0,3))
u = aut.add_expr(‘x=i’)
Each value/state of the multi-value variable (MSV) x, x=0, x=1, x=2, or x=3 leads to a specific BDD as shown in the diagrams at the bottom, taking a four-state variable x as example here. The notation is that state 0 represents the normal state and x can fail to different states 1, 2, and 3. The failure probabilities are assigned in table below. In the BDDs below, we (and tulip as well) use the binary coding with two bits x_1 and x_0 to represent each state/value of the MSV. The least significant bit (LSB), i.e., x_0, is always the ancestor. Each of the BDD diagrams below is a representation of a specific value, or state.
To quantify the BDD of a specific state, i.e., the top node, we must know probabilities of binary variables x_0 and x_1 taking different branches (then or else) in the BDD. These branch probabilities are not given directly but need to be calculated according to the BDD structure.
The key here is that the child node probabilities and the branch probabilities of the parent node must be known prior to the calculation of the parent node probability.
In the previous BDD quantification, we knew the probabilities of branches from node x_1 to leaf nodes when calculating node x_1 probability. We did not need to know how node x_1 was connected to node x_0.
Now, for this four-state variable x, we need to know how node x_1 is connected to node x_0, the binary variable representing the least significant bit, to determine the probabilities of branches from node x_1 to leaf nodes. The question is how to implement this?
Here’s one attempt that falls short:
import numpy as np
from omega.symbolic import temporal as trl
aut = trl.Automaton()
# Example of function that returns branch probabilities
def prntr(v, pvars):
assert sum(pvars)==1.0,"Probs must add to 1!"
if (v.high).var == 'x_1':
if (v.low) == aut.true:
return pvars[1] + pvars[3], pvars[1]
else:
return pvars[1] + pvars[3], pvars[3]
if (v.low).var == 'x_1':
if (v.low).negated:
return pvars[1] + pvars[3], pvars[0]
else:
return pvars[1] + pvars[3], pvars[2]
aut.declare_variables(x=(0, 3))
u=aut.add_expr('x = 3')
pvars = np.array([0.1, 0.2, 0.3, 0.4])
prntr(u,pvars)
The key here is that the child node probabilities and the branch probabilities of the parent node must be known prior to the calculation of the parent node probability.
Yes, exactly. In this case, a fully recursive bottom-up computation, like normally done with BDDs, will not work for the reason that you wrote.
However, the approach will start to work again when you treat the variables that together form a state to be a block. So in your recursive function for the probability calculation, whenever you encounter a variable for a block, you treat the node and the successor nodes for the same state component as a block and only recurse when you encounter a node not belonging to the block.
Note that this approach requires that the variables for the state appear continuously in the variable ordering. For the CUDD library, you can constrain the automatic variable reordering to guarantee this.
The following code is a modification of yours implementing this idea:
#!/usr/bin/env python3
import numpy as np
from omega.symbolic import temporal as trl
aut = trl.Automaton()
# Example of function that returns branch probabilities
# Does not yet use result caching and does not yet support assigning probabilities
# to more than one state variable set
def prntr(v, pvars):
assert abs(sum(pvars)-1.0)<=0.0001,"Probs must add to 1!"
if v == aut.true:
return 1.0
elif v == aut.false:
return 0.0
if v.var in ["x_0","x_1"]:
thisSum = 0.0
# Compute probabilities
for i,p in enumerate(pvars):
# Find successor
# Assuming that x_2 always comes after x_1
thisV = v
negate = thisV.negated
if thisV.var == 'x_0':
if i & 1:
thisV = thisV.high
else:
thisV = thisV.low
negate = negate ^ thisV.negated
if thisV.var == 'x_1':
if i & 2:
thisV = thisV.high
else:
thisV = thisV.low
if negate:
thisSum += p*prntr(~thisV, pvars)
else:
thisSum += p*prntr(thisV, pvars)
return thisSum
# TODO: What is the semantics for variables outside of the current block?
return 0.5*prntr(v.high, pvars) + 0.5*prntr(v.low, pvars)
pvars = np.array([0.1, 0.2, 0.3, 0.4])
aut.declare_variables(x=(0, 3))
u= aut.add_expr('x = 0')
print(prntr(u,pvars))
u2 = aut.add_expr('x = 3') | aut.add_expr('x = 2')
print(prntr(u2,pvars))
I want to implement the epsilon constraint method, where I need to have multiple models with similar variables and almost identical constraints. I was wondering how I can define a variable (or constraint) that I can use in all models. For example, please assume that I want to add a binary variable "x" and "con1" to two models ("mdl1" and "mdl2"). I have coded this problem as below, but it is not working. Would you please help me?
from docplex.mp.model import Model
# Model names
mdl1 = Model("OBJ1")
mdl2 = Model("OBJ2")
# set_idx1 is defined here.
# Variables
x = mdl1.binary_var_dict(set_idx1, name="x")
x = mdl2.binary_var_dict(set_idx1, name="x")
Moreover, how should I define constraints to prevent duplicate efforts? Thanks!
In Optimization easy with python
let me share the example clone a linear model
from docplex.mp.model import Model
mdl = Model(name='buses')
nbbus40 = mdl.integer_var(name='nbBus40')
nbbus30 = mdl.integer_var(name='nbBus30')
mdl.add_constraint(nbbus40*40 + nbbus30*30 >= 300, 'kids')
mdl.minimize(nbbus40*500 + nbbus30*400)
mdl.solve(log_output=False,)
mdlclone=mdl.clone()
mdlequal=mdl;
# set upper bound for nbbus40 to 0
nbbus40.ub=0
mdlclone.solve(log_output=False,)
mdlequal.solve(log_output=False,)
print("clone")
for v in mdlclone.iter_integer_vars():
print(v," = ",v.solution_value)
print("= operator")
for v in mdlequal.iter_integer_vars():
print(v," = ",v.solution_value)
which gives
clone
nbBus40 = 6.0
nbBus30 = 2.0
= operator
nbBus40 = 0
nbBus30 = 10.0
About using a modeling artefact in different models: this is not possible, each artefact belong to one parent model. Why would you want to do it?
About the second question, in order to avoid code duplication in writing a complex constraint (and you are right about this), I suggest writing a function that takes (at least) the model as argument, plus other input arguments.
A very simple example:
def new_ct_sum1(mdl, x, y):
# returns a constraint stating x+y ==1
# assumes x,y are in model mdl, otherwise an error is raised.
return mdl.add(x + y == 1)
I'm trying to solve a Constraint Satisfaction Optimisation Problem that assigns agents to tasks. However, different then the basic Assignment Problem, a agent can be assigned to many tasks if the tasks do not overlap.
Each task has a fixed start_time and end_time.
The agents are assigned to the tasks according to some unary&binary constraints.
Variables = set of tasks
Domain = set of compatible agents (for each variable)
Constraints = unary&binary
Optimisation fct = some liniar function
An example of the problem: the allocation of parking space (or teams) for trucks for which we know the arrival and departure time.
I'm interested if there is in the literature a precise name for these type of problems. I presume it is some kind of assignment problem. Also, if you ever approach the problem, how do you solve it?
Thank you.
I would interpret this as: rectangular assignment-problem with conflicts
which is arguably much more hard (NP-hard in general) than the polynomially-solvable assignment-problem.
The demo shown in the other answer might work and ortools' cp-sat is great, but i don't see a good reason to use discrete-time based reasoning here like it's done: interval-variables, edge-finding and co. based scheduling constraints (+ conflict-analysis / explanations). This stuff is total overkill and the overhead will be noticable. I don't see any need to reason about time, but just about time-induced conflicts.
Edit: One could label those two approaches (linked + proposed) as compact formulation and extended formulation. Extended formulations usually show stronger relaxations and better (solving) results as long as scalability is not an issue. Compact approaches might become more viable again with bigger data (bit it's hard to guess here as scheduling-propagators are not that cheap).
What i would propose:
(1) Formulate an integer-programming model following the basic assignment-problem formulation + adaptions to make it rectangular -> a worker is allowed to tackle multiple tasks while all tasks are tackled (one sum-equality dropped)
see wiki
(2) Add integrality = mark variables as binary -> because the problem is not satisfying total unimodularity anymore
(3) Add constraints to forbid conflicts
(4) Add constraints: remaining stuff (e.g. compatibility)
Now this is all straightforward, but i would propose one non-naive improvement in regards to (3):
The conflicts can be interpreted as stable-set polytope
Your conflicts are induced by a-priori defined time-windows and their overlappings (as i interpret it; this is the core assumption behind this whole answer)
This is an interval graph (because of time-windows)
All interval graphs are chordal
Chordal graphs allow enumeration of all max-cliques in poly-time (implying there are only polynomial many)
python: networkx.algorithms.chordal.chordal_graph_cliques
The set (enumeration) of all maximal cliques define the facets of the stable-set polytope
Those (a constraint for each element in the set) we add as constraints!
(The stable-set polytope on the graph in use here would also allow very very powerful semidefinite-relaxations but it's hard to foresee in which cases this would actually help due to SDPs being much more hard to work with: warmstart within tree-search; scalability; ...)
This will lead to a poly-size integer-programming problem which should be very very good when using a good IP-solver (commercials or if open-source needed: Cbc > GLPK).
Small demo about (3)
import itertools
import networkx as nx
# data: inclusive, exclusive
# --------------------------
time_windows = [
(2, 7),
(0, 10),
(6, 12),
(12, 20),
(8, 12),
(16, 20)
]
# helper
# ------
def is_overlapping(a, b):
return (b[1] > a[0] and b[0] < a[1])
# raw conflicts
# -------------
binary_conflicts = []
for a, b in itertools.combinations(range(len(time_windows)), 2):
if is_overlapping(time_windows[a], time_windows[b]):
binary_conflicts.append( (a, b) )
# conflict graph
# --------------
G = nx.Graph()
G.add_edges_from(binary_conflicts)
# maximal cliques
# ---------------
max_cliques = nx.chordal_graph_cliques(G)
print('naive constraints: raw binary conflicts')
for i in binary_conflicts:
print('sum({}) <= 1'.format(i))
print('improved constraints: clique-constraints')
for i in max_cliques:
print('sum({}) <= 1'.format(list(i)))
Output:
naive constraints: raw binary conflicts
sum((0, 1)) <= 1
sum((0, 2)) <= 1
sum((1, 2)) <= 1
sum((1, 4)) <= 1
sum((2, 4)) <= 1
sum((3, 5)) <= 1
improved constraints: clique-constraints
sum([1, 2, 4]) <= 1
sum([0, 1, 2]) <= 1
sum([3, 5]) <= 1
Fun facts:
Commercial integer-programming solvers and maybe even Cbc might even try to do the same reasoning about clique-constraints to some degree although without the assumption of chordality where it's an NP-hard problem
ortools' cp-sat solver has also a code-path for this (again: general NP-hard case)
Should trigger when expressing the conflict-based model (much harder to decide on this exploitation on general discrete-time based scheduling models)
Caveats
Implementation / Scalability
There are still open questions like:
duplicating max-clique constraints over each worker vs. merging them somehow
be more efficient/clever in finding conflicts (sorting)
will it scale to the data: how big will the graph be / how many conflicts and constraints from those do we need
But those things usually follow instance-statistics (aka "don't decide blindly").
I don't know a name for the specific variant you're describing - maybe others would. However, this indeed seems a good fit for a CP/MIP solver; I would go with the OR-Tools CP-SAT solver, which is free, flexible and usually works well.
Here's a reference implementation with Python, assuming each vehicle requires a team assigned to it with no overlaps, and that the goal is to minimize the number of teams in use.
The framework allows to directly model allowed / forbidden assignments (check out the docs)
from ortools.sat.python import cp_model
model = cp_model.CpModel()
## Data
num_vehicles = 20
max_teams = 10
# Generate some (semi-)interesting data
interval_starts = [i % 9 for i in range(num_vehicles)]
interval_len = [ (num_vehicles - i) % 6 for i in range(num_vehicles)]
interval_ends = [ interval_starts[i] + interval_len[i] for i in range(num_vehicles)]
### variables
# t, v is true iff vehicle v is served by team t
team_assignments = {(t, v): model.NewBoolVar("team_assignments_%i_%i" % (t, v)) for t in range(max_teams) for v in range(num_vehicles)}
#intervals for vehicles. Each interval can be active or non active, according to team_assignments
vehicle_intervals = {(t, v): model.NewOptionalIntervalVar(interval_starts[v], interval_len[v], interval_ends[v], team_assignments[t, v], 'vehicle_intervals_%i_%i' % (t, v))
for t in range(max_teams) for v in range(num_vehicles)}
team_in_use = [model.NewBoolVar('team_in_use_%i' % (t)) for t in range(max_teams)]
## constraints
# non overlap for each team
for t in range(max_teams):
model.AddNoOverlap([vehicle_intervals[t, v] for v in range(num_vehicles)])
# each vehicle must be served by exactly one team
for v in range(num_vehicles):
model.Add(sum(team_assignments[t, v] for t in range(max_teams)) == 1)
# what teams are in use?
for t in range(max_teams):
model.AddMaxEquality(team_in_use[t], [team_assignments[t, v] for v in range(num_vehicles)])
#symmetry breaking - use teams in-order
for t in range(max_teams-1):
model.AddImplication(team_in_use[t].Not(), team_in_use[t+1].Not())
# let's say that the goal is to minimize the number of teams required
model.Minimize(sum(team_in_use))
solver = cp_model.CpSolver()
# optional
# solver.parameters.log_search_progress = True
# solver.parameters.num_search_workers = 8
# solver.parameters.max_time_in_seconds = 5
result_status = solver.Solve(model)
if (result_status == cp_model.INFEASIBLE):
print('No feasible solution under constraints')
elif (result_status == cp_model.OPTIMAL):
print('Optimal result found, required teams=%i' % (solver.ObjectiveValue()))
elif (result_status == cp_model.FEASIBLE):
print('Feasible (non optimal) result found')
else:
print('No feasible solution found under constraints within time')
# Output:
#
# Optimal result found, required teams=7
EDIT:
#sascha suggested a beautiful approach for analyzing the (known in advance) time window overlaps, which would make this solvable as an assignment problem.
So while the formulation above might not be the optimal one for this (although it could be, depending on how the solver works), I've tried to replace the no-overlap conditions with the max-clique approach suggested - full code below.
I did some experiments with moderately large problems (100 and 300 vehicles), and it seems empirically that on smaller problems (~100) this does improve by some - about 15% on average on the time to optimal solution; but I could not find a significant improvement on the larger (~300) problems. This might be either because my formulation is not optimal; because the CP-SAT solver (which is also a good IP solver) is smart enough; or because there's something I've missed :)
Code:
(this is basically the same code from above, with the logic to support using the network approach instead of the no-overlap one copied from #sascha's answer):
from timeit import default_timer as timer
from ortools.sat.python import cp_model
model = cp_model.CpModel()
run_start_time = timer()
## Data
num_vehicles = 300
max_teams = 300
USE_MAX_CLIQUES = True
# Generate some (semi-)interesting data
interval_starts = [i % 9 for i in range(num_vehicles)]
interval_len = [ (num_vehicles - i) % 6 for i in range(num_vehicles)]
interval_ends = [ interval_starts[i] + interval_len[i] for i in range(num_vehicles)]
if (USE_MAX_CLIQUES):
## Max-cliques analysis
# for the max-clique approach
time_windows = [(interval_starts[i], interval_ends[i]) for i in range(num_vehicles)]
def is_overlapping(a, b):
return (b[1] > a[0] and b[0] < a[1])
# raw conflicts
# -------------
binary_conflicts = []
for a, b in itertools.combinations(range(len(time_windows)), 2):
if is_overlapping(time_windows[a], time_windows[b]):
binary_conflicts.append( (a, b) )
# conflict graph
# --------------
G = nx.Graph()
G.add_edges_from(binary_conflicts)
# maximal cliques
# ---------------
max_cliques = nx.chordal_graph_cliques(G)
##
### variables
# t, v is true iff point vehicle v is served by team t
team_assignments = {(t, v): model.NewBoolVar("team_assignments_%i_%i" % (t, v)) for t in range(max_teams) for v in range(num_vehicles)}
#intervals for vehicles. Each interval can be active or non active, according to team_assignments
vehicle_intervals = {(t, v): model.NewOptionalIntervalVar(interval_starts[v], interval_len[v], interval_ends[v], team_assignments[t, v], 'vehicle_intervals_%i_%i' % (t, v))
for t in range(max_teams) for v in range(num_vehicles)}
team_in_use = [model.NewBoolVar('team_in_use_%i' % (t)) for t in range(max_teams)]
## constraints
# non overlap for each team
if (USE_MAX_CLIQUES):
overlap_constraints = [list(l) for l in max_cliques]
for t in range(max_teams):
for l in overlap_constraints:
model.Add(sum(team_assignments[t, v] for v in l) <= 1)
else:
for t in range(max_teams):
model.AddNoOverlap([vehicle_intervals[t, v] for v in range(num_vehicles)])
# each vehicle must be served by exactly one team
for v in range(num_vehicles):
model.Add(sum(team_assignments[t, v] for t in range(max_teams)) == 1)
# what teams are in use?
for t in range(max_teams):
model.AddMaxEquality(team_in_use[t], [team_assignments[t, v] for v in range(num_vehicles)])
#symmetry breaking - use teams in-order
for t in range(max_teams-1):
model.AddImplication(team_in_use[t].Not(), team_in_use[t+1].Not())
# let's say that the goal is to minimize the number of teams required
model.Minimize(sum(team_in_use))
solver = cp_model.CpSolver()
# optional
solver.parameters.log_search_progress = True
solver.parameters.num_search_workers = 8
solver.parameters.max_time_in_seconds = 120
result_status = solver.Solve(model)
if (result_status == cp_model.INFEASIBLE):
print('No feasible solution under constraints')
elif (result_status == cp_model.OPTIMAL):
print('Optimal result found, required teams=%i' % (solver.ObjectiveValue()))
elif (result_status == cp_model.FEASIBLE):
print('Feasible (non optimal) result found, required teams=%i' % (solver.ObjectiveValue()))
else:
print('No feasible solution found under constraints within time')
print('run time: %.2f sec ' % (timer() - run_start_time))
I want to make use of Theano's logistic regression classifier, but I would like to make an apples-to-apples comparison with previous studies I've done to see how deep learning stacks up. I recognize this is probably a fairly simple task if I was more proficient in Theano, but this is what I have so far. From the tutorials on the website, I have the following code:
def errors(self, y):
# check if y has same dimension of y_pred
if y.ndim != self.y_pred.ndim:
raise TypeError(
'y should have the same shape as self.y_pred',
('y', y.type, 'y_pred', self.y_pred.type)
)
# check if y is of the correct datatype
if y.dtype.startswith('int'):
# the T.neq operator returns a vector of 0s and 1s, where 1
# represents a mistake in prediction
return T.mean(T.neq(self.y_pred, y))
I'm pretty sure this is where I need to add the functionality, but I'm not certain how to go about it. What I need is either access to y_pred and y for each and every run (to update my confusion matrix in python) or to have the C++ code handle the confusion matrix and return it at some point along the way. I don't think I can do the former, and I'm unsure how to do the latter. I've done some messing around with an update function along the lines of:
def confuMat(self, y):
x=T.vector('x')
classes = T.scalar('n_classes')
onehot = T.eq(x.dimshuffle(0,'x'),T.arange(classes).dimshuffle('x',0))
oneHot = theano.function([x,classes],onehot)
yMat = T.matrix('y')
yPredMat = T.matrix('y_pred')
confMat = T.dot(yMat.T,yPredMat)
confusionMatrix = theano.function(inputs=[yMat,yPredMat],outputs=confMat)
def confusion_matrix(x,y,n_class):
return confusionMatrix(oneHot(x,n_class),oneHot(y,n_class))
t = np.asarray(confusion_matrix(y,self.y_pred,self.n_out))
print (t)
But I'm not completely clear on how to get this to interface with the function in question and give me a numpy array I can work with.
I'm quite new to Theano, so hopefully this is an easy fix for one of you. I'd like to use this classifer as my output layer in a number of configurations, so I could use the confusion matrix with other architectures.
I suggest using a brute force sort of a way. You need an output for a prediction first. Create a function for it.
prediction = theano.function(
inputs = [index],
outputs = MLPlayers.predicts,
givens={
x: test_set_x[index * batch_size: (index + 1) * batch_size]})
In your test loop, gather the predictions...
labels = labels + test_set_y.eval().tolist()
for mini_batch in xrange(n_test_batches):
wrong = wrong + int(test_model(mini_batch))
predictions = predictions + prediction(mini_batch).tolist()
Now create confusion matrix this way:
correct = 0
confusion = numpy.zeros((outs,outs), dtype = int)
for index in xrange(len(predictions)):
if labels[index] is predictions[index]:
correct = correct + 1
confusion[int(predictions[index]),int(labels[index])] = confusion[int(predictions[index]),int(labels[index])] + 1
You can find this kind of an implementation in this repository.