Using multivalue DDs to solve multistate reliability quantification - cudd

#DCTLib, do you recall this discussion below? You suggested a recursive equation, which was the right approach.
Cudd_PrintMinterm, accessing the individual minterms in the sum of products
Now, I am considering multistate reliability, where we can have either not fail or fail to n-1 different states, with n >= 2. Tulip-dd implements MDDs as described in:
https://github.com/tulip-control/dd/blob/master/doc.md#multi-valued-decision-diagrams-mdd
https://github.com/tulip-control/dd/issues/71
https://github.com/tulip-control/dd/issues/66
In the diagrams in the drawings below, we have defined an MDD declared by:
aut.declare_variable(x=(0,3))
u = aut.add_expr(‘x=i’)
Each value/state of the multi-value variable (MSV) x, x=0, x=1, x=2, or x=3 leads to a specific BDD as shown in the diagrams at the bottom, taking a four-state variable x as example here. The notation is that state 0 represents the normal state and x can fail to different states 1, 2, and 3. The failure probabilities are assigned in table below. In the BDDs below, we (and tulip as well) use the binary coding with two bits x_1 and x_0 to represent each state/value of the MSV. The least significant bit (LSB), i.e., x_0, is always the ancestor. Each of the BDD diagrams below is a representation of a specific value, or state.
To quantify the BDD of a specific state, i.e., the top node, we must know probabilities of binary variables x_0 and x_1 taking different branches (then or else) in the BDD. These branch probabilities are not given directly but need to be calculated according to the BDD structure.
The key here is that the child node probabilities and the branch probabilities of the parent node must be known prior to the calculation of the parent node probability.
In the previous BDD quantification, we knew the probabilities of branches from node x_1 to leaf nodes when calculating node x_1 probability. We did not need to know how node x_1 was connected to node x_0.
Now, for this four-state variable x, we need to know how node x_1 is connected to node x_0, the binary variable representing the least significant bit, to determine the probabilities of branches from node x_1 to leaf nodes. The question is how to implement this?
Here’s one attempt that falls short:
import numpy as np
from omega.symbolic import temporal as trl
aut = trl.Automaton()
# Example of function that returns branch probabilities
def prntr(v, pvars):
assert sum(pvars)==1.0,"Probs must add to 1!"
if (v.high).var == 'x_1':
if (v.low) == aut.true:
return pvars[1] + pvars[3], pvars[1]
else:
return pvars[1] + pvars[3], pvars[3]
if (v.low).var == 'x_1':
if (v.low).negated:
return pvars[1] + pvars[3], pvars[0]
else:
return pvars[1] + pvars[3], pvars[2]
aut.declare_variables(x=(0, 3))
u=aut.add_expr('x = 3')
pvars = np.array([0.1, 0.2, 0.3, 0.4])
prntr(u,pvars)

The key here is that the child node probabilities and the branch probabilities of the parent node must be known prior to the calculation of the parent node probability.
Yes, exactly. In this case, a fully recursive bottom-up computation, like normally done with BDDs, will not work for the reason that you wrote.
However, the approach will start to work again when you treat the variables that together form a state to be a block. So in your recursive function for the probability calculation, whenever you encounter a variable for a block, you treat the node and the successor nodes for the same state component as a block and only recurse when you encounter a node not belonging to the block.
Note that this approach requires that the variables for the state appear continuously in the variable ordering. For the CUDD library, you can constrain the automatic variable reordering to guarantee this.
The following code is a modification of yours implementing this idea:
#!/usr/bin/env python3
import numpy as np
from omega.symbolic import temporal as trl
aut = trl.Automaton()
# Example of function that returns branch probabilities
# Does not yet use result caching and does not yet support assigning probabilities
# to more than one state variable set
def prntr(v, pvars):
assert abs(sum(pvars)-1.0)<=0.0001,"Probs must add to 1!"
if v == aut.true:
return 1.0
elif v == aut.false:
return 0.0
if v.var in ["x_0","x_1"]:
thisSum = 0.0
# Compute probabilities
for i,p in enumerate(pvars):
# Find successor
# Assuming that x_2 always comes after x_1
thisV = v
negate = thisV.negated
if thisV.var == 'x_0':
if i & 1:
thisV = thisV.high
else:
thisV = thisV.low
negate = negate ^ thisV.negated
if thisV.var == 'x_1':
if i & 2:
thisV = thisV.high
else:
thisV = thisV.low
if negate:
thisSum += p*prntr(~thisV, pvars)
else:
thisSum += p*prntr(thisV, pvars)
return thisSum
# TODO: What is the semantics for variables outside of the current block?
return 0.5*prntr(v.high, pvars) + 0.5*prntr(v.low, pvars)
pvars = np.array([0.1, 0.2, 0.3, 0.4])
aut.declare_variables(x=(0, 3))
u= aut.add_expr('x = 0')
print(prntr(u,pvars))
u2 = aut.add_expr('x = 3') | aut.add_expr('x = 2')
print(prntr(u2,pvars))

Related

What is the CP/MILP problem name for the assignment of agents to tasks with fixed start time and end time?

I'm trying to solve a Constraint Satisfaction Optimisation Problem that assigns agents to tasks. However, different then the basic Assignment Problem, a agent can be assigned to many tasks if the tasks do not overlap.
Each task has a fixed start_time and end_time.
The agents are assigned to the tasks according to some unary&binary constraints.
Variables = set of tasks
Domain = set of compatible agents (for each variable)
Constraints = unary&binary
Optimisation fct = some liniar function
An example of the problem: the allocation of parking space (or teams) for trucks for which we know the arrival and departure time.
I'm interested if there is in the literature a precise name for these type of problems. I presume it is some kind of assignment problem. Also, if you ever approach the problem, how do you solve it?
Thank you.
I would interpret this as: rectangular assignment-problem with conflicts
which is arguably much more hard (NP-hard in general) than the polynomially-solvable assignment-problem.
The demo shown in the other answer might work and ortools' cp-sat is great, but i don't see a good reason to use discrete-time based reasoning here like it's done: interval-variables, edge-finding and co. based scheduling constraints (+ conflict-analysis / explanations). This stuff is total overkill and the overhead will be noticable. I don't see any need to reason about time, but just about time-induced conflicts.
Edit: One could label those two approaches (linked + proposed) as compact formulation and extended formulation. Extended formulations usually show stronger relaxations and better (solving) results as long as scalability is not an issue. Compact approaches might become more viable again with bigger data (bit it's hard to guess here as scheduling-propagators are not that cheap).
What i would propose:
(1) Formulate an integer-programming model following the basic assignment-problem formulation + adaptions to make it rectangular -> a worker is allowed to tackle multiple tasks while all tasks are tackled (one sum-equality dropped)
see wiki
(2) Add integrality = mark variables as binary -> because the problem is not satisfying total unimodularity anymore
(3) Add constraints to forbid conflicts
(4) Add constraints: remaining stuff (e.g. compatibility)
Now this is all straightforward, but i would propose one non-naive improvement in regards to (3):
The conflicts can be interpreted as stable-set polytope
Your conflicts are induced by a-priori defined time-windows and their overlappings (as i interpret it; this is the core assumption behind this whole answer)
This is an interval graph (because of time-windows)
All interval graphs are chordal
Chordal graphs allow enumeration of all max-cliques in poly-time (implying there are only polynomial many)
python: networkx.algorithms.chordal.chordal_graph_cliques
The set (enumeration) of all maximal cliques define the facets of the stable-set polytope
Those (a constraint for each element in the set) we add as constraints!
(The stable-set polytope on the graph in use here would also allow very very powerful semidefinite-relaxations but it's hard to foresee in which cases this would actually help due to SDPs being much more hard to work with: warmstart within tree-search; scalability; ...)
This will lead to a poly-size integer-programming problem which should be very very good when using a good IP-solver (commercials or if open-source needed: Cbc > GLPK).
Small demo about (3)
import itertools
import networkx as nx
# data: inclusive, exclusive
# --------------------------
time_windows = [
(2, 7),
(0, 10),
(6, 12),
(12, 20),
(8, 12),
(16, 20)
]
# helper
# ------
def is_overlapping(a, b):
return (b[1] > a[0] and b[0] < a[1])
# raw conflicts
# -------------
binary_conflicts = []
for a, b in itertools.combinations(range(len(time_windows)), 2):
if is_overlapping(time_windows[a], time_windows[b]):
binary_conflicts.append( (a, b) )
# conflict graph
# --------------
G = nx.Graph()
G.add_edges_from(binary_conflicts)
# maximal cliques
# ---------------
max_cliques = nx.chordal_graph_cliques(G)
print('naive constraints: raw binary conflicts')
for i in binary_conflicts:
print('sum({}) <= 1'.format(i))
print('improved constraints: clique-constraints')
for i in max_cliques:
print('sum({}) <= 1'.format(list(i)))
Output:
naive constraints: raw binary conflicts
sum((0, 1)) <= 1
sum((0, 2)) <= 1
sum((1, 2)) <= 1
sum((1, 4)) <= 1
sum((2, 4)) <= 1
sum((3, 5)) <= 1
improved constraints: clique-constraints
sum([1, 2, 4]) <= 1
sum([0, 1, 2]) <= 1
sum([3, 5]) <= 1
Fun facts:
Commercial integer-programming solvers and maybe even Cbc might even try to do the same reasoning about clique-constraints to some degree although without the assumption of chordality where it's an NP-hard problem
ortools' cp-sat solver has also a code-path for this (again: general NP-hard case)
Should trigger when expressing the conflict-based model (much harder to decide on this exploitation on general discrete-time based scheduling models)
Caveats
Implementation / Scalability
There are still open questions like:
duplicating max-clique constraints over each worker vs. merging them somehow
be more efficient/clever in finding conflicts (sorting)
will it scale to the data: how big will the graph be / how many conflicts and constraints from those do we need
But those things usually follow instance-statistics (aka "don't decide blindly").
I don't know a name for the specific variant you're describing - maybe others would. However, this indeed seems a good fit for a CP/MIP solver; I would go with the OR-Tools CP-SAT solver, which is free, flexible and usually works well.
Here's a reference implementation with Python, assuming each vehicle requires a team assigned to it with no overlaps, and that the goal is to minimize the number of teams in use.
The framework allows to directly model allowed / forbidden assignments (check out the docs)
from ortools.sat.python import cp_model
model = cp_model.CpModel()
## Data
num_vehicles = 20
max_teams = 10
# Generate some (semi-)interesting data
interval_starts = [i % 9 for i in range(num_vehicles)]
interval_len = [ (num_vehicles - i) % 6 for i in range(num_vehicles)]
interval_ends = [ interval_starts[i] + interval_len[i] for i in range(num_vehicles)]
### variables
# t, v is true iff vehicle v is served by team t
team_assignments = {(t, v): model.NewBoolVar("team_assignments_%i_%i" % (t, v)) for t in range(max_teams) for v in range(num_vehicles)}
#intervals for vehicles. Each interval can be active or non active, according to team_assignments
vehicle_intervals = {(t, v): model.NewOptionalIntervalVar(interval_starts[v], interval_len[v], interval_ends[v], team_assignments[t, v], 'vehicle_intervals_%i_%i' % (t, v))
for t in range(max_teams) for v in range(num_vehicles)}
team_in_use = [model.NewBoolVar('team_in_use_%i' % (t)) for t in range(max_teams)]
## constraints
# non overlap for each team
for t in range(max_teams):
model.AddNoOverlap([vehicle_intervals[t, v] for v in range(num_vehicles)])
# each vehicle must be served by exactly one team
for v in range(num_vehicles):
model.Add(sum(team_assignments[t, v] for t in range(max_teams)) == 1)
# what teams are in use?
for t in range(max_teams):
model.AddMaxEquality(team_in_use[t], [team_assignments[t, v] for v in range(num_vehicles)])
#symmetry breaking - use teams in-order
for t in range(max_teams-1):
model.AddImplication(team_in_use[t].Not(), team_in_use[t+1].Not())
# let's say that the goal is to minimize the number of teams required
model.Minimize(sum(team_in_use))
solver = cp_model.CpSolver()
# optional
# solver.parameters.log_search_progress = True
# solver.parameters.num_search_workers = 8
# solver.parameters.max_time_in_seconds = 5
result_status = solver.Solve(model)
if (result_status == cp_model.INFEASIBLE):
print('No feasible solution under constraints')
elif (result_status == cp_model.OPTIMAL):
print('Optimal result found, required teams=%i' % (solver.ObjectiveValue()))
elif (result_status == cp_model.FEASIBLE):
print('Feasible (non optimal) result found')
else:
print('No feasible solution found under constraints within time')
# Output:
#
# Optimal result found, required teams=7
EDIT:
#sascha suggested a beautiful approach for analyzing the (known in advance) time window overlaps, which would make this solvable as an assignment problem.
So while the formulation above might not be the optimal one for this (although it could be, depending on how the solver works), I've tried to replace the no-overlap conditions with the max-clique approach suggested - full code below.
I did some experiments with moderately large problems (100 and 300 vehicles), and it seems empirically that on smaller problems (~100) this does improve by some - about 15% on average on the time to optimal solution; but I could not find a significant improvement on the larger (~300) problems. This might be either because my formulation is not optimal; because the CP-SAT solver (which is also a good IP solver) is smart enough; or because there's something I've missed :)
Code:
(this is basically the same code from above, with the logic to support using the network approach instead of the no-overlap one copied from #sascha's answer):
from timeit import default_timer as timer
from ortools.sat.python import cp_model
model = cp_model.CpModel()
run_start_time = timer()
## Data
num_vehicles = 300
max_teams = 300
USE_MAX_CLIQUES = True
# Generate some (semi-)interesting data
interval_starts = [i % 9 for i in range(num_vehicles)]
interval_len = [ (num_vehicles - i) % 6 for i in range(num_vehicles)]
interval_ends = [ interval_starts[i] + interval_len[i] for i in range(num_vehicles)]
if (USE_MAX_CLIQUES):
## Max-cliques analysis
# for the max-clique approach
time_windows = [(interval_starts[i], interval_ends[i]) for i in range(num_vehicles)]
def is_overlapping(a, b):
return (b[1] > a[0] and b[0] < a[1])
# raw conflicts
# -------------
binary_conflicts = []
for a, b in itertools.combinations(range(len(time_windows)), 2):
if is_overlapping(time_windows[a], time_windows[b]):
binary_conflicts.append( (a, b) )
# conflict graph
# --------------
G = nx.Graph()
G.add_edges_from(binary_conflicts)
# maximal cliques
# ---------------
max_cliques = nx.chordal_graph_cliques(G)
##
### variables
# t, v is true iff point vehicle v is served by team t
team_assignments = {(t, v): model.NewBoolVar("team_assignments_%i_%i" % (t, v)) for t in range(max_teams) for v in range(num_vehicles)}
#intervals for vehicles. Each interval can be active or non active, according to team_assignments
vehicle_intervals = {(t, v): model.NewOptionalIntervalVar(interval_starts[v], interval_len[v], interval_ends[v], team_assignments[t, v], 'vehicle_intervals_%i_%i' % (t, v))
for t in range(max_teams) for v in range(num_vehicles)}
team_in_use = [model.NewBoolVar('team_in_use_%i' % (t)) for t in range(max_teams)]
## constraints
# non overlap for each team
if (USE_MAX_CLIQUES):
overlap_constraints = [list(l) for l in max_cliques]
for t in range(max_teams):
for l in overlap_constraints:
model.Add(sum(team_assignments[t, v] for v in l) <= 1)
else:
for t in range(max_teams):
model.AddNoOverlap([vehicle_intervals[t, v] for v in range(num_vehicles)])
# each vehicle must be served by exactly one team
for v in range(num_vehicles):
model.Add(sum(team_assignments[t, v] for t in range(max_teams)) == 1)
# what teams are in use?
for t in range(max_teams):
model.AddMaxEquality(team_in_use[t], [team_assignments[t, v] for v in range(num_vehicles)])
#symmetry breaking - use teams in-order
for t in range(max_teams-1):
model.AddImplication(team_in_use[t].Not(), team_in_use[t+1].Not())
# let's say that the goal is to minimize the number of teams required
model.Minimize(sum(team_in_use))
solver = cp_model.CpSolver()
# optional
solver.parameters.log_search_progress = True
solver.parameters.num_search_workers = 8
solver.parameters.max_time_in_seconds = 120
result_status = solver.Solve(model)
if (result_status == cp_model.INFEASIBLE):
print('No feasible solution under constraints')
elif (result_status == cp_model.OPTIMAL):
print('Optimal result found, required teams=%i' % (solver.ObjectiveValue()))
elif (result_status == cp_model.FEASIBLE):
print('Feasible (non optimal) result found, required teams=%i' % (solver.ObjectiveValue()))
else:
print('No feasible solution found under constraints within time')
print('run time: %.2f sec ' % (timer() - run_start_time))

Solving vector second order differential equation while indexing into an array

I'm attempting to solve the differential equation:
m(t) = M(x)x'' + C(x, x') + B x'
where x and x' are vectors with 2 entries representing the angles and angular velocity in a dynamical system. M(x) is a 2x2 matrix that is a function of the components of theta, C is a 2x1 vector that is a function of theta and theta' and B is a 2x2 matrix of constants. m(t) is a 2*1001 array containing the torques applied to each of the two joints at the 1001 time steps and I would like to calculate the evolution of the angles as a function of those 1001 time steps.
I've transformed it to standard form such that :
x'' = M(x)^-1 (m(t) - C(x, x') - B x')
Then substituting y_1 = x and y_2 = x' gives the first order linear system of equations:
y_2 = y_1'
y_2' = M(y_1)^-1 (m(t) - C(y_1, y_2) - B y_2)
(I've used theta and phi in my code for x and y)
def joint_angles(theta_array, t, torques, B):
phi_1 = np.array([theta_array[0], theta_array[1]])
phi_2 = np.array([theta_array[2], theta_array[3]])
def M_func(phi):
M = np.array([[a_1+2.*a_2*np.cos(phi[1]), a_3+a_2*np.cos(phi[1])],[a_3+a_2*np.cos(phi[1]), a_3]])
return np.linalg.inv(M)
def C_func(phi, phi_dot):
return a_2 * np.sin(phi[1]) * np.array([-phi_dot[1] * (2. * phi_dot[0] + phi_dot[1]), phi_dot[0]**2])
dphi_2dt = M_func(phi_1) # (torques[:, t] - C_func(phi_1, phi_2) - B # phi_2)
return dphi_2dt, phi_2
t = np.linspace(0,1,1001)
initial = theta_init[0], theta_init[1], dtheta_init[0], dtheta_init[1]
x = odeint(joint_angles, initial, t, args = (torque_array, B))
I get the error that I cannot index into torques using the t array, which makes perfect sense, however I am not sure how to have it use the current value of the torques at each time step.
I also tried putting odeint command in a for loop and only evaluating it at one time step at a time, using the solution of the function as the initial conditions for the next loop, however the function simply returned the initial conditions, meaning every loop was identical. This leads me to suspect I've made a mistake in my implementation of the standard form but I can't work out what it is. It would be preferable however to not have to call the odeint solver in a for loop every time, and rather do it all as one.
If helpful, my initial conditions and constant values are:
theta_init = np.array([10*np.pi/180, 143.54*np.pi/180])
dtheta_init = np.array([0, 0])
L_1 = 0.3
L_2 = 0.33
I_1 = 0.025
I_2 = 0.045
M_1 = 1.4
M_2 = 1.0
D_2 = 0.16
a_1 = I_1+I_2+M_2*(L_1**2)
a_2 = M_2*L_1*D_2
a_3 = I_2
Thanks for helping!
The solver uses an internal stepping that is problem adapted. The given time list is a list of points where the internal solution gets interpolated for output samples. The internal and external time lists are in no way related, the internal list only depends on the given tolerances.
There is no actual natural relation between array indices and sample times.
The translation of a given time into an index and construction of a sample value from the surrounding table entries is called interpolation (by a piecewise polynomial function).
Torque as a physical phenomenon is at least continuous, a piecewise linear interpolation is the easiest way to transform the given function value table into an actual continuous function. Of course one also needs the time array.
So use numpy.interp1d or the more advanced routines of scipy.interpolate to define the torque function that can be evaluated at arbitrary times as demanded by the solver and its integration method.

mutual visibility of nodes on a grid

I have a simple grid, and I need check two nodes for mutual visibility. All walls and nodes coordinations is known. I need check two nodes for mutual visibility.
I have tried use vectors, but I didn't get acceptable result. This algorithm works, but it bad fit in my program, because of this i must do transformations of data to get acceptable result.
I used this code for check nodes for mutual visibility:
def finding_vector_grid(start, goal):
distance = [start[0]-goal[0], start[1]-goal[1]]
norm = math.sqrt(distance[0] ** 2 + distance[1] ** 2)
if norm == 0: return [1, 1]
direction = [(distance[0]/norm), (distance[1]/norm)]
return direction
def finding_vector_path(start, goal):
path = [start]
direction = finding_vector_grid((start[0]*cell_width, start[1]*cell_height),
(goal[0]*cell_width, goal[1]*cell_height))
x, y = start[0]*cell_width, start[1]*cell_height
point = start
while True:
if point not in path and in_map(point):
path.append(point)
elif not in_map(point):
break
x -= direction[0]
y -= direction[1]
point = (x//cell_width, y//cell_height)
return path
def vector_obstacles_clean(path, obstacles):
result = []
for node in path:
if node in obstacles:
result.append(node)
break
result.append(node)
return result
for example:
path = finding_vector_path((0, 0), (0, 5))
path = vector_obstacles_clean(path, [(0, 3)])
in_map - check if point not abroad map frontiers;
start, goal - tuples width x and y coords;
cell_width, cell_height - int variables with node width and height in pixels (I use pygame for visualization graph).
I have not any problems with this method, but it works not with graphs, it works "by itself", it not quite the that I need to. I am not good at English, please forgive me :)
The code you posted seems perfectly nice,
and your question doesn't clarify what needs improving.
Rather than doing FP arithmetic on vectors,
you might prefer to increment an integer X or Y pointer
one pixel at a time.
Consider using Bresenham's line algorithm,
which enumerates pixels in the line of sight
between start and goal.
The key observation is that for a given slope
it notices whether X or Y will increment faster,
and loops on that index.

Euler beam, solving differential equation in python

I must solve the Euler Bernoulli differential beam equation which is:
w’’’’(x) = q(x)
and boundary conditions:
w(0) = w(l) = 0
and
w′′(0) = w′′(l) = 0
The beam is as shown on the picture below:
beam
The continious force q is 2N/mm.
I have to use shooting method and scipy.integrate.odeint() func.
I can't even manage to start as i do not understand how to write the differential equation as a system of equation
Can someone who understands solving of differential equations with boundary conditions in python please help!
Thanks :)
The shooting method
To solve the fourth order ODE BVP with scipy.integrate.odeint() using the shooting method you need to:
1.) Separate the 4th order ODE into 4 first order ODEs by substituting:
u = w
u1 = u' = w' # 1
u2 = u1' = w'' # 2
u3 = u2' = w''' # 3
u4 = u3' = w'''' = q # 4
2.) Create a function to carry out the derivation logic and connect that function to the integrate.odeint() like this:
function calc(u, x , q)
{
return [u[1], u[2], u[3] , q]
}
w = integrate.odeint(calc, [w(0), guess, w''(0), guess], xList, args=(q,))
Explanation:
We are sending the boundary value conditions to odeint() for x=0 ([w(0), w'(0) ,w''(0), w'''(0)]) which calls the function calc which returns the derivatives to be added to the current state of w. Note that we are guessing the initial boundary conditions for w'(0) and w'''(0) while entering the known w(0)=0 and w''(0)=0.
Addition of derivatives to the current state of w occurs like this:
# the current w(x) value is the previous value plus the current change of w in dx.
w(x) = w(x-dx) + dw/dx
# others are calculated the same
dw(x)/dx = dw(x-dx)/dx + d^2w(x)/dx^2
# etc.
This is why we are returning values [u[1], u[2], u[3] , q] instead of [u[0], u[1], u[2] , u[3]] from the calc function, because u[1] is the first derivative so we add it to w, etc.
3.) Now we are able to set up our shooting method. We will be sending different initial boundary values for w'(0) and w'''(0) to odeint() and then check the end result of the returned w(x) profile to determine how close w(L) and w''(L) got to 0 (the known boundary conditions).
The program for the shooting method:
# a function to return the derivatives of w
def returnDerivatives(u, x, q):
return [u[1], u[2], u[3], q]
# a shooting funtion which takes in two variables and returns a w(x) profile for x=[0,L]
def shoot(u2, u4):
# the number of x points to calculate integration -> determines the size of dx
# bigger number means more x's -> better precision -> longer execution time
xSteps = 1001
# length of the beam
L= 1.0 # 1m
xSpace = np.linspace(0, L, xSteps)
q = 0.02 # constant [N/m]
# integrate and return the profile of w(x) and it's derivatives, from x=0 to x=L
return odeint(returnDerivatives, [ 0, u2, 0, u4] , xSpace, args=(q,))
# the tolerance for our results.
tolerance = 0.01
# how many numbers to consider for u2 and u4 (the guess boundary conditions)
u2_u4_maxNumbers = 1327 # bigger number, better precision, slower program
# you can also divide into separate variables like u2_maxNum and u4_maxNum
# these are already tested numbers (the best results are somewhere in here)
u2Numbers = np.linspace(-0.1, 0.1, u2_u4_maxNumbers)
# the same as above
u4Numbers = np.linspace(-0.5, 0.5, u2_u4_maxNumbers)
# result list for extracted values of each w(x) profile => [u2Best, u4Best, w(L), w''(L)]
# which will help us determine if the w(x) profile is inside tolerance
resultList = []
# result list for each U (or w(x) profile) => [w(x), w'(x), w''(x), w'''(x)]
resultW = []
# start generating numbers for u2 and u4 and send them to odeint()
for u2 in u2Numbers:
for u4 in u4Numbers:
U = []
U = shoot(u2,u4)
# get only the last row of the profile to determine if it passes tolerance check
result = U[len(U)-1]
# only check w(L) == 0 and w''(L) == 0, as those are the known boundary cond.
if (abs(result[0]) < tolerance) and (abs(result[2]) < tolerance):
# if the result passed the tolerance check, extract some values from the
# last row of the w(x) profile which we will need later for comaprisons
resultList.append([u2, u4, result[0], result[2]])
# add the w(x) profile to the list of profiles that passed the tolerance
# Note: the order of resultList is the same as the order of resultW
resultW.append(U)
# go through the resultList (list of extracted values from last row of each w(x) profile)
for i in range(len(resultList)):
x = resultList[i]
# both boundary conditions are 0 for both w(L) and w''(L) so we will simply add
# the two absolute values to determine how much the sum differs from 0
y = abs(x[2]) + abs(x[3])
# if we've just started set the least difference to the current
if i == 0:
minNum = y # remember the smallest difference to 0
index = 0 # remember index of best profile
elif y < minNum:
# current sum of absolute values is smaller
minNum = y
index = i
# print out the integral for w(x) over the beam
sum = 0
for i in resultW[index]:
sum = sum + i[0]
print("The integral of w(x) over the beam is:")
print(sum/1001) # sum/xSteps
This outputs:
The integral of w(x) over the beam is:
0.000135085272117
To print out the best profile for w(x) that we found:
print(resultW[index])
which outputs something like:
# w(x) w'(x) w''(x) w'''(x)
[[ 0.00000000e+00 7.54147813e-04 0.00000000e+00 -9.80392157e-03]
[ 7.54144825e-07 7.54142917e-04 -9.79392157e-06 -9.78392157e-03]
[ 1.50828005e-06 7.54128237e-04 -1.95678431e-05 -9.76392157e-03]
...,
[ -4.48774290e-05 -8.14851572e-04 1.75726275e-04 1.01560784e-02]
[ -4.56921910e-05 -8.14670764e-04 1.85892353e-04 1.01760784e-02]
[ -4.65067671e-05 -8.14479780e-04 1.96078431e-04 1.01960784e-02]]
To double check the results from above we will also solve the ODE using the numerical method.
The numerical method
To solve the problem using the numerical method we first need to solve the differential equations. We will get four constants which we need to find with the help of the boundary conditions. The boundary conditions will be used to form a system of equations to help find the necessary constants.
For example:
w’’’’(x) = q(x);
means that we have this:
d^4(w(x))/dx^4 = q(x)
Since q(x) is constant after integrating we have:
d^3(w(x))/dx^3 = q(x)*x + C
After integrating again:
d^2(w(x))/dx^2 = q(x)*0.5*x^2 + C*x + D
After another integration:
dw(x)/dx = q(x)/6*x^3 + C*0.5*x^2 + D*x + E
And finally the last integration yields:
w(x) = q(x)/24*x^4 + C/6*x^3 + D*0.5*x^2 + E*x + F
Then we take a look at the boundary conditions (now we have expressions from above for w''(x) and w(x)) with which we make a system of equations to solve the constants.
w''(0) => 0 = q(x)*0.5*0^2 + C*0 + D
w''(L) => 0 = q(x)*0.5*L^2 + C*L + D
This gives us the constants:
D = 0 # from the first equation
C = - 0.01 * L # from the second (after inserting D=0)
After repeating the same for w(0)=0 and w(L)=0 we obtain:
F = 0 # from first
E = 0.01/12.0 * L^3 # from second
Now, after we have solved the equation and found all of the integration constants we can make the program for the numerical method.
The program for the numerical method
We will make a FOR loop to go through the entire beam for every dx at a time and sum up (integrate) w(x).
L = 1.0 # in meters
step = 1001.0 # how many steps to take (dx)
q = 0.02 # constant [N/m]
integralOfW = 0.0; # instead of w(0) enter the boundary condition value for w(0)
result = []
for i in range(int(L*step)):
x= i/step
w = (q/24.0*pow(x,4) - 0.02/12.0*pow(x,3) + 0.01/12*pow(L,3)*x)/step # current w fragment
# add up fragments of w for integral calculation
integralOfW += w
# add current value of w(x) to result list for plotting
result.append(w*step);
print("The integral of w(x) over the beam is:")
print(integralOfW)
which outputs:
The integral of w(x) over the beam is:
0.00016666652805511192
Now to compare the two methods
Result comparison between the shooting method and the numerical method
The integral of w(x) over the beam:
Shooting method -> 0.000135085272117
Numerical method -> 0.00016666652805511192
That's a pretty good match, now lets see check the plots:
From the plots it's even more obvious that we have a good match and that the results of the shooting method are correct.
To get even better results for the shooting method increase xSteps and u2_u4_maxNumbers to bigger numbers and you can also narrow down the u2Numbers and u4Numbers to the same set size but a smaller interval (around the best results from previous program runs). Keep in mind that setting xSteps and u2_u4_maxNumbers too high will cause your program to run for a very long time.
You need to transform the ODE into a first order system, setting u0=w one possible and usually used system is
u0'=u1,
u1'=u2,
u2'=u3,
u3'=q(x)
This can be implemented as
def ODEfunc(u,x): return [ u[1], u[2], u[3], q(x) ]
Then make a function that shoots with experimental initial conditions and returns the components of the second boundary condition
def shoot(u01, u03): return odeint(ODEfunc, [0, u01, 0, u03], [0, l])[-1,[0,2]]
Now you have a function of two variables with two components and you need to solve this 2x2 system with the usual methods. As the system is linear, the shooting function is linear as well and you only need to find the coefficients and solve the resulting linear system.

What does `sample_weight` do to the way a `DecisionTreeClassifier` works in sklearn?

I've read from the relevant documentation that :
Class balancing can be done by sampling an equal number of samples from each class, or preferably by normalizing the sum of the sample weights (sample_weight) for each class to the same value.
But, it is still unclear to me how this works. If I set sample_weight with an array of only two possible values, 1's and 2's, does this mean that the samples with 2's will get sampled twice as often as the samples with 1's when doing the bagging? I cannot think of a practical example for this.
Some quick preliminaries:
Let's say we have a classification problem with K classes. In a region of feature space represented by the node of a decision tree, recall that the "impurity" of the region is measured by quantifying the inhomogeneity, using the probability of the class in that region. Normally, we estimate:
Pr(Class=k) = #(examples of class k in region) / #(total examples in region)
The impurity measure takes as input, the array of class probabilities:
[Pr(Class=1), Pr(Class=2), ..., Pr(Class=K)]
and spits out a number, which tells you how "impure" or how inhomogeneous-by-class the region of feature space is. For example, the gini measure for a two class problem is 2*p*(1-p), where p = Pr(Class=1) and 1-p=Pr(Class=2).
Now, basically the short answer to your question is:
sample_weight augments the probability estimates in the probability array ... which augments the impurity measure ... which augments how nodes are split ... which augments how the tree is built ... which augments how feature space is diced up for classification.
I believe this is best illustrated through example.
First consider the following 2-class problem where the inputs are 1 dimensional:
from sklearn.tree import DecisionTreeClassifier as DTC
X = [[0],[1],[2]] # 3 simple training examples
Y = [ 1, 2, 1 ] # class labels
dtc = DTC(max_depth=1)
So, we'll look trees with just a root node and two children. Note that the default impurity measure the gini measure.
Case 1: no sample_weight
dtc.fit(X,Y)
print dtc.tree_.threshold
# [0.5, -2, -2]
print dtc.tree_.impurity
# [0.44444444, 0, 0.5]
The first value in the threshold array tells us that the 1st training example is sent to the left child node, and the 2nd and 3rd training examples are sent to the right child node. The last two values in threshold are placeholders and are to be ignored. The impurity array tells us the computed impurity values in the parent, left, and right nodes respectively.
In the parent node, p = Pr(Class=1) = 2. / 3., so that gini = 2*(2.0/3.0)*(1.0/3.0) = 0.444..... You can confirm the child node impurities as well.
Case 2: with sample_weight
Now, let's try:
dtc.fit(X,Y,sample_weight=[1,2,3])
print dtc.tree_.threshold
# [1.5, -2, -2]
print dtc.tree_.impurity
# [0.44444444, 0.44444444, 0.]
You can see the feature threshold is different. sample_weight also affects the impurity measure in each node. Specifically, in the probability estimates, the first training example is counted the same, the second is counted double, and the third is counted triple, due to the sample weights we've provided.
The impurity in the parent node region is the same. This is just a coincidence. We can compute it directly:
p = Pr(Class=1) = (1+3) / (1+2+3) = 2.0/3.0
The gini measure of 4/9 follows.
Now, you can see from the chosen threshold that the first and second training examples are sent to the left child node, while the third is sent to the right. We see that impurity is calculated to be 4/9 also in the left child node because:
p = Pr(Class=1) = 1 / (1+2) = 1/3.
The impurity of zero in the right child is due to only one training example lying in that region.
You can extend this with non-integer sample-wights similarly. I recommend trying something like sample_weight = [1,2,2.5], and confirming the computed impurities.

Resources