Bayesian network conditional independency - statistics

If we observe that it is cloudy and raining. What is the probability that the grass is wet? The answer would be:
P(W=T|C=T,R =T) = P(W=T|R=T,S=T)*P(S=T|C=T)+P(W=T|R=T,S=F)*P(S=F|C=T)
But if we observe that the sprinkler is on and the grass is wet, then what would be the probability that it is raining? I'm not sure what would be the solution query to this problem?

The question is a bit off-topic and better for math, because formulas aren't supported here...
1) First, apply the definition of conditional probability:
p(R|S,W) = p(R,S,W) / p(S,W)
2) The numerator can be computed by the Law of total probability:
p(R,S,W) = p(R,S,W|C)p(C) + p(R,S,W|!C)p(!C)
and Bayesian network condition:
p(R,S,W|C) = p(W|S,R) p(S|C) p(R|C)
3) The denominator is computed likewise, but conditioning on both R and C:
p(S,W) = p(S,W|R,C)p(R|C)p(C) + p(S,W|R,!C)p(R|!C)p(!C) +
p(S,W|!R,C)p(!R|C)p(C) + p(S,W|!R,!C)p(!R|!C)p(!C)
Finally, each
p(S,W|R,C) = p(S,W,R,C) / p(R,C) =
p(W|S,R) p(S|C) p(R, C) / p(R,C) =
p(W|S,R) p(S|C)
This will give you all four: p(S,W|R,C), p(S,W|R,!C), p(S,W|!R,C) and p(S,W|!R,!C), which in turn give p(S,W).

Related

How to define log-count ratio for multiclass text dataset (fastai)?

I am trying to follow Rachel Thomas path of sentiment classification with Naive Bayes. In the video she uses a binary dataset (pos. and neg. movie reviews). When it comes to apply Naive Bayes, this is what she does:
Defintion: log-count ratio r for each word f:
r = log (ratio of feature f in positive documents) / (ratio of feature f in negative documents)
where ratio of feature $f$ in positive documents is the number of times a positive document has a feature divided by the number of positive documents.
p1 = np.squeeze(np.asarray(x[y.items==positive].sum(0)))
p0 = np.squeeze(np.asarray(x[y.items==negative].sum(0)))
pr1 = (p1+1) / ((y.items==positive).sum() + 1)
pr0 = (p0+1) / ((y.items==negative).sum() + 1)
r = np.log(pr1/pr0)
--> it is very simple to apply the log-count-ratio to a dataset with 2 labels!
Problem:
My dataset is not binary! Lets assume I have 5 labels: label_1,...,label_5
How do I get the log-count ratio r for multilabel dataset?
My approach:
p4 = np.squeeze(np.asarray(x[y.items==label_5].sum(0)))
p3 = np.squeeze(np.asarray(x[y.items==label_4].sum(0)))
p2 = np.squeeze(np.asarray(x[y.items==label_3].sum(0)))
p1 = np.squeeze(np.asarray(x[y.items==label_2].sum(0)))
p0 = np.squeeze(np.asarray(x[y.items==label_1].sum(0)))
log-count-ratio:
pr1 = (p1+1) / ((y.items==label_2).sum() + 1)
pr1_not = (p1+1) / ((y.items!=label_2).sum() + 1)
r_1 = np.log(pr1/pr1_not)
log-count-ratio:
pr2 = (p2+1) / ((y.items==label_3).sum() + 1)
pr2_not = (p2+1) / ((y.items!=label_3).sum() + 1)
r_2 = np.log(pr2/pr2_not)
...
Is this correct? Does it mean I get multiple ratios?
Yes this is correct. The “negative class” is basically all the classes but the one you are considering. So yes, you will get multiple ratios (as many as number of classes you have).
From https://marvinlsj.github.io/2018/11/23/NBSVM%20for%20sentiment%20and%20topic%20classification/
, the log-count-ratio is derived from Posterior prob ratio which is good for comparing 2 classes to get insight into which is the most probable. I guess you're trying to do one-vs-one method for multi-class problem. This will end up with 5x4/2=10 pairs of ratios for classification. If you'd like to do classification only, we normally compute Posterior prob for each class and select the best. So in your case, you just select the best from sum(log(p1)), sum(log(p2)), ..., sum(log(p5)).

Sympy Solveset Multi Variable Non-Linear Solutions

I am having a bit of trouble with Sympy's solveset. I am trying to use Sympy to find a solution to an basic circuit analysis question involving three unknown resistors and two equations. I realize that I will have to guess at the value of one of the resistors and then calculate the value of the other two resistors. I am using superposition to solve the circuit.
This is the circuit and superposition I am trying to solve
import sympy
V_outa, V_outb, R_1, R_2, R_3, V_1, V_2 = symbols('V_outa V_outb R_1 R_2 R_3 V_1 V_2')
### Here are a bunch of variable definitions.
R_eq12 = R_2*R_3/(R_2+R_3)
R_eq123 = R_eq12 + R_1
V_outa = V_1 *R_eq12/R_eq123
R_eq13 = R_1*R_3/(R_1+R_3)
R_eq123b = R_2 + R_eq13
V_outb = V_2 * R_eq13/R_eq123b
### Here is my governing equation.
V_out = 0.5* V_outb + (1/6) *V_outa
### Now I can start setting up the equations to solve. This sets the
### coefficient of the V1 term equal to the 1/2 in the governing equation
### and the coefficient of the V2 term equal to 1/6. I have also guessed
### that R_3 is equal to 10 ohms.
eq1 = Eq(1.0/2.0, V_out.coeff(V_2).subs(R_3, 10))
eq2 = Eq(1.0/6.0, V_out.coeff(V_2).subs(R_3, 10))
#### Now when I try to solve eq1 and eq2 with solveset, I get an error.
solveset([eq1, eq2], (R_2, R_3))
And here is the error I get:
ValueError: [Eq(0.500000000000000, 10*R_1/((R_1 + 10)*(10*R_1/(R_1 +
10) + R_2))), Eq(0.166666666666667, 10*R_1/((R_1 + 10)*(10*R_1/(R_1 +
10) + R_2)))] is not a valid SymPy expression
The other thing I don't understand is the set type I get when I try to solve it this way. Could someone also explain what set type this is, and how to make use of it?
expr3 = -1.0/2.0 + V_out.coeff(V_2).subs(R_3, 10)
solveset(expr3, R_2)
A screen shot of the weird set type
Any help would be much appreciated. I know it is a solvable set because Wolfram Alpha had no problems with it.
Thanks!
David

Would I reverse this calculation?

I have a spreadsheet which works out margin price etc.
I'm wondering if I can reverse my Margin Calculator?
Presently it works out the Margin %
=((I1-F1)-H1-K1-N1)/I1
(I=Net Price, F=Cost, H=Carriage, K=Fees, N=Promotions)
I want to be able to type a Margin % and it calculates the price and if possible add the round up feature? so everything ends in .99.
Using
=ROUNDUP(SUM(I864*M864)+(I864),0.1)-0.01
(M being Margin field) it calculate using the margin, but it differs from the original price
e.g. Price 69.99=19% Margin
Reverse Calculation 19%=66.99
Difference of 3, i'm really confused and cannot get my head around this, can someone please shed some light?
If M1 = ((I1-F1)-H1-K1-N1)/I1 then reverse I1 = (F1+H1+K1+N1)/(1-M1).
This is done using math to change the subject of the formula:
M1 = ((I1-F1)-H1-K1-N1)/I1 / parenthesis not necessary
M1 = (I1-F1-H1-K1-N1)/I1 / multiply out brackets (*1/I1)
M1 = I1/I1 -F1/I1-H1/I1-K1/I1-N1/I1
M1 = 1 -F1/I1-H1/I1-K1/I1-N1/I1 / -1
M1-1 = -F1/I1-H1/I1-K1/I1-N1/I1 / *I1
(M1-1) * I1 = -F1-H1-K1-N1 / *-1
-1*(M1-1) * I1 = -1*(-F1-H1-K1-N1)
(1-M1) * I1 = F1+H1+K1+N1 / /(1-M1)
I1 = (F1+H1+K1+N1)/(1-M1)

Line segment intersection

I found this code snippet on raywenderlich.com, however the link to the explanation wasn't valid anymore. I "translated" the answer into Swift, I hope you can understand, it's actually quite easy even without knowing the language. Could anyone explain what exactly is going on here? Thanks for any help.
class func linesCross(#line1: Line, line2: Line) -> Bool {
let denominator = (line1.end.y - line1.start.y) * (line2.end.x - line2.start.x) -
(line1.end.x - line1.start.x) * (line2.end.y - line2.start.y)
if denominator == 0 { return false } //lines are parallel
let ua = ((line1.end.x - line1.start.x) * (line2.start.y - line1.start.y) -
(line1.end.y - line1.start.y) * (line2.start.x - line1.start.x)) / denominator
let ub = ((line2.end.x - line2.start.x) * (line2.start.y - line1.start.y) -
(line2.end.y - line2.start.y) * (line2.start.x - line1.start.x)) / denominator
//lines may touch each other - no test for equality here
return ua > 0 && ua < 1 && ub > 0 && ub < 1
}
You can find a detailed segment-intersection algorithm
in the book Computational Geometry in C, Sec. 7.7.
The SegSegInt code described there is available here.
I recommend avoiding slope calculations.
There are several "degenerate" cases that require care: collinear segments
overlapping or not, one segment endpoint in the interior of the other segments,
etc. I wrote the code to return an indication of these special cases.
This is what the code is doing.
Every point P in the segment AB can be described as:
P = A + u(B - A)
for some constant 0 <= u <= 1. In fact, when u=0 you get P=A, and you getP=B when u=1. Intermediate values of u will give you intermediate values of P in the segment. For instance, when u = 0.5 you will get the point in the middle. In general, you can think of the parameter u as the ratio between the lengths of AP and AB.
Now, if you have another segment CD you can describe the points Q on it in the same way, but with a different u, which I will call v:
Q = C + v(D - C)
Again, keep in mind that Q lies between C and D if, and only if, 0 <= v <= 1 (same as above for P).
To find the intersection between the two segments you have to equate P=Q. In other words, you need to find u and v, both between 0 and 1 such that:
A + u(B - A) = C + v(D - C)
So, you have this equation and you have to see if it is solvable within the given constraints on u and v.
Given that A, B, C and D are points with two coordinates x,y each, you can open the equation above into two equations:
ax + u(bx - ax) = cx + v(dx - cx)
ay + u(by - ay) = cy + v(dy - cy)
where ax = A.x, ay = A.y, etc., are the coordinates of the points.
Now we are left with a 2x2 linear system. In matrix form:
|bx-ax cx-dx| |u| = |cx-ax|
|by-ay cy-dy| |v| |cy-ay|
The determinant of the matrix is
det = (bx-ax)(cy-dy) - (by-ay)(cx-dx)
This quantity corresponds to the denominator of the code snippet (please check).
Now, multiplying both sides by the cofactor matrix:
|cy-dy dx-cx|
|ay-by bx-ax|
we get
det*u = (cy-dy)(cx-ax) + (dx-cx)(cy-ay)
det*v = (ay-by)(cx-ax) + (bx-ax)(cy-ay)
which correspond to the variables ua and ub defined in the code (check this too!)
Finally, once you have u and v you can check whether they are both between 0 and 1 and in that case return that there is intersection. Otherwise, there isn't.
For a given line the slope is
m=(y_end-y_start)/(x_end-x_start)
if two slopes are equal, the lines are parallel
m1=m1
(y1_end-y_start)/(x1_end-x1_start)=(y2_end-y2_start)/(x2_end-x2_start)
And this is equivalent to checking that the denominator is not zero,
Regarding the rest of the code, find the explanation on wikipedia under "Given two points on each line"

Best fit square to quadrilateral

I've got a shape consisting of four points, A, B, C and D, of which the only their position is known. The goal is to transform these points to have specific angles and offsets relative to each other.
For example: A(-1,-1) B(2,-1) C(1,1) D(-2,1), which should be transformed to a perfect square (all angles 90) with offsets between AB, BC, CD and AD all being 2. The result should be a square slightly rotated counter-clockwise.
What would be the most efficient way to do this?
I'm using this for a simple block simulation program.
As Mark alluded, we can use constrained optimization to find the side 2 square that minimizes the square of the distance to the corners of the original.
We need to minimize f = (a-A)^2 + (b-B)^2 + (c-C)^2 + (d-D)^2 (where the square is actually a dot product of the vector argument with itself) subject to some constraints.
Following the method of Lagrange multipliers, I chose the following distance constraints:
g1 = (a-b)^2 - 4
g2 = (c-b)^2 - 4
g3 = (d-c)^2 - 4
and the following angle constraints:
g4 = (b-a).(c-b)
g5 = (c-b).(d-c)
A quick napkin sketch should convince you that these constraints are sufficient.
We then want to minimize f subject to the g's all being zero.
The Lagrange function is:
L = f + Sum(i = 1 to 5, li gi)
where the lis are the Lagrange multipliers.
The gradient is non-linear, so we have to take a hessian and use multivariate Newton's method to iterate to a solution.
Here's the solution I got (red) for the data given (black):
This took 5 iterations, after which the L2 norm of the step was 6.5106e-9.
While Codie CodeMonkey's solution is a perfectly valid one (and a great use case for the Lagrangian Multipliers at that), I believe that it's worth mentioning that if the side length is not given this particular problem actually has a closed form solution.
We would like to minimise the distance between the corners of our fitted square and the ones of the given quadrilateral. This is equivalent to minimising the cost function:
f(x1,...,y4) = (x1-ax)^2+(y1-ay)^2 + (x2-bx)^2+(y2-by)^2 +
(x3-cx)^2+(y3-cy)^2 + (x4-dx)^2+(y4-dy)^2
Where Pi = (xi,yi) are the corners of the fitted square and A = (ax,ay) through D = (dx,dy) represent the given corners of the quadrilateral in clockwise order. Since we are fitting a square we have certain contraints regarding the positions of the four corners. Actually, if two opposite corners are given, they are enough to describe a unique square (save for the mirror image on the diagonal).
Parametrization of the points
This means that two opposite corners are enough to represent our target square. We can parametrise the two remaining corners using the components of the first two. In the above example we express P2 and P4 in terms of P1 = (x1,y1) and P3 = (x3,y3). If you need a visualisation of the geometrical intuition behind the parametrisation of a square you can play with the interactive version.
P2 = (x2,y2) = ( (x1+x3-y3+y1)/2 , (y1+y3-x1+x3)/2 )
P4 = (x4,y4) = ( (x1+x3+y3-y1)/2 , (y1+y3+x1-x3)/2 )
Substituting for x2,x4,y2,y4 means that f(x1,...,y4) can be rewritten to:
f(x1,x3,y1,y3) = (x1-ax)^2+(y1-ay)^2 + ((x1+x3-y3+y1)/2-bx)^2+((y1+y3-x1+x3)/2-by)^2 +
(x3-cx)^2+(y3-cy)^2 + ((x1+x3+y3-y1)/2-dx)^2+((y1+y3+x1-x3)/2-dy)^2
a function which only depends on x1,x3,y1,y3. To find the minimum of the resulting function we then set the partial derivatives of f(x1,x3,y1,y3) equal to zero. They are the following:
df/dx1 = 4x1-dy-dx+by-bx-2ax = 0 --> x1 = ( dy+dx-by+bx+2ax)/4
df/dx3 = 4x3+dy-dx-by-bx-2cx = 0 --> x3 = (-dy+dx+by+bx+2cx)/4
df/dy1 = 4y1-dy+dx-by-bx-2ay = 0 --> y1 = ( dy-dx+by+bx+2ay)/4
df/dy3 = 4y3-dy-dx-2cy-by+bx = 0 --> y3 = ( dy+dx+by-bx+2cy)/4
You may see where this is going, as simple rearrangment of the terms leads to the final solution.
Final solution

Resources