Hidden Markov Model Coin & Dice Example with Prolog - statistics

I got the specific problem from here and wanted to implement it on Cplint as Im learning now the principles of ProbLog
So from the above model we get
A red die, having six sides, labeled 1 through 6.
• A green die, having twelve sides, five of which are labeled 2 through 6, while the
remaining seven sides are labeled 1.
• A weighted red coin, for which the probability of heads is 0.9 and the probability
of tails is 0.1.
• A weighted green coin, for which the probability of heads is 0.95 and the
probability of tails is 0.05.
As a solution, I want to create a sequence of numbers from the set {1, 2, 3, 4, 5, 6} with the
following rules:
• Begin by rolling the red die and writing down the number that comes up, which is
the emission/observation.
• Toss the red coin and do one of the following:
➢ If the result is heads, roll the red die and write down the result.
➢ If the result is tails, roll the green die and write down the result.
• At each subsequent step, you flip the coin that has the same color as the die you
rolled in the previous step. If the coin comes up heads, roll the same die as in the
previous step. If the coin comes up tails, switch to the other die.
My state diagram for this model has two states, red and green, as shown in the
figure. In addition, this figure shows: 1) the state-transition probability matrix A, b) the
discrete emission/observation probabilities matrix B, and 3) the initial (prior)
probabilities matrix π. The model is not hidden because you know the sequence of states
from the colors of the coins and dice. Suppose, however, that someone else is generating
the emissions/observations without showing you the dice or the coins. All you see is the
sequence of emissions/observations. If you start seeing more 1s than other numbers, you
might suspect that the model is in the green state, but you cannot be sure because you
cannot see the color of the die being rolled.
Consider the Hidden Markov Model
(HMM) M=(A, B, π), assuming an observation sequence O=<1,1,2,2,3,6,1,1,1,3> what
is the probability the hidden sequence to be
H =<RC,GC, GC,RC, RC,GC, GC,GC,GC,GC>
where RC and GC stand for Read Coin and Green Coin respectively. Use the
cplint or ProbLog to calculate the probability that the model M generated the sequence
O. That is, calculate the probability
P(H|O) = P(<RC,GC, GC,RC, RC,GC, GC,GC,
GC,GC>| <1,1,2,2,3,6,1,1,1,3>)
What I did so far are two approaches.
1)
:- use_module(library(pita)).
:- if(current_predicate(use_rendering/1)).
:- use_rendering(c3).
:- use_rendering(graphviz).
:- endif.
:- pita.
:- begin_lpad.
hmm(O):-hmm1(_,O).
hmm1(S,O):-hmm(q1,[],S,O).
hmm(end,S,S,[]).
hmm(Q,S0,S,[L|O]):-
Q\= end,
next_state(Q,Q1,S0),
letter(Q,L,S0),
hmm(Q1,[Q|S0],S,O).
next_state(q1,q1,S):0.9;
next_state(q1,q2,S):0.1.
next_state(q2,q1,S):0.05;
next_state(q2,q2,S):0.95.
letter(q1,rd1,S):1/6;
letter(q1,rd2,S):1/6;
letter(q1,rd3,S):1/6;
letter(q1,rd4,S):1/6;
letter(q1,rd5,S):1/6;
letter(q1,rd6,S):1/6.
letter(q2,gd1,S):7/12;
letter(q2,gd2,S):1/12;
letter(q2,gd3,S):1/12;
letter(q2,gd4,S):1/12;
letter(q2,gd5,S):1/12;
letter(q2,gd6,S):1/12.
:- end_lpad.
state_diagram(digraph(G)):-
findall(edge(A -> B,[label=P]),
(clause(next_state(A,B,_,_,_),
(get_var_n(_,_,_,_,Probs,_),equalityc(_,_,N,_))),
nth0(N,Probs,P)),
G).
which Im creating the diagram
and the 2 one is this which I just creating the two coins and dices. I dont know how to continue from this. The 1st one is specific from a example from cplint. I cannot find any other forum specified for this kind of tasks. Seems like problog is "dead"
:- use_module(library(pita)).
:- if(current_predicate(use_rendering/1)).
:- use_rendering(c3).
:- endif.
:- pita.
:- begin_lpad.
heads(RC): 0.9; tails(RC) : 0.1:- toss(RC).
heads(GC): 0.95; tails(GC) : 0.05:- toss(GC).
toss(rc);
RD(0,1):1/6;RD(0,2):1/6;RD(0,3):1/6;RD(0,4):1/6;RD(0,5):1/6;RD(0,6):1/6.
RD(0,1):1/6;RD(0,2):1/6;RD(0,3):1/6;RD(0,4):1/6;RD(0,5):1/6;RD(0,6):1/6:-
X1 is X-1,X1>=0,
RD(X1,_),
\+ RD(X1,6)
GD(0,1):1/12;GD(0,2):1/12;GD(0,3):1/12;GD(0,4):1/12;GD(0,5):1/12;GD(0,6):7/12.
GD(0,1):1/12;GD(0,2):1/12;GD(0,3):1/12;GD(0,4):1/12;GD(0,5):1/12;GD(0,6):7/12:-
X1 is X1-1,X1>=0,
GD(X1,_),
\+ GD(X1,12).
toss(RC).
toss(GC).
:- end_lpad.

Not sure if this is still useful but in ProbLog you could have tried something like this:
%% Probabilities
1/6::red_die(1,T) ; 1/6::red_die(2,T) ; 1/6::red_die(3,T) ;
1/6::red_die(4,T) ; 1/6::red_die(5,T) ; 1/6::red_die(6,T).
7/12::green_die(1,T) ; 1/12::green_die(2,T) ; 1/12::green_die(3,T) ;
1/12::green_die(4,T) ; 1/12::green_die(5,T) ; 1/12::green_die(6,T).
0.9::red_coin_head(T).
0.95::green_coin_head(T).
%% Rules
% Start with tossing red
toss_red(1).
% Toss red if previous toss was red, head.
toss_red(T) :- T > 1, Tprev is T - 1, toss_red(Tprev), red_coin_head(Tprev).
% Toss red if previous toss was green but tails.
toss_red(T) :- T > 1, Tprev is T - 1, toss_green(Tprev), \+green_coin_head(Tprev).
% Toss green if previous toss was green, head.
toss_green(T) :- T > 1, Tprev is T - 1, toss_green(Tprev), green_coin_head(Tprev).
% Toss green if previous toss was red but tails.
toss_green(T) :- T > 1, Tprev is T - 1, toss_red(Tprev), \+red_coin_head(Tprev).
% Writing results from red_die if next toss is red.
results([X],1) :- red_die(X,1), toss_red(1).
results([X|Y],T) :- T > 1, Tprev is T - 1, red_die(X,T), toss_red(T), results(Y,Tprev).
% Writing results from green_die if next toss is green.
results([X|Y],T) :- T > 1, Tprev is T - 1, green_die(X,T), toss_green(T), results(Y,Tprev).
results(X) :- length(X, Length), results(X,Length).
results(X) :- length(X, Length), results(X,Length).
% Query
query_state :-
toss_red(1),
toss_green(2),
toss_green(2),
toss_red(3),
toss_red(4),
toss_green(5),
toss_green(6),
toss_green(7),
toss_green(8),
toss_green(9).
toss_green(10).
evidence(results([1,1,2,2,3,6,1,1,1,3])).
query(query_state).
Which according to this has a probability of 0.00011567338

hmmpos.pl from here
seems to be usefull enough to continue

Related

Is there any "better way" to train a quadruped to walk using reinforcement Learning

I have been chasing this problem of using RL to train a quadruped to walk. But have got NO noteworthy success. Following are the Details of the GYM ENV I am using.
Sim: pybullet
env.action_space = box(shape=(12,), upper = 1, lower = -1)
converting selected actions and multiplying them by max_actions specified for each joint.
action space are the 3 motor positions(hip_joint_y, hip_joint_x, knee_joint) x 4 legs of the robot
env.observation_space = box(shape=(12,), upper = np.inf, lower = -np.inf)
observation_space include
roll, pitch of the body [r, p]
angular vel [x, y, z]
linear acc [x, y, z]
Binary contact forces for each leg [1 if in contact else 0]. [1, 1, 1, 1]
reward = (
+ distance_reward
- body_rotation_reward
- energy_usage_reward
- body_drift_from x-axis reward
- body_shake_reward)
I have tried the following approaches.
Using PPO from stable-baselines3 for 20 million timesteps [No Distinct improvement]
Using DDPG, TD3, SAC, A2C, and PPO with 5 million timesteps on each algo increasing policy network up to 4 layers of 1024 neurons each [1024, 1024, 1024, 1024] for qf and vf, or actor and critic.
Using the Discrete Delta concept to scale action limits so changing action_space from box to MultiDiscrete with each action limiting from 0 to 6. discrete_delta_vals = [-0.3, -0.1, -0.03, 0, 0.03, 0.1, 0.3]. Each joint value is decided from choosing one value from the discrete_delta_vals list and adding that value to the previous actions.
Keeping hip_joint_y of all legs as zeros and changing action space from box(shape=(12,)) to box(shape=(8,)). Trained this agent for another 6M timesteps, there seems to be a small improvement at first and then the eps_length and mean_reward settles and no significant improvements afterwards.
I have generated Half Ellipsoid Trajectories with IK and That works but that is explicitly Robotics Approach to solve this problem. I am currently looking into DeepMimic to use those trajectories to guide RL to build a stable walking gait. No Significant breakthrough.
Here is the Repo Link
Check the scripts folder and go through the start_training_v(x).py scripts. Thanks in Advance. If you feel like discussing the entire topic to sort this please drop your email in the comment and I'll reach out to you.
Hi try using Nvidia IsaacGym. This uses pytorch end to endon GPU with PPO. I was able to train a custom urdf to walk in about 10 minutes of training

Octave fplot abs looks very strange

f = #(x)(abs(x))
fplot(f, [-1, 1]
Freshly installed octave, with no configuration edited. It results in the following image, where it looks as if it is constant for a while around 0, looking more like a \_/ than a \/:
Why does it look so different from a usual plot of the absolute value near 0? How can this be fixed?
Since fplot is written in Octave it is relatively easy to read. Its location can be found using the which command. On my system this gives:
octave:1> which fplot
'fplot' is a function from the file /usr/share/octave/5.2.0/m/plot/draw/fplot.m
Examining fplot.m reveals that the function to be plotted, f(x), is evaluated at n equally spaced points between the given limits. The algorithm for determining n starts at line 192 and can be summarised as follows:
n is initially chosen to be 8 (unless specified differently by the user)
Construct a vector of arguments using a coarser grid of n/2 + 1 points:
x0 = linspace (limits(1), limits(2), n/2 + 1)'
(The linspace function will accept a non-integer value for the number of points, which it rounds down)
Calculate the corresponding values:
y0 = f(x0)
Construct a vector of arguments using a grid of n points:
x = linspace (limits(1), limits(2), n)'
Calculate the corresponding values:
y = f(x0)
Construct a vector of values corresponding to the members of x but calculated from x0 and y0 by linear interpolation using the function interp1():
yi = interp1 (x0, y0, x, "linear")
Calculate an error metric using the following formula:
err = 0.5 * max (abs ((yi - y) ./ (yi + y + eps))(:))
That is, err is proportional to the maximum difference between the calculated and linearly interpolated values.
If err is greater than tol (2e-3 unless specified by the user) then put n = 2*(n-1) and repeat. Otherwise plot(x,y).
Because abs(x) is essentially a pair of straight lines, if x0 contains zero then the linearly interpolated values will always exactly match their corresponding calculated values and err will be exactly zero, so the above algorithm will terminate at the end of the first iteration. If x doesn't contain zero then plot(x,y) will be called on a set of points that doesn't include the 'cusp' of the function and the strange behaviour will occur.
This will happen if the limits are equally spaced either side of zero and floor(n/2 + 1) is odd, which is the case for the default values (limits = [-5, 5], n = 8).
The behaviour can be avoided by choosing a combination of n and limits so that either of the following is the case:
a) the set of m = floor(n/2 + 1) equally spaced points doesn't include zero or
b) the set of n equally spaced points does include zero.
For example, limits equally spaced either side of zero and odd n will plot correctly . This will not work for n=5, though, because, strangely, if the user inputs n=5, fplot.m substitutes 8 for it (I'm not sure why it does this, I think it may be a mistake). So fplot(#abs, [-1, 1], 3) and fplot(#abs, [-1, 1], 7) will plot correctly but fplot(#abs, [-1, 1], 5) won't.
(n/2 + 1) is odd, and therefore x0 contains zero for symmetrical limits, only for every 2nd even n. This is why it plots correctly with n=6 because for that value n/2 + 1 = 4, so x0 doesn't contain zero. This is also the case for n=10, 14, 18 and so on.
Choosing slightly asymmetrical limits will also do the trick, try: fplot(#abs, [-1.1, 1.2])
The documentation says: "fplot works best with continuous functions. Functions with discontinuities are unlikely to plot well. This restriction may be removed in the future." so it is probably a bug/feature of the function itself that can't be fixed except by the developers. The ordinary plot() function works fine:
x = [-1 0 1];
y = abs(x);
plot(x, y);
The weird shape comes from the sampling rate, i.e. at how many points the function is evaluated. This is controlled by the parameter N of fplot The default call seems to accidentally skip x=0, and with fplot(#abs, [-1, 1], N=5) I get the same funny shape like you:
However, trying out different values of N can yield the correct shape, try e.g. fplot(#abs, [-1, 1], N=6):
Although in general I would suggest to use way higher numbers, like N=100.

How to calculate two orthogonal points of a line?

I need two new points, which are on a new orthogonal line through point 1 and in distance of meters s and minus s for the other direction. The new orthogonal line is orthogonal to a line given by two points shown in "coords".
I have tried to reuse results from here and here, but both example are somehow different. These examples state that I should work with the vector from the line and that the orthognal of the vector m is given by -1/m or a new point by y = (-1/m)x + b
import math as m
coords=([5,5], [5,6])
print (coords)
x1,y1=coords[0]
x2,y2=coords[1]
s= 5
veclen= m.sqrt(m.pow(x2-x1,2)+m.pow(y2-y1,2))
u=(x2-x1)/veclen
v=(y2-y1)/veclen
print ("u,v:", u,v)
dir1 = (v, -u)
dir2 = (-v, u)
newpoint1=(x1+ s*dir1[0], y1+ s*dir1[1])
newpoint2=(x1+ s*dir2[0], y1+ s*dir2[1])
print (newpoint1, newpoint2)
xn,yn=newpoint1
dist = m.hypot(xn-x1, yn-y1)
print (dist)
This is maybe the right direction, but somehow I do not understand the derived vector 1 (v) and the orthogonal vector (v2) and how to add from point x1,y1 the distance s. Should the vector 1 not be (1,1), as in +1 in x-direction and +1 in y-direction? And likewise the orthogonal vector 2 (1, -1) as in +1 in x and -1 in y?
And is the calculation of both newpoints correct?
I will assume that this is the problem, in one sentence:
Code a routine that is given a tuple coords containing two 2-dimensional points and also given a positive number s, and the routine returns two other distinct points such that the line segment between each output point and coords[0] is orthogonal (perpendicular) to the line segment between coords[0] and coords[1] and the distance from each output point to coords[0] is s.
Now for your questions.
The 2-tuple v represents the vector of length one (the "unit vector") that is parallel to the vector from point coords[0] to point coords[1]. It is found by first subtracting the coordinates of the two points in coords, but that vecto will probably have the wrong length. So your code beforehand found the length of that vector in variable l (a terrible name for a variable) and divides the vector by l. Mathematics tells us that the resulting vector is parallel to the original vector and has length one.
Your code then tries to find a perpendicular unit vector. It fails in two ways. First, it does not use the unit vector; it uses the original vector instead. Second, the new vector is not necessarily perpendicular. Your code says a vector perpendicular to (u, v) is (-u, v), but actually the perpendicular vector is either (v, -u) or (-v, u)--note the swapped coordinates. This new vector is both perpendicular to the previous vector and has the same length.
Therefore the calculation of the two new points is not correct.
I have answered your given questions--let me know if you need code that actually does what you want. Note that you should improve your code by using longer, descriptive variable names and comments and by wrapping up the code into a function. The function should return the points, while the calling routine could print the results.
Here is my code that satisfies your problem. I reduced the amount of printing as checks--you can print more checks, if you like. I also combined some lines, since too many separate computation lines can worsen the accuracy of floating-point calculations. I never compute a unit vector--I go straight to a vector of the desired length.
import math
def orthogonal_points(coords, s):
"""Given a tuple coords containing two 2-dimensional points and also
given a positive number s, return two other distinct points such
that the line segment between each output point and coords[0] is
orthogonal (perpendicular) to the line segment between coords[0] and
coords[1] and the distance from each output point to coords[0] is s.
"""
(point1x, point1y), (point2x, point2y) = coords
points_vectorx, points_vectory = point2x - point1x, point2y - point1y
points_vector_length = math.hypot(points_vectorx, points_vectory)
normalized_x, normalized_y = (points_vectorx * s / points_vector_length,
points_vectory * s / points_vector_length)
newpoint1x, newpoint1y = point1x + normalized_y, point1y - normalized_x
newpoint2x, newpoint2y = point1x - normalized_y, point1y + normalized_x
return ([newpoint1x, newpoint1y], [newpoint2x, newpoint2y])
coords=([5,5], [5,6])
s= 5
print (coords, s)
print (orthogonal_points(coords, s))
The output from that is correct:
([5, 5], [5, 6]) 5
([10.0, 5.0], [0.0, 5.0])

How to generate a list of available steps on a grid?

I have a 5x5 grid which is described by max_size(5, 5). I need to generate a list of all cells from that description using DCG.
Here's the code I have so far:
:- use_module(library(clpfd)).
map_size(5, 5).
natnum(0).
natnum(X) :-
X #= X0 + 1,
natnum(X0).
list_all_cells(Visited) -->
{ length(Visited, 25) },
[].
list_all_cells(Visited) -->
[X-Y],
{ map_size(X_max, Y_max),
natnum(X), natnum(Y),
X #< X_max, Y #< Y_max,
maplist(dif(X-Y), Visited) },
list_all_cells([X-Y|Visited]).
However, it doesn't generate a list and outputs only 4 pairs.
A possible query to the DCG looks like list_all_cells([]) which is supposed to list all cells on the grid. For example, it's gonna be [0-0, 1-0, 1-1, 0-1] for a 2x2 grid (order doesn't matter).
In fact, I need this predicate to build another one called available_steps/2 that would generate a list of all possible moves for a given position. Having available_steps(CurrentPos, Visited), I will be able to brute-force Hunt the Wumpus game and find all possible routes to gold.
list_all_cells(Cells) :-
bagof(C,cell(C),Cells).
cell(X-Y) :-
between(0,4,X),
between(0,4,Y).
Example run:
?- list_all_cells(Cells); true.
Cells= [0-0, 0-1, 0-2, 0-3, 0-4, 1-0, 1-1, 1-2, ... - ...|...] [write] % The letter w was pressed.
Cells= [0-0, 0-1, 0-2, 0-3, 0-4, 1-0, 1-1, 1-2, 1-3, 1-4, 2-0, 2-1, 2-2, 2-3, 2-4, 3-0, 3-1, 3-2, 3-3, 3-4, 4-0, 4-1, 4-2, 4-3, 4-4] ;
true.

Positioning random points on a 2D plane

So here's a little bit of geometry for you. I've been stuck on this for a while now:
I need to write a script (in C#, but feel free to answer in whatever script you'd like) that generates random points. A points has to values, x and y.
I must generate N points total (where N > 1 and is also randomly up to 100).
point 1 must be x = 0, y = 0. point 2 must be of distance 1 from point 1. So that Root(x2 + y2) = 1.
point 3 must be of distance 1 from point 2 and so on and so forth.
Now here's the tricky part - point N must be of distance 1 from point 1. So if you were to connect all points into a single shape, you'd get a closed shape with each vertices being the same length.
(vertices may cross and you may even have two points at exactly the same location. As long as it's random).
Any idea how you'd do that?
I would do it with simulation of chain there are 2 basic ways one is start from regular polygon and then randomize one point a bit (rotate a bit) then iterate the rest to maintain the segment size=1.
The second one is start with full random open chain (like in MBo answer) and then iteratively change the angles until the last point is on desired distance from first point. I think the second approach is a bit simpler to code...
If you want something more complicated then you can generate M random points and handle them as closed Bezier curve cubic patches loop control points. Then just find N equidistant points on it (this is hard task) and rescale the whole thing to match segment line size = 1
If you want to try first approach then
Regular polygon start (closed loop)
Start with regular polygon (equidistant points on circle). So divide circle to N angular segments. Select radius r so line length match l=1
so r=0.5/cos(pi/N) ... from half angle triangle
Make function to rotate i-th point by some single small step
So just rotate the i-th point around (i-1)th point with radius 1 and then iteratively change the {i+1,...N} points to match segments sizes
you can exploit symmetry to avoid bullet #2
but this will lead not to very random result for small N. Just inverse rotation of 2 touching segments for random point p(i) and loop this many times.
to make it more random you can apply symmetry on whole parts (between 2 random points) instead of on 2 lines only
The second approach is like this:
create randomized open chain (like in MBo's answer)
so all segments are already with size=1.0. Remember also the angle not just position
i-th point iteration
for simplicity let the points be called p1,p2,...pn
compute d0=||pn-p1|-1.0|
rotate point pi left by some small da angle step
compute dl=||pn-p1|-1.0|
rotate point pi right by 2.0*da
compute dr=||pn-p1|-1.0|
rotate point pi to original position ... left by da
now chose direction closer to the solution (min dl,dr,d0) so:
if d0 is minimal do not change this point at all and stop
if dl is minimal then rotate left by da while dl is lowering
if dr is minimal then rotate right by da while dr is lowering
solution
loop bullet #2 while the d=||pn-p0|-1.0| is lowering then change da to da*=0.1 and loop again. Stop if da step is too small or no change in d after loop iteration.
[notes]
Booth solutions are not precise your distances will be very close to 1.0 but can be +/- some error dependent on the last da step size. If you rotate point pi then just add/sub angle to all pi,pi+1,pi+2,..pn points
Edit: This is not an answer, closeness has not been taken into account.
It is known that Cos(Fi)^2 + Sin(Fi)^2 = 1 for any angle Fi
So you may use the next approach:
P[0].X = 0
P[0].Y = 0
for i = 1 .. N - 1:
RandomAngle = 2 * Pi * Random(0..1)
P[i].X = P[i-1].X + Cos(RandomAngle)
P[i].Y = P[i-1].Y + Sin(RandomAngle)

Resources