How to combine different equations automatically - python-3.x

I have 9 different equations that contain only 7 unknowns. I would like to generate a code that creates all the possible systems of 7 equations and return the cases in which all the variables have a positive result. The 9 equations are:
eq1 = 14*x+7*z-21.71
eq2 = 15*x+11*z+w-38.55
eq3=12*x+8*y+12*z+w-52.92
eq4=12*x+8*y+14*z+t-61.7
eq5=13*x+8*y+15*z+t-69.37
eq6=4*x+17*y+14*r+s-98.32
eq7=4*x+18*y+12*w+s-130.91
eq8=4*x+18*y+15*w+2*t-165.45
eq9=4*x+18*y+12*w+2*s-168.16

Adapted from this answer
What you want to do is iterate through all the combinations that only have 7 equations. You can do this with nested loops but it will get very ugly as you will have 7 nests.
itertools library in python (standard library) has a built in method for looping like this, so if you create a list of equations, you can iterate through all unique systems by doing the following
import itertools
eq = [eq1,eq2eq3,eq4,eq5,eq6,eq7,eq8,eq9]
for system in itertools.combinations(eq, 7):
# do stuff to solve equations however you want
system is going to be a tuple containing the 7 equations in the system. From there use whatever solving technique you want to solve them and determine if all outputs are positive (i.e. numpy and matrix equations)

Related

Numerical differences in NumPy conjugate and angle

Suppose to have two complex numbers using Python and NumPy (1.20.1):
a = 5 + 1j*3
a0 = 4 + 1j*2
And I want to calculate the phase shift, aka the distance between the two angles. I am getting two slightly different results:
>>> np.angle(a*np.conjugate(a0))
0.07677189126977804
>>> np.angle(a) - np.angle(a0)
0.07677189126977807
I guess the most correct way should be the first.
In some cases the difference is bigger, in others there is none.
Does anyone know the origin of this difference?
Cheers.
EDIT
I've found a more relevant example:
>>> a = 41.887609743111966+3.868827773225067j
>>> a0 = -65.06495257694792-0.19335140606773393j
>>> np.angle(a) - np.angle(a0)
3.2307217955357035
>>> np.angle(a*np.conjugate(a0))
-3.0524635116438827
The first example is just due to numerical imprecision inherent in doing floating point calculations; performing these operations in different order leads to different round offs that result in them being represented by (very slightly) different floating point values. The difference in value between the two is negligible for most applications.
However, as your second example shows, these two expressions are not equivalent. np.angle returns a value from -pi to pi, which is important when the difference in angle is larger than than that. When you take a difference between two angles, you can get a value outside this range, which is what happens in the first snippet. The second snippet where the result comes directly from np.angle has to be in the range -pi to pi. The difference between these your two results is simply 2pi.
So if you wanted to determine the absolute angle between two points, you would use your first formula. If you just wanted to determine the relative phase between -pi and pi, you would use the second.

D separation python implementation

I am completely new to the field of Bayesian Networks. For my project, I need to check All the possible d separation conditions existing in a 7 node dag and for that I am looking for some good python code.
My knowledge in programming is limited ( a bit of numerical analysis and data structures; but I understand d separation, e separation and other concepts in a dag quite well).
It would be really very helpful if someone could point out where to look for such a specific code. Please note that I want a python codes that checks for All the conditional independences following from d separation in a 7 node dag.
I would be happier with an
algorithm checking whether each path is blocked or not etc rather than one built on semi graphoid axioms.
I don't know exactly where should I look or to whom should I ask, so any help would be greatly appreciated.
I guess you understand that your demand is a very large list. Even if we only consider d-separation between only 2 variables (conditionned by a set of nodes).
Anyway, you can do that quite easily with pyAgrum (https://agrum.org)
import itertools
import pyAgrum as gum
# create a BN
bn=gum.fastBN("A->B<-C->D->E->F;B->E<-G");
# print the indepency model by testing d-separations
# how to iterate for each subset of an interable
def powerset(iterable):
"""
powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)
"""
xs = list(iterable)
# note we return an iterator rather than a list
return itertools.chain.from_iterable(itertools.combinations(xs,n) for n in range(len(xs)+1))
# testing every d-separation
for i in bn.names():
for j in bn.names()-{i}:
for k in powerset(bn.names()-{i,j}):
if bn.isIndependent(i,j,k):
print(f"{i} indep {j} given {k}")
And the result (in a notebook) :

Determining the Distance between two matrices using numpy

I am developing my own Architecture Search algorithm using Pythons numpy. Currently I am trying to determine how to develop a cost function that can see the distance between X and Y, or two matrices.
I'd like to reduce the difference between the two, to a meaningful scalar value.
Ideally between 0 and 1, so that if both sets of elements within the matrices are the same numerically and positionally, a 0 is returned.
In the example below, I have the output of my algorithm X. Both X and Y are the same shape. I tried to sum the difference between the two matrices; however I'm not sure that using summation will work in all conditions. I also tried returning the mean. I don't think that either approach will work though. Aside from looping through both matrices and comparing elements directly, is there a way to capture the degree of difference in a scalar?
Y = np.arange(25).reshape(5, 5)
for i in range(1000):
X = algorithm(Y)
# I try to reduce the difference between the two matrices to a scalar value
cost = np.sum(X-Y)
There are many ways to calculate a scalar "difference" between two matrices. Here are just two examples.
The mean square error:
((m1 - m2) ** 2).mean() ** 0.5
The max absolute error:
np.abs(m1 - m2).max()
The choice of the metric depends on your problem.

(in excel) randomly generating a power law distribution

I am trying to simulate a number of different distribution types for a project using Excel. Right now, I have generated a normal distribution with a mean of 35 and a standard deviation of 3.33. So far so good.
I would like to also generate some other distribution types.
One I have tried is a lognormal. To get that, I am using the following code:
=(LOGNORM.INV(RAND(),LN(45^2/SQRT(45^2+3.33^2)),SQRT(LN((45^2+3.33^2)/4.5^2))
It produces some output, but I would welcome anyone's input on the syntax.
What I really want to try to do is a power law distribution. From what I can tell, Excel does not have a built-in function to randomly generate this data. Does anyone know of a way to do it, besides switching software packages?
Thanks for any help you can provide.
E
For the (type I) Pareto distribution, if the parameters are a min value xm and an exponent alpha then the cdf is given by
p = 1 - (xm/x)^alpha
This gives the probability, p, that the random variable takes on a value which is <= x. This is easy to invert, so you can use inverse sampling to generate random variables which follow that distribution:
x = xm/(1-p)^(1/alpha) = xm*(1-p)^(-1/alpha)
If p is uniform over [0,1] then so is 1-p, so in the above you can just use RAND() to simulate 1/p. Thus, in Excel if you wanted to e.g. simulate a type-1 Pareto distribution with xm = 2 and alpha = 3, you would use the formula:
= 2 * RAND()^(-1/3)
If you are going to be doing this sort of thing a lot with different distributions, you might want to consider using R, which can be called directly from Excel using the REXcel add-in. R has a very large number of built-in distributions that it can directly sample from (and it also uses a better underlying random number generator than Excel does).

How to find a regression line for a closed set of data with 4 parameters in matlab or excel?

I have a set of data I have acquired from simulations. There are 3 parameters that go into my simulations and I get one result out.
I can graph the data from the small subset i have and see the trends for each input, but I need to be able to extrapolate this and get some form of a regression equation seeing as the simulation takes a long time.
In matlab or excel, is it possible to list the inputs and outputs to obtain a 4 parameter regression line for a given set of information?
Before this gets flagged as a duplicate, i understand polyfit will give me an equation of best fit and will be as accurate as i want it, but i need the equation to correspond to the inputs, not just a regression line.
In other words if i 20 simulations of inputs a, b, c and output y, is there a way to obtain a "best fit":
y=B0+B1*a+B2*b+B3*c
using the data?
My usual recommendation for higher-dimensional curve fitting is to pose the problem as a minimization problem (that may be unneeded here with the nice linear model you've proposed, but I'm a hammer-nail guy sometimes).
It starts by creating a correlation function (the functional form you think maps your inputs to the output) given a vector of fit parameters p and input data xData:
correl = #(p,xData) p(1) + p(2)*xData(:,1) + p(3)*xData(:2) + p(4)*xData(:,3)
Then you need to define a function to minimize given the parameter vector, which I call the objective; this is typically your correlation minus you output data.
The details of this function are determined from the solver you'll use (see below).
All of the method need a starting vector pGuess, which is dependent on the trends you see.
For nonlinear correlation function, finding a good pGuess can be a trial but necessary for a good solution.
fminsearch
To use fminsearch, the data must be collapsed to a scalar value using some norm (2 here):
x = [a,b,c]; % your input data as columns of x
objective = #(p) norm(correl(p,x) - y,2);
p = fminsearch(objective,pGuess); % you need to define a good pGuess
lsqnonlin
To use lsqnonlin (which solves the same problem as above in different ways), the norm-ing of the objective is not needed:
objective = #(p) correl(p,x) - y ;
p = lsqnonlin(objective,pGuess); % you need to define a good pGuess
(You can also specify lower and upper bounds on the parameter solution, which is nice.)
lsqcurvefit
To use lsqcurvefit (which is simply a wrapper for lsqnonlin), only the correlation function is needed along with the data:
p = lsqcurvefit(correl,pGuess,x,y); % you need to define a good pGuess

Resources