Matlab code for general multinomial formula - statistics

I would like to know if there is Matlab code that can solve the multinomial formula. I can write a code for finite number of terms in the multinomial formula, for example, (x_{1}+x_{2}+x_{3})^4.
But for general case, I found it is not easy, i.e. (x_{1}+x_{2}+ .... +x_{m})^n

I interpret your question as: Using the symbolic toolbox, how do I expand the multinomial
(x_{1} + x_{2} + .... + x_{m})^n
to be able to see all its coefficients?
And my answer for you is:
m = 2; n = 3;
X = sym('x', [1,m]);
expand(sum(X)^n)
Which will yield:
>> x1^3 + 3*x1^2*x2 + 3*x1*x2^2 + x2^3

Related

Is there any method/solver in python to solve embedded derivatives in a ODE equation?

I've got this equation from mathematical model to know the thermal behavior of a battery.
dTsdt = Ts * a+ Ta * b + dTadt * c + d
However, i can't get to solve it due to the nested derivatives.
I need to solve the equation for Ts and Ta.
I tried to define it as follows, but python does not like it and several eŕrors show up.
Im using scipy.integrate and the solver ODEint
Since the model takes data from vectors, it has to be solved for every time step and record the output accordingly.
I also tried assinging the derivatives to a variable v1,v2, and then put everything in an equation without derivatives like the second approach shown as follows.
def Tmodel(z,t,a,b,c,d):
    Ts,Ta= z
    dTsdt = Ts*a+ Ta*b + dTadt*c+ d
    dzdt=[dTsdt]
    return dzdt
z0=[0,0]
# solve ODE
for i in range(0,n-1):
   
    tspan = [t[i],t[i+1]]
    # solve for next step
    z = odeint(Tmodel,z0,tspan,arg=(a[i],b[i],c[i],d[i],))
    # store solution for plotting
    Ts[i] = z[1][0]
    Ta[i] = z[1][1]
    # next initial condition
    z0 = z[1]
def Tmodel(z,t,a,b,c,d):
    Ts,v1,Ta,v2= z
# v1= dTsdt
# v2= dTadt
    v1 = Ts*a+ Ta*b + v2*c+ d
    dzdt=[v1,v2]
    return dzdt
That did not work either.I believe there might be a solver capable of solving that equation or the equation must be decouple in a way and solve accordingly.
Any advice on how to solve such eqtn with python would be appreciate it.
Best regards,
MM
Your difficulty seems to be that you are given Ta in a form with no easy derivative, so you do not know where to take it from. One solution is to avoid this derivative completely and solve the system for y=Ts-c*Ta. Substitute Ts=y+c*Ta in the right side to get
dy/dt = y*a + Ta*(b+c*a) + d
Of course, this requires then a post-processing step Ts=y+c*Ta to get to the requested variable.
If Ta is given as function table, use an interpolation function to get values at any odd time t that is demanded by the ODE solver.
Ta_func = interp1d(Ta_times,Ta_values)
def Tmodel(y,t,a,b,c,d):
Ta= Ta_func(t)
dydt = y*a+ Ta*(b+c*a) + d
return dydt
y[0] = Ts0-c*Ta_func(t[0])
for i in range(len(t)-1):
y[i+1] = odeint(Tmodel,y[i],t[i:i+2],arg=(a[i],b[i],c[i],d[i],))[-1,0]
Ts = y + c*Ta_func(t)

Solve for x using python

I came across a problem. One string is taken as input say
input_string = "12345 + x = x * 5 + (1+x)* x + (1+18/100)"
And get output of x using python. I am not able to figure out logic for this.
Here is a complete SymPy example for your input:
from sympy import Symbol, solve, Eq
from sympy.parsing.sympy_parser import parse_expr
input_string = "12345 + x = x * 5 + (1+x)* x + (1+18/100)"
x = Symbol('x', real=True)
lhs = parse_expr(input_string.split('=')[0], local_dict={'x':x})
rhs = parse_expr(input_string.split('=')[1], local_dict={'x':x})
print(lhs, "=", rhs)
sol = solve(Eq(lhs, rhs), x)
print(sol)
print([s.evalf() for s in sol])
This outputs:
x + 12345 = x*(x + 1) + 5*x + 59/50
[-5/2 + 9*sqrt(15247)/10, -9*sqrt(15247)/10 - 5/2]
[108.630868798908, -113.630868798908]
Note that solve() gives a list of solutions. And that SymPy normally does not evaluate fractions and square roots, as it prefers solutions without loss of precision. evalf() evaluates a float value for these expressions.
Well, that example shows a quadratic equation which may have no solutions, one solution, or two solutions. You would have to rearrange the terms symbolically to come to
input_string = "x**2 + 5*x - 12345 + (118/100)"
But that means you need to implement rules for multiplication, addition, subtraction and potentially division. At least for Python there is a library called SymPy which can parse such strings and provide an expression that you can evaluate and even solve.

How to solve this optimization problem with cvxopt

I have a non-linear optimization problem which, in Mathematica, could be solved as:
FindMaximum[{(81 x + 19)^0.4 + (80 (1 - x) + 20)^0.6, 0 <= x <= 1}, x‬‬]
However, now I am on a computer without Mathematica and I would like to solve a similar problem in Python, using the CVXOPT module. I looked at the examples and found linear programs, quadratic programs, and other kinds of programs, but could not find this simple program.
Can I solve such a program with CVXOPT?
I did not find a solution with cvxopt, but I found a much better alternative - cvxpy:
import cvxpy as cp
x = cp.Variable()
prob = cp.Problem(
cp.Maximize((81*x + 19)**0.6 + (80*(1-x)+20)**0.6),
[0 <= x, x <= 1])
prob.solve() # Returns the optimal value.
print("status:", prob.status)
print("optimal value", prob.value)
print("optimal var", x.value)
Prints:
status: optimal
optimal value 23.27298502822502
optimal var 0.5145387371825181

Rounding Error: Harmonic mean with exponent of small numbers

Let us say I have log_a1=-1000, log_a2=-1001, and log_a3=-1002.
n=3
I want to get the harmonic mean (HM) of a1, a2 and a3 (not log_a1, log_a2 and log_a3) such that HM = n/[1/exp(log_a1) + 1/exp(log_a2) + 1/exp(log_a3)].
However, due to rounding error exp(log_a1)=exp(-1000)=0 and accordingly 1/exp(log_a)=inf and HM=0.
Is there any mathematical trick to do? It is okay to get either HM or log(HM).
The best approach is probably to keep things in log scale. Many scientific languages have a log-add-exp function (e.g. numpy.logaddexp in python) that does what you want to high precision, with both the input and the result in log form.
The idea is that you want to compute e^-1000 + e^-1001 + e^-1002, so you factor it to e^-1000 (1 + + e^-1 + e^-2) and take the log. The result is -1000 + log(1 + e^-1 + e^-2), which can be computed without loss of precision.
log(HM)=log(n)-log(1)+log_a_max - log(sum(1./exp(log_ai - log_a_max)))
For a=[-1000, -1001, -1002];
log(HM)=-1001.309

Algorithm to solve Local Alignment

Local alignment between X and Y, with at least one column aligning a C
to a W.
Given two sequences X of length n and Y of length m, we
are looking for a highest-scoring local alignment (i.e., an alignment
between a substring X' of X and a substring Y' of Y) that has at least
one column in which a C from X' is aligned to a W from Y' (if such an
alignment exists). As scoring model, we use a substitution matrix s
and linear gap penalties with parameter d.
Write a code in order to solve the problem efficiently. If you use dynamic
programming, it suffices to give the equations for computing the
entries in the dynamic programming matrices, and to specify where
traceback starts and ends.
My Solution:
I've taken 2 sequences namely, "HCEA" and "HWEA" and tried to solve the question.
Here is my code. Have I fulfilled what is asked in the question? If am wrong kindly tell me where I've gone wrong so that I will modify my code.
Also is there any other way to solve the question? If its available can anyone post a pseudo code or algorithm, so that I'll be able to code for it.
public class Q1 {
public static void main(String[] args) {
// Input Protein Sequences
String seq1 = "HCEA";
String seq2 = "HWEA";
// Array to store the score
int[][] T = new int[seq1.length() + 1][seq2.length() + 1];
// initialize seq1
for (int i = 0; i <= seq1.length(); i++) {
T[i][0] = i;
}
// Initialize seq2
for (int i = 0; i <= seq2.length(); i++) {
T[0][i] = i;
}
// Compute the matrix score
for (int i = 1; i <= seq1.length(); i++) {
for (int j = 1; j <= seq2.length(); j++) {
if ((seq1.charAt(i - 1) == seq2.charAt(j - 1))
|| (seq1.charAt(i - 1) == 'C') && (seq2.charAt(j - 1) == 'W')) {
T[i][j] = T[i - 1][j - 1];
} else {
T[i][j] = Math.min(T[i - 1][j], T[i][j - 1]) + 1;
}
}
}
// Strings to store the aligned sequences
StringBuilder alignedSeq1 = new StringBuilder();
StringBuilder alignedSeq2 = new StringBuilder();
// Build for sequences 1 & 2 from the matrix score
for (int i = seq1.length(), j = seq2.length(); i > 0 || j > 0;) {
if (i > 0 && T[i][j] == T[i - 1][j] + 1) {
alignedSeq1.append(seq1.charAt(--i));
alignedSeq2.append("-");
} else if (j > 0 && T[i][j] == T[i][j - 1] + 1) {
alignedSeq2.append(seq2.charAt(--j));
alignedSeq1.append("-");
} else if (i > 0 && j > 0 && T[i][j] == T[i - 1][j - 1]) {
alignedSeq1.append(seq1.charAt(--i));
alignedSeq2.append(seq2.charAt(--j));
}
}
// Display the aligned sequence
System.out.println(alignedSeq1.reverse().toString());
System.out.println(alignedSeq2.reverse().toString());
}
}
#Shole
The following are the two question and answers provided in my solved worksheet.
Aligning a suffix of X to a prefix of Y
Given two sequences X and Y, we are looking for a highest-scoring alignment between any suffix of X and any prefix of Y. As a scoring model, we use a substitution matrix s and linear gap penalties with parameter d.
Give an efficient algorithm to solve this problem optimally in time O(nm), where n is the length of X and m is the length of Y. If you use a dynamic programming approach, it suffices to give the equations that are needed to compute the dynamic programming matrix, to explain what information is stored for the traceback, and to state where the traceback starts and ends.
Solution:
Let X_i be the prefix of X of length i, and let Y_j denote the prefix of Y of length j. We compute a matrix F such that F[i][j] is the best score of an alignment of any suffix of X_i and the string Y_j. We also compute a traceback matrix P. The computation of F and P can be done in O(nm) time using the following equations:
F[0][0]=0
for i = 1..n: F[i][0]=0
for j = 1..m: F[0][j]=-j*d, P[0][j]=L
for i = 1..n, j = 1..m:
F[i][j] = max{ F[i-1][j-1]+s(X[i-1],Y[j-1]), F[i-1][j]-d, F[i][j-1]-d }
P[i][j] = D, T or L according to which of the three expressions above is the maximum
Once we have computed F and P, we find the largest value in the bottom row of the matrix F. Let F[n][j0] be that largest value. We start traceback at F[n][j0] and continue traceback until we hit the first column of the matrix. The alignment constructed in this way is the solution.
Aligning Y to a substring of X, without gaps in Y
Given a string X of length n and a string Y of length m, we want to compute a highest-scoring alignment of Y to any substring of X, with the extra constraint that we are not allowed to insert any gaps into Y. In other words, the output is an alignment of a substring X' of X with the string Y, such that the score of the alignment is the largest possible (among all choices of X') and such that the alignment does not introduce any gaps into Y (but may introduce gaps into X'). As a scoring model, we use again a substitution matrix s and linear gap penalties with parameter d.
Give an efficient dynamic programming algorithm that solves this problem optimally in polynomial time. It suffices to give the equations that are needed to compute the dynamic programming matrix, to explain what information is stored for the traceback, and to state where the traceback starts and ends. What is the running-time of your algorithm?
Solution:
Let X_i be the prefix of X of length i, and let Y_j denote the prefix of Y of length j. We compute a matrix F such that F[i][j] is the best score of an alignment of any suffix of X_i and the string Y_j, such that the alignment does not insert gaps in Y. We also compute a traceback matrix P. The computation of F and P can be done in O(nm) time using the following equations:
F[0][0]=0
for i = 1..n: F[i][0]=0
for j = 1..m: F[0][j]=-j*d, P[0][j]=L
for i = 1..n, j = 1..m:
F[i][j] = max{ F[i-1][j-1]+s(X[i-1],Y[j-1]), F[i][j-1]-d }
P[i][j] = D or L according to which of the two expressions above is the maximum
Once we have computed F and P, we find the largest value in the rightmost column of the matrix F. Let F[i0][m] be that largest value. We start traceback at F[i0][m] and continue traceback until we hit the first column of the matrix. The alignment constructed in this way is the solution.
Hope you get some idea about wot i really need.
I think it's quite easy to find resources or even the answer by google...as the first result of the searching is already a thorough DP solution.
However, I appreciate that you would like to think over the solution by yourself and are requesting some hints.
Before I give out some of the hints, I would like to say something about designing a DP solution
(I assume you know this can be solved by a DP solution)
A dp solution basically consisting of four parts:
1. DP state, you have to self define the physical meaning of one state, eg:
a[i] := the money the i-th person have;
a[i][j] := the number of TV programmes between time i and time j; etc
2. Transition equations
3. Initial state / base case
4. how to query the answer, eg: is the answer a[n]? or is the answer max(a[i])?
Just some 2 cents on a DP solution, let's go back to the question :)
Here's are some hints I am able to think of:
What is the dp state? How many dimensions are enough to define such a state?
Thinking of you are solving problems much alike to common substring problem (on 2 strings),
1-dimension seems too little and 3-dimensions seems too many right?
As mentioned in point 1, this problem is very similar to common substring problem, maybe you should have a look on these problems to get yourself some idea?
LCS, LIS, Edit Distance, etc.
Supplement part: not directly related to the OP
DP is easy to learn, but hard to master. I know a very little about it, really cannot share much. I think "Introduction to algorithm" is a quite standard book to start with, you can find many resources, especially some ppt/ pdf tutorials of some colleges / universities to learn some basic examples of DP.(Learn these examples is useful and I'll explain below)
A problem can be solved by many different DP solutions, some of them are much better (less time / space complexity) due to a well-defined DP state.
So how to design a better DP state or even get the sense that one problem can be solved by DP? I would say it's a matter of experiences and knowledge. There are a set of "well-known" DP problems which I would say many other DP problems can be solved by modifying a bit of them. Here is a post I just got accepted about another DP problem, as stated in that post, that problem is very similar to a "well-known" problem named "matrix chain multiplication". So, you cannot do much about the "experience" part as it has no express way, yet you can work on the "knowledge" part by studying these standard DP problems first maybe?
Lastly, let's go back to your original question to illustrate my point of view:
As I knew LCS problem before, I have a sense that for similar problem, I may be able to solve it by designing similar DP state and transition equation? The state s(i,j):= The optimal cost for A(1..i) and B(1..j), given two strings A & B
What is "optimal" depends on the question, and how to achieve this "optimal" value in each state is done by the transition equation.
With this state defined, it's easy to see the final answer I would like to query is simply s(len(A), len(B)).
Base case? s(0,0) = 0 ! We can't really do much on two empty string right?
So with the knowledge I got, I have a rough thought on the 4 main components of designing a DP solution. I know it's a bit long but I hope it helps, cheers.

Resources