Python3 Sine Function to fit Data - python-3.x

I have data which I broke into 28day slices. Here is a plot of the means of each of these slices.
I have to calculate some values from equation that fits the data reasonably well, and then find the correlation coefficient between those values and the means for each 28-day slice. Since I have one mean for each slice I have to find one value from the equation per slice.
I believe the equation is:
y = A sin(Bx - C) + D
But I am struggling to implement the equation for my data and find these values.

Related

Finding the optimal selections of x number per column and y numbers per row of an NxM array

Given an NxM array of positive integers, how would one go about selecting integers so that the maximum sum of values is achieved where there is a maximum of x selections in each row and y selections in each column. This is an abstraction of a problem I am trying to face in making NCAA swimming lineups. Each swimmer has a time in every event that can be converted to an integer using the USA Swimming Power Points Calculator the higher the better. Once you convert those times, I want to assign no more than 3 swimmers per event, and no more than 3 races per swimmer such that the total sum of power scores is maximized. I think this is similar to the Weapon-targeting assignment problem but that problem allows a weapon type to attack the same target more than once (in my case allowing a single swimmer to race the same event twice) and that does not work for my use case. Does anybody know what this variation on the wta problem is called, and if so do you know of any solutions or resources I could look to?
Here is a mathematical model:
Data
Let a[i,j] be the data matrix
and
x: max number of selected cells in each row
y: max number of selected cells in each column
(Note: this is a bit unusual: we normally reserve the names x and y for variables. These conventions can help with readability).
Variables
δ[i,j] ∈ {0,1} are binary variables indicating if cell (i,j) is selected.
Optimization Model
max sum((i,j), a[i,j]*δ[i,j])
sum(j,δ[i,j]) ≤ x ∀i
sum(i,δ[i,j]) ≤ y ∀j
δ[i,j] ∈ {0,1}
This can be fed into any MIP solver.

Inverse CDF of Poisson dist in Excel

I want to know is there a function to calculate the inverse cdf of poisson distribution? So that I can use inverse CDF of poisson to generate a set of poisson distributed random number.
A) Inverse CDF of Poisson distribution
The inverse CDF at q is also referred to as the q quantile of a distribution. For a discrete distribution distribution . the inverse CDF at q is the smallest integer x such that CDF[dist,x]≥q.. The Poisson distribution is a discrete distribution that models the number of events based on a constant rate of occurrence. The Poisson distribution can be used as an approximation to the binomial when the number of independent trials is large and the probability of success is small. A common application of the Poisson distribution is predicting the number of events over a specific time, such as the number of cars arriving at a toll plaza in 1 minute.
Formula
The probability mass function (PMF) is:
mean = λ
variance = λ
Notation
Term Description
e base of the natural logarithm
Reference: Methods and Formulas for Inverse Cumulative Distribution Functions
B) Excel Function: Excel provides the following function for the Poisson distribution:
POISSON(x, μ, cum)
where μ = the mean of the distribution and cum takes the values TRUE and FALSE
POISSON(x, μ, FALSE) = probability density function value f(x) at the value x for the Poisson distribution with mean μ.
POISSON(x, μ, TRUE)= cumulative probability distribution function F(x) at the value x for the Poisson distribution with mean μ.
Excel 2010/2013/2016 provide the additional function POISSON.DIST which is equivalent to POISSON.
Reference: Office Support POISSON.DIST Function
C) Excel doesn’t provide a worksheet function for the inverse of the Poisson distribution.
Instead you can use the following function provided by the Real Statistics Resource Pack. It’s a free download for Excel various versions.
POISSON_INV(p, μ) = smallest integer x such that POISSON(x, μ, TRUE) ≥ p
Note that the maximum value of x is 1,024,000,000. A value higher than this indicates an error.
Reference: Real Statistics Using Excel
D)
Reference to MREXCEL.COM web site a query related to your question quoted below seems to be related to your question.
Not sure if anyone can help with this. Basically I'm trying to find out how to apply the reverse of the Poisson function in excel. So as of now I have poisson(x value, mean, true-cumulative) and that lets me get the probability for that occurence. Basically I want to know how I can get the minimum/maximum x value based on a given probability.
So if I have a list of data (700 rows) and I want to find out what the minimum starting value should be given a desired average and the fact that I want the lowest value to be at the 0.05% probability. So 0.05% = (x, 35, True) solve for x. I know I can prob do this with solver, but I am trying to figure out a way to do this formulaicly without having to use the solver (as I may have to use this many times).
The code referred to here covers the inverse of the poisson formula when using True in the excel formula. It does not cover the inverse of the poisson formula when using False in the excel formula.
Re: Reverse Poisson?
Originally Posted by shg
A further mod to accommodate large means:
Code:
Function PoissonInv(Prob As Double, Mean As Double) As Variant
' shg 2011, 2012, 2014, 2015-0415
' For a Poisson process with mean Mean, returns a three-element array:
' o The smallest integer N such that POISSON(N, Mean, True) >= Prob
' o The CDF for N-1 (which is < Prob)
' o The CDF for N (which is >= Prob)
-------Reference :> https://www.mrexcel.com/forum/excel-questions/507508-reverse-poisson-2.html>
E) Why doesn't Excel have a POISSON.INV function?
Discussion on Referred web page have references to some formulas for calculating related information desired by OP.
You could use the following.
With the Poisson mean named lambda, enter the following in an newly inserted worksheet.
A1: =IF(ROWS(A$1:A1)<=4*lambda,POISSON(ROWS(A$1:A1)-1,lambda,1))
Fill A1 down into A2:A1000 (4 times as many rows as your most typical lambda value). Name the A1:A1000 range POISSON.CDF. Then use the formula
=MATCH(n,POISSON.CDF)-1
to give the results a POISSON.INV(n,lambda) function would.
If you want this for varying lambda, use the array formula
=MATCH(n,POISSON(ROW($A$1:INDEX($A:$A,4*lambda+1),lambda,1))-1
Reference Shared Link
Hope That Helps.
=MATCH(RAND(),MMULT((ROW(INDIRECT(ADDRESS(1,1)&":"&ADDRESS(MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+3,1)))=COLUMN(INDIRECT(ADDRESS(1,1)&":"&ADDRESS(1,MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+2))))+0,MMULT((ROW(INDIRECT(ADDRESS(1,1)&":"&ADDRESS(MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+2,1)))=(COLUMN(INDIRECT(ADDRESS(1,1)&":"&ADDRESS(1,MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+1)))+1))+0,POISSON(ROW($A$1:INDEX($A:$A,MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+1))-1,lambda,1)))+(ROW(INDIRECT(ADDRESS(1,1)&":"&ADDRESS(MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+3,1)))=(COLUMN(INDIRECT(ADDRESS(1,1)&":"&ADDRESS(1,1)))+FLOOR(MAX(lambda,5+lambda* 45/50)+6* SQRT(lambda)+2,1)))+0)-1
It is quite slow for lambda >1000.
This expands on the array formula
=MATCH(C4,POISSON(ROW($A$1:INDEX($A:$A,4*lambda+1)),lambda,1))-1
shared above by skkakkar, by prepending the array with 0 and appending with 1, following Is there a way to concatenate two arrays in Excel without VBA? .
The rest is mostly making the array shorter by replacing 4* lambda with 6* SQRT(lambda).

Deriving a matrix values from another 2 matrices

Suppose you have Matrix A.
Suppose also that we have Matrix C
If we have A = B x C and we want to find out the B matrix values which I believe should be 3x3 (Correct me if I am wrong)
Do we need to use matrix inversion here? I did not use algebra since many years.
I do not have a code yet but if someone can provide a snippet that will be great.
This is a problem that I have in image processing where the A , C hold RGB values.
The submitted matrices are just for illustration.
I am trying to solve this problem using Python numpy
I hope that someone can help with it.
Your matrix should be 5x5. As we are dealing with non-square matrices, you could use the generalized inverse of C to obtain B:
import numpy as np
np.random.seed(10)
A = np.random.randint(0,9,(5,3))
C = np.random.randint(0,9,(5,3))
B = np.matmul(A,np.linalg.pinv(C))
print B
Building on percusse's comment, you can do this with numpy.linalg.lstsq. However, this assumes that we are performing matrix left division but the situation is your question is for right division.
Using the fact that you are solving for B with B = A / C, lstsq solves problems of the type A \ C. To convert this into a form for lstsq, we can convert it into the latter problem by:
B = A / C = (C' \ A')'
The ' operator is the transpose. The above is found by linear algebra rules. Specifically, perform two transposes: ((A / C)')' where transposing a matrix twice is simply the result of itself. Also, knowing that (AC)' is equal to C'A' and for a matrix, the inverse of the transpose is equal to the transpose of the inverse you should get the above relationship.
Therefore:
B = numpy.linalg.lstsq(C.T, A.T)[0].T
The output of lstsq is a tuple where the first element is the actual solution.
Take note that for your particular example, C is a rank-deficient matrix so you won't be able to reconstruct A properly from B and C.

how to obtain estimation from regression in excel?

I use datas in excel to produce a graphic.
Then I make a regression, and have an equation. I'd like to know what value would be obtained from the regression (for example, x = 7,6 is the value for which I wanna know an estimation of y).
It is an approximation with a 6 degree polynome.
One wimple method would be this : I have the equation, so I could use it
However, I wondered if there is a fast method to do it? Like I enter 7,6 somewhere to have the result quickly?
if you are looking at a linear regression line (straight line) you could try the forecast formula
=forecast(X, Known Ys, Known Xs)
you could also build your own equation automatically from
=linest(...)
I found the following on a site describing the capabilities of the linest function in excel:
In addition to using LOGEST to calculate statistics for other
regression types, you can use LINEST to calculate a range of other
regression types by entering functions of the x and y variables as the
x and y series for LINEST. For example, the following formula:
=LINEST(yvalues, xvalues^COLUMN($A:$C))
works when you have a single column of y-values and a single column of
x-values to calculate the cubic (polynomial of order 3) approximation
of the form:
y = m1*x + m2*x^2 + m3*x^3 + b
You can adjust this formula to calculate other types of regression,
but in some cases it requires the adjustment of the output values and
other statistics.
or look at:
=trend

How to curve fit data in Excel to a multi variable polynomial?

I have a simple set of data, 10 values that increase.
I want to fit them to a polynomial of the form:
Z = A1 + A2*X + A3*Y + A4*X^2 + A5*X*Y+ A6*Y^2
Where Z the output is the set of data above, A1 - A6 are the coefficients I am looking for,
X is the range of inputs (10 of course), and Y for the moment is a constant value.
How can I curve fit to this polynomial and not the standard 2nd order one that is created using 'trendline'?
Construct a Vandermonde matrix on your data points, find it's inverse with MINVERSE, then apply this to the vector of Z values with MMULT. This would work for polynomial degree n with n data points.
Otherwise you could try polynomial regression, which will again use the Vandermonde matrix.
More math than Excel really.

Resources