Deriving a matrix values from another 2 matrices - python-3.x

Suppose you have Matrix A.
Suppose also that we have Matrix C
If we have A = B x C and we want to find out the B matrix values which I believe should be 3x3 (Correct me if I am wrong)
Do we need to use matrix inversion here? I did not use algebra since many years.
I do not have a code yet but if someone can provide a snippet that will be great.
This is a problem that I have in image processing where the A , C hold RGB values.
The submitted matrices are just for illustration.
I am trying to solve this problem using Python numpy
I hope that someone can help with it.

Your matrix should be 5x5. As we are dealing with non-square matrices, you could use the generalized inverse of C to obtain B:
import numpy as np
np.random.seed(10)
A = np.random.randint(0,9,(5,3))
C = np.random.randint(0,9,(5,3))
B = np.matmul(A,np.linalg.pinv(C))
print B

Building on percusse's comment, you can do this with numpy.linalg.lstsq. However, this assumes that we are performing matrix left division but the situation is your question is for right division.
Using the fact that you are solving for B with B = A / C, lstsq solves problems of the type A \ C. To convert this into a form for lstsq, we can convert it into the latter problem by:
B = A / C = (C' \ A')'
The ' operator is the transpose. The above is found by linear algebra rules. Specifically, perform two transposes: ((A / C)')' where transposing a matrix twice is simply the result of itself. Also, knowing that (AC)' is equal to C'A' and for a matrix, the inverse of the transpose is equal to the transpose of the inverse you should get the above relationship.
Therefore:
B = numpy.linalg.lstsq(C.T, A.T)[0].T
The output of lstsq is a tuple where the first element is the actual solution.
Take note that for your particular example, C is a rank-deficient matrix so you won't be able to reconstruct A properly from B and C.

Related

Scikit Learn PolynomialFeatures - what is the use of the include_bias option?

In scikit-learn's PolynomialFeatures preprocessor, there is an option to include_bias. This essentially just adds a column of ones to the dataframe. I was wondering what the point of having this was. Of course, you can set it to False. But theoretically how does having or not having a column of ones along with the Polynomial Features generated affect Regression.
This is the explanation in the documentation, but I can't seem to get anything useful out of it relation to why it should be used or not.
include_bias : boolean
If True (default), then include a bias column, the feature in which
all polynomial powers are zero (i.e. a column of ones - acts as an
intercept term in a linear model).
Suppose you want to perform the following regression:
y ~ a + b x + c x^2
where x is a generic sample. The best coefficients a,b,c are computed via simple matricial calculus. First, let us denote with X = [1 | X | X^2] a matrix with N rows, where N is the number of samples. The first column is a column of 1s, the second column is a column of values x_i, for all the samples i, the third column is a column of values x_i^2, for all samples i. Let us denote with B the following column vector B=[a b c]^T If Y is a column vector of the N target values for all samples i, we can write the regression as
y ~ X B
The i-th row of this equation is y_i ~ [1 x_i x^2] [a b c]^t = a + b x_i + c x_i^2.
The goal of training a regression is to find B=[a b c] such that X B be as close as possible to y.
If you don't add a column of 1, you are assuming a-priori that a=0, which might not be correct.
In practice, when you write Python code, and you use PolynomialFeatures together with sklearn.linear_model.LinearRegression, the latter takes care by default of adding a column of 1s (since in LinearRegression the fit_intercept parameter is True by default), so you don't need to add it as well in PolynomialFeatures. Therefore, in PolynomialFeatures one usually keeps include_bias=False.
The situation is different if you use statsmodels.OLS instead of LinearRegression

Python3 Sine Function to fit Data

I have data which I broke into 28day slices. Here is a plot of the means of each of these slices.
I have to calculate some values from equation that fits the data reasonably well, and then find the correlation coefficient between those values and the means for each 28-day slice. Since I have one mean for each slice I have to find one value from the equation per slice.
I believe the equation is:
y = A sin(Bx - C) + D
But I am struggling to implement the equation for my data and find these values.

element-wise multiplication in blockMatrix in Spark

I am looking if there is a good way to implement the following problem in Spark BlockMatrix:
Suppose I have a matrix A of n * m, n is huge, and m is not that large, represented by a BlockMatrix. I have another matrix B with only one column, the size is n * 1, also a BlockMatrix. Now I want to multiply these two matrix of A times B, in the way very close to element-wise multiplication. The first row of matrix A will multiply by the first element in matrix B, and so on....
So the result will still be a n * m Matrix. The operation is like B is acting as a weight that put to A.
I think the key is how to mapping the corresponding blocks together.
I know I can make B as a diagonal matrix of n * n, then B * A will be the result. Multiplication is supported already in Spark, but this will not be a good idea since it costs much more in communication.

About affine transform (translation )

For affine 4*4 transformation, I saw two representation in different text
one is
L T
0 1
Another is
L 0
T 1
L is the linear part, T is the translation part; I am wondering which is correct?
Both forms are correct. The first is used in left-multiply matrix by column vector
ResultVector = Matrix*V (for example, in OpenGL), and the second - in right-multiply convention with row vector V*Matrix (for example, in DirectX)

is there any paper or an explanation on how to implement a two dimensional KMP?

I tried to solve the problem of two dimensional search using a combination of Aho-Corasick and a single dimensional KMP, however, I still need something faster.
To elaborate, I have a matrix A of characters of size n1*n2 and I wish to find all occurrences of a smaller matrix B of size m1*m2 and I want that to be in O(n1*n2+m1*m2) if possible.
For example:
A = a b c b c b
b c a c a c
d a b a b a
q a s d q a
and
B = b c b
c a c
a b a
the algorithm should return the indexes of say, the upper left corner of the match, which in this case should return (0,1) and (0,3). notice that the occurrences may overlap.
There is an algorithm called the Baker-Bird algorithm that I just recently encountered that appears to be a partial generalization of KMP to two dimensions. It uses two algorithms as subroutines - the Aho-Corasick algorithm (which itself is a generalization of KMP), and the KMP algorithm - to efficiently search a two-dimensional grid for a pattern.
I'm not sure if this is what you're looking for, but hopefully it helps!

Resources