I am trying to implement polynomial regression using the least squares method. There was a problem while plotting the 3rd graph, it is not displayed.
I think it's about the implementation of the formula y=ax+b.
But in my case, in first I got experimental data values using inline functions polyfit and polyval.
x=0:0.1:5;
y=3*x+2;
y1=y+randn(size(y));
k=1;#Polynom
X1=0:0.01:10
B=polyfit(x,y1,k);
Y1=polyval(B,X1);
And after all, I am already using a linear model to solve the polynomial regression using the method of least squares.
Y2=Y1'*x+B'; -----this problem formula
subplot(3,2,3);
plot(x,Y1,'-b',X1,y1,'LineWidth');
title('y1=ax+b');
xlabel('x');
ylabel('y');
grid on;
As a result, no graph is drawn.
check size of the vector: x and Y1 are not same length, same for X1 and y1.
You probably want to plot as:
plot(x,y1,'-b',X1,Y1,'LineWidth', 1);
Related
I have implemented multivariate linear regression, where parameters theta0 (intersect), theta1, theta2 are optimized by minimizing MSE loss, chosen with line search in gradient descent. How do I visually illustrate the mathematical property that the direction of steepest descent (negative gradient) of successive steps are orthogonal? I'm trying to generate a contour map similar to this image: Plot, but with respect to 2 parameters instead of 1 (if it's not possible, 2 separate plots would also be great).
Also, I originally wanted to perform multivariate linear regression with 4 features, but ultimately decided to use only the 2 most strongly correlated ones (after comparing their PCC) in order to be able to plot a graph. Although I'm not aware of any way to plot 4-dimensional data, does anyone know if this is possible and how?
I was working with one dataset and found the curve to be sigmoidal. i have fitted the curve and got the equation A2+((A1-A2)/1+exp((x-x0)/dx)) where:
x0 : Mid point of the curve
dx : slope of the curve
I need to find the slope and midpoint in order to give generalized equation. any suggestions?
You should be able to simplify the modeling of the sigmoid with a function of the following form:
The source includes code in R showing how to fit your data to the sigmoid curve, which you can adapt to whatever language you're writing in. The source also notes the following form:
Which you can adapt the linked R code to solve for. The nice thing about the general functions here will be that you can solve for the derivative from them. Also, you should note that the midpoint of the sigmoid is just where the derivative of dx (or dx^2) is 0 (where it goes from neg to pos or vice versa).
Assuming your equation is a misprint of
A2+(A1-A2)/(1+exp((x-x0)/dx))
then your graph does not reflect zero residual, since in your graph the upper shoulder is sharper than the lower shoulder.
Likely the problem is your starting values. Try using the native R function SSfpl, as in
nls(y ~ SSfpl(x,A2,A1,x0,dx))
Yesterday, I posted a question about general concept of SVM Primal Form Implementation:
Support Vector Machine Primal Form Implementation
and "lejlot" helped me out to understand that what I am solving is a QP problem.
But I still don't understand how my objective function can be expressed as QP problem
(http://en.wikipedia.org/wiki/Support_vector_machine#Primal_form)
Also I don't understand how QP and Quasi-Newton method are related
All I know is Quasi-Newton method will SOLVE my QP problem which supposedly formulated from
my objective function (which I don't see the connection)
Can anyone walk me through this please??
For SVM's, the goal is to find a classifier. This problem can be expressed in terms of a function that you are trying to minimize.
Let's first consider the Newton iteration. Newton iteration is a numerical method to find a solution to a problem of the form f(x) = 0.
Instead of solving it analytically we can solve it numerically by the follwing iteration:
x^k+1 = x^k - DF(x)^-1 * F(x)
Here x^k+1 is the k+1th iterate, DF(x)^-1 is the inverse of the Jacobian of F(x) and x is the kth x in the iteration.
This update runs as long as we make progress in terms of step size (delta x) or if our function value approaches 0 to a good degree. The termination criteria can be chosen accordingly.
Now consider solving the problem f'(x)=0. If we formulate the Newton iteration for that, we get
x^k+1 = x - HF(x)^-1 * DF(x)
Where HF(x)^-1 is the inverse of the Hessian matrix and DF(x) the gradient of the function F. Note that we are talking about n-dimensional Analysis and can not just take the quotient. We have to take the inverse of the matrix.
Now we are facing some problems: In each step, we have to calculate the Hessian matrix for the updated x, which is very inefficient. We also have to solve a system of linear equations, namely y = HF(x)^-1 * DF(x) or HF(x)*y = DF(x).
So instead of computing the Hessian in every iteration, we start off with an initial guess of the Hessian (maybe the identity matrix) and perform rank one updates after each iterate. For the exact formulas have a look here.
So how does this link to SVM's?
When you look at the function you are trying to minimize, you can formulate a primal problem, which you can the reformulate as a Dual Lagrangian problem which is convex and can be solved numerically. It is all well documented in the article so I will not try to express the formulas in a less good quality.
But the idea is the following: If you have a dual problem, you can solve it numerically. There are multiple solvers available. In the link you posted, they recommend coordinate descent, which solves the optimization problem for one coordinate at a time. Or you can use subgradient descent. Another method is to use L-BFGS. It is really well explained in this paper.
Another popular algorithm for solving problems like that is ADMM (alternating direction method of multipliers). In order to use ADMM you would have to reformulate the given problem into an equal problem that would give the same solution, but has the correct format for ADMM. For that I suggest reading Boyds script on ADMM.
In general: First, understand the function you are trying to minimize and then choose the numerical method that is most suited. In this case, subgradient descent and coordinate descent are most suited, as stated in the Wikipedia link.
I have a Cubic Bézier curve. But I have a problem when I need only one point. I have only value from the X-axis and want to find a value that coresponds to Y-axis to that point. Or find the t step, from it I can easely calculate the Y-axis.
Any clue how to do it? Or is there any formula to do this?
Any solution will have to deal with the fact that there may be multiple solutions if the curve is not X monotone. Consider the cubic bezier (0,0),(2,0),(-1,1),(1,1):
As you can see, there are 4 parameter values (and Y coordinates) at which X==1/2.
This means that if you use subdivision (which is probably your simplest solution), then you need to be careful that your initial bounding t values only surround the point you want.
You can also guess what this implies about the order of an algebraic solution.
A parametric curve extends to any dimension by adding coefficients for those dimensions. Are you sure you've got things straight? It seems like you are using the x-axis as the curve parameter t. The t parameter controls the computations of X- and Y-coordinates by having two cubic equations. Take a look at Wikipedia which provides some pretty neat explanations for the 2D case.
Edit:
Solve as a general third-degree polynomial. Beware that it might have 3 solutions, though.
I want to use a linear regression model, but I want to use ordinary least squares, which I think it is a type of linear regression. The software I use is SPSS. It only has linear regression, partial least squares and 2-stages least squares. I have no idea which one is ordinary least squares (OLS).
Yes, although 'linear regression' refers to any approach to model the relationship between one or more variables, OLS is the method used to find the simple linear regression of a set of data.
Linear regression is a vast term that just says we are finding a relationship between the dependent and independent variable(s), no matter what technique we are using.
OLS is just one of the technique to do linear reg.
Lets say,
error(e) = (observed value - predicted value)
Observed values - blue dots in picture
predicted values - points on the line(vertically below to the observed values)
The vertical lines below represent 'e'. We square them -> add them and get total err. And we try to reduce this total error.
For OLS, as the name says (ordinary least squared method), here we reduce the sum of all e^2 i.e. we try to make the error least.