I need to do a linear fit as follows:
Y=a*X+b
I need to find the values of a and b that fit the experimental data
the first thing that occurred to me was to use the polyfit function,
but the problem is that in my data, X is a vector with 3 entries,
this is my code:
p_0=np.array([10,10,10])
p_1=np.array([100,10,10])
p_2=np.array([10,100,10])
p_3=np.array([10,10,100])
# Experimental data:
x=np.array([p_0,p_1,p_2,p_3])
y=np.array([35,60,75,65])
a=np.polyfit(x, y,1)
print(a)
I was expecting a list of lists to print, with the matrix and matrix b ... but I got TypeError("expected 1D vector for x")
Is there any way to do this with numpy or some other library?
sklearn can be used for this:
import numpy as np
from sklearn.linear_model import LinearRegression
model = LinearRegression()
p_0=np.array([10,10,10])
p_1=np.array([100,10,10])
p_2=np.array([10,100,10])
p_3=np.array([10,10,100])
# Experimental data:
x=np.array([p_0,p_1,p_2,p_3])
y=np.array([35,60,75,65])
model.fit(X=x, y=y)
print("coeff: ", *model.coef_)
print("intercept: ", model.intercept_)
output:
coeff: 0.27777777777777785 0.44444444444444464 0.33333333333333337
intercept: 24.444444444444436
A few other nice features of the sklearn package:
model.fit(x,y) # 1.0
model.rank_ # 3
model.predict([[1,2,3]]) # array([26.61111111])
One way to go about this is using numpy.linalg.lstsq:
# Experimental data:
x=np.array([p_0,p_1,p_2,p_3])
y=np.array([35,60,75,65])
A = np.column_stack([x, np.ones(len(x))])
coefs = np.linalg.lstsq(A, y)[0]
print (coefs)
# 0.27777778 0.44444444 0.33333333 24.44444444
Another option is to use LinearRegression from sklearn:
from sklearn.linear_model import LinearRegression
reg = LinearRegression().fit(x, y)
print (reg.coef_, reg.intercept_)
# array([0.27777778, 0.44444444, 0.33333333]), 24.444444444444443
hello i am new to sklearn in python and iam trying to learn it and use this module to predict some numbers based on two features here is the error i am getting:
ValueError: only 2 non-keyword arguments accepted
and here is my code:
from sklearn.linear_model import LinearRegression
import numpy as np
trainingData = np.array([[861, 16012018], [860, 12012018], [859, 9012018], [858, 5012018], [857, 2012018], [856, 29122017], [855, 26122017], [854, 22122017], [853, 19122017]])
trainingScores = np.array([11,18,23,33,34,6],[10,19,21,33,34,1], [14,18,22,23,31,6],[16,22,29,31,33,10],[21,24,27,30,31,6],[1,14,15,20,27,7],[1,9,10,11,15,8],[2,9,27,31,35,1],[7,13,18,22,33,2])
clf = LinearRegression(fit_intercept=True)
clf.fit(trainingScores,trainingData)
predictionData = np.array([862, 19012018 ])
x=clf.predict(predictionData)
print(x)
I am not sure what you are trying to do here, but change this line:
trainingScores = np.array([11,18,23,33,34,6],[10,19,21,33,34,1], [14,18,22,23,31,6],[16,22,29,31,33,10],[21,24,27,30,31,6],[1,14,15,20,27,7],[1,9,10,11,15,8],[2,9,27,31,35,1],[7,13,18,22,33,2])
to this (Notice the extra square brackets around your data):
trainingScores = np.array([[11,18,23,33,34,6],[10,19,21,33,34,1], [14,18,22,23,31,6],[16,22,29,31,33,10],[21,24,27,30,31,6],[1,14,15,20,27,7],[1,9,10,11,15,8],[2,9,27,31,35,1],[7,13,18,22,33,2]])
Then change the order of params in fit() like this:
clf.fit(trainingData,trainingScores)
And finally change prediction data like this (again look at the extra square brackets):
predictionData = np.array([[862, 19012018]])
After that your code will run.
You are doing a linear regression code in ML and try to change this line with
trainingScores = np.array(
[11,18,23,33,34,6],
[10,19,21,33,34,1],
[14,18,22,23,31,6],
[16,22,29,31,33,10],
[21,24,27,30,31,6],
[1,14,15,20,27,7],
[1,9,10,11,15,8],
[2,9,27,31,35,1],
[7,13,18,22,33,2]
)
I'm attempting to run a simple linear regression on a data set and retrieve the coefficients. The data which is from a a .csv file looks like:
"","time","LakeHuron"
"1",1875,580.38
"2",1876,581.86
"3",1877,580.97
"4",1878,580.8
...
import pandas as pd
import numpy as np
from sklearn import datasets, linear_model
def Main():
location = r"~/Documents/Time Series/LakeHuron.csv"
ts = pd.read_csv(location, sep=",", parse_dates=[0], header=None)
ts.drop(ts.columns[[0]], axis=1, inplace=True)
length = len(ts)
x = ts[1].values
y = ts[2].values
x = x.reshape(length, 1)
y = y.reshape(length, 1)
regr = linear_model.LinearRegression()
regr.fit(x, y)
print(regr.coef_)
if __name__ == "__main__":
Main()
Since this is a simple linear model then $Y_t = a_0 + a_1*t$, which in this case should be $Y_t = 580.202 -0.0242t$. and all that prints out when running the above code is [[-0.02420111]]. Is there anyway to get the second coefficient 580.202?
I've had a look at the documentation on http://scikit-learn.org/stable/modules/linear_model.html and it outputs two variables in the array.
Look like you only have one X and one Y, So output is correct.
Try this:
#coef_ : array, shape (n_features, ) or (n_targets, n_features)
print(regr.coef_)
#intercept_ : array Independent term in the linear model.
print(regr.intercept_)
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression