'Polynomial Regression Curves in Python
I'm trying to create a regression curve for my data, with 2 degrees. When I create my graph, I get a funny zigzag thing:
but I want to model my data as an actual curve, which would look like the connected version of the scatter plot.
Any advice/better ways of doing this?
degree = 2
p = np.poly1d(np.polyfit(data['input'],y, degree))
plt.plot(data['input'], p(data['input']), c='r',linestyle='-')
plt.scatter(data['input'], p(data['input']), c='b')
Here, data['input'] is a column vector with the same dimensions as y.
Edit: I have also tried it like this:
X, y = np.array(data['input']).reshape(-1,1), np.array(data['output'])
lin_reg=LinearRegression(fit_intercept=False)
lin_reg.fit(X,y)
poly_reg=PolynomialFeatures(degree=2)
X_poly=poly_reg.fit_transform(X)
poly_reg.fit(X_poly,y)
lin_reg2=LinearRegression(fit_intercept=False)
lin_reg2.fit(X_poly,y)
X_grid=np.arange(min(X),max(X),0.1)
X_grid=X_grid.reshape((len(X_grid),1))
plt.scatter(X,y,color='red')
plt.plot(X,lin_reg2.predict(poly_reg.fit_transform(X)),color='blue')
plt.show()
Which gives me this graph here.
The scatter is my data and the blue zigzag is what is SUPPOSED to be a quadratic curve modelling the data. Help?
Solution 1:[1]
In your plot you just plot from point to point with straight lines (where your y value is the approximated y from your polyfit function).
I would skip the polyfit function (because you have all y values you are interested in) and just interpolate the data['input'] and y with BSplines function make_interp_spline from scipy and plot the new y values with your interested range of x.
import numpy as np
import matplotlib.pyplot as plt
import scipy.interpolate as interp
plots just from point to point (zigzag)
x = np.array([1, 2, 3, 4])
y = np.array([75, 0, 25, 100])
plt.plot(x, y)
interpolates the points
x_new = np.linspace(1, 4, 300)
a_BSpline = interp.make_interp_spline(x, y)
y_new = a_BSpline(x_new)
plt.plot(x_new, y_new)
Try this and then adjust with your data! :)
Solution 2:[2]
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
#improve degree = 3
p_reg = PolynomialFeatures(degree = 3)
X_poly = p_reg.fit_transform(X)
#again create new linear regression obj
reg2 = LinearRegression()
reg2.fit(X_poly,y)
plt.scatter(X, y, color = 'b')
plt.xlabel('Level')
plt.ylabel('Salary')
plt.title("Truth or Bluff")
# predicted values
plt.plot(X, reg2.predict(X_poly), color='r')
plt.show()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | miGa77 |
| Solution 2 | Ravi kumar |
