'Polynomial Regression Curves in Python

I'm trying to create a regression curve for my data, with 2 degrees. When I create my graph, I get a funny zigzag thing:
enter image description here
but I want to model my data as an actual curve, which would look like the connected version of the scatter plot.
enter image description here
Any advice/better ways of doing this?

degree = 2
p = np.poly1d(np.polyfit(data['input'],y, degree))
plt.plot(data['input'], p(data['input']), c='r',linestyle='-')
plt.scatter(data['input'], p(data['input']), c='b')

Here, data['input'] is a column vector with the same dimensions as y.

Edit: I have also tried it like this:

X, y = np.array(data['input']).reshape(-1,1), np.array(data['output'])
lin_reg=LinearRegression(fit_intercept=False)
lin_reg.fit(X,y)

poly_reg=PolynomialFeatures(degree=2)
X_poly=poly_reg.fit_transform(X)
poly_reg.fit(X_poly,y)
lin_reg2=LinearRegression(fit_intercept=False)
lin_reg2.fit(X_poly,y)

X_grid=np.arange(min(X),max(X),0.1)
X_grid=X_grid.reshape((len(X_grid),1))
plt.scatter(X,y,color='red')
plt.plot(X,lin_reg2.predict(poly_reg.fit_transform(X)),color='blue')
plt.show()

Which gives me this graph here.
enter image description here
The scatter is my data and the blue zigzag is what is SUPPOSED to be a quadratic curve modelling the data. Help?



Solution 1:[1]

In your plot you just plot from point to point with straight lines (where your y value is the approximated y from your polyfit function).

I would skip the polyfit function (because you have all y values you are interested in) and just interpolate the data['input'] and y with BSplines function make_interp_spline from scipy and plot the new y values with your interested range of x.

import numpy as np
import matplotlib.pyplot as plt
import scipy.interpolate as interp

plots just from point to point (zigzag)

x = np.array([1, 2, 3, 4])
y = np.array([75, 0, 25, 100])
plt.plot(x, y)

interpolates the points

x_new = np.linspace(1, 4, 300)
a_BSpline = interp.make_interp_spline(x, y)
y_new = a_BSpline(x_new)
plt.plot(x_new, y_new)

Try this and then adjust with your data! :)

Solution 2:[2]

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

#improve degree = 3
p_reg = PolynomialFeatures(degree = 3)
X_poly = p_reg.fit_transform(X)

#again create new linear regression obj
reg2 = LinearRegression()
reg2.fit(X_poly,y)
plt.scatter(X, y, color = 'b')
plt.xlabel('Level')
plt.ylabel('Salary')
plt.title("Truth or Bluff")

# predicted values
plt.plot(X, reg2.predict(X_poly), color='r')
plt.show()

With Degree 3

With Degree 4

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 miGa77
Solution 2 Ravi kumar