'Traces on Polynomial Regression

Hello I'm having troube trying to predict the Weekly Sales based on the fuel price using polynomial regression. I saw someone else ask the same question and tried the only answer but I still can't get a good graph. Here's what I've done:

from contextlib import redirect_stderr
from turtle import color, pd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

df = pd.read_csv (r'Walmart.csv')
df = df.sort_values(by=['Weekly_Sales'])

y = df.loc[:, "Fuel_Price"].sample(n = 50, random_state= 6)
x = df.loc[:, "Weekly_Sales"].sample(n = 50, random_state= 6)

poly = PolynomialFeatures(degree=2)

X_poly = poly.fit_transform(x.values.reshape(-1,1))
poly.fit(X_poly,y)
linreg = LinearRegression()
linreg.fit(X_poly,y)
y_pred = linreg.predict(X_poly)
plt.scatter(x, y, color='red')
plt.plot(x,y_pred, color = 'blue')
plt.show()

Result: Graph



Solution 1:[1]

Your main problem is that x are not in order after randomly sampling them from df. Replace the x and y sampling lines lines with

...
xy = df.sample(n = 50, random_state= 6).sort_values(by=['Weekly_Sales']) 
y = df["Fuel_Price"]
x = df["Weekly_Sales"]
...

and it should work. Eg for some made up data:

enter image description here

Alternatively you can plot the blue line as a scatter and it would not matter if the xs are not in order

...
plt.plot(x,y_pred ,'.', color = 'blue')
...

and it would look like this:

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 piterbarg