'Traces on Polynomial Regression
Hello I'm having troube trying to predict the Weekly Sales based on the fuel price using polynomial regression. I saw someone else ask the same question and tried the only answer but I still can't get a good graph. Here's what I've done:
from contextlib import redirect_stderr
from turtle import color, pd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
df = pd.read_csv (r'Walmart.csv')
df = df.sort_values(by=['Weekly_Sales'])
y = df.loc[:, "Fuel_Price"].sample(n = 50, random_state= 6)
x = df.loc[:, "Weekly_Sales"].sample(n = 50, random_state= 6)
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(x.values.reshape(-1,1))
poly.fit(X_poly,y)
linreg = LinearRegression()
linreg.fit(X_poly,y)
y_pred = linreg.predict(X_poly)
plt.scatter(x, y, color='red')
plt.plot(x,y_pred, color = 'blue')
plt.show()
Result: Graph
Solution 1:[1]
Your main problem is that x are not in order after randomly sampling them from df. Replace the x and y sampling lines lines with
...
xy = df.sample(n = 50, random_state= 6).sort_values(by=['Weekly_Sales'])
y = df["Fuel_Price"]
x = df["Weekly_Sales"]
...
and it should work. Eg for some made up data:
Alternatively you can plot the blue line as a scatter and it would not matter if the xs are not in order
...
plt.plot(x,y_pred ,'.', color = 'blue')
...
and it would look like this:
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | piterbarg |


