'predicting values using a dataframe and model.predict()
I have this simple dataframe:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
d = {'x': [7,8, 10,15], 'y': [15,17,20,24], 'z': [15,np.nan,20,np.nan]}
df = pd.DataFrame(data=d)
df
with which I set up this simple model:
df_m=df.dropna()
X = df_m.loc[:, df_m.columns != 'z']
y=df_m['z']
X_train, X_test, y_train, y_test = train_test_split(X, y)
LR=LinearRegression()
LR.fit(X_train,y_train)
LR.predict(X_test)
now I want to make a function which goes through the dataframe and replaces the Nan of column Z with the predicted value of the model:
def fill_z(df,LR):
for i, row in df.iterrows():
if pd.isnull(row['z']):
print(row['x'],row['y'])
df.at(i,'z') = LR.predict(row['x'],row['y'])
I get an error message:
File "<ipython-input-243-7de7d76520a1>", line 24
df.at(i,'z')=LR.predict(row['x'],row['y'])
^
SyntaxError: can't assign to function call
Solution 1:[1]
No need to iterate through each row to find the nan and then predict the value after each iteration.
You can feed in your features that you want to predict on. Then get those predictions and merge it back into the dataframe.
Then update your dataframe.
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
d = {'x': [7,8, 10,15], 'y': [15,17,20,24], 'z': [15,np.nan,20,np.nan]}
df = pd.DataFrame(data=d)
print(df)
df_m=df.dropna()
X = df_m.loc[:, df_m.columns != 'z']
y=df_m['z']
X_train, X_test, y_train, y_test = train_test_split(X, y)
LR=LinearRegression()
LR.fit(X_train,y_train)
to_predict = df[df['z'].isna()]
print('Predict:')
print(to_predict)
print('\nGet Predictions:')
predictions = LR.predict(np.array(to_predict[['x','y']]))
print(predictions)
print('\nMerge it back:')
to_predict['z'] = predictions
print(to_predict)
print('\nUpdate df:')
df.update(to_predict)
print(df)
Output:
print(df)
x y z
0 7 15 15.0
1 8 17 NaN
2 10 20 20.0
3 15 24 NaN
Predict:
x y z
1 8 17 NaN
3 15 24 NaN
Get Predictions:
[15. 15.]
Merge it back:
x y z
1 8 17 15.0
3 15 24 15.0
Update df:
x y z
0 7.0 15.0 15.0
1 8.0 17.0 15.0
2 10.0 20.0 20.0
3 15.0 24.0 15.0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
