'Using isnull() in a pandas data frame to check a particular value is null or not
I am editing my previous question as it was flawed. I have a data frame named df. In that data frame, columns contain values, some of them are negative values, zeros, and NaN. I want to replace these values and store a respective value of the flag in another data frame at the respective index.
df = pd.read_excel('Check.xlsx')
df_ph_temp = df.iloc[:,2:5]
df_flags = pd.DataFrame(index=df.index, columns=df.columns)
flag_ph_temp = df_flags.iloc[:,2:5]
for rowIndex, row in df_ph_temp.iterrows() :
for colIndex, value in row.items() :
if value == 0 :
df_ph_temp.loc[rowIndex, colIndex] = df_ph_temp.loc[rowIndex - 1, colIndex]
flag_ph_temp.loc[rowIndex, colIndex] = 1
elif value < 0 :
df_ph_temp.loc[rowIndex, colIndex] = 0
flag_ph_temp.loc[rowIndex, colIndex] = 1
elif value > 200 :
df_ph_temp.loc[rowIndex, colIndex] = 130
flag_ph_temp.loc[rowIndex, colIndex] = 2
elif value == np.nan : # Not working... Why?
df_ph_temp.loc[rowIndex, colIndex] = df_ph_temp.loc[rowIndex - 1, colIndex]
flag_ph_temp.loc[rowIndex, colIndex] = 1
else :
continue
I am not getting any errors but also not getting desired output. Replacing NaN values and storing the resp. flag values in the flag's data frame, this part of the program is not working. I think this is because data contains more than 2 lines with NaN values. Is there a way to fix this? I tried
df_ph_temp[colIndex].fillna(method ='ffill', inplace = True)
before the if condition but still not able to achieve desired results.
I am unable to figure it out. Kindly help.
Solution 1:[1]
Using pandas, you should avoid loop. Use mask filtering and slicing to fill your flag column. In order to detect null values, use .isnull() directly on pandas dataframe or series (when you select a column), not on a value as you did. Then use .fillna() if you want to replace null values with something else.
Based on your code (but not sure it will works, it could be helpfull you share some input data and expected output), the solution may look as follow.
First create empty column as you did:
data['Flags'] = None
Then fill this columns based on condition on "Temperature phase" column (using fillna(0) to replace all null values by 0 allow you to only test if values are <= 0, this replacement is not applied on the final dataframe):
data.loc[data['Temperature phase'].fillna(0) <= 0, "Flags"] = 1
data.loc[data['Temperature phase'] > 200, "Flags"] = 2
And now replace Temperature phase values.
For the values equal to 0 or null, you seems to have choosen to replace them with the previous value in dataframe. You maybe could achieve this part using this.
data.loc[data['Temperature phase'].isnull(), 'Temperature phase'] = data['Temperature phase'].shift().loc[data.loc[data['Temperature phase'].isnull()].index]
First, this command use .shift() to shift all values in column Temperature phase by one, then filtering rows where Temperature phase is null and replace values by corresponding index in shifted Temperature phase values.
Finaly, replace other Temperature phase values:
data.loc[data['Temperature phase'] < 0, "Temperature phase"] = 0
data.loc[data['Temperature phase'] > 200, "Temperature phase"] = 130
You don't need flag index so on as the Flag is directly fill in the final dataframe.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Léo Beaucourt |
