'Strange `ErrorType` error referencing columns

Given the following pandas Dataframe df -

Reporting Group Entity/Grouping Entity ID Adjusted Value (Today, No Div, USD) Adjusted TWR (Current Quarter, No Div, USD) Adjusted TWR (YTD, No Div, USD) Annualized Adjusted TWR (Since Inception, No Div, USD) Adjusted Value (No Div, USD)
Barrack Family William and Rupert Trust 9957007 -1.44 -1.44
Barrack Family Cash - -1.44 -1.44
Barrack Family Gratia Holdings No. 2 LLC 8413655 55491732.66 -0.971018847 -0.971018847 11.52490309 55491732.66
Barrack Family Investment Grade Fixed Income - 18469768.6 18469768.6
Barrack Family High Yield Fixed Income - 3668982.44 -0.205356545 -0.205356545 4.441190127 3668982.44

I am trying to only keep rows that where Entity/Grouping column values == Cash AND where any of the following column values are NaN -

  • Adjusted TWR (Current Quarter, No Div, USD)
  • Adjusted TWR (YTD, No Div, USD)
  • Annualized Adjusted TWR (Since Inception, No Div, USD)

Current code: the following function takes df and tries to perform the above said and return, ready for use in another function -

def twr_exceptions_logic():
    df = databases_creation()

    df  = df.loc[(df['Entity/Grouping']!= 'Cash')]
    df  = df.loc[(df['Adjusted TWR (Current Quarter, No Div, USD)',
                              'Adjusted TWR (YTD, No Div, USD)',
                              'Annualized Adjusted TWR (Since Inception, No Div, USD)'].isnull())]
    return df

Issue/Error: currently, I am getting a non-descript TypeError, which references the columns I am referencing:

KeyError: ('Adjusted TWR (Current Quarter, No Div, USD)', 'Adjusted TWR (YTD, No Div, USD)', 'Annualized Adjusted TWR (Since Inception, No Div, USD)')

Help: Am I missing something obvious? Do I have an obvious syntax error? Any hints/tips would be warmly received.



Solution 1:[1]

  1. You need one more square bracket.
  2. You can use any method to make a proper condition.

Code:

import pandas as pd

def databases_creation():
    import numpy as np
    return pd.DataFrame({'Reporting Group': {0: 'Barrack Family', 1: 'Barrack Family', 2: 'Barrack Family', 3: 'Barrack Family', 4: 'Barrack Family'}, 'Entity/Grouping': {0: 'William and Rupert Trust', 1: 'Cash', 2: 'Gratia Holdings No. 2 LLC', 3: 'Investment Grade Fixed Income', 4: 'High Yield Fixed Income'}, 'Entity ID': {0: '9957007', 1: '-', 2: '8413655', 3: '-', 4: '-'}, 'Adjusted Value (Today, No Div, USD)': {0: -1.44, 1: -1.44, 2: 55491732.66, 3: 18469768.6, 4: 3668982.44}, 'Adjusted TWR (Current Quarter, No Div, USD)': {0: np.NaN, 1: np.NaN, 2: -0.971018847, 3: np.NaN, 4: -0.205356545}, 'Adjusted TWR (YTD, No Div, USD)': {0: np.NaN, 1: np.NaN, 2: -0.971018847, 3: np.NaN, 4: -0.205356545}, 'Annualized Adjusted TWR (Since Inception, No Div, USD)': {0: np.NaN, 1: np.NaN, 2: 11.52490309, 3: np.NaN, 4: 4.441190127}, 'Adjusted Value (No Div, USD)': {0: -1.44, 1: -1.44, 2: 55491732.66, 3: 18469768.6, 4: 3668982.44}})

def twr_exceptions_logic():
    df = databases_creation()

    # df  = df.loc[(df['Entity/Grouping']!= 'Cash')]
    # df  = df.loc[(df['Adjusted TWR (Current Quarter, No Div, USD)',
    #                           'Adjusted TWR (YTD, No Div, USD)',
    #                           'Annualized Adjusted TWR (Since Inception, No Div, USD)'].isnull())]

    mask = (df['Entity/Grouping']!= 'Cash') & (df[['Adjusted TWR (Current Quarter, No Div, USD)',
                              'Adjusted TWR (YTD, No Div, USD)',
                              'Annualized Adjusted TWR (Since Inception, No Div, USD)']].isnull().any(axis=1))
    df  = df[mask]

    return df

df = twr_exceptions_logic()

print(df)

Output:

Reporting Group Entity/Grouping Entity ID Adjusted Value (Today, No Div, USD) Adjusted TWR (Current Quarter, No Div, USD) Adjusted TWR (YTD, No Div, USD) Annualized Adjusted TWR (Since Inception, No Div, USD) Adjusted Value (No Div, USD)
Barrack Family William and Rupert Trust 9957007 -1.44 nan nan nan -1.44
Barrack Family Investment Grade Fixed Income - 1.84698e+07 nan nan nan 1.84698e+07

Solution 2:[2]

You need a double square bracket when you are selecting a list:

df[['Adjusted TWR (Current Quarter, No Div, USD)',
                              'Adjusted TWR (YTD, No Div, USD)',
                              'Annualized Adjusted TWR (Since Inception, No Div, USD)']].isnull()

This solves the KeyError. However, using this would give you a new error, as this would return you a 2D matrix to pass to df.loc as a parameter, which is an invalid parameter type. Instead, you could filter them one by one.

def twr_exceptions_logic():
    df = databases_creation()

    df  = df.loc[(df['Entity/Grouping']!= 'Cash')]
    df  = df.loc[(df['Adjusted TWR (Current Quarter, No Div, USD)'].isnull())]
    df  = df.loc[(df['Adjusted TWR (YTD, No Div, USD)'].isnull())]
    df  = df.loc[(df['Annualized Adjusted TWR (Since Inception, No Div, USD)'].isnull())]
    return df

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 quasi-human
Solution 2