'How do I create a pandas column using a different function depending on what month it is in?

I have been given a JSON file that has information about flight delays from seven different airports. I have saved this to a pandas data frame called flights. The data doesn't accurately display how many flights were delayed by weather so I have been assigned to recalculate that information. If the month is between April and August it is calculated differently than the rest of the months. I initially tried lambda with an if flights["month"] in delay_40. Second I tried an np.where without using an in statement, then np.select using dot notation instead of bracket notation. Each different implementation has given me the same error message. ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Just a heads up, the indentation here is to improve readability. I understand that you can't have indentation in a lambda statement, and I don't know if it affects np.select or np.where.

delay_40 = ["April", "May", "June", "July", "August"]

weather_delay_total = flights

weather_delay_total = flights.assign(
    improved_delays_weather = lambda row: 
    (round(row["num_of_delays_weather"] + (.3 * row["num_of_delays_late_aircraft"]) + (.4 * row["num_of_delays_nas"]))) if (row["month"] in delay_40) 
    else (round(row["num_of_delays_weather"] + (.3 * row["num_of_delays_late_aircraft"]) + (.65 * row["num_of_delays_nas"])))
)

weather_delay_total["improved_delays_weather"] = np.where(
    flights.month == "April" or flights.month == "May" or flights.month == "June" or flights.month == "July" or flights.month == "August", 
    round(flights["num_of_delays_weather"] + (.3 * flights["num_of_delays_late_aircraft"]) + (.4 * flights["num_of_delays_nas"])), 
    round(flights["num_of_delays_weather"] + (.3 * flights["num_of_delays_late_aircraft"]) + (.65 * flights["num_of_delays_nas"])))


weather_delay_total = flights.assign(
    improved_delays_weather = np.select(
        flights.month == "April" or flights.month == "May" or flights.month == "June" or flights.month == "July" or flights.month == "August",
        round(flights.num_of_delays_weather + (.3 * flights.num_of_delays_late_aircraft) + (.4 * flights.num_of_delays_nas)), 
        round(flights.num_of_delays_weather + (.3 * flights.num_of_delays_late_aircraft) + (.65 * flights.num_of_delays_nas)
    )
)


Solution 1:[1]

This question was answered by Parfait and mozway in the comments. I used np.where = flights["month"].isin(delay_40) and it worked perfectly. Thank you Parfait and mozway!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Riley S