'Working with Json Logic on Pandas Dataframe
How can I use a manual logic for feature aggregation for example bu using Json Logic (open to other solutions as well) on large dataframes:
For example if I have this dataframe (in reality it's a large DF):
pie_df
temp pie_filling
0 100 "apple"
1 400 "apple"
2 70 "cherry"
and this logic (for example inside a json file), in reality the logic file will have multiple aggregations at different nesting levels:
rules = { "and" : [
{"<" : [ { "var" : "temp" }, 110 ]},
{"==" : [ { "var" : "pie_filling" }, "apple" ] }
] }
I want the answer to be:
pie_ready
0 true
1 false
2 false
The logic file should be generic and readable. I can convert the dataframe to json but I am worried this won't be computationally efficient.
I did find this package: https://github.com/nadirizr/json-logic-py but they didn't mention implementing the logic on dataframes
This line doesn't work:
jsonLogic(rules, pie_df.to_json())
I get this error:
{TypeError}'dict_keys' object is not subscriptable
Solution 1:[1]
json-logi-py is not maintained anymore, use this fork instead: pip install json-logic-qubit
Then, you can handle your dataframe like this:
from json_logic import jsonLogic
rules = {
"and": [{"<": [{"var": "temp"}, 110]}, {"==": [{"var": "pie_filling"}, "apple"]}]
}
print(
pd.DataFrame(
{"pie_ready": [jsonLogic(rules, row) for row in df.to_dict(orient="records")]}
)
)
# Output
pie_ready
0 True
1 False
2 False
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Laurent |
