'How to remove all empty fields in a nested dict?
If I have a dict, which field's values may also be a dict or an array. How can I remove all empty fields in it?
"Empty field" means a field's value is empty array([]), None, or empty dict(all sub-fields are empty).
Example: Input:
{
"fruit": [
{"apple": 1},
{"banana": None}
],
"veg": [],
"result": {
"apple": 1,
"banana": None
}
}
Output:
{
"fruit": [
{"apple": 1}
],
"result": {
"apple": 1
}
}
Solution 1:[1]
Use a recursive function that returns a new dictionary:
def clean_empty(d):
if isinstance(d, dict):
return {
k: v
for k, v in ((k, clean_empty(v)) for k, v in d.items())
if v
}
if isinstance(d, list):
return [v for v in map(clean_empty, d) if v]
return d
The {..} construct is a dictionary comprehension; it'll only include keys from the original dictionary if v is true, e.g. not empty. Similarly the [..] construct builds a list.
The nested (.. for ..) construct is a generator expression that allows the code to compactly filter empty objects after recursing.
Another way of constructing such a function is to use the @singledispatch decorator; you then write multiple functions, one per object type:
from functools import singledispatch
@singledispatch
def clean_empty(obj):
return obj
@clean_empty.register
def _dicts(d: dict):
items = ((k, clean_empty(v)) for k, v in d.items())
return {k: v for k, v in items if v}
@clean_empty.register
def _lists(l: list):
items = map(clean_empty, l)
return [v for v in items if v]
The above @singledispatch version does exactly the same thing as the first function but the isinstance() tests are now taken care of by the decorator implementation, based on the type annotations of the registered functions. I also put the nested iterators (the generator expression and map() function) into a separate variable to improve readability further.
Note that any values set to numeric 0 (integer 0, float 0.0) will also be cleared. You can retain numeric 0 values with if v or v == 0.
Demo of the first function:
>>> sample = {
... "fruit": [
... {"apple": 1},
... {"banana": None}
... ],
... "veg": [],
... "result": {
... "apple": 1,
... "banana": None
... }
... }
>>> def clean_empty(d):
... if isinstance(d, dict):
... return {
... k: v
... for k, v in ((k, clean_empty(v)) for k, v in d.items())
... if v
... }
... if isinstance(d, list):
... return [v for v in map(clean_empty, d) if v]
... return d
...
>>> clean_empty(sample)
{'fruit': [{'apple': 1}], 'result': {'apple': 1}}
Solution 2:[2]
If you want a full-featured, yet succinct approach to handling real-world data structures which are often nested, and can even contain cycles and other kinds of containers, I recommend looking at the remap utility from the boltons utility package.
After pip install boltons or copying iterutils.py into your project, just do:
from boltons.iterutils import remap
data = {'veg': [], 'fruit': [{'apple': 1}, {'banana': None}], 'result': {'apple': 1, 'banana': None}}
drop_falsey = lambda path, key, value: bool(value)
clean = remap(data, visit=drop_falsey)
print(clean)
# Output:
{'fruit': [{'apple': 1}], 'result': {'apple': 1}}
This page has many more examples, including ones working with much larger objects from Github's API.
It's pure-Python, so it works everywhere, and is fully tested in Python 2.7 and 3.3+. Best of all, I wrote it for exactly cases like this, so if you find a case it doesn't handle, you can bug me to fix it right here.
Solution 3:[3]
@mojoken - How about this to overcome the boolean problem
def clean_empty(d):
if not isinstance(d, (dict, list)):
return d
if isinstance(d, list):
return [v for v in (clean_empty(v) for v in d) if isinstance(v, bool) or v]
return {k: v for k, v in ((k, clean_empty(v)) for k, v in d.items()) if isinstance(v, bool) or v}
Solution 4:[4]
def not_empty(o):
# you can define what is empty.
if not (isinstance(o, dict) or isinstance(o, list)):
return True
return len(o) > 0
def remove_empty(o):
# here to choose what container you not need to recursive or to remove
if not (isinstance(o, dict) or isinstance(o, list)):
return o
if isinstance(o, dict):
return {k: remove_empty(v) for k, v in o.items() if not_empty(v)}
if isinstance(o, list):
return [remove_empty(v) for v in o if not_empty(v)]
Solution 5:[5]
def remove_empty_fields(data_):
"""
Recursively remove all empty fields from a nested
dict structure. Note, a non-empty field could turn
into an empty one after its children deleted.
:param data_: A dict or list.
:return: Data after cleaning.
"""
if isinstance(data_, dict):
for key, value in data_.items():
# Dive into a deeper level.
if isinstance(value, dict) or isinstance(value, list):
value = remove_empty_fields(value)
# Delete the field if it's empty.
if value in ["", None, [], {}]:
del data_[key]
elif isinstance(data_, list):
for index in reversed(range(len(data_))):
value = data_[index]
# Dive into a deeper level.
if isinstance(value, dict) or isinstance(value, list):
value = remove_empty_fields(value)
# Delete the field if it's empty.
if value in ["", None, [], {}]:
data_.pop(index)
return data_
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Mahmoud Hashemi |
| Solution 3 | rocky rambo |
| Solution 4 | lovxin |
| Solution 5 | Tian Chu |
