'Improve algorithm for 'Get element in a list or dictionary in an elegant way using list or str as key'
I have this function
from functools import reduce
from operator import getitem
from typing import Any, Mapping, Union
def getdeep(data: Mapping, map_list: Union[list, str], default: Any = None) -> Any:
"""Iterate nested dictionary|list and can return default value if key not found
This methods can handle list of keys as well as string of keys separated by dot.
Example:
>>> data = {'a': {'b': {'c': 1}}}
>>> getdeep(data, ['a', 'b', 'c'])
1
>>> getdeep(data, 'a.b.c')
1
>>> getdeep(data, ['a', 'b', 'D'], default=0)
0
>>> getdeep(data, 'a.b.D', default=0)
0
>>> getdeep({data: ["a", "b", "c"]}, "data.1")
'b'
>>> getdeep({data: ["a", "b", "c"]}, ["data", 1])
'b'
>>> getdeep(["a": {"j": "e"} "b", "c"], ["data", "0.j"])
'e'
:param data: dictionary or list to iterate
:type data: Mapping
:param map_list: list of keys or string of keys separated by dot
:type map_list: list or str
:param default: default value to return if key not found
:type default: Any
:return: value of key or default value
:rtype: Any
"""
try:
if isinstance(map_list, str):
map_list = map_list.split(".")
# Transform string integer keys to int
map_list = [
int(key) if isinstance(key, str) and key.isdigit() else key
for key in map_list
]
return reduce(getitem, map_list, data)
except (KeyError, IndexError, TypeError):
return default
I try to reduce the execution time of this function but i can't reach my goal...
The notation data[..][..][..] is very efficient in term of execution time; I have a big difference (I perform tests on very very large datasets)
An exemple :
# Import data for example
import requests, time
data = requests.get('https://api.github.com/repos/torvalds/linux/commits').json()
# Compute with the getdeep function
start = time.time()
getdeep(data, "10.parents.0.sha")
method1 = time.time() - start
# Compute with the traditional method
start = time.time()
data[10]['parents'][0]['sha']
method2 = time.time() - start
Here, we can see that method2 is always lower (basically is normal, but there is a big difference).
I can also note that using a list instead of a string makes it possible not to enter the if of the getdeep function. This is reflected in the execution time
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
