'How to access parent of nested json object
I have an arbitrary nested JSON object (parsed using json.load), consisting of dicts, lists, primitives etc. I traverse it depth first using recursion and tracking the path to the node in a linux fs format (for lists, I append /element to the path, so there can be multiple objects at a specific path), as following:
def traverse(node, path =''):
if isinstance(node, dict):
for key in node:
traverse(node[key], path+'/'+key)
elif isinstance(node, list):
for elem in node:
traverse(elem,path+'/element')
Each node may include a string which needs to be filled in using values of objects that can exist wherever in the tree, referenced in a relative path from the current node, eg: "{../../key} {./child1/child2/key}".
Problem: I can access the values of nodes that are children of the current node but I can't directly access the parent of the current node.
Solutions I thought: One solution I thought is to have a list of tuples (child,parent) and store the current node along with the child I'm about to recurse in, and then search that list in reverse when I need to go up. This is a bit dangerous because if the child is a primitive value then is will be equal to any other children of the same value and type and hence I may retrieve a wrong parent, but I think going through the list in reverse should take care of that right?
A different solution I thought is to have a dictionary with key being the child's path and value the parent node. I think this should work better because the only time paths clash is with list elements but they all have the same parent so it should be ok I think.
Are there any other suggestions? Or any comments on these two solutions? Thank you
Solution 1:[1]
The only way to get back to the parent of a dict (or any recursive structure that doesn't have back-pointers) is to remember it as you traverse.
But notice that you're already doing this: your path string is a path from the top to the current node.
Let's write a function that uses the path:
def follow(head, path):
if not path:
return head
first, _, rest = path.partition('/')
return follow(head[first], rest)
Of course it would probably be nicer to build up path as a tuple of keys instead of a string, so we don't have to split them off (and so we don't have to worry about escaping or quoting if any of the keys might contain /, etc.); you can always join them at the end.
And it might be even better to build up path as a tuple of nodes (or key-node pairs) instead of just the keys, so we can access the parent in constant time as just path[-1] instead of in logarithmic time as follow(head, path). But it really depends on what you're actually trying to do; your real code presumably doesn't just traverse the tree, building up paths to each node, and then doing nothing with them.
However, a really nice way to solve this would be to turn the traversal inside-out: make traverse an iterator:
def traverse(node, path =''):
if isinstance(node, dict):
for key in node:
yield from traverse(node[key], path+'/'+key)
elif isinstance(node, list):
for elem in node:
yield from traverse(elem,path+'/element')
yield (node, path)
Now we can just loop over traverse to do anything we want as a post-order depth-first traversal:
for node, path in traverse(root):
# do something
And now you can change it easily to yield node, parent, path (whether parent is the parent node, or the parent key, or whatever you wanted).
Solution 2:[2]
If you want to search for a node by attribute value, and know what the node's parent nodes were, you can use the following:
def find_node_callback(node, stack, callback, path=''):
stack.append(node)
if callback(node,path,stack):
return True
if isinstance(node, dict):
for key in node:
if find_node_callback(node[key],stack,callback,path+'/'+key):
return True
elif isinstance(node, list):
for elem in node:
if find_node_callback(elem,stack, callback,path+'/element'):
return True
stack.pop()
return False
call it like so:
def isTargetNode(node,path,stack):
if node == "value Im searching for":
return True
return False
# uses a callback to determine if the node is what you want
a = json.loads(jsonString)
stack=[]
if find_node_callback(a,stack,isTargetNode):
# this array will contain all the parent items of your
# target node, with the target node value being the last
# thing stored in the array
print(stack)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | abarnert |
| Solution 2 | Brad Parks |
