'Create function for indirect reference to JSON path and value

I want to use a configuration JSON to tell any function where to get the data from source data JSON.

Example of 3 values I want to add to an array mapped to value:

data = {
    "value1" : "Hello",
    "value2" : "World",
    "object" : {
        "value3" : "Great"
    }
}

For this most basic example, I want to create the following using configuration:

example = {
    "example_array" : [
        {"value" : "Hello"},
        {"value" : "World"},
        {"value" : "Great"}
]}

What I would prefer is to use a config file which has this:

{
    "paths" : [
        {
            "source" : "data",
            "origin_path" : "",
            "origin_value" : "['value1']"
        },
        {
            "source" : "data",
            "origin_path" : "",
            "origin_value" : "['value2']"
        },
        {
            "source" : "data",
            "origin_path" : "['object']",
            "origin_value" : "['value1']"
        }
    ]
}

If I could create a function like set_mapping(destination, source, path, value) then I want to loop through the paths rows with something like this:

for x in origin['paths']:
    origin_data = setmapping(
                value,
                config['paths'][x]['source'],
                config['paths'][x]['origin_path'],
                config['paths'][x]['origin_value']
                )
    example['example_array'].append using origin_data

Original Question

I want to know if I can control data paths from a configuraiton file so I can process a list of transformations in a loop rather than code for each. In Excel, I would call this an INDIRECT reference to a value used to make up a part of a function, in Python, I can't find out how to do it.

Here's a manual example of what I want to do:

print(data['liability_asset']['acceptance']['contractors_and_subcontractors']['contractors_and_subcontractors_engaged'])

What I want to be able to do is this instead is maybe have a data path function. I imagine the function to have a definition which is like data_path(source, path, value) but that's as far as I got.

for x in config['paths]:
    print(data(data_path(['config']['paths'][x]['source'],['config']['paths'][x]['origin_path'], data_path(['config']['paths'][x]['origin_value']

Here's an extract from the horrible verbose, nested origin data I need to transform from:

data = {
          "liability_asset": {
              "acceptance" : {
                  "contractors_and_subcontractors" : {
                      "contractors_and_subcontractors_engaged" : "YES"
                      }
                  }
              }
       }

Here's the config file I intend to use for starters:

config = {
    "paths" : [
            {
                 "source" : "data",
                 "origin_path" : "['liability_asset']['acceptance']['contractors_and_subcontractors']",
                 "origin_value" : "['contractors_and_subcontractors_engaged']"
            }

        ]
    }

In addition, if there's a way for the script to allow the config to support . as separator of a path rather than [''] each time, that would make the management of the config file much easier.

Expected result from both is YES



Solution 1:[1]

import json

all_data = {}

# mock config file, in reality you'd load it.
config = """{
    "paths" : [
        {
            "source" : "data",
            "origin_path" : [],
            "origin_value" : "value1"
        },
        {
            "source" : "data",
            "origin_path" : [],
            "origin_value" : "value2"
        },
        {
            "source" : "data",
            "origin_path" : ["object"],
            "origin_value" : "value3"
        }
    ]
}
"""
config = json.loads(config)

# Load all of your possible data sources into a dictionary.
# If you already had a 'data' variable, this could just be 
# all_data['data'] = data

all_data['data'] = {
    "value1" : "Hello",
    "value2" : "World",
    "object" : {
        "value3" : "Great"
    }
}


def get_data_via_config(config_file:dict, sources:dict) -> dict:
    # For each given path, goes to the origin_path and adds the 
    # value of each origin_value to the output dict, by name.
    output = {}
    for path in config_file['paths']:
        source = sources[path['source']]
        for o_path in path['origin_path']:
            source = source[o_path]
        # You can modify this next line and make output a list if desired.
        output[path['origin_value']] = source[path['origin_value']]
    return output


example = get_data_via_config(config, all_data)
print(example)

Output:

{'value1': 'Hello', 'value2': 'World', 'value3': 'Great'}

It works just fine for your more complicated example as well:

all_data['data_adv'] = {
          "liability_asset": {
              "acceptance" : {
                  "contractors_and_subcontractors" : {
                      "contractors_and_subcontractors_engaged" : "YES"
                      }
                  }
              }
       }

config2 = """{
    "paths" : [
            {
                 "source" : "data_adv",
                 "origin_path" : ["liability_asset", "acceptance", "contractors_and_subcontractors"],
                 "origin_value" : "contractors_and_subcontractors_engaged"
            }

        ]
    }
"""
config2 = json.loads(config2)

example2 = get_data_via_config(config2, all_data)
print(example2)

Output:

{'contractors_and_subcontractors_engaged': 'YES'}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1