'How to convert a dictionary according to a json scheme, Python3

I have a json scheme, which specifies the format of a dictionary in Python 3.

INPUT_SCHEME = {
    "type": "object",
    "properties": {
        "a1": {
            "type": "object",
            "properties": {
                "a1_1": {"type": ["string", "null"]},
                "a1_2": {"type": ["number", "null"]},
            },
            "additionalProperties": False,
            "minProperties": 2,
        },
        "a2": {
            "type": "array",
            "items": {"type": ["number", "null"]},
        },
        "a3": {
            "type": ["number", "null"],
        },
        "a4": {
            "type": "object",
            "properties": {
                "a4_1": {"type": ["string", "null"]},
                "a4_2": {
                    "type": "object",
                    "properties": {
                        "a4_2_1": {"type": ["string", "null"]},
                        "a4_2_2": {"type": ["number", "null"]},
                    },
                    "additionalProperties": False,
                    "minProperties": 2,
                },
            },
            "additionalProperties": False,
            "minProperties": 2,
        },
        "a5": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "a5_1": {"type": ["string", "null"]},
                    "a5_2": {"type": ["number", "null"]},
                },
                "additionalProperties": False,
                "minProperties": 2,
            },
        },
    },
    "additionalProperties": False,
    "minProperties": 5,
}

And I want to write a function which can convert an arbitrary input dictionary to the format defined by the INPUT_SCHEME.

The rules are:

  1. if the input dict misses a filed, then fill the filed with None or empty list in the output dict.
  2. if the input dict has a key that is not defined in the INPUT_SCHEME, then remove it in the output dict.

For example, suppose I have a_input, where only 'a1' is correct. 'a2', 'a3', and 'a4' are missing. Each element in 'a5' misses one property. And 'a6' is an un-defined field. The function I want to write should convert a_input to a_output. And you can use jsonschema.validate to check.

a_input = {
    'a1': {'a1_1': 'apple', 'a1_2': 20.5},
    'a5': [{'a5_1': 'pear'}, {'a5_2': 18.5}],
    'a6': [1, 2, 3, 4],
}

a_output = {
    'a1': {'a1_1': 'apple', 'a1_2': 20.5},
    'a2': [],
    'a3': None,
    'a4': {
        'a4_1': None,
        'a4_2': {
            'a4_2_1': None,
            'a4_2_2': None,
        }
    },
    'a5': [
        {
            'a5_1': 'pear',
            'a5_2': None,
        },
        {
            'a5_1': None,
            'a5_2': 18.5,
        }
    ]
}

jsonschema.validate(a_output, schema=INPUT_SCHEME)

I tried to write the function, but could not make it. Mainly because there are too many if-else check plus the nested structure, and I got lost. Could you please help me?

Thanks.

def my_func(a_from):
    a_to = dict()
    for key_1 in INPUT_SCHEME['properties'].keys():
        if key_1 not in a_from:
            a_to[key_1] = None  # This is incorrect, since the structure of a_to[key_1] depends on INPUT_SCHEME.
            continue

        layer_1 = INPUT_SCHEME['properties'][key_1]
        if 'properties' in layer_1:  # like a1, a4
            for key_2 in layer_1['properties'].keys():
                layer_2 = layer_1['properties'][key_2]
                ...

                # but it can be a nest of layers. Like a4, there are 3 layers. In real case, it can have more layers.

        elif 'items' in layer_1:
            if 'properties' in layer_1['items']:  # like a5
                ...
            else:  # like a2
                ...
        else:  # like 3
            ...
    return a_to


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source