'Can we use dataclasses as DTO in AWS python Lambda for nested JSON

I was looking out for a structure like DTO to store data from json in a AWS python Lambda. I came across dataclasses in python and tried it to create simple dto for the json data. My project contains lot of lambdas and heavy json parsing. I had been using dict till now to handle parsed json. please guide me if dataclasses as a standalone module from python is a right way to go for DTO kind of functionality in aws python lambda? And I have included these dtos in lambda layer for reusability across other lambdas I am worried that for nested data mainitainaing these dataclasses will be difficult.

Adding a code snippet for ref: lambda_handler.py

from dataclasses import asdict

from custom_functions.epicervix_dto import EpicervixDto
from custom_functions.dto_class import Dejlog


def lambda_handler(event, context):
    a = Dejlog(**event)

    return {
        'statusCode': 200,
        'body': json.dumps(asdict(b))
    }

dto_class.py from lambda layer

from dataclasses import dataclass, field
from typing import Any

@dataclass
class Dejlog:
    PK: str
    SK: str
    eventtype: str
    result: Any 
    type: str = field(init=False, repr=False, default=None)
    status: str
    event:str = field(init=False, repr=False, default=None)


Solution 1:[1]

One option can be to use the dataclass-wizard library for this task, which supports automatic key casing transforms, as well as de/serializing a nested dataclass model.

Here's a (mostly) complete example of how that could work:

from dataclasses import dataclass, field
from typing import Any

from dataclass_wizard import JSONWizard, json_field


@dataclass
class Dejlog(JSONWizard):

    class _(JSONWizard.Meta):
        """Change key transform for dump (serialization) process; default transform is to `camelCase`."""
        key_transform_with_dump = 'SNAKE'

    PK: str
    SK: str
    result: Any
    type: str = field(init=False, repr=False, default=None)
    status: str
    event: 'Event'

    def response(self) -> dict:
        return {
            'statusCode': 200,
            'body': self.to_json()
        }


@dataclass
class Event:
    event_type: str
    # pass `dump=False` to exclude field from the dump (serialization) process
    event: str = json_field('', init=False, dump=False, repr=False, default=None)


my_data: dict = {
    'pk': '123',
    'SK': 'test',
    'result': {'key': 'value'},
    'status': '200',
    'event': {'eventType': 'something here'}
}

instance = Dejlog.from_dict(my_data)

print(f'{instance!r}')
print(instance.response())

Result:

Dejlog(PK='123', SK='test', result={'key': 'value'}, status='200', event=Event(event_type='something here'))
{'statusCode': 200, 'body': '{"pk": "123", "sk": "test", "result": {"key": "value"}, "type": null, "status": "200", "event": {"event_type": "something here"}}'}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 rv.kvetch