'DynamoDB costs arising due to triggers

I have a workflow where I put files into an S3 bucket, which triggers a Lambda function. The Lambda function extracts some info about the file and inserts a row into a DynamoDB table for each file:

def put_filename_in_db(dynamodb, filekey, filename):
    table = dynamodb.Table(dynamodb_table_name)
    try:
        response = table.put_item(
            Item={
            'masterclient': masterclient,
            'filekey': filekey,
            'filename': filename,
            'filetype': filetype,
            'source_bucket_name': source_bucket_name,
            'unixtimestamp': unixtimestamp,
            'processed_on': None,
            'archive_path': None,
            'archived_on': None,
            }
        )
    except Exception as e:
        raise Exception(f"Error") 
    return response

def get_files():
    bucket_content = s3_client.list_objects(Bucket=str(source_bucket_name), Prefix=Incoming_prefix)['Contents']  
    file_list = []

    for k, v in enumerate(bucket_content):
        if (v['Key'].endswith("zip") and not v['Key'].startswith(Archive_prefix)):
            filekey = v['Key']
            filename = ...  

            dict = {"filekey": filekey, "filename": filename}
            file_list.append(dict)

    logger.info(f'Found {len(file_list)} files to process: {file_list}')
    return file_list

def lambda_handler(event, context):

    for current_item in get_files():
        filekey = current_item['filekey']
        filename = current_item['filename']      
        put_filename_in_db(dynamodb, filekey, filename)

    return {
        'statusCode': 200
    }


This is how my DynamoDB table is defined in terraform:

resource "aws_dynamodb_table" "filenames" {
  name           = local.dynamodb_table_filenames
  billing_mode   = "PAY_PER_REQUEST"
  #read_capacity  = 10
  #write_capacity = 10
  hash_key       = "filename"
  stream_enabled = true
  stream_view_type = "NEW_IMAGE"
  attribute {
    name = "filename"
    type = "S"
  }
}

resource "aws_lambda_event_source_mapping" "allow_dynamodb_table_to_trigger_lambda" {
  event_source_arn  = aws_dynamodb_table.filenames.stream_arn
  function_name     = aws_lambda_function.trigger_stepfunction_lambda.arn
  starting_position = "LATEST"
}

New entries in the DynamoDB table trigger another Lambda function which contains this:

def parse_file_info_from_trigger(event):
    filename = event['Records'][0]['dynamodb']['Keys']['filename']['S']
    filetype = event['Records'][0]['dynamodb']['NewImage']['filetype']['S']
    unixtimestamp = event['Records'][0]['dynamodb']['NewImage']['unixtimestamp']['S']
    masterclient = event['Records'][0]['dynamodb']['NewImage']['masterclient']['S']
    source_bucket_name = event['Records'][0]['dynamodb']['NewImage']['source_bucket_name']['S']
    filekey = event['Records'][0]['dynamodb']['NewImage']['filekey']['S']

    return filename, filetype, unixtimestamp, masterclient, source_bucket_name, filekey


def start_step_function(event, state_machine_zip_files_arn):
    if event['Records'][0]['eventName'] == 'INSERT':
            filename, filetype, unixtimestamp, masterclient, source_bucket_name, filekey = parse_file_info_from_trigger(event)
    ......
    else: 
        logger.info(f'This is not an Insert event')

However, the costs for this process are extremely high. If I start testing with a single file loaded into S3, the overall DynamoDB costs for that day were $0.785. If I do it for around 50 files for a day, that would mean my total costs per day are 40$, which seems too high if we want to run the workflow on a daily basis.

Am I doing something wrong? Or is DynamoDB generally expensive? If it's the later, then what part exactly is costing so much? Or is it because put_filename_in_db is running in a loop?

enter image description here



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source