'Delete dynamoDB records using hive

I have a pipeline that runs periodically and dumps some data into s3. Now I want to completely replace the existing data in dynamoDb with the new data dumped in the s3.

if dynamoDB table key is present in s3 data the dynamo record should get updated.

If key is absent in s3 data but present in dynmoDB, then record in dynamo needs to be deleted and any new data need to be inserted as new record in dynamo table.

Is it possible to accomplish this by hive query (external Hive table connected to a DynamoDB). I am aware we can insert data into dynamoDB using hive query. Can we delete items from the dynamoDB as well using hive query.



Solution 1:[1]

Run:

Insert Overwrite table <Dynamodbtablename> select columns from <s3_table>;

This will overwrite the existing records based on the key column values.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ouflak