'DynamoDB creation of a unique partition key
Let's say I'm creating a DynamoDB table called Products that contains any number of items that a user could purchase. An admin should be able to access a front end page to enter product details, send the details to a Lambda, which creates a new Product in the Products table.
I understand that a partition key should be highly distributed to avoid hot partitions, so I was looking to use a productId (which would be a number) as the partition key. My question is, if DynamoDB has no concept of auto-increment fields, how can I create a unique key as to not overwrite any item already in the table? I would not expect an admin to have to input a unique number when creating an item. I am planning on using a sort key.
Solution 1:[1]
There are many tools to generate unique id values. Personally, I recommend you look at KSUID which is a UID generator that has the nice extra characteristic it's naturally sorted by timestamp. With a partition key (as in your case today) it doesn't matter, any UID will work, but for situations later where you use an ID in the sort key... if you're using a KSUID the values will be in timestamp order and you can pull out, for example, an item by id or the 10 most recent items, both off the same index.
Solution 2:[2]
What you are looking for is an anti-pattern in DynamoDB, the whole purpose of going NoSQL was to speed up database reads by eliminating the need to for locks which is required by auto-increment features.
Have you considered using uuid?
npm install uuid
import { v4 as uuidv4 } from 'uuid';
uuidv4(); // ? '9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d'
Otherwise, I would recommend using a hybrid model where an RDS is needed to store your products list and generate a unique ID using auto-increment feature. You can then let the other lock-intensive data be stored in DynamoDB (EG. Transactions, Transaction Items).
Solution 3:[3]
Several responses already mentioned how you can create a unique id - time based or random - to ensure that each new product gets a unique id.
One problem with this approach is that the product-adding operation is not idempotent, i.e., if you do it twice (e.g., because of some network problem), you'll add the same product twice. One way to fix this is to calculate the item's key as a hash function of the initial content of the product - this way, if you try to add the exact same product twice, you'll set both under the same key and only get one item, not two.
Finally, in some situations you have some mostly-unique way of defining a new key, which isn't always unique - perhaps some counter you can't fully trust to be concurrency-safe, the current time, and so on. You may want to use this key - but only if you can verify that some other concurrent process didn't beat you to using this key. Well, you can easily do this - you can use a conditional update to set the item but only if an item with this key doesn't yet exist. This can be done safely (with regards to concurrency). You can do this with a ConditionExpression like attribute_not_exists(p) or p <> :p (both conditions will fail if the item already exist, with p being the partition (and only) key in this example).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | hunterhacker |
| Solution 2 | Allan Chua |
| Solution 3 | Nadav Har'El |
