'Speed up Django bulk_create with unique fields [duplicate]
I have a database with thousands of entries. Each entry has a unique
field that I called unique_id and it's calculated on some other fields:
class ModelA(models.Model):
name = models.CharField(max_length=256, help_text="Type a name")
created_at = models.DateTimeField(auto_now_add=True)
unique_id = models.CharField(max_length=256, unique=True, editable=False, default=calculate_my_unique_id)
This is the algorithm I'm using to prevent duplicates (omitting a bit of code):
objectsToAdd = []
items_dict = ModelA.objects.in_bulk(field_name='unique_id')
unique_id = generate_unique_id()
if unique_id not in items_dict.keys():
objectsToAdd.append(
ModelA(
name=item_name, unique_id=unique_id
)
)
if objectsToAdd:
ModelA.objects.bulk_create(objs=objectsToAdd)
The problem is that when table items grow, will grow also item_dict
list and consequently the check time.
Is there a more efficient way to simply skip duplicates from bulk insert?
EDIT: If I use ignore_conflicts=True
the bulk_create returns all objects I've tried to insert instead of those really created
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|