'How to have better performance when updating database from AWS EC2 using SQLAlchemy (Flask)
I wanted to update all entries of column state in the Org table based on another column location and pass in location as a string into python module usaddress.tag(). However, using the following way, due to the volume of data entries in Org table, the AWS EC2 server crashed:
orgs = db.session.query(Org).filter(Org.location != None).all()
for org in orgs:
address_dict = usaddress.tag(Org.location)
org.state_abbr = address_dict[0].get("StateName", None)
db.session.commit()
I have also tried to use update() method directly, but it tells me that usaddress.tag() expects a string or binary format:
orgs = db.session.query(Org).filter(Org.location != None).update({"state_abbr": usaddress.tag(Org.location)[0].get("StateName", None)}, synchronize_session=False)
I cannot write a SQL query directly because it is easier to obtain state from location using the module usaddress.
Is there any way to update database without overwhelming the server?
Any help would be much appreciated. Thank you all very much in advance.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
