'Update MySQL with API data and sqlalchemy and avoid repetition
I download forecasting time series data from an API multiple times a day, since the prediction (for the same future dates and times) is updated based on new information.
What would be the most efficient way to upload these data to MySQL: batch csv or sqlalchemy?
I am trying the latter option, that is, to update the database from a pandas dataframe and sqlalchemy. The problem is that I can only 'append' with df.to_sql, which creates repeated records for the same date and time.
Is there another way around this?
Here is my code so far, where df represents the dataframe with data that should to be updated or inserted in MySQL. No id column, since MySQL auto increments the primary key.
Old data downloaded 4 hours ago (dfold):
| Datetime | City | Temperature_C |
|----------------------|------------------|----------------|
| 2021-03-29 15:00:00 | Oslo | 0.0 |
| 2021-03-29 16:00:00 | Oslo | 1.5 |
New data downloaded now (df):
| Datetime | City | Temperature_C |
|----------------------|------------------|----------------|
| 2021-03-29 15:00:00 | Oslo | -4.0 |
| 2021-03-29 16:00:00 | Oslo | 1.5 |
| 2021-03-29 17:00:00 | Oslo | 2.5 |
Python code:
connect_url = sqlalchemy.engine.URL.create(
'mysql+pymysql',
username='user',
password='test',
host='somehost',
database = 'dbname')
engine = sqlalchemy.create_engine(connect_url)
try:
engine.connect()
except Exception as error:
print(error)
# dataframe with some data, no index info
df.to_sql('tablename',engine,if_exists='append',method='multi',index=False)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
