'Luigi/SQLite: How to update database after initial load?
I'm loading data into an SQLite database via Luigi with the following code:
class LoadData(luigi.Task):
def requires(self):
return TransformData()
def run(self):
with sqlite3.connect('database.db') as db:
cursor = db.cursor()
cursor.execute("INSERT INTO prod SELECT * FROM staging;")
def output(self):
return luigi.LocalTarget('database.db')
This works, but when I want to update or insert new data, the task doesn't execute because Luigi considers it complete (database.db already exists).
Maybe I didn't understand the good use of LocalTarget. What is the right way to approach this?
///EDIT: My question applies to the example given on this page (code for le_create_db.py). How do you solve updates and inserts in that example?
///EDIT: This question about appending to a file is similar, but the solution using marker files does not work because sqla expects an SQLAlchemyTarget output. Are there any other answers, specifically about appending to a database?
Solution 1:[1]
Consider using a mock file: http://gouthamanbalaraman.com/blog/building-luigi-task-pipeline.html
In each execution you will be creating a new file.
Another solution could be using the strategy of creating a marker table inside the db, for example: https://luigi.readthedocs.io/en/stable/api/luigi.contrib.postgres.html#luigi.contrib.postgres.PostgresTarget
Solution 2:[2]
I had the same issue and was able to solve it by overriding the complete method to simply return False:
def complete(self):
return False
Now the task is re-run every time, even if database file is present.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jesus Sono |
| Solution 2 | kchomski |
