'Django - uploading data to a model with pandas ".to_sql()" causes problems for the foreign key field
I identified a problem that ".select_related()" doesn't work if the table with a foreign key was uploaded using pandas. Uploading data the alternative way using a loop is not an option because the table is too big.
Here's an example:
Set up
#Models
class Magazine(models.Model):
name = models.CharField(max_length=20, unique=True, primary_key=True)
founded = models.CharField(max_length=20, unique=True, primary_key=True)
def __str__(self):
return self.name
class Issue(models.Model):
name = models.ForeignKey(Magazine, on_delete=models.CASCADE, db_column="name")
week = models.CharField(max_length=10, unique=False)
def __str__(self):
return self.week
#Admin
from mainsite.models import Magazine, Issue
admin.site.register(Magazine)
admin.site.register(Issue)
#Migrated
python ../manage.py makemigrations
python ../manage.py migrate
Fill first table
import os
import sys
import pandas as pd
import django
sys.path.append(os.path.dirname(os.getcwd()))
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings') #configures the settings for the project. Need to do this before manipulating models
django.setup()
from mainsite.models import Magazine, Issue
magazineNames = ['journal science', 'naughty mag', 'funny cartoons', 'comic fans', 'lawncare', 'NEVERISSUED']
magazineFounded = ['1901', '1995', '2005', '2011', '1993', '1900']
Magazine.objects.bulk_create([Magazine(name=i, founded=j) for i,j in zip(magazineNames, magazineFounded)])
Fill second table
1.) THIS WORKS for Issue.objects.select_related("name"), but uses a loop. It works because I can specify the names being a Magazine() instance.
issueNames = ['journal science', 'journal science', 'naughty mag', 'funny cartoons', 'comic fans', 'lawncare', 'lawncare']
issueWeeks = ['1', '2', '3', '4', '5', '6', '7']
Issue.objects.bulk_create([Issue(name=Magazine(name=i), week=j) for i,j in zip(issueNames, issueWeeks)])
2.) THIS DOESNT WORK with Issue.objects.select_related("name") because the pandas to_sql is blind to the magazine instance.
from django.conf import settings
from sqlalchemy import create_engine
import psycopg2
issueData = pd.read_csv('../static/defaults/issues.csv')
engine = create_engine('sqlite:///C:\\Users\\amoli\\Desktop\\SEC_scraper_tool\\django2\\finagg\\mainsite-1\\db.sqlite3' , echo=False)
issueData.to_sql('mainsite_issue', con=engine, index_label='id',if_exists='replace', index=False)
The Error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\me\anaconda3\lib\site-packages\django\db\models\query.py", line 296, in __repr__
data = list(self[: REPR_OUTPUT_SIZE + 1])
File "C:\Users\me\anaconda3\lib\site-packages\django\db\models\query.py", line 302, in __len__
self._fetch_all()
File "C:\Users\me\anaconda3\lib\site-packages\django\db\models\query.py", line 1507, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "C:\Users\me\anaconda3\lib\site-packages\django\db\models\query.py", line 57, in __iter__
results = compiler.execute_sql(
File "C:\Users\me\anaconda3\lib\site-packages\django\db\models\sql\compiler.py", line 1361, in execute_sql
cursor.execute(sql, params)
File "C:\Users\me\anaconda3\lib\site-packages\django\db\backends\utils.py", line 103, in execute
return super().execute(sql, params)
File "C:\Users\me\anaconda3\lib\site-packages\django\db\backends\utils.py", line 67, in execute
return self._execute_with_wrappers(
File "C:\Users\me\anaconda3\lib\site-packages\django\db\backends\utils.py", line 80, in _execute_with_wrappers
return executor(sql, params, many, context)
File "C:\Users\me\anaconda3\lib\site-packages\django\db\backends\utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
File "C:\Users\me\anaconda3\lib\site-packages\django\db\utils.py", line 91, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "C:\Users\me\anaconda3\lib\site-packages\django\db\backends\utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
File "C:\Users\me\anaconda3\lib\site-packages\django\db\backends\sqlite3\base.py", line 477, in execute
return Database.Cursor.execute(self, query, params)
django.db.utils.OperationalError: no such column: mainsite_issue.id
Any solutions? Hopefully I'm unaware of a feature available by pandas, or there's a way to change the Issue's name field to a Magazine instance after the to_sql. Last option would be to use a loop + multithreadhing to create an object in a reasonable time.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
