'Convert Django QuerySet to Pandas Dataframe and Maintain Column Order
Given a Django queryset like the following:
qs = A.objects.all().values_list('A', 'B', 'C', 'D', 'E', 'F')
I can convert my qs to a pandas dataframe easily:
df = pd.DataFrame.from_records(qs.values('A', 'B', 'C', 'D', 'E', 'F'))
However, the column order is not maintained. Immediately after conversion I need to specify the new order of columns and I'm not clear why:
df = df.columns['B', 'F', 'C', 'E', 'D', 'A']
Why is this happening and what can I do differently to avoid having to set the dataframe columns explicitly?
Solution 1:[1]
try:
df = pd.DataFrame.from_records("DATA_GOES_HERE", columns=['A','B','C'.. etc.)
I'm using the columns= parameter found here.
I believe you could also construct the DataFrame by just using pd.DataFrame and put your lists in there with the corresponding column names. This may be more manual work up-front, but if this is for an automated job it could work as well. (may have the ordering issue here again, but can easily be solved by rearranging the columns.. Again, may be more work upfront)
Solution 2:[2]
The abovementioned answers require adding columns manually. However, this can be circumvented. I wrote a more simple version that does not require column names:
def django_recordset_to_data_frame(django_recordset):
mydf = pd.DataFrame.from_records(django_recordset.values_list())
mydf.columns = [col for col in django_recordset[0].__dict__.keys()][1:]
return mydf
You can use this like below for instance your News table:
django_recordset = News.objects.all()
panda_data_frame = django_recordset_to_data_frame(django_recordset )
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | MattR |
| Solution 2 | Dharman |
