'How to cleanup BigQuery export to CSV using pandas
I exported my BigQuery data to CSV but can't figure out how to clean up the data as the headers are all appended on the backend in the same row.
Here's my code:
from google.cloud import bigquery
import pandas as pd
project = 'project1'
client = bigquery.Client(project=project)
defineQuery = """
SELECT *
FROM table
LIMIT 5;
"""
df = pd.DataFrame(client.query(defineQuery))
df.to_csv(r'C:/file-name.csv')
The BigQuery CSV file export came out with just 2 columns like the following:-
| | 0 |
|-----|-------|
| 1|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
| 2|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
| 3|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
| 4|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
| 5|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
How do I clean it up so it's like the following?
| | Field 1 | Field 2 | Field 3 |
|-----|---------|---------|---------|
| 1| 123 | data | data2 |
| 2| 123 | data | data2 |
| 3| 123 | data | data2 |
| 4| 123 | data | data2 |
| 5| 123 | data | data2 |
Thanks in advance!
Solution 1:[1]
Try this:
df = client.query(defineQuery).to_dataframe()
instead of
df = pd.DataFrame(client.query(defineQuery))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jaytiger |
