'How to divide up a BigQuery table to smaller ones using groupby

Below is a demonstration of what I am aiming for in Pandas.

import pandas as pd
import numpy as np

Create a mock dataframe with 4 columns:

mockdf = pd.DataFrame({'col0':['x','y','y','y','x','x'],
                       'col1':['a','a','b','b','b','a'],
                       'col2':np.random.RandomState(42).normal(size=6),
                       'col3':np.random.RandomState(43).normal(size=6)})

mockdf is:

  col0 col1      col2      col3
0    x    a  0.496714  0.257400
1    y    a -0.138264 -0.908481
2    y    b  0.647689 -0.378503
3    y    b  1.523030 -0.534916
4    x    b -0.234153  0.858073
5    x    a -0.234137 -0.413010

I perform a groupby using col0 and col1. and create a dict where the key is the given group keys, the value is the dataframe:

gs = {'_'.join(key): value for key, value in mockdf.groupby(['col0','col1'])}

then, for example, gs['x_a'] is:

  col0 col1      col2     col3
0    x    a  0.496714  0.25740
5    x    a -0.234137 -0.41301

Uploading mockdf to BigQuery via:

mockdf.to_gbq(f"ourtable.mockdf", project_id="ourproject")

Dataframe uploads successfully.

How can I split the BigQuery table into smaller ones using groupby? Ideally, the names of these new tables should reflect which group they belong to.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'How to divide up a BigQuery table to smaller ones using groupby

Sources

Related Questions