'How do I list my scheduled queries via the Python google client API?

I have set up my service account and I can run queries on bigQuery using client.query().

I could just write all my scheduled queries into this new client.query() format but I already have many scheduled queries so I was wondering if there is a way I can get/list the scheduled queries and then use that information to run those queries from a script.

I can query with big Query but I want to list queries



Solution 1:[1]

Yes, you can use the APIs. When you don't know which one to use, I have a tip. Use the command proposed by @Yev

bq ls --transfer_config --transfer_location=US --format=prettyjson

But log the API calls. for that use the --apilog <logfile name> parameter like that

bq --apilog ./log ls --transfer_config --transfer_location=US --format=prettyjson

And, magically, you can find the API called by the command: https://bigquerydatatransfer.googleapis.com/v1/projects/<PROJECT-ID>/locations/US/transferConfigs?alt=json

Then, a simple google search leads you to the correct documentation


In python, add that dependencies in your requirements.txt: google-cloud-bigquery-datatransfer and use that code

from google.cloud import bigquery_datatransfer

client = bigquery_datatransfer.DataTransferServiceClient()
parent = client.common_project_path("<PROJECT-ID>")
resp = client.list_transfer_configs(parent=parent)
print(resp)

Solution 2:[2]

Using cloud SDK there is a command that gets you what you want and more: bq ls --transfer_config --transfer_location=US --format=prettyjson, more about that here - List Scheduled Queries in BigQuery

Executing this within your command prompt (given google cloud SKD is installed) gives you the following (with red part = scheduled query sql):

enter image description here


Following that you can run this as a shell sub process within Python and parse it out:

import pandas as pd
import json
from subprocess import PIPE, run, call

response = run('bq ls --transfer_config --transfer_location=US --format=prettyjson', 
               stdout=PIPE, 
               stderr=PIPE, 
               universal_newlines=True, 
               shell=True)

response

Here is first few lines resulting from above:

CompletedProcess(args='bq ls --transfer_config --transfer_location=US --format=prettyjson', returncode=0, stdout='[\n  {\n    "dataSourceId": "scheduled_query",\...

Then to get to sql you could access output via response.stdout and parse as json and then dictionary your way in to desired results or get it into pandas dataframe format and go from there like below:

data = json.loads(response.stdout)
df = pd.json_normalize(data)

df.columns =

dataSourceId
datasetRegion
destinationDatasetId
disabled
displayName
name
schedule
state
updateTime
userId
emailPreferences.enableFailureEmail
params.destination_table_name_template

### sql located in this one
params.query

params.write_disposition
scheduleOptions.startTime
params.overwrite_destination_table
params.source_dataset_id
params.source_project_id
scheduleOptions.endTime
nextRunTime

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 guillaume blaquiere
Solution 2