'Big Query Slot estimator

Currently, GCP Projects use BigQuery with "On Demand" pricing and I search to know how many slots I should purchased if I switch to "flat rate" pricing. I have top folder with projects inside and I search query which can provide slot usage for this folder, to purchase right number of slot.

I try something like this but i don't understand how I can targeted specific folder and if this query will works fine

SELECT
SUM(total_slot_ms) / (1000*60*60*24*7) AS avg_slots
FROM `region-eu`.INFORMATION_SCHEMA.JOBS_BY_FOLDER
WHERE
 -- Filter by the partition column first to limit the amount of data scanned. Eight days
 -- allows for jobs created before the 7 day end_time filter.
 creation_time BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 8 DAY) AND CURRENT_TIMESTAMP()
 AND job_type = "QUERY"
 AND end_time BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY) AND CURRENT_TIMESTAMP()

I can retrieve folder number to target it but it is an array request with select *



Solution 1:[1]

You can calculate the slots with this formula:

Number of slot = total_slot_ms / TIMESTAMP_DIFF(end_time,start_time, MILLISECOND)

You can use this example query:

   select job_id ,total_slot_ms / TIMESTAMP_DIFF(end_time,start_time,MILLISECOND) as num_slot from `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT

You can filter by project_id, project_number, job_id or referenced_tables.

or manually using BQ UI execution details if you don't have access to the table above:

Number of slot = Slot time consumed (convert in MILLISECOND)/Elapses time (convert in MILLISECOND)

Another option is to use BigQuery monitoring with the System Tables Reports Dashboard to check how many slots your account is using.

You can see more documentation about monitoring and BigQuery.

To extract the folder_name from the array, you need to use the SAFE_OFFSET function. You can see this example with OFFSET.

WITH items AS
 (SELECT ["apples", "bananas", "pears", "grapes"] as list union all
 SELECT ["watermelon", "peach", "orange", "avocado"]
 )
 SELECT
 list[SAFE_OFFSET(0)] as safe_offset_0,
 list[SAFE_OFFSET(1)] as safe_offset_1,
 list[SAFE_OFFSET(2)] as safe_offset_2,
 list[SAFE_OFFSET(3)] as safe_offset_3
FROM items;

This is the output.

enter image description here

In your case could be something like this:

SELECT folder_numbers[SAFE_OFFSET(0)] as id,folder_numbers[SAFE_OFFSET(2)] as folder_number,
SUM(total_slot_ms) / (1000*60*60*24*7) AS avg_slots
FROM `region-eu`.INFORMATION_SCHEMA.JOBS_BY_FOLDER
WHERE...........
GROUP BY  total_slot_ms,folder_numbers[SAFE_OFFSET(1)] 

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1