'Issue with splitting string in bigquery
I cant seem to split this string, I would like to get the last 8 numbers as 'Date' in the format of YYYY-MM-DD. I would like to use the _filename to feed through and generate the date.
select
split(split("gs://mmmm_ssss_count_detail/mmm_ssss_count_detail_20220125.csv",'/')[offset(2)],'_')[offset(3)]
This just gives me 'detail' which is not what I need, it should at least give me '20220125.csv' where I can then remove the .csv part and parsedate it to 'Date' in my main select query.
Help please.
Solution 1:[1]
Instead of a split, can you use a regex to find the date value? If so try the following:
select
parse_date("%Y%m%d",regexp_extract(path, r'_(\d+)\.csv'))
from sample_data
The regex string above is looking for a pattern where a set of digits (\d+) exists between an _ and .csv
With the sample string you provided it yields:
Solution 2:[2]
Consider below approach / option
select _filename,
parse_date('%Y%m%d', array_reverse(split(replace(_filename, '.', '_'), '_'))[offset(1)]) file_date
from your_table
if applied to sample data in your question - output is
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Daniel Zagales |
| Solution 2 | Mikhail Berlyant |


