'Google Apps Script for BigQuery - different results of REGEXP_CONTAINS
I have troubles with my own SQL and REGEXP_CONTAINS, so I decided to test official BigQuery example from here. I used SQL from example, which shows how the REGEXP_CONTAINS works. See the SQL here:
SELECT
email,
REGEXP_CONTAINS(email, r'^([\w.+-]+@foo\.com|[\w.+-]+@bar\.org)$')
AS valid_email_address,
REGEXP_CONTAINS(email, r'^[\w.+-]+@foo\.com|[\w.+-]+@bar\.org$')
AS without_parentheses
FROM
(SELECT
['[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]']
AS addresses),
UNNEST(addresses) AS email;
+----------------+---------------------+---------------------+
| email | valid_email_address | without_parentheses |
+----------------+---------------------+---------------------+
| [email protected] | true | true |
| [email protected] | false | true |
| [email protected] | true | true |
| [email protected] | false | true |
| [email protected] | false | false |
+----------------+---------------------+---------------------+
This SQL works as expected when I compose new query directly in BigQuery editor. But If I want to use SQL with the REGEXP_CONTAINS in Google Apps Script, then it doesn't work. The script I use is:
/**
* Runs a BigQuery query and logs the results in a spreadsheet.
*/
function testSQL() {
// Replace this value with the project ID listed in the Google
// Cloud Platform project.
var projectId = 'bigquery-xxxx';
var request = {
query: 'SELECT '+
'email, '+
'REGEXP_CONTAINS(email, r"^([\w.+-]+@foo\.com|[\w.+-]+@bar\.org)$")'+
' AS valid_email_address, '+
'REGEXP_CONTAINS(email, r"^[\w.+-]+@foo\.com|[\w.+-]+@bar\.org$")'+
' AS without_parentheses '+
'FROM'+
' (SELECT'+
' ["[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]"]'+
' AS addresses),'+
' UNNEST(addresses) AS email;',
useLegacySql: false
};
var queryResults = BigQuery.Jobs.query(request, projectId);
var jobId = queryResults.jobReference.jobId;
// Check on status of the Query Job.
var sleepTimeMs = 500;
while (!queryResults.jobComplete) {
Utilities.sleep(sleepTimeMs);
sleepTimeMs *= 2;
queryResults = BigQuery.Jobs.getQueryResults(projectId, jobId);
}
// Get all the rows of results.
var rows = queryResults.rows;
while (queryResults.pageToken) {
queryResults = BigQuery.Jobs.getQueryResults(projectId, jobId, {
pageToken: queryResults.pageToken
});
rows = rows.concat(queryResults.rows);
}
if (rows) {
var spreadsheet = SpreadsheetApp.create('BiqQuery Results');
var sheet = spreadsheet.getActiveSheet();
// Append the headers.
var headers = queryResults.schema.fields.map(function(field) {
return field.name;
});
sheet.appendRow(headers);
// Append the results.
var data = new Array(rows.length);
for (var i = 0; i < rows.length; i++) {
var cols = rows[i].f;
data[i] = new Array(cols.length);
for (var j = 0; j < cols.length; j++) {
data[i][j] = cols[j].v;
}
}
sheet.getRange(2, 1, rows.length, headers.length).setValues(data);
Logger.log('Results spreadsheet created: %s',
spreadsheet.getUrl());
} else {
Logger.log('No rows returned.');
}
}
The script above creates the sheet and the results are:
What is wrong with REGEXP_CONTAINS?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|

