'How to compute number of friends of friends in a BigQuery table with repeated records?
I have a BigQuery table with the following format:
| person | friends.name | friends.year |
|---|---|---|
| John | Mary | 1977 |
| Mike | 1984 | |
| Mary | John | 1980 |
| Mike | John | 1977 |
| Jane | 1971 |
I want to compute, for each person, the maximum year in a separate column, and also for each friends record I would like to get the number of friends that each of the friends has (which would be achieved either with a self join, or with a window function).
I am not sure how to write this query, my approach so far has been:
SELECT person,
ARRAY(SELECT AS STRUCT f.name, f.year FROM UNNEST (Friends) f),
ARRAY_LENGTH(friends) AS number_friends
FROM table
However, this does not compute the number of friends for each array struct value. This is the output I am expecting:
| person | friends.name | friends.year | friends.num_friends | max_year |
|---|---|---|---|---|
| John | Mary | 1977 | 1 | 1984 |
| Mike | 1984 | 2 | ||
| Mary | John | 1980 | 2 | 1980 |
| Mike | John | 1977 | 2 | 1977 |
| Jane | 1971 | 0 |
How can I write this query in an optimised way?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
