Question:
In BigQuery, standard SQL, if I run
SELECT *
FROM mytable
CROSS JOIN UNNEST(mytable.array)
Can I be certain that the resulting row order is the same as the array order?
Example:
Let's say I have the following table mytable:
Row | id | prediction
1 | abcd | [0.2, 0.5, 0.3]
If I run SELECT * FROM mytable CROSS JOIN UNNEST(mytable.prediction), can I be certain that the row order is the same as the array order? I.e. will the resulting table always be:
Row | id | unnested_prediction
1 | abcd | 0.2
2 | abcd | 0.5
3 | abcd | 0.3
More background on use case (argmax):
I'm trying to find the array index with the largest value for the array in each row (argmax), i.e. the second element (0.5) in the array above. My target output is thus something like this:
Row | id | argmax
1 | abcd | 2
Using CROSS JOIN, a DENSE_RANK window function ordered by the prediction value and a ROW_NUMBER window function to find the argmax, I am able to make this work with some test data. You can verify with this query:
WITH predictions AS (
SELECT 'abcd' AS id, [0.2, 0.5, 0.3] AS prediction
UNION ALL
SELECT 'efgh' AS id, [0.7, 0.2, 0.1] AS prediction
),
ranked_predictions AS (
SELECT
id,
ROW_NUMBER() OVER (PARTITION BY id) AS rownum, -- This is the ordering I'm curious about
DENSE_RANK() OVER (PARTITION BY id ORDER BY flattened_prediction DESC) AS array_rank
FROM
predictions P
CROSS JOIN
UNNEST(P.prediction) AS flattened_prediction
)
SELECT
id,
rownum AS argmax
FROM
ranked_predictions
WHERE array_rank = 1
It could just be a coincidence that ROW_NUMBER behaves well in my tests (i.e. that it is ordered according to the unnested array), so it would be nice to be certain.