'Cannot use custom SQL function with arguments inside transform scope [Spark SQL] (Error in SQL statement: AnalysisException: Resolved attribute(s)...)
I am using a Spark SQL context in Azure Databricks.
My query uses the transform function for handling an array like so:
SELECT
colA,
colB,
transform(colC,
x -> named_struct(
"innerColA", functionA(x.innerColA), -- does not work
"innerColB", [...x.innerColB...], -- works (same logic as functionA)
"test1", test1(), -- works
"test2", test2(x.innerColA) -- does not work
)
)
FROM
tableA
I get the following error regarding the use of functionA:
Error in SQL statement: AnalysisException: Resolved attribute(s) x#2723416 missing from in operator !Project [cast(lambda x#2723416 as string) AS arg1#2723417].
functionA is simple enough that, if I rewrite it directly into the query, it works (as shown using "innerColC" of my code example.
I have tested with simple functions that don't take any arguments and they can be used without any issues:
CREATE OR REPLACE FUNCTION test1() RETURNS STRING RETURN "test"
But if you have any arguments, it throws that error:
CREATE OR REPLACE FUNCTION test2(arg1 STRING) RETURNS STRING RETURN "test"
Is that a limitation of SparkSQL? Are there any workarounds?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
