'Cannot use custom SQL function with arguments inside transform scope [Spark SQL] (Error in SQL statement: AnalysisException: Resolved attribute(s)...)

I am using a Spark SQL context in Azure Databricks. My query uses the transform function for handling an array like so:

SELECT
  colA,
  colB,
  transform(colC,
    x -> named_struct(
        "innerColA", functionA(x.innerColA), -- does not work
        "innerColB", [...x.innerColB...],    -- works (same logic as functionA)

        "test1", test1(),                    -- works
        "test2", test2(x.innerColA)          -- does not work
    )
  )
FROM
  tableA

I get the following error regarding the use of functionA:

Error in SQL statement: AnalysisException: Resolved attribute(s) x#2723416 missing from in operator !Project [cast(lambda x#2723416 as string) AS arg1#2723417].

functionA is simple enough that, if I rewrite it directly into the query, it works (as shown using "innerColC" of my code example.

I have tested with simple functions that don't take any arguments and they can be used without any issues:

CREATE OR REPLACE FUNCTION test1() RETURNS STRING RETURN "test"

But if you have any arguments, it throws that error:

CREATE OR REPLACE FUNCTION test2(arg1 STRING) RETURNS STRING RETURN "test"

Is that a limitation of SparkSQL? Are there any workarounds?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source