'Spark 3.1.2 NoSuchMethodError: org.apache.spark.sql.catalyst.expressions.aggregate.AggregateFunction.toAggregateExpression$default$2()Lscala/Option;
I'm Building a jar and running it in EMR cluster.
I'm using spark-alchemy below version and calling a function named hll_init_agg inside .agg and getting above error.
CODE where it's called:
Dataset<Row> groupByDf = df
.groupBy(functions.col("A"), functions.col("DAY"), functions.col("C"), functions.col("D"))
.agg(com.swoop.alchemy.spark.expressions.hll.functions.hll_init_agg(functions.col("ID"), 0.005, "AGKN").alias("NEWID"));
<dependency> <groupId>com.swoop</groupId> <artifactId>spark-alchemy_2.12</artifactId> <version>1.1.0</version> </dependency>
Stack Trace :
22/03/23 07:06:22 ERROR ApplicationMaster: User class threw exception: java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.expressions.aggregate.AggregateFunction.toAggregateExpression$default$2()Lscala/Option; java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.expressions.aggregate.AggregateFunction.toAggregateExpression$default$2()Lscala/Option; at com.swoop.alchemy.spark.expressions.WithHelper.withAggregateFunction(WithHelper.scala:13) at com.swoop.alchemy.spark.expressions.WithHelper.withAggregateFunction$(WithHelper.scala:10) at com.swoop.alchemy.spark.expressions.hll.functions$.withAggregateFunction(HLLFunctions.scala:653) at com.swoop.alchemy.spark.expressions.hll.HLLFunctions.hll_init_agg(HLLFunctions.scala:695) at com.swoop.alchemy.spark.expressions.hll.HLLFunctions.hll_init_agg$(HLLFunctions.scala:695) at com.swoop.alchemy.spark.expressions.hll.functions$.hll_init_agg(HLLFunctions.scala:653) at com.swoop.alchemy.spark.expressions.hll.functions.hll_init_agg(HLLFunctions.scala) at com.xxx.xxx.xxx.xxxx(xxxx.java:103) at com.xxx.xxx.xxxxx(MainClass.java:315) at com.xxxx.xxxx.main(MainClass.java:104) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:728) Exception in thread "main" org.apache.spark.SparkException: Application application_1647925696500_0096 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1196) at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1587) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:936) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1015) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1024) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/03/23 10:06:14 INFO ShutdownHookManager: Shutdown hook called 22/03/23 10:06:14 INFO ShutdownHookManager: Deleting directory /tmp/spark-a055b36c-0f9a-46f0-9575-893d300705f8 22/03/23 10:06:14 INFO ShutdownHookManager: Deleting directory /tmp/spark-9d482629-a936-4c16-8df4-ac426dcc12ff
Is there any heads up on this to resolve the issue. Any suggestion is much appreciated .
Solution 1:[1]
Resolved the issue. There was a jar version mismatch. EMR Cluster spark was having jar spark-catalyst_2.12-3.0.1-amzn-0.jar but we have added this jar in our maven dependency spark-catalyst_2.12-3.1.2-amzn-0.jar
Upon updating the version of jar to 3.1.2 , I was able to run the job.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Deepak |
