'Why does writing to AWS Keyspaces with spark-cassandra-connector return "Unsupported partitioner"?
I'm trying to write some data in aws keyspace with spark, but the follow message error shows:
Exception in thread "main" java.lang.IllegalArgumentException: \
Unsupported partitioner: com.amazonaws.cassandra.DefaultPartitioner
Being so, I tried to write the same data with Java pure client and I had success. I looked which dependencies spark has, and for my suprise are the same then java client pure (Java-driver).
Why java has been success write and spark not ? Can be some about connection ? Auth ?
Solution 1:[1]
AWS Keyspaces uses a proprietary partitioner class com.amazonaws.cassandra.DefaultPartitioner
which isn't available in open-source Apache Cassandra. It works with the Java driver because it is possible to use custom partitioner classes with the driver.
However, the Spark-Cassandra connector only supports two partitioners:
Murmur3Partitioner
RandomPartitioner
You won't be able to use the Spark connector on AWS Keyspaces since their DefaultPartitioner
is not supported.
The Spark connector does not support Cassandra forks or CQL API variants so we don't run tests against them. Cheers!
Solution 2:[2]
Keyspaces now supports the RandomPartitioner, which enables reading and writing data between Keyspaces and Apache Spark by using the open-source Spark Cassandra Connector. You just have to update the partitioner for your account.
Docs: https://docs.aws.amazon.com/keyspaces/latest/devguide/spark-integrating.html
Launch announcement: https://aws.amazon.com/about-aws/whats-new/2022/04/amazon-keyspaces-read-write-data-apache-spark/
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Erick Ramirez |
Solution 2 | Arturo Hinojosa |