Category "spark-cassandra-connector"

Reading from cassandra in Spark does not return all the data when using JoinWithCassandraTable

I'm querying data from Cassandra in Spark using SCC 2.5 and Spark 2.4.7 (pyspark). The table I'm reading from has a composite partition key (partition_key_1, pa

When I try fetch data from Amazon Keyspaces with Pyspark, I get Unsupported partitioner: com.amazonaws.cassandra.DefaultPartitioner Error

I'm not experienced in Java or Hadoop ecosystem. I configured my Spark cluster to connect to Amazon Keyspaces by using spark-cassandra-connector from Datastax.

Why does writing to AWS Keyspaces with spark-cassandra-connector return "Unsupported partitioner"?

I'm trying to write some data in aws keyspace with spark, but the follow message error shows: Exception in thread "main" java.lang.IllegalArgumentException: \

HDFS Date partition directory loop

I have a HDFS Directory as below. /user/staging/app_name/2022_05_06 Under such a directory I have around 1000 part files. I want to loop each of the part file

Error to write dataframe in Cassandra table on Amazon Keyspaces

I'm trying to write a dataframe on AWS (Keyspace), but I'm getting the following messages below: Stack: dfExploded.write.cassandraFormat(table = "table", keyspa

How did spark RDD map to Cassandra table?

I am new to Spark, and recently I saw a code is saving data in RDD format to Cassandra table. But I am not able to figure it out how it is doing the column mapp

Problem with cassandra-connector at "load()"

I downloaded succesfully this connector: com.datastax.spark:spark-cassandra-connector_2.11:2.5.1 And when I try to load the information with this line: data = s

Cannot connect to Cassandra in spark-shell

I am trying to connect to a remote cassandra cluster in my spark shell using the Spark-cassandra connector. But its throwing some unusual errors. I do the usual

How to fix org.apache.spark.SparkException: Job aborted due to stage failure Task & com.datastax.spark.connector.rdd.partitioner.CassandraPartition

In my project i am using spark-Cassandra-connector to read the from Cassandra table and process it further into JavaRDD but i am facing issue while processing C