I have large amount of json files that Spark can read in 36 seconds but Spark 3.0 takes almost 33 minutes to read the same. On closer analysis, looks like Spark
I need to store to Cassandra and publish to Kafka multiple events, and call some final handler() only after all events are stored and published. I came across U
I have a map of string to IO like this Map[String, IO[String]], I want to transform it into IO[Map[String, String]]. How to do it?
activator new results in: Fetching the latest list of templates... Browse the list of templates: http://lightbend.com/activator/templates Choose from these f
I am trying list all objects in AWS S3 Buckets with input Bucket Name & Filter Prefix using following code. import scala.collection.JavaConverters._ import
I have two identical Spark DataFrame. They have the same columns. I am trying to create a IF-Else statement in one line but couldnt find a better way to do it.
I am very new to programming in Scala. I am writing a test program to get maximum value from JSON data. I have following code: import scala.io.Source import sc
I'm learning the concept of F[_] as a constructor for other types, but how do you pronounce this to another human or say it in your head (for us internal monolo
I am trying to connect to a remote cassandra cluster in my spark shell using the Spark-cassandra connector. But its throwing some unusual errors. I do the usual
I have followed this post pyspark error reading bigquery: java.lang.ClassNotFoundException: org.apache.spark.internal.Logging$class and followed the resolution
I'm looking for a reliable way in Spark (v2+) to programmatically adjust the number of executors in a session. I know about dynamic allocation and the ability
I try to get data from Kafka to Flink, I use FlinkKafkaConsumer but Intellij shows me that it is depricated and also ssh console in Google Cloud shows me this e
These are the contents of my build.sbt file: name := "WordCounter" version := "0.1" scalaVersion := "2.13.1" libraryDependencies ++= Seq( "org.apache.spar
I have a csv file with below data. Id Subject Marks 1 M,P,C 10,8,6 2 M,P,C 5,7,9 3 M,P,C 6,7,4 I Need to find out Max value in the Marks column for each Id an
I have a spark job that needs to store the last time it ran to a text file. This has to work both on HDFS but also on local fs (for testing). However it seems
I have a question about Behaviors.unhandled, I know that Akka sends the unhandled message to the Dead Letter and with the following configuration it also logs i
I am working on an implementation of a state machine in scala. The original version is written in python, therefore I have a lot of if /else clauses in the co
I have some scala code that needs to be able to serialize/deserialize some Java classes using Json4s. I am using "org.json4s" %% "json4s-ext" % "4.0.5" and "org
Whenever I try to run my main program directly in IntelliJ I get this error: Error:(5, 12) object apache is not a member of package org import org.apache.common
I am writing unit tests for my spark/scala application. I am using scalamock as well to mock objects, specifically Session / Session Factory. In one of my test