Category "scala"

How to efficiently remove duplicate rows in Spark Dataframe, keeping row with highest timestamp

I have a large data set which I am reading from Postgres. It has an ID column, a timestamp column and several other columns which may have been updated. For eac

How to extract values from key value map?

I have a column of type map, where the key and value changes. I am trying to extract the value and create a new column. Input: ----------------+ |symbols

sbt-assembly: deduplication found error

I am not sure whether mergestrategy or exclude jars is the best option here. Any help with how do I proceed further with this error will be great! [sameert@pzx

How to use spark with large decimal numbers?

My database has numeric value, which is up to 256-bit unsigned integer. However, spark's decimalType has a limit of Decimal(38,18). When I try to do calculatio

Circe list deserialization with best-attempt and error reporting

I'm using Circe to deserialize json containing a list. Sometimes a few items in the json list are corrupted, and that causes the entire deserialization to fail.

Consume TCP stream and redirect it to another Sink (with Akka Streams)

I try to redirect/forward a TCP stream to another Sink with Akka 2.4.3. The program should open a server socket, listen for incoming connections and then consum

getStrLn in ZIO working with flatMap but not inside for comprehension

val askNameFlatMap: ZIO[Console, IOException, Unit] = putStrLn("What is your Name? ") *> getStrLn.flatMap(name => putStrLn(s"Hello $name")) val askNa

Apache Flink + CEP - Detect same events

I'd like to detect events that share the same property. Suppose I have a simple case class: case class Record(name: String, value: Int) Suppose there is the

Sum a column values based on a condition using spark scala

I have a dataframe like this: JoiKey period Age Amount Jk1 2022-02 2 200 Jk1 2022-02 3 450 Jk2 2022-03 5 500 Jk3 2022-03 0 200 Jk2 2022-02 8 300 Jk3 2022-03 9

DAG of Spark Sort application spanning two jobs

I've written a very simple Sort scala program with Spark. object Sort { def main(args: Array[String]): Unit = { if (args.length < 2) {

How to convert list of integers to a map with frequency per bin?

Lets say I have a list of numbers: val numbers = List(15, 30, 110, 140, 170, 210) How can I count the number of integers per bin of a 100 in order to get: M

How to NOT send an assured message via solace JMS

I have a very simple producer type program that tries to send a ByteMessage to a topic. My program is receiving the error com.solacesystems.jms.ConfigurationEx

Extracting Structure Failed when importing an sbt project

I'm trying to setup Scala on IntelliJ IDE and when I create a new project it seems fine. When I import another project it errors: Extracting Structure Failed. T

How to make $filter operator inside a projection of an aggregation using the MongoDB Scala Driver

I need to filter array of embedded documents inside a projection of an aggregation operation. I know there is this new $filter operator, but I do not know how t

Static methods in interface require -target:jvm-1.8

I'm building scala project using gradle 4.5, scala 2.11.11/2.12.4 with JDK 1.8.0_162 and it was working fine until I upgrade to scala 2.11.12. With 2.11.12 I ke

Scala Single Abstract Method & Intellij

I'm just getting use to Scala Single Abstract Method which I did not know well, until I went through several blogs on Type Class. I am able after few minutes to

Java.io.FileNotFoundException : YAML file does not exists

When I am submitting the spark job from terminal I am getting below error that file does not exists. Although I have already placed config file to my local. spa

Apache Spark - Is it possible to use a Dependency Injection Mechanism

Is there any possibility using a framework for enabling / using Dependency Injection in a Spark Application? Is it possible to use Guice, for instance? If so,

Write single CSV file using spark-csv

I am using https://github.com/databricks/spark-csv , I am trying to write a single CSV, but not able to, it is making a folder. Need a Scala function which wil

Converting JSON string to a JSON object in Scala

I want to convert a simple JSON string such as {"Name":"abc", "age":10} to the corresponding JSON object (not a custom Scala object such as "Person"). Does Scal