'Spark DataFrame cache java implementation
I am building a Data pipeline using Apache Spark in java and wanted to ask on the perspective of caching the Data Frame.
There is an API call which is returning the data in form of DataFrame and idea is to cache it in Spark session cache.
The code snippet for it is as:
updatesDF = updatesDF.join(fxRate, JavaConverters.asScalaIteratorConverter(List.of("account_id","date").iterator()).asScala().toSeq());
I could not find anything on how to cache this in the Spark application for a Spark session.
Thanks
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
