'Get the last 5 lines of a 1000 line csv, RDD Spark Java
I have a .csv file that has 1000 lines of data in it, and I'm trying to write a line of code that will show only the last 5 lines of data.
private SparkSession spark;
private JavaSparkContext sc;
private JavaRDD<String> lines;
private JavaRDD<PurchaseOrder> orders;
public OrderProcessingRDDSparkApp(String ...args) throws IOException {
spark = SparkSession.builder().appName("OrderProcessingSparkApp").config("spark.master", "local[1]").getOrCreate();
sc = new JavaSparkContext(spark.sparkContext());
sc.setLogLevel("ERROR");
lines = sc.textFile(args[0]);
orders = lines.map( line -> new PurchaseOrder(line));
What can I try to resolve this?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
