'Custom sort 2 csv files with 2 columns and print differences in Java

I have 2 csv files as below:

csv1:

101,101-1,400.01,500.01
101,101-2,400.02,500.01
102,102-1,600.01,700.01
102,102-2,600.02,700.02

csv2:

101,101-1,400.02,500.01
101,101-2,400.02,500.01
102,102-1,600.01,700.02
102,102-2,600.02,700.07

I want to store the data in Java collection in such a way I can compare column c and d both csv and print the differences of each id.

desired output:

difference of column c of id 101 and sub id 101-1 is : 0.01
difference of column d of id 101 and sub id 101-1 is : 0.00

difference of column c of id 102 and sub id 102-1 is : 0.00
difference of column d of id 102 and sub id 102-1 is : 0.01

and so on.

I have tried using Map<Integer,List<Map<String,List>>> but its getting too complex and time consuming. Please suggest better way to get the result using Java.



Solution 1:[1]

OOP can help us here.

We can define:

@AllArgsConstructor
@Getter
class Row {
    private Id id;
    private double columnC;
    private double columnD;
}

And:

@AllArgsConstructor
@Getter
@EqualsAndHashCode
class Id implements Comparable<Id> {
    private int id;
    private String subId;

    @Override
    public int compareTo(Id o) {
        int compare = Integer.compare(this.id, o.id);
        return compare != 0 ? compare : Integer.compare(Integer.parseInt(this.subId.substring(this.subId.indexOf('-')+1)),
                Integer.parseInt(o.subId.substring(o.subId.indexOf('-')+1)));
    }
}

Then using Apache Commons CSV we can do:

public static void main(String[] args) {
    Map<Id, Row> rows1 = getRows("C:\\Users\\Saar\\IdeaProjects\\so2\\src\\main\\resources\\books.csv");
    Map<Id, Row> rows2 = getRows("C:\\Users\\Saar\\IdeaProjects\\so2\\src\\main\\resources\\books2.csv");

    rows1.entrySet().stream()
            .sorted(Map.Entry.comparingByKey())
            .forEach(e1 -> {
                double c = Math.abs(e1.getValue().getColumnC() - rows2.get(e1.getKey()).getColumnC());
                double d = Math.abs(e1.getValue().getColumnD() - rows2.get(e1.getKey()).getColumnD());
                System.out.println("difference of column c of id " + e1.getKey().getId() + " and sub id " + e1.getKey().getSubId() + " is : " + c);
                System.out.println("difference of column d of id " + e1.getKey().getId() + " and sub id " + e1.getKey().getSubId() + " is : " + d);
            });
}

private static Map<Id, Row> getRows(String fileName) throws IOException {
    return StreamSupport.stream(CSVFormat.DEFAULT.parse(new FileReader(fileName)).spliterator(), false)
            .map(r -> {
                Id id = new Id(Integer.parseInt(r.get(0)), r.get(1));
                return new Row(id, Double.parseDouble(r.get(2)), Double.parseDouble(r.get(3)));
            })
            .collect(Collectors.toMap(Row::getId, row -> row));
}

Output (note the floating-point causes "weird" results):

difference of column c of id 101 and sub id 101-1 is : 0.009999999999990905
difference of column d of id 101 and sub id 101-1 is : 0.0
difference of column c of id 101 and sub id 101-2 is : 0.0
difference of column d of id 101 and sub id 101-2 is : 0.0
difference of column c of id 102 and sub id 102-1 is : 0.0
difference of column d of id 102 and sub id 102-1 is : 0.009999999999990905
difference of column c of id 102 and sub id 102-2 is : 0.0
difference of column d of id 102 and sub id 102-2 is : 0.05000000000006821

Solution 2:[2]

To finish your task, we need to sort each CSV file by two specific columns in same order, calculate the difference of corresponding values in another column, and output the absolute value. It is rather complicated to get this done in Java. I suggest you using SPL to do this. It is an open-source Java package, and you just need several lines of code, as shown below:

A
1 =file("1.csv").import@wc().sort(~(1),~(2))
2 =file("2.csv").import@wc().sort(~(1),~(2))
3 =A1.("difference of column c of id"/~(1)/"and sub id"/~(2)/"is :"/abs(~(3)-A2(#)(3))/"\ndifference of column d of id"/~(1)/"and sub id"/~(2)/"is :"/abs(~(4)-A2(#)(4))).export()

SPL offers JDBC driver to be invoked by Java. Just store the above SPL script as diff.splx and invoke it in Java as you call a stored procedure:

…
Class.forName("com.esproc.jdbc.InternalDriver");
con= DriverManager.getConnection("jdbc:esproc:local://");
st=con.prepareCall("call diff()");
st.execute();
…

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 JinJune