'How can I skip the first line of a csv in Java?

I want to skip the first line and use the second as header.

I am using classes from apache commons csv to process a CSV file.

The header of the CSV file is in the second row, not the first (which contains coordinates).

My code looks like this:

static void processFile(final File file) {
    FileReader filereader = new FileReader(file);
    final CSVFormat format = CSVFormat.DEFAULT.withDelimiter(';');
    CSVParser parser = new CSVParser(filereader, format);
    final List<CSVRecord> records = parser.getRecords();
    //stuff
}

I naively thought,

CSVFormat format = CSVFormat.DEFAULT.withFirstRecordAsHeader().withDelimiter(;)

would solve the problem, as it's different from withFirstRowAsHeader and I thought it would detect that the first row doesn't contain any semicolons and is not a record. It doesn't. I tried to skip the first line (that CSVFormat seems to think is the header) with

CSVFormat format = CSVFormat.DEFAULT.withSkipHeaderRecord().withFirstRecordAsHeader().withDelimiter(;);

but that also doesn't work. What can I do? What's the difference between withFirstRowAsHeader and withFirstRecordAsHeader?



Solution 1:[1]

You may want to read the first line, before passing the reader to the CSVParser :

static void processFile(final File file) {
    FileReader filereader = new FileReader(file);
    BufferedReader bufferedReader = new BufferedReader(filereader);
    bufferedReader.readLine();// try-catch omitted
    final CSVFormat format = CSVFormat.DEFAULT.withDelimiter(';');
    CSVParser parser = new CSVParser(bufferedReader, format);
    final List<CSVRecord> records = parser.getRecords();
    //stuff
}

Solution 2:[2]

The correct way to skip the first line if it is a header is by using a different CSVFormat

CSVFormat format = CSVFormat.DEFAULT.withDelimiter(';').withFirstRecordAsHeader();

Solution 3:[3]

In version 1.9.0 of org.apache.commons:commons-csv use:

val format = CSVFormat.Builder.create(CSVFormat.DEFAULT)
        .setHeader()
        .setSkipHeaderRecord(true)
        .build()

val parser = CSVParser.parse(reader, format)

Solution 4:[4]

You can skip the first record using stream:

List<CSVRecord> noHeadersLine = records.stream.skip(1).collect(toList());

Solution 5:[5]

You can filter it using Java Streams:

parser.getRecords().stream()
     .filter(record -> record.getRecordNumber() != 1) 
     .collect(Collectors.toList());

Solution 6:[6]

I am assuming your file format looks something like:

<garbage line here>
<header data>
<record data starts here>

For version 1.9.0, use, as given above, but with one addition:

Reader in = new FileReader(fileName);
BufferedReader bufferedReader = new BufferedReader(in);
System.out.println(bufferedReader.readLine());
CSVFormat format = CSVFormat.Builder.create(CSVFormat.DEFAULT)
            .setHeader()
            .setSkipHeaderRecord(true)
            .build();
CSVParser parser = CSVParser.parse(bufferedReader, format);
for (CSVRecord record : parser.getRecords()) {
    <do something>
}

If you don't skip that first line somehow, you will throw an IllegalArgumentException.

Solution 7:[7]

You could consume the first line and then pass it to the CSVParser. Other than that there is a method #withIgnoreEmptyLines which might solve the issue.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Arnaud
Solution 2 Sully
Solution 3 Markus Lenger
Solution 4 Frank Why
Solution 5
Solution 6 Matt Minton
Solution 7