'Ignore Specific Columns Parsing a CSV File with Jackson CSV
My problem is that I need to parse CSV files with arbitrary columns/order into a known domain POJO (say Person). I can identify which columns that I need to process, ignoring the rest.
The option CsvParser.Feature.IGNORE_TRAILING_UNMAPPABLE" seemed to be exactly what I need, but the columns that I need to process are not necessarliy grouped at the start of the CSV file, and I cannot force the user to "re-order" the columns of their uploaded CSV files. Also, sometimes I do not get a header row, but the UI forces the user to identify columns & passes this information over.
For example, I have the following CSV file:
First Name,Last Name,Nickname,DOB,Occupation,Postal Code
Freddy,Benson,Ruprecht,08/14/45,Con Artist,76701
Lawrence,Jamieson,Prince,03/14/33,Con Artist,5201
Janet,Colgate,Jackal,03/13/55,Con Artist,90401
I only need 4 of the 6 columns (First Name, Last Name, DOB, Postal Code), as my Person POJO only includes those fields:
public class Person {
private String firstName;
private String lastName;
private LocalDate dob;
private String postalCode;
}
I have defined a CsvSchema typed for Person and specify the columns I'm interested in order (First Name, Last Name, IGNORE, DOB, IGNORE2, Postal Code), as I would like to skip columns (Nickname, Occupation). Hoever, the "IGNORE" columns get ignored during mapping in the deserializer, and I end up getting "Nickname" values for "DOB", resulting in invalid values for the DOB field.
Solution 1:[1]
My mistake was defining the schema as follows, which apparently strongly couples the schema to the domain POJO:
CsvSchema schema = mapper
.typedSchemaFor(Person.class)
.withSkipFirstDataRow(hasHeader)
.sortedBy(columnOrder.toArray(new String[columnOrder.size()]));
Resolved by defining schema/columns as follows, which seems to loosly couple the schema to the domain POJO:
CsvSchema schema = CsvSchema.builder()
.addColumn("firstName")
.addColumn("lastName")
.addColumn("ignore1")
.addColumn("dob")
.addColumn("ignore2")
.addColumn("postalCode")
.build();
CsvMapper mapper = new CsvMapper();
MappingIterator<Person> personIter = mapper
.readerFor(Person.class)
.with(schema)
.readValues(csvFile);
Solution 2:[2]
Ignoring unknown can be achieved as shown below(tested using jackson 2.13):
- Annotate POJO
import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import com.fasterxml.jackson.annotation.JsonProperty;
import java.time.LocalDate;
@JsonIgnoreProperties(ignoreUnknown = true) //this is the trick to ignore all undeclared columns
public class Person {
@JsonProperty("First Name") //declare what's needed
private String firstName;
private String lastName;
private LocalDate dob;
private String postalCode;
}
- Configure csvSchema
CsvMapper csvMapper = new CsvMapper();
CsvSchema csvSchema = csvMapper
.schemaFor(Person.class)
.withHeader()//header defines the order
.withColumnReordering(true)//this allows columns to be in any order as long as there is a header
;
- Finally, use it to read csv file
csvMapper.readerFor(Person.class).with(csvSchema).readValues(csvFile)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | tkofford |
| Solution 2 | Sateesh |
