'how to write a arrow table to parquet file

I read a parquet file as a table, so I can parse the parquet file and get the data. When I want to write the table to a local parquet file. The size of parquet file is very large. In theory, the size should be the same. When I check the binary file, found the new file is different original parquet file. It's larger because it contain many duplicate column names. How can I compress it.

void read_write(std::string file_read, std::string file_write)
{
    std::shared_ptr<arrow::io::ReadableFile> infile;
    PARQUET_ASSIGN_OR_THROW(infile,
                            arrow::io::ReadableFile::Open(file_read,
                                                          arrow::default_memory_pool()));

    std::unique_ptr<parquet::arrow::FileReader> reader;

    parquet::arrow::OpenFile(infile, arrow::default_memory_pool(), &reader);

    std::shared_ptr<arrow::Table> table;

    reader->ReadTable(&table);

    std::shared_ptr<arrow::io::FileOutputStream> outfile;
    PARQUET_ASSIGN_OR_THROW(
        outfile, arrow::io::FileOutputStream::Open(file_write));

    PARQUET_THROW_NOT_OK(
        parquet::arrow::WriteTable(*table, arrow::default_memory_pool(), outfile, 3));
}


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source