'C or Macos from parquet to ascii streamer? [closed]

I am looking for a space-efficient [and ideally fast] way to stream terabytes of parquet data (record by record) into my C programs on a Mac. I do not want to read full tables into my programs --- I do not have TB of RAM. (besides, even Mac Studios are not only hard to purchase, they also don't have that much RAM.)

my first stop was, of course, brew:

$ brew install parquet-tools
...installs ok...
$ parquet-tools cat *parquet | head
org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] no native library is found for os.name=Mac and os.arch=aarch64

I do know the length of my output records (peeked with R). They are very simple and short lines (almost like a linux syslog) and mostly ASCII (not int's etc).

All I want to do in C is

for (int fi=0; fi<FI; ++fi) {
    FILE *f= parquet_read( filename[fi] );
    while (!feof(f)) {
        char buffer[256];
        parquet_read( buffer, 256, f );
        // do stuff with buffer
        // fields could be \0 or otherwise separated.
    }
    fclose(f);
}

could someone please recommend good solutions here, presumably either a cli fast output streamer that I can pipe into C or a non-deep native C library I could use?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source