'Compression without dictionary

I have been testing the various compression algorithms with parquet files, and have settled on Zstd.

Now as far as I understand Zstd uses adaptive dictionary unless one is explicitly specified, thus it begins with an empty one. However when having a dictionary enabled the compressed size and and the execution time are quite unsatisfactory.

The file size without using a dictionary is quite less compared to using the adaptive one. (The number at the end of the name is the compression level):

Name: C:\ParquetFiles\Zstd1 Execution time: 279 ms Size: 13738134
Name: C:\ParquetFiles\Zstd2 Execution time: 140 ms Size: 13207017
Name: C:\ParquetFiles\Zstd9 Execution time: 511 ms Size: 12701030

And for comparison the log from using the adaptive dictionary:

Name: C:\ParquetFiles\ZstdDictZstd1 Execution time: 487 ms Size: 19462825
Name: C:\ParquetFiles\ZstdDictZstd2 Execution time: 402 ms Size: 19292513
Name: C:\ParquetFiles\ZstdDictZstd9 Execution time: 614 ms Size: 19072779

Can you help me understand the significance of this, shouldn't the output with an empty dictionary perform at least as good as Zstd compression with dictionary disabled?

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Compression without dictionary

Sources

Related Questions