'How to set Git compression level?

Git is able to compress objects and pack files. How can I set this compression level?

Normally, the answer is to set core.compression or pack.compression.

However, I tried setting these to 1 or 9 and then running gc --aggressive and in a separate test repack -a -d. This does not change the size of the .git folder in any meaningful way. I tried this on 14 GB of various open source repositories taken from GitHub. Source code is highly compressible. There should be a difference. I ran these tests on Windows using the official Git 2.25.

I interpret these findings to mean that I did not manage to change the compression level. How can I actually change the Git compression level?

git


Solution 1:[1]

Important

When doing the repack, be sure to use -F so as to:

Pass the --no-reuse-object option to git-pack-objects

as the git repack documentation notes, and you discovered. Otherwise your new compression level won't apply to any existing objects.

Background

There are three knobs:

  • core.compression sets the default for both core.loosecompression and pack.compression. If this is not explicitly set, it leaves the other two at their settings or default settings.
  • core.loosecompression sets the zlib compression default. If not set, it defaults to zlib's own "best speed" value.
  • pack.compression sets the pack compression default. If not set, it defaults to zlib's own "default compression" level (which may depend on your zlib but I think is generally 6; see https://www.euccas.me/zlib/).

But in pack files, the compression level may be much less relevant to the final pack file size. The reason for this is that the pack file format is ... well, here's a link to the technical documentation, but I'd summarize it as typically dominated by delta chains rather than typically dominated by file content.

A loose object consists of he zlib-deflated Git header plus the raw file content. Here, the compression (and level) will generally have the same effect it would have if you did your own zlib compression, as the header is pretty tiny compared to a typical file and these bytes should not disturb substring-finding. The entire object is compressed without regard to any other objects.

A packed object, however, can either be a base object or a deltified object. If the packed object is a base object, its compression could be similar to that of a loose object. But if the packed object is deltified, it will consist of binary instructions, rather than text. These are unlikely to compress very well.

Suppose your average delta chain is 20 objects in length. This means that for every one base object, there are 19 deltified objects. Suppose compression works very well (say, to 35% the original size) for the base object, and terribly (say, to 97% of the original size) for the deltified objects. Suppose further that the average sized of the base object is 64K and the average size of the deltified object, including instructions, is 6.4K. Then improving these figures to, say, 32% and 94% respectively—which might be realistic but I have not done any actual measuring—would take us from:

  • Original: 35%(65536) + 19 * (97%(6554)) = 22938 + 19 * 6537 = 147141
  • level-9: 32%(65536) + 19 * (94%(6554)) = 20972 + 19 * 6161 = 138031

That's not as big a gain as we might have expected: the loose object would shrink about 8.5% but the pack file shrank about 6.5%.

(The results of doing various packing experiments on real Git data, rather than these thought experiments, would be interesting. Even more interesting might be trying some of the other compression algorithms mentioned in the first link above.)

Solution 2:[2]

How to set Git compression level?

https://git-scm.com/docs/git-config

git config core.compression -1
# default compression level
# -1 is default.
# 0 means no compression,
# and 1..9 are various speed/size tradeoffs,
# 9 being slowest.

git config core.looseCompression -1
# compression level for objects
# that are not in a pack file.

git config pack.compression -1
# compression level for objects in a pack file.
#
# Note that changing the compression level
# will not automatically recompress all existing objects.
# You can force recompression
# by passing the -F option to git-repack
# example: git repack -a -d -F

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Mila Nautikus