'Alpine APK Package Repositories, how are the checksums calculated?

I'm trying to work out how the pull checksum for packages is calculated within Alpine APK package repositories. The documentation regarding the format is lacking in any detail.

When I run apk index -o APKINDEX.unsigned.tar.gz *.apk which generates the repository. When you extract the txt file from inside the gz, it contains the following...

C:Q17KXT6xFVWz4EZDIbkcvXQ/uz9ys=
P:redis-server
V:3.2.3-0
A:noarch
S:2784844
I:102400
T:An advanced key-value store
U:http://redis.io/
L:
D:linux-headers

I'm interested in how the very first line is generated. I've tried to read the actual source that's used to generate this, but I'm not a C programmer, so it's hard for me to comprehend as it jumps all over the place.

The two files mentioned in the documentation are database.c and package.c.

Incase this somewhat helps, the original APK file has these various hashes...

CRC32 = ac17ea88
MD5 = a035ecf940a67a6572ff40afad4f396a
SHA1 = eca5d3eb11555b3e0464321b91cbd743fbb3f72b
SHA256 = 24bc1f03409b0856d84758d6d44b2f04737bbc260815c525581258a5b4bf6df4


Solution 1:[1]

So...

/* Internal cointainer for MD5 or SHA1 */
struct apk_checksum {
    unsigned char data[20];
    unsigned char type;
};

Basically take the C: value then chop off the Q from the front then base 64 decode. Chop off the last value (type which defaults to SHA1) then you have your sha1. This appears to be made of the CONTENTS of the package but that would take further looking into it.

Solution 2:[2]

The pull checksum is the sha1sum of the second tar.gz file in the apk file, containing the .PKGINFO file.

The Alpine APK package is actually a concatenation in disguise of 3 tar.gz files.

We can split the package below using gunzip-split into 3 .gz files, then rename them to .tar.gz

./gunzip-split -s -o ./out/ strace-5.14-r0.apk
mv ./out/file_1.gz ./out/file_1.tar.gz
mv ./out/file_2.gz ./out/file_2.tar.gz
mv ./out/file_3.gz ./out/file_3.tar.gz

sha1sum ./out/file_2.tar.gz
7a266425df7bfd7ce9a42c71a015ea2ae5715838  out/file_2.tar.gz

tar tvf out/file_2.tar.gz 
-rw-r--r-- root/root       702 2021-09-03 01:34 .PKGINFO

In the case of the strace package the checksum value can be derived as above:

apk index strace-5.14-r0.apk -o APKINDEX.tar.gz
tar xvf APKINDEX.tar.gz
cat APKINDEX

echo eiZkJd97/XzppCxxoBXqKuVxWDg=|base64 -d|xxd
00000000: 7a26 6425 df7b fd7c e9a4 2c71 a015 ea2a  z&d%.{.|..,q...*
00000010: e571 5838                                .qX8

When comparing them we see that they match.

References

https://github.com/martencassel/apk-tools/blob/master/README.md

https://gitlab.com/cg909/gunzip-split/-/releases

https://lists.alpinelinux.org/~alpine/devel/%3C257B6969-21FD-4D51-A8EC-95CB95CEF365%40ferrisellis.com%3E#%[email protected]%3E

Solution 3:[3]

You need to look here: https://git.alpinelinux.org/cgit/apk-tools/tree/src/blob.c#n492

It is apk_blob_pull_csum

First 'Q' stands for encoding Next '1' stands for SHA1

Looks like this checksum is made database.c in apk_db_unpack_pkg:

    apk_sign_ctx_init(&ctx.sctx, APK_SIGN_VERIFY_IDENTITY, &pkg->csum, db->keys_fd);
tar = apk_bstream_gunzip_mpart(bs, apk_sign_ctx_mpart_cb, &ctx.sctx);
r = apk_tar_parse(tar, apk_db_install_archive_entry, &ctx, TRUE, &db->id_cache);

but I'm not sure, because I failed to trace this code.

It is really not easy to understand what are they doing.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Lee
Solution 2
Solution 3 Max Lapshin