'How can I create a tarball with UID/GID of root?

I'm trying to amend this code so that the UID and GID of the files inserted into the tarball belong to root.

import tarball

sources = [ 'test-directory', 'another-directory/file1' ]

with tarfile.open("/tmp/test.tar","w") as tarball:
    for source in sources:
        tarball.add(source)

sources is mixed list of directory and filenames. With the above code, all the files are there, but with my user UID and GID. If I were doing this on the command line, I'd prefix a call to tar with fakeroot.

In Python (3), if I try just looking at one directory:

import tarfile
import glob

with tarfile.open("/tmp/test.tar","w") as tarball:
    for filename in glob.iglob('test-directory/**', recursive=True):
        info = tarball.gettarinfo(filename)
        info.uid = 0
        info.gid = 0
        info.uname = 'root'
        info.gname = 'root'
        tarball.addfile(info)

That gets me proper ownership but it's missing files in the test-directory tree because I can't get the glob to working satisfactorily.

How can I do this?



Solution 1:[1]

Reading the source (tarfile.py)

I added this function based on inspecting the add() method from the above.

        def add_tarinfo(tarball, tarinfo, name, arcname, fakeroot):
            if fakeroot:
                tarinfo.uid = 0
                tarinfo.gid = 0
                tarinfo.uname = 'root'
                tarinfo.gname = 'root'
            if tarinfo.isreg():
                with open(name, "rb") as f:
                    tarball.addfile(tarinfo, f)
            elif tarinfo.isdir():
                tarball.addfile(tarinfo)
                for f in os.listdir(name):
                    nname = os.path.join(name, f)
                    narcname = os.path.join(arcname, f)
                    ntarinfo = tarball.gettarinfo(nname, narcname)
                    add_tarinfo(tarfile, ntarinfo, nname, narcname, fakeroot)
            else:
                tarball.addfile(tarinfo)

So the original code becomes:

  with tarfile.open("/tmp/test.tar","w") as tarball:
        for arcname in self.sources:
            name = os.path.join(self.source_path, arcname)
            tarinfo = tarball.gettarinfo(name=name, arcname=arcname)
            add_tarinfo(tarball, tarinfo, name, arcname, True)

Solution 2:[2]

I achieved this using the filter parameter (available since Python 3.2) of TarFile.add() method (docs.python.org):

def fakeroot_filter(tarinfo):
    tarinfo.gid = 0
    tarinfo.uid = 0
    tarinfo.gname = 'root'
    tarinfo.uname = 'root'
    return tarinfo

with tarfile.open('data.tgz', 'w:gz', format=tarfile.GNU_FORMAT) as arc:
    arc.add(f'{path}/data', arcname='data', filter=fakeroot_filter)

Some additional info about this feature can be found in the related issue on bugs.python.org

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jamie
Solution 2 eltio