'Why is vec![0, <super large number>] so memory-efficient when the default value is 0?

I am dealing with huge amounts of data in Rust and while evaluating the memory usage of data structures, I stumbled upon some surprising results.

First, I tried filling a vector manually by just pushing zeroes to it:

let mut arr = vec![];

for _ in 0..(4_294_967_294 as u32) {
    arr.push(0);
}

After a while, quite expectedly I would say, my computer ran out of available memory and the process was killed by the OS.

However, if I initialise the vector using the macro initalisation, the behaviour changes:

let mut arr = vec![0; 2_147_483_647_000];

for i in 1..1_000_000_000 {
    arr[i-1] = rng.next_u64();

    let sample = rng.next_u32();
    let res = arr[sample as usize];
    if i % 10000000 == 0 {
        print!("After {} ", i);
        print!("Btw, arr[{}] == {} ", sample, res);
        print_allocated_memory();
    }
}

Even though I filled 1 billion entries with an actual u64 value and read out random values out of the array (mostly zeroes, I just tried to exclude compiler optimisation of the entire array here), my computer memory did not overflow.

Memory usage per jemalloc was this (please note that my computer only has 16 GB of RAM installed):

allocated 16777216.05 MiB resident 16777223.02 MiB

... whereas my OS reported a maximum of roughly 8000M (measured in htop) right at the end of the code.

Curiously, if I use any other default value than 0 (be it 1 or 100), the macro runs out of memory before finishing vector creation, so it surely has something to do with the init value being 0.

I wonder what the macro does to keep the resulting data structure so memory efficient? Are the elements in the array not really created? And if not, then how can it just work with me reading out random indices from the vector?

I have already checked the documentation, though it only says that it relies on the default element being of type Clone, which does not really mean anything for primitive types.



Solution 1:[1]

When allocating memory for a vector, there's a few allocation built-ins that can be used. When the type of the vector is numeric and the initial value given is zero, __rust_alloc_zeroed is used.

On Unix-compatible systems, the default implementation for this allocator function can use either calloc() or posix_memalign().

calloc() guarantees that the allocation is zeroed; posix_memalign() does not. If the latter is used, the Rust allocator will zero the memory itself.

Given the behavior you've observed, the only plausible explanation is that calloc() is used. Since the request cannot be satisfied by the library using previously-freed memory (the allocation is certainly too large) this request is passed on to the kernel, which creates entries in the process' page table for the requested allocation.

However, the OS doesn't have to actually assign a region of physical memory to each page in the allocation. It can defer this until later, a technique called overcommitment.

Reading from or writing to an address in the allocation region will trigger a page fault if it's not yet backed by physical memory. When this fault happens, the kernel resolves it by assigning a region of memory to the accessed page.

The end result of all of this is that if you create a vector of a numeric type where the initial value is zero, very little system memory is actually used by that allocation initially. Nearly all of the allocation is within pages that don't have backing system memory yet, analogous to a hole in a sparse file. As you write to the vector, the system will start assigning physical memory to the allocation and your used memory (and/or used swap) will start to increase.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1