'Generating a shuffled range in python
I need to generate (iterator) a random sequence of 32 bit integers with absolutely no repeats. The naive solutions to this problem don't work for me because of the massive sizes involved, for instance I can't use random.shuffle(range(...))
because that requires loading the very large range into memory, and I can't check every new value for a previous appearance and reroll if found because that would require keeping all previous items in memory.
Is there an algorithm for generating non repeating random numbers in a range?
Solution 1:[1]
If you need a relatively small number of unique numbers, random.sample
would be perfect here. The documentation gives a relevant example for your use-case:
To choose a sample from a range of integers, use a
range()
object as an argument. This is especially fast and space efficient for sampling from a large population:sample(range(10000000), k=60)
.
So if you want, let's say, 1000 unique 32 bit numbers, you could do:
random.sample(range(1 << 32), k=1000)
Solution 2:[2]
Encryption is a one-to-one process, so if the inputs are all unique then the outputs must also be unique. Hence if you encrypt the numbers 0, 1, 2, ... you will get a sequence of unique outputs in random order. For 32 bits you will need a 32-bit block cipher. You will need to keep the same key for as long as you want no repetitions. All you will need to store is the key and where you have got to in the range 0..2^32.
Unfortunately, 32 bits is not a common block size since it is too small to be cryptographically secure. Hasty Pudding is one cipher that allows a 32 bit block size. Alternatively is is possible to write your own 32 bit Feistel cipher reasonably easily. For randomness, four or six rounds will be sufficient.
Solution 3:[3]
You can throw your own pseudorandom number generator, for instance a xorshift one. Wikipedia lists an implementation of a 32-bit three xorshift algorithm, which I translate to Python here:
import random
def gen_random():
# The state word must be initialized to non-zero
# We use random to generate this "seed"
state = random.randrange(1, 1 << 32)
while True:
# Algorithm "xor" from p. 4 of Marsaglia, "Xorshift RNGs"
x = state
x ^= (x << 13) & 0xffffffff
x ^= x >> 17
x ^= (x << 5) & 0xffffffff
state = x
yield state
# demo. print 5 random numbers
it = gen_random()
for i in range(5):
print(next(it))
According to the same article, this generator has a period of 232?1, which means there will be no repetitions when you would pull that many numbers from it. Given that state=0
would be a "sink", in that this would only keep yielding 0, this means that 0 will never be generated, but all other values (in the range of 32-bit unsigned numbers) eventually will.
If you want to have 0 as a possible output too, and a period of 232, you can easily insert that 0 at a random index in the series, like so:
def gen_random():
# The state word must be initialized to non-zero
# We use random to generate this "seed"
state = random.randrange(1, 1 << 32)
# Determine where to insert a zero in the series
zeroindex = random.randrange(1 << 32)
i = 0
while True:
if i == zeroindex:
yield 0
else:
# Algorithm "xor" from p. 4 of Marsaglia, "Xorshift RNGs"
x = state
x ^= (x << 13) & 0xffffffff
x ^= x >> 17
x ^= (x << 5) & 0xffffffff
state = x
yield state
i = (i + 1) & 0xffffffff
Solution 4:[4]
You can start with a randomly chosen number and then during each subsequent step you can flip a different combination of bits of that original number. A single bit flip can be achieved by using XOR (^
) with a number that has a 1
bit at the desired position. There are 2**32
possible bit flip combinations since this is the number of distinct 32-bit integers. So the different bit flip masks are the integers from 0
to 2**32-1
themselves and you can simply iterate over the corresponding range:
import random
def generate(n=32):
number = random.randrange(2**n)
for i in range(2**n):
yield number ^ i
Using the same principle but improving the random stream by adding recursion for the masks and a variable increment larger than 1:
import itertools as it
import random
PRIMES = [3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997, 1009, 1013, 1019, 1021, 1031, 1033, 1039, 1049, 1051, 1061, 1063, 1069, 1087, 1091, 1093, 1097, 1103, 1109, 1117, 1123, 1129, 1151, 1153, 1163, 1171, 1181, 1187, 1193, 1201, 1213, 1217, 1223]
def generate(n, *, depth=25, d=0):
number = random.randrange(n)
increment = random.choice(PRIMES) # choose a number coprime to `n`
if d < depth:
masks = generate(n, depth=depth, d=d+1)
else:
masks = it.accumulate(it.repeat(increment, n-1))
yield number
for i in masks:
yield number ^ (i % n)
for p in range(4, 10):
n = 2**p
assert set(generate(n)) == set(range(n))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Jasmijn |
Solution 2 | rossum |
Solution 3 | |
Solution 4 |