'Nested dictionary (with defaults) surpisingly slow updating/inserting a large number of values
I have some swift that looks like this:
import Foundation
// simulated data...
var sequence = [Int]()
for _ in 0..<10_000 {
let random = Int.random(in: 1...5)
sequence.append(random)
}
// code to optimize...
var dict: [Int: [Int: Int]] = [:]
for (a, b) in zip(sequence, sequence[1...]) {
dict[a, default: [:]][b, default: 0] += 1
}
Basically, it keeps track of all the possible options from a and the counts for each a -> b pair occurence.
In a Playground it takes about ~10 seconds. And in a proper test suite with self.measure { ... } it's not that much faster.
Re-written in Python:
from collections import defaultdict
import random
sequence = []
for _ in range(10_000):
rand = random.choice([1, 2, 3, 4, 5])
sequence.append(rand)
my_dict = defaultdict(lambda: defaultdict(lambda: 0))
for a, b in zip(sequence, sequence[1:]):
my_dict[a][b] += 1
... it basically runs instantly.
So two questions:
- What's happening here?! I thought swift dictionaries were O(1)?
- Most importantly, how can I make this nested dictionary faster / more performant?
Solution 1:[1]
A bit faster with Shadowrun's help:
import Foundation
typealias NestedDict = [Int: [Int: Int]]
var sequence = (0..<10_000).map { _ in Int.random(in: 1...5) }
var dict = NestedDict()
dict = zip(sequence, sequence[1...]).reduce(into: NestedDict()) { partialResult, pair in
partialResult[pair.0, default: [:]][pair.1, default: 0] += 1
}
Solution 2:[2]
The performance is because of all the copy-on-write behaviour your triggering with those mutations. See https://developer.apple.com/documentation/swift/dictionary/3127177-reduce for the reduce into method and the performance discussion there
var sequence = [Int]()
for _ in 0..<10_000 {
let random = Int.random(in: 1...5)
sequence.append(random)
}
// 16-18 seconds
(0..<10_000).reduce(into: []) { partialResult, i in
partialResult.append(Int.random(in: 1...5))
}
// 0.09 seconds
(0..<10_000).map { _ in Int.random(in: 1...5) }
// 0.07 seconds
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | emehex |
| Solution 2 |
