'Why are finding elements most efficient in arrays in c++?
I need a fast STL container for finding if an element exists in it, so I tested arrays, vectors, sets, and unordered sets. I thought that sets were optimized for finding elements, because of unique and ordered values, but the fastest for 10 million iterations are: arrays (0.3 secs) vectors (1.7 secs) unordered sets (1.9 secs) sets (3 secs)
Here is the code:
#include <algorithm>
#include <iostream>
#include <set>
#include <unordered_set>
#include <vector>
int main() {
using std::cout, std::endl, std::set, std::unordered_set, std::vector, std::find;
int i;
const long ITERATIONS = 10000000;
int a[] {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
for (int i = 0; i < ITERATIONS; i++) {
if (find(a, a + 16, rand() % 64) == a + 16) {}
else {}
}
vector<int> v{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
for (i = 0; i < ITERATIONS; i++) {
if (find(v.begin(), v.end(), rand() % 64) == v.end()) {}
else {}
}
set<int> s({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15});
for (i = 0; i < ITERATIONS; i++) {
if (find(s.begin(), s.end(), rand() % 64) == s.end()) {}
else {}
}
unordered_set<int> us({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15});
for (i = 0; i < ITERATIONS; i++) {
if (find(us.begin(), us.end(), rand() % 64) == us.end()) {}
else {}
}
}
Solution 1:[1]
You are not measuring efficiency, you are measuring performance. And doing so badly.
The effect of address space randomization or just different usernames or other variable in env has up to about 40% effect on speed. That's more than the difference between -O0 and -O2. You are only measuring one single system with one single address space layout 10000000 times. That makes the value about meaningless.
And yet you still manage to figure out that for 16 ints any attempt to be better than just looking at all of them will perform worse. A simple linear search in a single cache line (or two if the layout is bad, more likely) is just simply the best way.
Now try again with 10000000 ints and run the binary 1000 times. Even better if you use a layout randomizer to truly exclude accidents of the layout from the timing.
Note: Iirc the limit for sort where an array with bubble sort is best is somewhere between 32 and 64 ints and find is even simpler.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Goswin von Brederlow |
