'Algorithm for optimal pair-wise processing and caching
I'm looking for an algorithm that probably already exists but I cannot find its name.
Given a large set of data with N elements, I want to compute the distance pair-wise among all N elements (NxN matrix). However, each element is large so only a subset of them fit in memory, say, only M elements can be held in memory at a given time.
What I'm looking for is an algorithm to sort the reading/computation of the distances that minimizes the element reading from disc. Computing the distance is fast, but reading the elements from disc is slow.
Ideally, I would need two flavours, one where distances are not symmetrical (so I need to compute the NxN matrix except main diagonal), other where they are symmetrical (so I only need the upper-triangular NxN submatrix). In addition, it would be great if the algorithm can be easily parallelized.
I'm pretty sure this algorithm already exists but I cannot find its name.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
