'Use thread-specific pre-allocated data to run tasks in parallel
I want to run some tasks in parallel. Each task uses heap-allocated data. To speed things up, I would like that each thread re-use the same data instead of un-allocating it and re-allocating it just after. Is it feasible?
Here is a basic example of what I want to do:
use rayon::prelude::*;
use std::collections::HashMap;
fn main() {
// Data for the tasks to run in parallel.
let tasks: Vec<_> = (0..1000).collect();
let task_results: Vec<_> = tasks
.par_iter()
.map(|task| {
// Allocate heap-allocated data.
let data: HashMap<usize, usize> = HashMap::with_capacity(1024);
// Do something the heap-allocated data and the task.
// Drop the heap-allocated data and return the result from the task.
task * 2
})
.collect();
}
Each task uses a HashMap for its computation. The HashMap is dropped when the task is done. How can I do it such that each thread uses a single HashMap that is cleared before running a new task?
Solution 1:[1]
You can use map_with to create a HashMap that will be cloned once for each thread and then passed to your closure:
use rayon::prelude::*;
use std::collections::HashMap;
fn main() {
// Data for the tasks to run in parallel.
let tasks: Vec<_> = (0..10).collect();
let task_results: Vec<_> = tasks
.par_iter()
.map_with(
HashMap::<usize, usize>::with_capacity(1024),
|data, task| {
// Clear the data for this run
data.clear();
// Do something the heap-allocated data and the task.
task * 2
},
)
.collect();
println!("{:?}", task_results);
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jmb |
