'Multiple buffers on the same file

The procedure is as follows.

  1. Filtering a huge File.txt file(fastq file if you are interested) by lines through file streaming in C.

  2. After each filtering process, the output is a filtered_i.txt file.

  3. Repeat steps 1-2 with 1000 different filters.

  4. The expected results are 1000 filtered_i.txt files, i from 1 to 1000.

The question is:

Can I run these filtering processes in parallel?

My concern is multiple buffers would be opened in File.txt if do parallel. Is it safe to do? Any potential drawbacks?



Solution 1:[1]

I would advise against opening a file multiple times in parallel. This puts a lot of strain on the OS, and if all of your threads are streaming at once, your performance is going to drop significantly because of thrashing. You'd be much better off streaming the file serially, even large files. If you do want a parallel solution, I'd suggest having one thread be the "streamer", where you'd read a certain number of chunks from the file and then pass those chunks off to the other threads.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 fireshadow52