'How to create multiple threads that write to the same file in C#

I want to write 10^5 lines of 10^5 randomly generated numbers to a file, so that each line contains 10^5 numbers. Therefore I wanted to know what the best approach would be for doing this quickly. I thought of creating 10^5 threads that are launched concurrently and each of them writes one line, so that the file is filled in the time it takes to write only 1 line.

public static void GenerateNumbers(string path)
    {
        using(StreamWriter sw = new StreamWriter(path))
        {
            for (int i = 0; i < 100000; i++)
            {
                for (int j = 0; j < 100000; j++)
                {
                    Random rnd = new Random();
                    int number = rnd.Next(1, 101);
                    sw.Write(number + " ");
                }
                sw.Write('\n');
            }
        }
    }

Currently I am doing it like this, is there a faster way?



Solution 1:[1]

Now that there's a code snippet, some optimization can be applied.

static void Main(string[] args)
{
    var sw = new Stopwatch();
    const int pow = 5;
    sw.Start();
    GenerateNumbers("test.txt", pow);
    sw.Stop();
    Console.WriteLine($"Wrote 10^{pow} lines of 10^{pow} numbers in {sw.Elapsed}");
}

public static void GenerateNumbers(string path, int pow)
{
    var rnd = new Random();
    using var sw = new StreamWriter(path, false);
    var max = Math.Pow(10, pow);
    var sb = new StringBuilder();
    for (long i = 0; i < max; i++)
    {
        for (long j = 0; j < max; j++)
        {
            sb.Append(rnd.Next(1, 101));
            sb.Append(' ');
        }
        sw.WriteLine(sb.ToString());
        sb.Clear();
        if (i % 100 == 0)
            Console.WriteLine((i / max).ToString("P"));
    }
}

The above code does IO writes at a fairly decent pace (remember the limit is the IO speed, not CPU / number generation). Also note that I'm running the code from inside a VM, so I'm likely not getting the best IO results.

Resource Monitor

  • As mentioned by Neil Moss in the comments, you don't need to instantiate the Random class on each run.
  • I'm generating a single line to write in-memory using a StringBuilder, then I write this to the disk.
  • Since this does take a bit of time I've added a progress tracker (this adds a miniscule amount of overhead).
  • A 10^4 lines of 10^4 numbers file already is 285MB in size and was generated in 4.6767592 seconds.
  • A 10^5 case like the above yields a 25.5 GB file and takes 5:54.2580683 to generate.

I haven't tried this, but I'm wondering if you couldn't save time by writing the data to a ZIP file, assuming you're more interested in just getting the data onto the disk, and not the format itself. A compressed TXT file of numbers should be a fair-bit smaller and as such should be much faster to write.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1