'How to write millions of double values into a txt file
I've made a neural network and now I need to save the results of the training process into a local file. In total, there are 7,155,264 values. I've tried with a loop like this
string weightsString = "";
string biasesString = "";
for (int l = 1; l < layers.Length; l++)
{
for (int j = 0; j < layers[l].Length; j++)
{
for (int k = 0; k < layers[l - 1].Length; k++)
{
weightsString += weights[l][j, k] + "\n";
}
biasesString += biases[l][j] + "\n";
}
}
File.WriteAllText(@"path", weightsString + "\n" + biasesString);
But it literally takes forever to go through all of the values. Is there no way to write the contents directly without having to write them in a string first?
(Weights is a double[][,] while biases is a double[][])
Solution 1:[1]
StringBuilder weightsSB = new StringBuilder();
StringBuilder biasesSB = new StringBuilder();
for (int l = 1; l < layers.Length; l++)
{
for (int j = 0; j < layers[l].Length; j++)
{
for (int k = 0; k < layers[l - 1].Length; k++)
{
weightsSB.Append(weights[l][j, k] + "\n");
}
biasesSB.Append(biases[l][j] + "\n");
}
}
As suggested in the comments, I used a StringBuilder instead. Works like a charm.
Solution 2:[2]
First of writing down 7 million datasets will obviously take a lot of time. I'd suggest you split up weights and biases into two files and write them on the fly, no need to store them all in memory until you are done.
using StreamWriter weigthStream = new("weigths.txt", append: true);
using StreamWriter biasStream = new("biases.txt", append: true);
for (int l = 1; l < layers.Length; l++)
{
for (int j = 0; j < layers[l].Length; j++)
{
for (int k = 0; k < layers[l - 1].Length; k++)
{
await weightStream.WriteLineAsync(weights[l][j, k]);
}
await biasStream.WriteLineAsync(biases[l][j]);
}
}
Solution 3:[3]
But it literally takes forever to go through all of the values. Is there no way to write the contents directly without having to write them in a string first?
One option would be to save it as binary data. This makes it much harder to read by humans, but for large amount of data this would really be preferable since it will save a lot of time both when reading and writing. For example using BinaryWriter and using unsafe code.
myBinaryWriter.Write(myArray.GetLength(0));
myBinaryWriter.Write(myArray.GetLength(1));
fixed (double* ptr = myArray)
{
var span = new ReadOnlySpan<byte>(ptr, myArray.GetLength(0) *myArray.GetLength(1) * 8);
myBinaryWriter.Write(span);
}
You might also consider using a binary serialization library like protbuf.net that can just take a object an and serialize it to a stream. Note that some libraries may need attributes to be added to classes and properties. Some libraries may also have issues with multidimensional and/or jagged arrays. Because of this it can sometimes be useful to define your own 2D array that uses a 1D array as the backing storage, this can make things like serialization or passing data to other components much simpler.
Another somewhat common practice is to store metadata, like height, width, etc in a simple human readable text-file using something like json or xml. While keeping the actual data in a separate raw binary file.
Solution 4:[4]
Bad variant - you can use json serialization
So-so variant - write in file immediately. Use File.AppendText
IMHO the best variant - use DB
IMHO good variant - use BinaryFormatter (you will not be able to read that by yourself, but application will)
Working variant - use StringBuilder
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Elia Giaccardi |
Solution 2 | Christian O. |
Solution 3 | JonasH |
Solution 4 | AnatolyBelanov |