'Why is it faster to create a list of classes than a list of list of integers in C#?
Backgorund
Hi,
I am creating a game where the AI is collecting possible moves from positions which is done millions of times per turn. I am trying to figure out the fastest way to store these moves containing e.g. pieceFromSquare, pieceToSquare and other move related things.
Currently I have a Move class which contains public variables for the move as per below:
public class Move
{
public int FromSquare;
public int ToSquare;
public int MoveType;
public static Move GetMove(int fromSquare, int toSquare, int moveType)
{
Move move = new Move();
move.FromSquare = fromSquare;
move.ToSquare = toSquare;
move.MoveType = moveType;
return move;
}
}
When the AI finds a move it stores the move in a List. I was thinking if it would be faster to store the move as a list of integers instead so I ran a test:
Test
public void RunTimerTest()
{
// Function A
startTime = DateTime.UtcNow;
moveListA = new List<List<int>>();
for (int i = 0; i < numberOfIterations; i++)
{
FunctionA();
}
PrintResult((float)Math.Round((DateTime.UtcNow - startTime).TotalSeconds, 2), "Function A");
// Function B
startTime = DateTime.UtcNow;
moveListB = new List<Move>();
for (int i = 0; i < numberOfIterations; i++)
{
FunctionB();
}
PrintResult((float)Math.Round((DateTime.UtcNow - startTime).TotalSeconds, 2), "Function B");
}
private void FunctionA()
{
moveListA.Add(new List<int>() { 1, 123, 10});
}
private void FunctionB()
{
moveListB.Add(Move.GetMove(1, 123, 10));
}
The test results in the following result when running over 10 million times:
- Function A ran for: 4,58 s.
- Function B ran for: 1,47 s.
So it is more than 3 times faster to create the class and populate its variables than to create a list of integers.
Questions
- Why is it so much faster to create a class than a list of integers?
- Is there an even faster way to store this type of data?
Solution 1:[1]
I examined four options, and allocating arrays for each record is by for the slowest. I did not check allocating class objects, as I was going for the faster options.
struct Move {}stores three integer values into readonly fields of a strcuture.struct FMove {}stores an fixed int array of size 3(int,int,int)stores a tuple of three int valuesint[]allocates an array of three values.
With the following results. I am tracking time and size allocated in bytes.
| Storage | Iterations | Time | Size |
|---|---|---|---|
| Move struct | 10000000 allocations | 0.2633584 seconds | 120000064 bytes |
| FMove struct | 10000000 allocations | 0.3572664 seconds | 120000064 bytes |
| Tuple | 10000000 allocations | 0.702174 seconds | 160000064 bytes |
| Array | 10000000 allocations | 1.2226393 seconds | 480000064 bytes |
public readonly struct Move
{
public readonly int FromSquare;
public readonly int ToSquare;
public readonly int MoveType;
public Move(int fromSquare, int toSquare, int moveType) : this()
{
FromSquare = fromSquare;
ToSquare = toSquare;
MoveType = moveType;
}
}
public unsafe struct FMove
{
fixed int Data[3];
public FMove(int fromSquare, int toSquare, int moveType) : this()
{
Data[0] = fromSquare;
Data[1] = toSquare;
Data[2] = moveType;
}
public int FromSquare { get => Data[0]; }
public int ToSquare { get => Data[1]; }
public int MoveType { get => Data[2]; }
}
static class Program
{
static void Main(string[] args)
{
// Always compile with Release to time
const int count = 10000000;
Console.WriteLine("Burn-in start");
// Burn-in. Do some calc to spool up
// the CPU
AllocateArray(count/10);
Console.WriteLine("Burn-in end");
double[] timing = new double[4];
// store timing results for four different
// allocation types
var sw = new Stopwatch();
Console.WriteLine("Timming start");
long startMemory;
startMemory = GC.GetTotalMemory(true);
sw.Restart();
var r4 = AllocateArray(count);
sw.Stop();
var s4 = GC.GetTotalMemory(true) - startMemory;
timing[3] = sw.Elapsed.TotalSeconds;
startMemory = GC.GetTotalMemory(true);
sw.Restart();
var r1 = AllocateMove(count);
sw.Stop();
var s1 = GC.GetTotalMemory(true) - startMemory;
timing[0] = sw.Elapsed.TotalSeconds;
startMemory = GC.GetTotalMemory(true);
sw.Restart();
var r2 = AllocateFMove(count);
sw.Stop();
var s2 = GC.GetTotalMemory(true) - startMemory;
timing[1] = sw.Elapsed.TotalSeconds;
startMemory = GC.GetTotalMemory(true);
sw.Restart();
var r3 = AllocateTuple(count);
sw.Stop();
var s3 = GC.GetTotalMemory(true) - startMemory;
timing[2] = sw.Elapsed.TotalSeconds;
Console.WriteLine($"| Storage | Iterations | Time | Size |");
Console.WriteLine($"|---|---|---|---|");
Console.WriteLine($"| Move struct| {r1.Count} allocations | {timing[0]} seconds | {s1} bytes |");
Console.WriteLine($"| FMove struct| {r2.Count} allocations | {timing[1]} seconds | {s2} bytes |");
Console.WriteLine($"| Tuple| {r3.Count} allocations | {timing[2]} seconds | {s3} bytes |");
Console.WriteLine($"| Array| {r4.Count} allocations | {timing[3]} seconds | {s4} bytes |");
}
static List<Move> AllocateMove(int count)
{
var result = new List<Move>(count);
for (int i = 0; i < count; i++)
{
result.Add(new Move(1, 123, 10));
}
return result;
}
static List<FMove> AllocateFMove(int count)
{
var result = new List<FMove>(count);
for (int i = 0; i < count; i++)
{
result.Add(new FMove(1, 123, 10));
}
return result;
}
static List<(int from, int to, int type)> AllocateTuple(int count)
{
var result = new List<(int from, int to, int type)>(count);
for (int i = 0; i < count; i++)
{
result.Add((1, 123, 10));
}
return result;
}
static List<int[]> AllocateArray(int count)
{
var result = new List<int[]>(count);
for (int i = 0; i < count; i++)
{
result.Add(new int[] { 1, 123, 10});
}
return result;
}
}
Based on the comments, I decided to use BenchmarkDotNet for the above comparison also and the results are quite similar
| Method | Count | Mean | Error | StdDev | Ratio |
|---|---|---|---|---|---|
| Move | 10000000 | 115.9 ms | 2.27 ms | 2.23 ms | 0.10 |
| FMove | 10000000 | 149.7 ms | 2.04 ms | 1.91 ms | 0.12 |
| Tuple | 10000000 | 154.8 ms | 2.99 ms | 2.80 ms | 0.13 |
| Array | 10000000 | 1,217.5 ms | 23.84 ms | 25.51 ms | 1.00 |
I decided to add a class allocation (called CMove) with the following definition
public class CMove
{
public readonly int FromSquare;
public readonly int ToSquare;
public readonly int MoveType;
public CMove(int fromSquare, int toSquare, int moveType)
{
FromSquare = fromSquare;
ToSquare = toSquare;
MoveType = moveType;
}
}
And used the above as a baseline for benchmarking. I also tried different allocation sizes. And here is a summary of the results.

Anything below 1.0 means it is faster than CMove. As you can see array allocations is always bad. For a few allocations it does not matter much, but for a lot of allocations there are clear winners.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
