'Why is it faster to create a list of classes than a list of list of integers in C#?

Backgorund

Hi,

I am creating a game where the AI is collecting possible moves from positions which is done millions of times per turn. I am trying to figure out the fastest way to store these moves containing e.g. pieceFromSquare, pieceToSquare and other move related things.

Currently I have a Move class which contains public variables for the move as per below:

public class Move
{
    public int FromSquare;
    public int ToSquare;
    public int MoveType;

    public static Move GetMove(int fromSquare, int toSquare, int moveType)
    {
        Move move = new Move();

        move.FromSquare = fromSquare;
        move.ToSquare = toSquare;
        move.MoveType = moveType;

        return move;
    }
}

When the AI finds a move it stores the move in a List. I was thinking if it would be faster to store the move as a list of integers instead so I ran a test:

Test

public void RunTimerTest()
    {
        // Function A
        startTime = DateTime.UtcNow;
        moveListA = new List<List<int>>();
        for (int i = 0; i < numberOfIterations; i++)
        {
            FunctionA();
        }
        PrintResult((float)Math.Round((DateTime.UtcNow - startTime).TotalSeconds, 2), "Function A");

        // Function B
        startTime = DateTime.UtcNow;
        moveListB = new List<Move>();
        for (int i = 0; i < numberOfIterations; i++)
        {
            FunctionB();
        }
        PrintResult((float)Math.Round((DateTime.UtcNow - startTime).TotalSeconds, 2), "Function B");
    }

    private void FunctionA()
    {
        moveListA.Add(new List<int>() { 1, 123, 10});
    }

    private void FunctionB()
    {
        moveListB.Add(Move.GetMove(1, 123, 10));
    }

The test results in the following result when running over 10 million times:

  • Function A ran for: 4,58 s.
  • Function B ran for: 1,47 s.

So it is more than 3 times faster to create the class and populate its variables than to create a list of integers.

Questions

  1. Why is it so much faster to create a class than a list of integers?
  2. Is there an even faster way to store this type of data?


Solution 1:[1]

I examined four options, and allocating arrays for each record is by for the slowest. I did not check allocating class objects, as I was going for the faster options.

  • struct Move {} stores three integer values into readonly fields of a strcuture.
  • struct FMove {} stores an fixed int array of size 3
  • (int,int,int) stores a tuple of three int values
  • int[] allocates an array of three values.

With the following results. I am tracking time and size allocated in bytes.

Storage Iterations Time Size
Move struct 10000000 allocations 0.2633584 seconds 120000064 bytes
FMove struct 10000000 allocations 0.3572664 seconds 120000064 bytes
Tuple 10000000 allocations 0.702174 seconds 160000064 bytes
Array 10000000 allocations 1.2226393 seconds 480000064 bytes
public readonly struct Move
{
    public readonly int FromSquare;
    public readonly int ToSquare;
    public readonly int MoveType;

    public Move(int fromSquare, int toSquare, int moveType) : this()
    {
        FromSquare = fromSquare;
        ToSquare = toSquare;
        MoveType = moveType;
    }
}

public unsafe struct FMove
{
    fixed int Data[3];

    public FMove(int fromSquare, int toSquare, int moveType) : this()
    {
        Data[0] = fromSquare;
        Data[1] = toSquare;
        Data[2] = moveType;
    }

    public int FromSquare { get => Data[0]; }
    public int ToSquare { get => Data[1]; }
    public int MoveType { get => Data[2]; }
}

static class Program
{
    static void Main(string[] args)
    {
        // Always compile with Release to time
        const int count = 10000000;
        Console.WriteLine("Burn-in start");
        // Burn-in. Do some calc to spool up
        // the CPU
        AllocateArray(count/10);
        Console.WriteLine("Burn-in end");
        double[] timing = new double[4];
        // store timing results for four different 
        // allocation types



        var sw = new Stopwatch();
        Console.WriteLine("Timming start");

        long startMemory;

        startMemory = GC.GetTotalMemory(true);
        sw.Restart();
        var r4 = AllocateArray(count);
        sw.Stop();
        var s4 = GC.GetTotalMemory(true) - startMemory;
        timing[3] = sw.Elapsed.TotalSeconds;

        startMemory = GC.GetTotalMemory(true);
        sw.Restart();
        var r1 = AllocateMove(count);
        sw.Stop();
        var s1 = GC.GetTotalMemory(true) - startMemory;
        timing[0] = sw.Elapsed.TotalSeconds;

        startMemory = GC.GetTotalMemory(true);
        sw.Restart();
        var r2 = AllocateFMove(count);
        sw.Stop();
        var s2 = GC.GetTotalMemory(true) - startMemory;
        timing[1] = sw.Elapsed.TotalSeconds;

        startMemory = GC.GetTotalMemory(true);
        sw.Restart();
        var r3 = AllocateTuple(count);
        sw.Stop();
        var s3 = GC.GetTotalMemory(true) - startMemory;
        timing[2] = sw.Elapsed.TotalSeconds;

        Console.WriteLine($"| Storage | Iterations | Time | Size |");
        Console.WriteLine($"|---|---|---|---|");
        Console.WriteLine($"|  Move struct| {r1.Count} allocations | {timing[0]} seconds | {s1} bytes |");
        Console.WriteLine($"| FMove struct| {r2.Count} allocations | {timing[1]} seconds | {s2} bytes |");
        Console.WriteLine($"|        Tuple| {r3.Count} allocations | {timing[2]} seconds | {s3} bytes |");
        Console.WriteLine($"|        Array| {r4.Count} allocations | {timing[3]} seconds | {s4} bytes |");
    }

    static List<Move> AllocateMove(int count)
    {
        var result = new List<Move>(count);
        for (int i = 0; i < count; i++)
        {
            result.Add(new Move(1, 123, 10));
        }
        return result;
    }
    static List<FMove> AllocateFMove(int count)
    {
        var result = new List<FMove>(count);
        for (int i = 0; i < count; i++)
        {
            result.Add(new FMove(1, 123, 10));
        }
        return result;
    }

    static List<(int from, int to, int type)> AllocateTuple(int count)
    {
        var result = new List<(int from, int to, int type)>(count);
        for (int i = 0; i < count; i++)
        {
            result.Add((1, 123, 10));
        }
        return result;
    }

    static List<int[]> AllocateArray(int count)
    {
        var result = new List<int[]>(count);
        for (int i = 0; i < count; i++)
        {
            result.Add(new int[] { 1, 123, 10});
        }
        return result;
    }

}

Based on the comments, I decided to use BenchmarkDotNet for the above comparison also and the results are quite similar

Method Count Mean Error StdDev Ratio
Move 10000000 115.9 ms 2.27 ms 2.23 ms 0.10
FMove 10000000 149.7 ms 2.04 ms 1.91 ms 0.12
Tuple 10000000 154.8 ms 2.99 ms 2.80 ms 0.13
Array 10000000 1,217.5 ms 23.84 ms 25.51 ms 1.00

I decided to add a class allocation (called CMove) with the following definition

public class CMove
{
    public readonly int FromSquare;
    public readonly int ToSquare;
    public readonly int MoveType;

    public CMove(int fromSquare, int toSquare, int moveType) 
    {
        FromSquare = fromSquare;
        ToSquare = toSquare;
        MoveType = moveType;
    }
}

And used the above as a baseline for benchmarking. I also tried different allocation sizes. And here is a summary of the results.

chart

Anything below 1.0 means it is faster than CMove. As you can see array allocations is always bad. For a few allocations it does not matter much, but for a lot of allocations there are clear winners.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1