'How to iterate at elements from a sub list and then remove the sub list from the list? With great performance

This is an example: originalList is a list of objects

var subList = (originalList.Where(x => x.number < 0)).ToList();
originalList.RemoveAll(x => x.number < 0);

I will use the subList later. In this example the originaList is iterated two times. This function is called billions of times and originalList is a large List

Is there a easy way to improve the performance?


One important thing: The value of number of the object can change between two calls of this function.



Solution 1:[1]

An efficiency improvement (though still ultimately O(n)) is to batch the removals together. My testing shows that depending on the frequency of removal, this can be the same speed or over 4 times faster. Here is the function as an extension method:

public static List<T> RemoveAllAndReturn<T>(this List<T> input, Func<T, bool> condition) {
    List<T> result = new List<T>();
    var removeCt = 0;
    for (int i = input.Count - 1; i >= 0; --i) {
        if (condition(input[i])) {
            result.Add(input[i]);
            ++removeCt;
        }
        else if (removeCt > 0) {
            input.RemoveRange(i + 1, removeCt);
            removeCt = 0;
        }
    }
    if (removeCt > 0)
        input.RemoveRange(0, removeCt);
    return result;
}

Solution 2:[2]

This method removes all elements fulfilling a condition and returns a list of the removed elements. It only iterates once.

public static List<T> RemoveAll<T>(List<T> input, Func<T,bool> condition)
{
    List<T> removedEntries = new List<T>();
    int offset = 0;
    for(int i = 0; i < input.Count - offset; i++)
    {
      while(i < input.Count - offset && condition.Invoke(input[i + offset]))
      {
         removedEntries.Add(input[i + offset]);
         offset++; 
         Console.WriteLine("i="+i+", offset="+offset);
      }
    
      if(i < input.Count - offset)
      {
         input[i] = input[i+offset];
      }
    }
    input.RemoveRange(input.Count - offset, offset);
    return removedEntries;
}

We loop over the list and check if an element matches the condition. If the condition is matched, the element after that element is copied into the position. So all elements that don't fulfill the condition will be at the start of the list, and all elements that fulfill the condition are at the end of the list. In the last step, the elements at the end of the list are removed.

It might be wise to give an initial capacity to the removedEntries list. Per default, a list has a capacity of 4, which is doubled every time it is exceeded. If you have 100 elements to be removed, the capacity has to be extended 5 times. This is an O(n) operation each time. If you can estimate that you will delete about 10% of the elements, you might write

List<T> removedEntries = new List<T>(input.Count / 10);

This might save you some time, but on the other hand, if you don't need the full initial capacity of the list, you waste some memory.

Online demo: https://dotnetfiddle.net/dlthkH

Solution 3:[3]

You could consider doing this hack:

var subList = new List<SomeType>();
originalList.RemoveAll(x =>
{
    bool shouldBeRemoved = x.Number < 0;
    if (shouldBeRemoved) subList.Add(x);
    return shouldBeRemoved;
});

The Predicate<T> passed to the RemoveAll is not pure: it has the side-effect of inserting matched elements in the subList. Based on the implementation of the RemoveAll method, this hack should work as expected. The documentation though does not make the explicit guarantee that the predicate will be invoked only once per element:

The elements of the current List<T> are individually passed to the Predicate<T> delegate, and the elements that match the conditions are removed from the List<T>.

So make your own judgment whether it's safe to use this hack, or not.


Edit: You could also make it an extension method:

public static int RemoveAll<T>(this List<T> source, Predicate<T> match,
    out List<T> removed)
{
    var removedLocal = new List<T>();
    removed = removedLocal;
    int removedCount = source.RemoveAll(x =>
    {
        bool shouldBeRemoved = match(x);
        if (shouldBeRemoved) removedLocal.Add(x);
        return shouldBeRemoved;
    });
    Debug.Assert(removedCount == removed.Count);
    return removedCount;
}

Usage example:

originalList.RemoveAll(x => x.number < 0, out var subList);

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3