'Fastest way to make list out of IEnumerable in .NET

I try to show a lot of values in a WPF datachart, after user was selecting target parameters. To realize this, I am using the livecharts for WPF (like this: https://lvcharts.net/App/examples/v1/Wpf/Scrollable) and it's working well. To change the values in the chart, I have to call this function:

' Change values of xAxis
Private Sub ChangeXAxis(axis As Object, title As String, values As Object)
    axis.Labels = values  ' has to be array or list of values (strings for X, double for Y)
    axis.Title = title
End Sub

The selected values have to be filtered by timestamp or parameter before showing. For this purpose, I am using the folowing function:

    Public Function FilterListForChart(values As IEnumerable, axis As String, counterStart As Integer, counterEnd As Integer)
    ...

    Try
        If axis = "X" Then
            'Return filtered values for axis
            Dim query As IEnumerable = (From rows In values
                                        Where CInt(rows(3)) > counterStart And CInt(rows(3)) < counterEnd
                                        Select (Math.Round(CDbl(rows(1)), 3))).ToList()
            
            Return query

Unfortunately the lvcharts needs values as list or array (string or double) to show the data correctly. The Problem is converting the IEnumerable to a list or array will take a long time if I want to show a lot of values (e.g. >300.000 values need something like 10 or more seconds)

Because of this, I was trying a lot of different things like discussed here:

I was trying the following options without success:

  • Use ToArray()
  • Use ToList()
  • Use allocated list with defined capacity
  • Use for each loop instead of queries
  • Filter without rounding (use reduced data processing)

During my tests, I got the following processing times:

  • Filter 180.000 values out of 180.000 total and show in chart: X = 1088ms, Y = 1085ms, Total: 2.173ms
  • Filter 180.000 values out of 1.800.000 values and show in chart: X = 9919ms, Y = 9983ms, Total: 19.902ms

To query and filter the data just take a few millisencods. The majority of the computing time goes for creating of the list/array. During my tests, I was able to reduce the computing time only by a few milliseconds.

My current solution is to use values <200.000 to show data and to load the remaining data with backgroundworker and update the gui later. But that's not a good solution. The user has to see all values in the chart to assess the data and the values have to be re-loaded if the user wants to add some paramter to chart. To filter the data during the previous SQL query is not a good alternative, because of similar durations.

UPDATE:

I did another tests with five scenarios:

  Dim yAxis1 = TestSpeed_ForEach(values_YAxis, "Y", counterStart, counterEnd, decimalCut, False)
  Dim yAxis2 = TestSpeed_ForEachFixedList(values_YAxis, "Y", counterStart, counterEnd, decimalCut, False)
  Dim yAxis3 = TestSpeed_QueryToArray(values_YAxis, "Y", counterStart, counterEnd, decimalCut, False)
  Dim yAxis4 = TestSpeed_QueryToList(values_YAxis, "Y", counterStart, counterEnd, decimalCut, False)
  Dim yAxis5 = TestSpeed_QueryToListParallel(values_YAxis, "Y", counterStart, counterEnd, decimalCut, False)

I testet the function with stopwatch for different values from 2.300 --> 1.800.000. As described before, I was not able to speed up the calculating time very much. The function with fixed list was the fastest, but the saving was just between 50 - 400ms. Here are the results for the query of 27.500 values out of 1.800.000 total:

  • TIME: YAxis 1 - 10868ms
  • TIME: YAxis 2 - 10844ms
  • TIME: YAxis 3 - 11311ms
  • TIME: YAxis 4 - 11265ms
  • TIME: YAxis 5 - 11313ms

In the second scenario I tested the constant list with 30.000 or 500.000 entries. But this only has a very small effect.

Here are the used functions:

'################ TESTING ################
Public Function TestSpeed_ForEach(values As IEnumerable, axis As String, counterStart As Integer, counterEnd As Integer, decimalCut As Integer, isTimeAxis As Boolean)
    Try
        Dim yList As New List(Of Double)

        For Each row In values
            If CInt(row(3)) > counterStart And CInt(row(3)) < counterEnd Then
                yList.Add(Math.Round(CDbl(row.ItemArray(4)), decimalCut))
            End If
        Next
        Return yList

    Catch ex As Exception
        Return Nothing
    End Try
End Function

Public Function TestSpeed_ForEachFixedList(values As IEnumerable, axis As String, counterStart As Integer, counterEnd As Integer, decimalCut As Integer, xIsTimeList As Boolean)
    Try
        Const capacity As Integer = 30000
        Dim yList As New List(Of Double)(capacity)

        For Each row In values
            If CInt(row(3)) > counterStart And CInt(row(3)) < counterEnd Then
                yList.Add(Math.Round(CDbl(row.ItemArray(4)), decimalCut))
            End If
        Next
        Return yList

    Catch ex As Exception
        Return Nothing
    End Try
End Function

Public Function TestSpeed_QueryToList(values As IEnumerable, axis As String, counterStart As Integer, counterEnd As Integer, decimalCut As Integer, isTimeAxis As Boolean)
    Try
        Dim query As IEnumerable = (From rows In values
                                    Where CInt(rows(3)) > counterStart And CInt(rows(3)) < counterEnd
                                    Select (Math.Round(CDbl(rows(4)), decimalCut))).ToList()
        Return query

    Catch ex As Exception
        Return Nothing
    End Try
End Function

Public Function TestSpeed_QueryToArray(values As IEnumerable, axis As String, counterStart As Integer, counterEnd As Integer, decimalCut As Integer, isTimeAxis As Boolean)
    Try
        Dim query As IEnumerable = (From rows In values
                                    Where CInt(rows(3)) > counterStart And CInt(rows(3)) < counterEnd
                                    Select (Math.Round(CDbl(rows(4)), decimalCut))).ToArray()
        Return query

    Catch ex As Exception
        Return Nothing
    End Try
End Function

Public Function TestSpeed_QueryToListParallel(values As IEnumerable, axis As String, counterStart As Integer, counterEnd As Integer, decimalCut As Integer, isTimeAxis As Boolean)
    Try
        Dim query As IEnumerable = (From rows In values
                                    Where CInt(rows(3)) > counterStart And CInt(rows(3)) < counterEnd
                                    Select (Math.Round(CDbl(rows(4)), decimalCut))).AsParallel.ToList()
        Return query

    Catch ex As Exception
        Return Nothing
    End Try
End Function

What else could I do to speed up?



Solution 1:[1]

I agree with the comments. This should be faster ASSUMING your function returns list of double...

    Const capacity As Integer = 1024 * 512

    Dim query As New List(Of Double)(capacity)

    For Each rows In values
        If CInt(rows(3)) > counterStart AndAlso CInt(rows(3)) < counterEnd Then
            query.Add(Math.Round(CDbl(rows(1)), 3))
        End If
    Next

    Return query

Solution 2:[2]

You'll have to edit this and make corrections where there are ??? The point is to test the IEnumerable being passed to the method.

Public Function TestSpeed_ForEachFixedList(values As IEnumerable(Of ???),
                                            axis As String,
                                            counterStart As Integer,
                                            counterEnd As Integer,
                                            decimalCut As Integer,
                                            xIsTimeList As Boolean) As List(Of Double)

    Dim Nvals As List(Of ???) = values.ToList
    Try
        Const capacity As Integer = 300000
        Dim yList As New List(Of Double)(capacity)

        For Each row As ??? In Nvals
            If CInt(row(3)) > counterStart And CInt(row(3)) < counterEnd Then
                yList.Add(Math.Round(CDbl(row.ItemArray(4)), decimalCut))
            End If
        Next
        Return yList

    Catch ex As Exception
        Return Nothing
    End Try
End Function

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 dbasnett
Solution 2 dbasnett