'Enum.filter not scalable?

I decode a CSV file (using https://hexdocs.pm/csv/), producing a stream, and I filter this stream with Enum.filter. My problem is that the processing time does not grow linearly with the size of the CSV file:

% wc -l long.csv 
10000 long.csv
% time mix run testcvs.exs long.csv  
mix run testcvs.exs long.csv  3.08s user 0.50s system 242% cpu 1.479 total

% wc -l verylong.csv
100000 verylong.csv
% time mix run testcvs.exs verylong.csv 
mix run testcvs.exs verylong.csv  98.08s user 3.24s system 117% cpu 1:25.93 total

It should take ten times more but it actually takes 57 times more. Definitely not scalable. Does it mean that Enum.filter does not use streaming but instead loads everything in memory? Is there a more scalable way to filter a stream?

The code:

Enum.at(System.argv(), 0)
|> File.stream!([:read], :line)
|> CSV.decode([separator: ?;])
|> Enum.filter(fn {:ok, line} -> Enum.at(line, 11) == "" end)

elixir scalability

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Enum.filter not scalable?

Sources

Related Questions