'Difference between using a list or a pcollection
Im building a pipeline in apache beam and I just got curious about this, whats the difference between applying a ptransform to a list and a pcollection, is the performance affected by this or is just that the pcollection is inmutable and is this a bad way to aproach a pipeline with apache beam?
Solution 1:[1]
By definition, a PCollection is a unbounded collection. Immutable, and unbounded.
The main difference with a list is mainly the unbounded characteristic and it's especially powerful when you are streaming data (from a large file, or from a unbounded source, like PubSub).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | guillaume blaquiere |
