'Why lazy.compactMap.first maps 'first' element twice?

I'm testing compactMap for lazy array to find first element and map it in a few lines of code.

"abc5def".lazy
  .compactMap {
    print($0)
    return Int(String($0))
}.first as Int?

Prints

a
b
c
5
5

Why last element being mapped twice. How to avoid this behaviour?



Solution 1:[1]

TL;DR The compactMap call returns a chain of lazy sequences LazyMapSequence<LazyFilterSequence<LazyMapSequence<..., this, combined with the fact that first needs to compute both the start index, as well as the element at that start index, results in the transform closure being called twice:

  1. when startIndex is computed
  2. when retrieving the element at the start index

This is the current implementation of compactMap over LazySequenceProtocol (a protocol that all lazy sequences conform to):

public func compactMap<ElementOfResult>(
    _ transform: @escaping (Elements.Element) -> ElementOfResult?
  ) -> LazyMapSequence<
    LazyFilterSequence<
      LazyMapSequence<Elements, ElementOfResult?>>,
    ElementOfResult
  > {
    return self.map(transform).filter { $0 != nil }.map { $0! }
}

This makes your "abc5def".lazy.compactMap { ... } to be of type LazyMapSequence<LazyFilterSequence<LazyMapSequence<String, Optional<Int>>>, Int>.

Secondly, you're asking about the first element from the lazy sequence. This resolves to the default implementation of first over the Collection protocol (all lazy sequences get automatic conformance to Collection if their base sequence is also a collection):

public var first: Element? {
    let start = startIndex
    if start != endIndex { return self[start] }
    else { return nil }
}

This means that first has to retrieve two pieces of information:

  1. the start index
  2. the value at the start index (the subscript part)

Now, it's the startIndex computation that causes the duplicate evaluation, due to this implementation over LazyFilterSequence:

public var startIndex: Index {
    var index = _base.startIndex
    while index != _base.endIndex && !_predicate(_base[index]) {
      _base.formIndex(after: &index)
    }
    return index
}

The subscript implementation over LazyMapSequence is a standard one:

public subscript(position: Base.Index) -> Element {
    return _transform(_base[position])
}

, however, as you can see, the transform is called again, resulting in the second print you see.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1