'error handling within std::tranform iteration

This question is about customisation for handling errors within std::transform's UnaryPredicate.

Parameters
first1, last1   -   the first range of elements to transform
first2  -   the beginning of the second range of elements to transform
d_first -   the beginning of the destination range, may be equal to first1 or first2
policy  -   the execution policy to use. See execution policy for details.
unary_op    -   unary operation function object that will be applied.

The standard API allows to customize a transformation logic that is happening during a single iteration. However, it is not documented how one could customise the behavior for outputing the result. Except for the requirement that d_first must be an input iterator. As a result std::transform performs 1 to 1 transformation by default. That means, thath the output range is of the same size as the input range.

However, I want to customize the behavior to ignore the output when an error has occurred. That would result in an output range of size n_original - n_errors.

Here is a code example, that parses a Visual Studio solution file string and gets a list of projects using regex. It is obvious, that the file can be corrupted to some extent, but failing on a step of extracting projects' info is not feasible - logging an error would suffice.

class VSParser
{
public:
    static auto projects(std::string_view slnFile)
    {
        std::regex pattern{
            R"(Project\(\"\{(?:[a-zA-Z0-9]|\-){36}\}\"\)\s*=\s*\"(.+?)\",\s*\"(.+?)\",)"
        };

        struct ProjInfo
        {
            std::string name;
            std::filesystem::path path;
        };

        using regex_iter_type = std::regex_iterator<std::decay_t<decltype(slnFile)>::iterator>;

        std::vector<ProjInfo> projects;
        std::transform(regex_iter_type(slnFile.cbegin(), slnFile.cend(), pattern),
                       regex_iter_type(),
                       std::back_inserter(projects),
                       [](const auto &match) -> ProjInfo
                       {
                           // TODO: handle parsing errors
                           return {std::string(match[1]),
                           std::string(match[2])};
                       });

        return projects;
    }

private:
};

The problem here is that the Ret type of the UnaryPredicate must be the same as the type resulting from dereferencing the OutputIter. So I can't see how I could manage to compile the UnaryPredicate with std::optional as a return type:

        std::vector<ProjInfo> projects;
        std::transform(regex_iter_type(slnFile.cbegin(), slnFile.cend(), pattern),
                       regex_iter_type(),
                       [&projects](const auto&)
                       {
                         // insert if not nullopt
                       }, // example. Will not compile since an it is a callble, not
                       [](const auto &match) -> std::optional<ProjInfo> 
                       {
                           // TODO: handle parsing errors
                           return ProjInfo{std::string(match[1]),
                           std::string(match[2])};
                       });

        return projects;

I know that I can do a vector of optionals and then strip it from invalid elements, but since std::optional<ProjInfo> and ProjInfo are different types, it will double the allocation and copy overhead which I don't want if can be avoided.



Solution 1:[1]

Rather than use std::transform, use a similar algorithm

template< class InputIt,
          class OutputIt,
          class UnaryOperation >
OutputIt transform_if( InputIt first1,
                       InputIt last1,
                       OutputIt d_first, 
                       UnaryOperation unary_op )
{
    while (first1 != last1) {
        if(auto v = unary_op(*first1++)) {
            *d_first++ = *std::move(v);
        }
    }
    return d_first;
}

Solution 2:[2]

So this wrapper compiles. Example.

#include <algorithm>
#include <vector>
#include <iterator>
#include <optional>
#include <iostream>

namespace detail
{
    template <typename ContainerT>
    class optional_back_inserter
    {
    public:
        using container_type = ContainerT;
        using value_type = typename container_type::value_type;

        constexpr explicit optional_back_inserter(container_type &container)
            : backInserter_(container)
        {}

        optional_back_inserter& operator=(std::optional<value_type> value)
        {
            if (value)
                backInserter_ = std::move(*value);
            return *this;
        }

        /**
         * no-op
         * @return
         */
        constexpr optional_back_inserter& operator*()
        {
            return *this;
        }

        /**
         * no-op
         * @return
         */
        constexpr optional_back_inserter& operator++()
        {
            return *this;
        }


        /**
         * no-op
         * @return
         */
        constexpr optional_back_inserter operator++(int)
        {
            return *this;
        }

    private:
        std::back_insert_iterator<container_type> backInserter_;
    };
}

template <typename ContainerT>
constexpr auto optional_back_inserter(ContainerT &container)
{
    return detail::optional_back_inserter<ContainerT>(container);
}


int main()
{
    std::vector<int> vec{4, 8, 15, 16, 23, 42};
    std::vector<int> output{};

    std::transform(vec.cbegin(), 
    vec.cend(),
    optional_back_inserter(output),
    [](int i) -> std::optional<int>
    {
        if (i % 2)
            return {i};

        return std::nullopt;
    });

    std::cout << vec.size() << std::endl;
    std::cout << output.size() << std::endl;

    return 0;
}

This is "more verbose" but is more about separation of responcibilities. By customizing only the outputIterator's behavior I remove the necessity of messing with the transformation logic by itself. It is SOLID-friendly, since I don't need to modify the general logic, rather I can provide a customisation object in any other part of the code.

Solution 3:[3]

I am hesitating to put an answer in because I am not sure what exactly what you are asking but here is my attempt to answer: How do you transform between a type and it's optional type using std::transform? If I have that wrong, let me know and I will delete the answer.

#include <iostream>
#include <vector>
#include <optional>

int main() {
  std::vector<int> iv{1, 2, 3, 4, 5, 6};
  std::vector<std::optional<int>> oiv;
  std::transform(iv.begin(), iv.end(), std::back_inserter(oiv),
                 [](int i) -> std::optional<int> { return i % 2 == 0 ? std::optional<int>(i) : std::nullopt; });


  for (auto oi: oiv) {
    std::cout << ' ' << oi.has_value();
  }


  std::cout << '\n';
}

Another possibility is to use ranges with pipes or functionally.

#include <ranges>
#include <iostream>

int main() {
  auto const ints = {0, 1, 2, 3, 4, 5};
  auto even = [](int i) { return 0 == i % 2; };
  auto to_option = [](int i) { return std::optional(i); };

  // "pipe" syntax of composing the views:
  for (std::optional<int> i: ints | std::views::filter(even) | std::views::transform(to_option)) {
    std::cout << i.value() << ' ';
  }

  std::cout << '\n';

  // a traditional "functional" composing syntax:
  for (std::optional<int> i: std::views::transform(std::views::filter(ints, even), to_option)) {
    std::cout << i.value() << ' ';
  }
}

SOLID principals are not really in play here since the STL is largely functional but one of the SOLID principals is the single-responsibility principle and by coupling two responsibilities into one function you are clearly breaking it.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Caleth
Solution 2
Solution 3