It really seems like this implementation should be shared between here and algorithm_impl.h as a __parallel_includes function.
The only difference I see is the projections, where std::identity could be used to support the iterator version.
Originally posted by @danhoeflinger in #2320 (comment)