在结构上应用标准算法? [英] applying std algorithms over structs?

查看：54 发布时间：2021/6/4 20:17:50 c++ multithreading struct boost tuples

本文介绍了在结构上应用标准算法?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们有 Boost.PFR 并且我们有元组迭代器.如果我们将两者结合起来，我们可能有一种在结构上应用 std 算法的方法.是否已经存在解决方案?我正在寻找的是:

S a, b;自动常量 ra(to_range(a)), rb(to_range(b));std::transform(ra.begin(), ra.end(), rb.begin(), [](auto&& a)noexcept{return a;});

这将允许我们使用较新的功能来不按顺序或并行处理结构.

解决方案

为了说明我试图在对另一个答案的评论，让我们写一些类似于您的转换的内容:

直接实现

我跳过了迭代器和标准库的概念，因为它被整个迭代器值类型必须固定"所累.和其他负担.

相反，让我们从功能上"来做.

#include 命名空间 pfr = boost::pfr;模板 无效变换(Op f，T&&...操作数){自动应用 = [&]() {f(pfr::get(std::forward(operands))...);返回 1；};constexpr 自动大小 = std::min({pfr::tuple_size<std::decay_t<T>>::value...});//可选地断言大小匹配://static_assert(size == std::max({pfr::tuple_size<std::decay_t<T>>::value...}));[=]<自动... N>(std::index_sequence) {return (apply.template operator()() + ...);}(std::make_index_sequence{});}

我已经通过不固定 arity 进行了一些概括.它现在更像是一个 n-ary zip 或访问者.要获得您想要的转换，您可以向它传递一个操作，例如

auto binary = [](auto const& a, auto& b) {b = a;};

让我们演示一下，突出显示混合类型成员、非对称类型以及混合长度结构:

struct S1 { int a;双 b;长 c;浮动 d;};struct S2 { double a;双 b;双c;双d;};struct S3 { double a;双 b;};

测试用例:

int main() {auto n_ary = [](auto&... fields) {puts(__PRETTY_FUNCTION__);返回(... = 字段)；};S1a;S2b；S3c;//所有方向变换(二进制，a，b)；变换(二进制，b，a)；//混合尺寸变换(二进制，b，c)；变换(二进制，c，a)；//为什么选择二进制?变换(n_ary，a，b)；变换(n_ary，a，b，c)；变换(n_ary，c，b，a)；}

查看它在编译器资源管理器上直播

反汇编已经表明一切都在进行内联和优化.字面上地.只剩下 puts 调用:

主要:子 rsp, 8mov edi，偏移平面:.LC0看涨期权mov edi，偏移平面:.LC1看涨期权mov edi，偏移平面:.LC2看涨期权......异或 eax, eax添加 rsp, 8回复

给出输出

main()::[自动:12 = int;自动:13 = 双倍]main()::[自动:12 = 双倍；自动:13 = 双倍]main()::[自动:12 = long int;自动:13 = 双倍]main()::[自动:12 = 浮动；自动:13 = 双倍]main()::[自动:12 = 双倍；自动:13 = 整数]main()::[自动:12 = 双倍；自动:13 = 双倍]main()::[自动:12 = 双倍；auto:13 = long int]main()::[自动:12 = 双倍；自动:13 = 浮动]main()::[自动:12 = 双倍；自动:13 = 双倍]main()::[自动:12 = 双倍；自动:13 = 双倍]main()::[自动:12 = 双倍；自动:13 = 整数]main()::[自动:12 = 双倍；自动:13 = 双倍]main()::[自动:14 = {int，double}]main()::[自动:14 = {双，双}]main()::[自动:14 = {long int，double}]main()::[自动:14 = {float，double}]main()::[自动:14 = {int，double，double}]main()::[自动:14 = {双，双，双}]main()::[自动:14 = {double，double，int}]main()::[自动:14 = {双，双，双}]

工作量证明

让我们做一些有用的"我们可以检查的计算.此外，将 transform 函数重命名为 nway_visit 只是反映了它更通用的方向:

auto binary = [](auto& a, auto& b) { return a *= b;};auto n_ary = [](auto&... fields) { return (... *= fields);};

所以这两个操作都进行右折叠乘法赋值.给定一些明确选择的初始化器

S1 a {1,2,3,4};S2 b {2,3,4,5};S3 c {3,4};

我们希望能够看到通过数据结构的数据流.因此，让我们有选择地进行一些调试跟踪:

#define DEMO(expr) \无效(表达式)；\如果 constexpr (output_enabled) { \std::cout <<之后"<<std::left <<std::setw(26) <<#expr;\std::cout <<"一个:"<<pfr::io(a) <<\tb:"<<pfr::io(b) \<<\tc:"<<pfr::io(c) <<\n";\}演示(初始化")；//所有方向演示(nway_visit(二进制，a，b))；演示(nway_visit(二进制，b，a))；//混合尺寸演示(nway_visit(二进制，b，c))；演示(nway_visit(二进制，c，a))；//为什么选择二进制?演示(nway_visit(n_ary，a，b))；演示(nway_visit(n_ary，a，b，c))；演示(nway_visit(n_ary，c，b，a))；返回 long(c.a + c.b) % 37;//防止整个程序优化...

作为最重要的，让我们绝对确定(禁用输出)编译器无法优化整个程序，因为没有可观察到的效果:

return long(c.a + c.b) % 37;//防止整个程序优化...

该演示是 Live On Compiler Explorer 并启用输出，并且一次禁用输出显示反汇编:

主要:mov eax, 13回复

<块引用>

哇

神圣的烟雾.这就是优化.整个程序是静态评估的，只返回退出代码 13.让我们看看这是否是正确的退出代码:

输出已启用:

初始化"后a:{1, 2, 3, 4} b:{2, 3, 4, 5} c:{3, 4}nway_visit(binary, a, b) a:{2, 6, 12, 20} b:{2, 3, 4, 5} c:{3, 4}在 nway_visit(binary, b, a) a:{2, 6, 12, 20} b:{4, 18, 48, 100} c:{3, 4}nway_visit(binary, b, c) a:{2, 6, 12, 20} b:{12, 72, 48, 100} c:{3, 4}在 nway_visit(binary, c, a) a:{2, 6, 12, 20} b:{12, 72, 48, 100} c:{6, 24} 之后之后 nway_visit(n_ary, a, b) a:{24, 432, 576, 2000} b:{12, 72, 48, 100} c:{6, 24}nway_visit(n_ary, a, b, c) a:{1728, 746496, 576, 2000} b:{12, 72, 48, 100} c:{6, 24}nway_visit(n_ary, c, b, a) a:{1728, 746496, 576, 2000} b:{12, 72, 48, 100} c:{124416, 1289945088}

所以，返回值应该是 (124416 + 1289945088) modulo 37，用袖珍计算器确认是13.

从这里开始:并行任务等

您最初的动机包括免费从标准库中获取并行执行选项.如您所知我怀疑它的用处.

但是，您可以从算法中获得这种行为:

boost::asio::thread_pool ctx;//或者，例如system_executorauto run_task = [&](auto&...字段) {boost::asio::post(ctx, [=] { long_running_task(fields...); });};

希望这是很好的灵感.谢谢你让我看看 PFR.挺甜的.

We have Boost.PFR and we have the tuple iterator. If we combine the two, we might have a way of applying std algorithms over structs. Does a solution already exist? What I'm looking for is:

S a, b;
auto const ra(to_range(a)), rb(to_range(b));
std::transform(ra.begin(), ra.end(), rb.begin(), [](auto&& a)noexcept{return a;});

This would allow us to use the newer <execution> features to process structs out of sequence or in parallel.

解决方案

So to illustrate the points I tried to make in the comments to the other answer, let's write something like your transform:

A Direct Implementation

I skipped the notion of iterators and standard library, since it is encumbered with the whole "iterator value type must be fixed" and other burden.

Instead, let's do it "functionally".

#include <boost/pfr.hpp>
namespace pfr = boost::pfr;

template <typename Op, typename... T>
void transform(Op f, T&&... operands) {
    auto apply = [&]<int N>() {
        f(pfr::get<N>(std::forward<T>(operands))...);
        return 1;
    };

    constexpr auto size = std::min({pfr::tuple_size<std::decay_t<T>>::value...});
    // optionally assert that sizes match:
    //static_assert(size == std::max({pfr::tuple_size<std::decay_t<T>>::value...}));

    [=]<auto... N>(std::index_sequence<N...>) {
        return (apply.template operator()<N>() + ...);
    }
    (std::make_index_sequence<size>{});
}

I already generalized a bit by not making the arity fixed. It's more like an n-ary zip or visitor now. To get the transform you wanted you'd pass it an operation like

auto binary = [](auto const& a, auto& b) {
    b = a;
};

Let's demo this, highlighting mixed type members, asymmetric types as well as mixed-length structs:

struct S1 { int a; double b; long c; float d; };
struct S2 { double a; double b; double c; double d; };
struct S3 { double a; double b; };

Test cases:

int main() {

    auto n_ary = [](auto&... fields) {
        puts(__PRETTY_FUNCTION__);
        return (... = fields);
    };

    S1 a;
    S2 b;
    S3 c;

    // all directions
    transform(binary, a, b);
    transform(binary, b, a);

    // mixed sizes
    transform(binary, b, c);
    transform(binary, c, a);

    // why settle for binary?
    transform(n_ary, a, b);
    transform(n_ary, a, b, c);
    transform(n_ary, c, b, a);
}

See it Live On Compiler Explorer

Already, the disassembly suggests everything is getting inlined and optimized away. Literally. Only the puts calls remain:

main:
    sub     rsp, 8
    mov     edi, OFFSET FLAT:.LC0
    call    puts
    mov     edi, OFFSET FLAT:.LC1
    call    puts
    mov     edi, OFFSET FLAT:.LC2
    call    puts
    ...
    ...
    xor     eax, eax
    add     rsp, 8
    ret

giving the output

main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = int; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = long int; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = float; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = int]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = long int]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = float]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = double]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = int]
main()::<lambda(const auto:12&, auto:13&)> [with auto:12 = double; auto:13 = double]
main()::<lambda(auto:14& ...)> [with auto:14 = {int, double}]
main()::<lambda(auto:14& ...)> [with auto:14 = {double, double}]
main()::<lambda(auto:14& ...)> [with auto:14 = {long int, double}]
main()::<lambda(auto:14& ...)> [with auto:14 = {float, double}]
main()::<lambda(auto:14& ...)> [with auto:14 = {int, double, double}]
main()::<lambda(auto:14& ...)> [with auto:14 = {double, double, double}]
main()::<lambda(auto:14& ...)> [with auto:14 = {double, double, int}]
main()::<lambda(auto:14& ...)> [with auto:14 = {double, double, double}]

Proof Of Work

Let's do some "useful" calculations that we can check. Also, renaming the transform function to nway_visit just reflect its more generic orientation:

auto binary = [](auto& a, auto& b) { return a *= b; };
auto n_ary  = [](auto&... fields) { return (... *= fields); };

So both operations do a right-fold multiply-assigning. Given some distinctly chosen initializers

S1 a {1,2,3,4};
S2 b {2,3,4,5};
S3 c {3,4};

we want to be able to see the data flow through the data structures. So, let's optionally do some debug tracing of that:

#define DEMO(expr)                                                             \
    void(expr);                                                                \
    if constexpr (output_enabled) {                                            \
        std::cout << "After " << std::left << std::setw(26) << #expr;          \
        std::cout << " a:" << pfr::io(a) << "\tb:" << pfr::io(b)               \
                  << "\tc:" << pfr::io(c) << "\n";                             \
    }

    DEMO("initialization");

    // all directions
    DEMO(nway_visit(binary, a, b));
    DEMO(nway_visit(binary, b, a));

    // mixed sizes
    DEMO(nway_visit(binary, b, c));
    DEMO(nway_visit(binary, c, a));

    // why settle for binary?
    DEMO(nway_visit(n_ary, a, b));
    DEMO(nway_visit(n_ary, a, b, c));
    DEMO(nway_visit(n_ary, c, b, a));

    return long(c.a + c.b) % 37; // prevent whole program optimization...

As the cherry-on-top let's be absolutely sure that (with output disabled) the compiler cannot optimize the whole program away because there are no observable effects:

return long(c.a + c.b) % 37; // prevent whole program optimization...

The demo is Live On Compiler Explorer with output enabled, and once with output disabled showing the disassembly:

main:
        mov     eax, 13
        ret

WOW

Holy smokes. That's optimization. The whole program is statically evaluated and just returns the exit code 13. Let's see whether that's the correct exit code:

Output Enabled:

After "initialization"           a:{1, 2, 3, 4} b:{2, 3, 4, 5}  c:{3, 4}
After nway_visit(binary, a, b)   a:{2, 6, 12, 20}   b:{2, 3, 4, 5}  c:{3, 4}
After nway_visit(binary, b, a)   a:{2, 6, 12, 20}   b:{4, 18, 48, 100}  c:{3, 4}
After nway_visit(binary, b, c)   a:{2, 6, 12, 20}   b:{12, 72, 48, 100} c:{3, 4}
After nway_visit(binary, c, a)   a:{2, 6, 12, 20}   b:{12, 72, 48, 100} c:{6, 24}
After nway_visit(n_ary, a, b)    a:{24, 432, 576, 2000} b:{12, 72, 48, 100} c:{6, 24}
After nway_visit(n_ary, a, b, c) a:{1728, 746496, 576, 2000}    b:{12, 72, 48, 100} c:{6, 24}
After nway_visit(n_ary, c, b, a) a:{1728, 746496, 576, 2000}    b:{12, 72, 48, 100} c:{124416, 1289945088}

So, the return value should be (124416 + 1289945088) modulo 37, which a pocket calculator confirms is 13.

From Here: Parallel tasks etc.

Your original motivation included getting the parallel execution options from standard library for free. As you know I'm skeptical of the usefulness of that.

However, little stops you from getting this behaviour from thie algorithm:

boost::asio::thread_pool ctx; // or, e.g. system_executor

auto run_task = [&](auto&... fields) {
    boost::asio::post(ctx, [=] { long_running_task(fields...); });
};

Hope this is good inspiration. And thanks for making me look at PFR. It's pretty sweet.

这篇关于在结构上应用标准算法?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在结构上应用标准算法? [英] applying std algorithms over structs?

问题描述

直接实现

工作量证明

哇

输出已启用:

从这里开始:并行任务等

A Direct Implementation

Proof Of Work

WOW

Output Enabled:

From Here: Parallel tasks etc.

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

在结构上应用标准算法? [英] applying std algorithms over structs?

问题描述

直接实现

工作量证明

哇

输出已启用:

从这里开始:并行任务等

A Direct Implementation

Proof Of Work

WOW

Output Enabled:

From Here: Parallel tasks etc.

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭