通过C ++ 20协程制作python生成器 [英] Making python generator via c++20 coroutines

查看:54
本文介绍了通过C ++ 20协程制作python生成器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有以下python代码:

Let's say I have this python code:

def double_inputs():
    while True:
        x = yield
        yield x * 2
gen = double_inputs()
next(gen)
print(gen.send(1))

它按预期打印"2".我可以像这样在c ++ 20中生成一个生成器:

It prints "2", just as expected. I can make a generator in c++20 like that:

#include <coroutine>

template <class T>
struct generator {
    struct promise_type;
    using coro_handle = std::coroutine_handle<promise_type>;

    struct promise_type {
        T current_value;
        auto get_return_object() { return generator{coro_handle::from_promise(*this)}; }
        auto initial_suspend() { return std::suspend_always{}; }
        auto final_suspend() { return std::suspend_always{}; }
        void unhandled_exception() { std::terminate(); }
        auto yield_value(T value) {
            current_value = value;
            return std::suspend_always{};
        }
    };

    bool next() { return coro ? (coro.resume(), !coro.done()) : false; }
    T value() { return coro.promise().current_value; }

    generator(generator const & rhs) = delete;
    generator(generator &&rhs)
        :coro(rhs.coro)
    {
        rhs.coro = nullptr;
    }
    ~generator() {
        if (coro)
            coro.destroy();
    }
private:
    generator(coro_handle h) : coro(h) {}
    coro_handle coro;
};

generator<char> hello(){
    //TODO:send string here via co_await, but HOW???
    std::string word = "hello world";
    for(auto &ch:word){
        co_yield ch;
    }
}

int main(int, char**) {
    for (auto i = hello(); i.next(); ) {
        std::cout << i.value() << ' ';
    }
}

此生成器只是逐个字母地生成一个字符串,但是该字符串在其中进行了硬编码.在python中,不仅可以从生成器中产生一些东西,而且还可以从中产生一些东西.我相信可以通过C ++中的co_await完成.

This generator just produces a string letter by letter, but the string is hardcoded in it. In python, it is possible not only to yield something FROM the generator but to yield something TO it too. I believe it could be done via co_await in C++.

我需要它像这样工作:

generator<char> hello(){
    std::string word = co_await producer; // Wait string from producer somehow 
    for(auto &ch:word){
        co_yield ch;
    }
}

int main(int, char**) {
    auto gen = hello(); //make consumer
    producer("hello world"); //produce string
    for (; gen.next(); ) {
        std::cout << gen.value() << ' '; //consume string letter by letter
    }
}

我该如何实现?如何制作这个生产者"?使用c ++ 20协程?

How can I achieve that? How to make this "producer" using c++20 coroutines?

推荐答案

如果要执行此操作,基本上要解决两个问题.

You have essentially two problems to overcome if you want to do this.

首先是C ++是一种静态类型的语言.这意味着需要在编译时知道所有涉及的类型.这就是为什么您的 generator 类型需要作为模板的原因,以便用户可以指定从协程到调用方使用的类型.

The first is that C++ is a statically typed language. This means that the types of everything involved need to be known at compile time. This is why your generator type needs to be a template, so that the user can specify what type it shepherds from the coroutine to the caller.

因此,如果要使用此双向接口,则 hello 函数上的 something 必须同时指定输出类型和输入类型.

So if you want to have this bi-directional interface, then something on your hello function must specify both the output type and the input type.

最简单的方法是创建一个对象,并将对该对象的非 const 引用传递给生成器.每次执行 co_yield 时,调用方都可以修改引用的对象,然后请求新的值.协程可以从参考中读取并查看给定的数据.

The simplest way to go about this is to just create an object and pass a non-const reference to that object to the generator. Each time it does a co_yield, the caller can modify the referenced object and then ask for a new value. The coroutine can read from the reference and see the given data.

但是,如果您坚持将协程的未来类型同时用作输出和输入,那么您既需要解决第一个问题(通过使您的 generator 模板采用 OutputType InputType )以及第二个问题.

However, if you insist on using the future type for the coroutine as both output and input, then you need to both solve the first problem (by making your generator template take OutputType and InputType) as well as this second problem.

看,您的目标是为协程获得价值.问题在于该值的源(调用协程的函数)具有将来的对象.但是协程无法访问将来的对象.它也无法访问将来引用的promise对象.

See, your goal is to get a value to the coroutine. The problem is that the source of that value (the function calling your coroutine) has a future object. But the coroutine cannot access the future object. Nor can it access the promise object that the future references.

或者至少不能轻易做到.

Or at least, it can't do so easily.

有两种使用不同用例的方法.第一个操纵协程机器后门实现承诺.第二种操作 co_yield 的属性来做基本相同的事情.

There are two ways to go about this, with different use cases. The first manipulates the coroutine machinery to backdoor a way into the promise. The second manipulates a property of co_yield to do basically the same thing.

协程的诺言对象通常是隐藏的,无法从协程访问.承诺创建的未来对象可以访问它,并充当承诺数据的接口.但是在 co_await 机制的某些部分中也可以访问它.

The promise object for a coroutine is usually hidden and inaccessible from the coroutine. It is accessible to the future object, which the promise creates and which acts as an interface to the promised data. But it is also accessible during certain parts of the co_await machinery.

具体来说,当您对协程中的任何表达式执行 co_await 时,机器会查看您的promise类型,以查看其是否具有名为 await_transform 的函数.如果是这样,它将在您 co_await 上的每个表达式上调用该诺言对象的 await_transform (至少在 co_await 中>是您直接编写的,而不是隐式等待的,例如 co_yield 创建的代码.)

Specifically, when you perform a co_await on any expression in a coroutine, the machinery looks at your promise type to see if it has a function called await_transform. If so, it will call that promise object's await_transform on every expression you co_await on (at least, in a co_await that you directly write, not implicit awaits, such as the one created by co_yield).

这样,我们需要做两件事:在promise类型上创建 await_transform 的重载,并创建其唯一目的是允许我们调用该 await_transform 函数.

As such, we need to do two things: create an overload of await_transform on the promise type, and create a type whose sole purpose is to allow us to call that await_transform function.

所以看起来像这样:

struct generator_input {};

...

//Within the promise type:
auto await_transform(generator_input);

一个简短的说明.像这样使用 await_transform 的缺点是,即使为我们的承诺指定了此函数的一个重载,我们也会在任何协程中影响每个 co_await 使用这种类型.对于生成协程,这不是很重要,因为除非您进行此类黑客攻击,否则没有太多理由 co_await .但是,如果您正在创建一种更通用的机制,使其可以在生成任意生成的对象时明显地等待它,那么您将遇到问题.

One quick note. The downside of using await_transform like this is that, by specifying even one overload of this function for our promise, we impact every co_await in any coroutine that uses this type. For a generator coroutine, that's not very important, since there's not much reason to co_await unless you're doing a hack like this. But if you were creating a more general mechanism that could distinctly await on arbitrary awaitables as part of its generation, you'd have a problem.

好的,所以我们有了这个 await_transform 函数;此功能需要做什么?它需要返回一个等待对象,因为 co_await 将要在其上等待.但是,此等待对象的目的是传递对输入类型的引用.幸运的是, co_await 机制用于将awaitable转换为一个值,该机制由awaitable的 await_resume 方法提供.因此我们只需返回 InputType& :

OK, so we have this await_transform function; what does this function need to do? It needs to return an awaitable object, since co_await is going to await on it. But the purpose of this awaitable object is to deliver a reference to the input type. Fortunately, the mechanism co_await uses to convert the awaitable into a value is provided by the awaitable's await_resume method. So ours can just return an InputType&:

//Within the `generator<OutputType, InputType>`:
    struct passthru_value
    {
        InputType &ret_;

        bool await_ready() {return true;}
        void await_suspend(coro_handle) {}
        InputType &await_resume() { return ret_; }
    };


//Within the promise type:
auto await_transform(generator_input)
{
    return passthru_value{input_value}; //Where `input_value` is the `InputType` object stored by the promise.
}

通过调用 co_await generator_input {}; ,可以使协程访问该值.请注意,这将返回对该对象的引用.

This gives the coroutine access to the value, by invoking co_await generator_input{};. Note that this returns a reference to the object.

generator 类型可以轻松修改,以允许修改存储在promise中的 InputType 对象.只需添加一对 send 函数即可覆盖输入值:

The generator type can easily be modified to allow the ability to modify an InputType object stored in the promise. Simply add a pair of send functions for overwriting the input value:

void send(const InputType &input)
{
    coro.promise().input_value = input;
} 

void send(InputType &&input)
{
    coro.promise().input_value = std::move(input);
} 

这表示不对称的传输机制.协程会在自己选择的时间和地点检索值.因此,没有任何义务立即对任何更改做出响应.在某些方面,这是很好的,因为它允许协程将自身与有害更改隔离开来.如果您在容器上使用基于范围的 for 循环,则外界无法(在大多数情况下)直接修改该容器,否则您的程序将显示UB.因此,如果协程以这种方式易碎,它可以从用户那里复制数据,从而阻止用户对其进行修改.

This represents an asymmetric transport mechanism. The coroutine retrieves a value at a place and time of its own choosing. As such, it is under no real obligation to respond instantly to any changes. This is good in some respects, as it allows a coroutine to insulate itself from deleterious changes. If you're using a range-based for loop over a container, that container cannot be directly modified (in most ways) by the outside world or else your program will exhibit UB. So if the coroutine is fragile in that way, it can copy the data from the user and thus prevent the user from modifying it.

总而言之,所需的代码不是那么大.这是一个您的代码的可运行示例,并进行了以下修改:

All in all, the needed code isn't that large. Here's a run-able example of your code with these modifications:

#include <coroutine>
#include <exception>
#include <string>
#include <iostream>

struct generator_input {};


template <typename OutputType, typename InputType>
struct generator {
    struct promise_type;
    using coro_handle = std::coroutine_handle<promise_type>;

    struct passthru_value
    {
        InputType &ret_;

        bool await_ready() {return true;}
        void await_suspend(coro_handle) {}
        InputType &await_resume() { return ret_; }
    };

    struct promise_type {
        OutputType current_value;
        InputType input_value;


        auto get_return_object() { return generator{coro_handle::from_promise(*this)}; }
        auto initial_suspend() { return std::suspend_always{}; }
        auto final_suspend() { return std::suspend_always{}; }
        void unhandled_exception() { std::terminate(); }
        auto yield_value(OutputType value) {
            current_value = value;
            return std::suspend_always{};
        }

        void return_void() {}

        auto await_transform(generator_input)
        {
            return passthru_value{input_value};
        }
    };

    bool next() { return coro ? (coro.resume(), !coro.done()) : false; }
    OutputType value() { return coro.promise().current_value; }

    void send(const InputType &input)
    {
        coro.promise().input_value = input;
    } 

    void send(InputType &&input)
    {
        coro.promise().input_value = std::move(input);
    } 

    generator(generator const & rhs) = delete;
    generator(generator &&rhs)
        :coro(rhs.coro)
    {
        rhs.coro = nullptr;
    }
    ~generator() {
        if (coro)
            coro.destroy();
    }
private:
    generator(coro_handle h) : coro(h) {}
    coro_handle coro;
};

generator<char, std::string> hello(){
    auto word = co_await generator_input{};

    for(auto &ch: word){
        co_yield ch;
    }
}

int main(int, char**)
{
    auto test = hello();
    test.send("hello world");

    while(test.next())
    {
        std::cout << test.value() << ' ';
    }
}

提高产量

使用显式 co_await 的替代方法是利用 co_yield 的属性.即, co_yield 是一个表达式,因此具有一个值.具体来说,它(主要)等效于 co_await p.yield_value(e),其中 p 是promise对象(ohh!)和 e 这就是我们所产生的.

Be more yielding

An alternative to using an explicit co_await is to exploit a property of co_yield. Namely, co_yield is an expression and therefore it has a value. Specifically, it is (mostly) equivalent to co_await p.yield_value(e), where p is the promise object (ohh!) and e is what we're yielding.

幸运的是,我们已经有了一个 yield_value 函数;它返回 std :: suspend_always .但是它也可能返回一个总是挂起的对象,但是 co_await 可以将其解压缩为 InputType& :

Fortunately, we already have a yield_value function; it returns std::suspend_always. But it could also return an object that always suspends, but also which co_await can unpack into an InputType&:

struct yield_thru
{
    InputType &ret_;

    bool await_ready() {return false;}
    void await_suspend(coro_handle) {}
    InputType &await_resume() { return ret_; }
};

...

//in the promise
auto yield_value(OutputType value) {
    current_value = value;
    return yield_thru{input_value};
}

这是一种对称的运输机制;对于您产生的每个值,您都会收到一个值(可能与之前的值相同).与显式的 co_await 方法不同,在开始生成它们之前,您不能在之前接收到值.这对于某些界面可能很有用.

This is a symmetric transport mechanism; for every value you yield, you receive a value (which may be the same one as before). Unlike the explicit co_await method, you can't receive a value before you start to generate them. This could be useful for certain interfaces.

当然,您可以根据需要将它们组合起来.

And of course, you could combine them as you see fit.

这篇关于通过C ++ 20协程制作python生成器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆