C++ Coroutines

37
Your systems. Working as one. Sumant Tambe, PhD Principal Research Engineer and Microsoft VC++ MVP Real-Time Innovations, Inc. @sutambe SFBay Association of C/C++ Users Jan 13, 2016 Polyglot Programming DC Mar 16, 2016

Transcript of C++ Coroutines

Page 1: C++ Coroutines

Your systems. Working as one.

Sumant Tambe, PhDPrincipal Research Engineer and Microsoft VC++ MVPReal-Time Innovations, Inc.@sutambe

SFBay Association of C/C++ UsersJan 13, 2016

Polyglot Programming DCMar 16, 2016

Page 2: C++ Coroutines

7/9/2017 © 2015 RTI 2

Alternative Title

Page 3: C++ Coroutines

Why should you care?

• Most software written today relies on networking and I/O

• Simplify writing I/O-oriented software– Correct (bug-free)

– Performs well

• Non-blocking

• Latency-aware

• Multi-core scalable

– Expressive–Simple and direct expression of intent

– Easy to write and read (maintainable)

– Productivity

• Modular

• Reusable

• Extensible

– and have fun while doing that!

7/9/2017 © 2015 RTI 4

Page 4: C++ Coroutines

Agenda

• Remote Procedure call (RPC) over DDS (DDS-RPC) standard

• Deep dive into asynchronous programming

– future<T> examples

– C++ await examples

– Abusing C++ await to avoid if-then-else boilerplate

– C++ generator examples

• Composing abstractions

7/9/2017 © 2015 RTI 5

Page 5: C++ Coroutines

7/9/2017 © Real-Time Innovations, Inc. 6

Data Connectivity Standard for the Industrial IoT

Page 6: C++ Coroutines

DDS: Data Connectivity Standard for the Industrial IoT

© 2009 Real-Time Innovations, Inc.

Streaming

DataSensors Events

Real-Time

Applications

Enterprise

ApplicationsActuators

Page 7: C++ Coroutines

The DDS Standard Family

8

DDS v 1.4

RTPS v2.2D

DS-

SEC

UR

ITY

DD

S-X

TYP

ES

Application

UDP TCP** DTLS** TLS**

DDS-C++ DDS-JAVA* DDS-IDL-C DDS-IDL-C#

SHARED-MEMORY**IP

DD

S-W

EB

HTTP(s)

IDL

4.0

© 2015 RTI

DD

S-R

PC

*

Page 8: C++ Coroutines

DDS-RPC

• Remote Procedure Call over DDS Pub-Sub Middleware

• Adopted OMG specification

• C++ and Java

• Two language bindings

– Request/Reply

– Function-call

• Reference Implementation

– dds-rpc-cxx RTI github

7/9/2017 © 2015 RTI 9

Page 9: C++ Coroutines

RobotControl IDL Interface

7/9/2017 © 2015 RTI 10

module robot {

exception TooFast {};

enum Command { START_COMMAND, STOP_COMMAND };

struct Status {

string msg;

};

@DDSService

interface RobotControl

{

void command(Command com);

float setSpeed(float speed) raises (TooFast);

float getSpeed();

void getStatus(out Status status);

};

}; // module robot

Page 10: C++ Coroutines

RobotControl Abstract Class

7/9/2017 © 2015 RTI 11

class RobotControl

{

public:

virtual void command_async(const robot::Command & command) = 0;

// returns old speed when successful

virtual float setSpeed_async(float speed) = 0;

virtual float getSpeed_async() = 0;

virtual robot::RobotControl_getStatus_Out getStatus_async() = 0;

virtual ~RobotControl() { }

};

Page 11: C++ Coroutines

Synchronous calls

7/9/2017 © 2015 RTI 12

robot::RobotControlSupport::Client

robot_client(rpc::ClientParams().domain_participant(...)

.service_name("RobotControl"));

float speed = 0;

try

{

speed = robot_client.getSpeed();

speed += 10;

robot_client.setSpeed(speed);

}

catch (robot::TooFast &)

{

printf("Going too fast!\n");

}

Page 12: C++ Coroutines

How well do you know latency?

7/9/2017 © 2015 RTI 13

Action Latency

Execute a typical instruction 1 second

Fetch from L1 cache memory 0.5 second

Branch misprediction 5 seconds

Fetch from L2 cache memory 7 seconds

Mutex lock/unlock 30 seconds

Fetch from main memory 1.5 minutes

Send 2K bytes over 1Gbps network 5.5 hours

Read 1 MB sequentially from memory 3 days

Fetch from new disk location (seek) 13 weeks

Read 1MB sequentially from disk 6.5 months

Send packet from US to Europe and back 5 years

Credit: http://www.coursera.org/course/reactive week 3-2Latency Numbers Every Programmer Should Know (jboner) https://gist.github.com/jboner/2841832

Assume a typical instruction takes 1 second…

Page 13: C++ Coroutines

Making Latency Explicit … as an Effect

7/9/2017 © 2015 RTI 14

class RobotControlAsync

{

public:

virtual rpc::future<void> command_async(const robot::Command & command) = 0;

// returns old speed when successful

virtual rpc::future<float> setSpeed_async(float speed) = 0;

virtual rpc::future<float> getSpeed_async() = 0;

virtual rpc::future<robot::RobotControl_getStatus_Out> getStatus_async() = 0;

virtual ~RobotControlAsync() { }

};

Page 14: C++ Coroutines

When rpc::future is C++11 std::future

7/9/2017 © 2015 RTI 15

try {

dds::rpc::future<float> speed_fut =

robot_client.getSpeed_async();

// Do some other stuff

while(speed_fut.wait_for(std::chrono::seconds(1)) ==

std::future_status::timeout);

speed = speed_fut.get();

speed += 10;

dds::rpc::future<float> set_speed_fut =

robot_client.setSpeed_async(speed);

// Do even more stuff

while(set_speed_fut.wait_for(std::chrono::seconds(1)) ==

std::future_status::timeout);

set_speed_fut.get();

}

catch (robot::TooFast &) {

printf("Going too fast!\n");

}

Page 15: C++ Coroutines

Limitations of C++11 std::future<T>

• Must block (in most cases) to retrieve the result

• If the main program isn’t blocked, it’s likely that the continuation is blocked– I.e., the async result is available but no one has

noticed

• The programmer must do correlation of requests with responses– The order in which async result will be ready is not

guaranteed by DDS-RPC (when multiple requests are outstanding)

7/9/2017 © 2015 RTI 16

Page 16: C++ Coroutines

When rpc::future is C++11 std::future

7/9/2017 © 2015 RTI 17

Page 17: C++ Coroutines

Composable Futures to Rescue

• Concurrency TS/C++17

• Serial Composition– future.then()

• Parallel composition– when_all, when_any

• Lot of implementations

– Boost.future, Microsoft PPL, HPX, Facebook’s Folly

– dds::rpc::future<T>wraps PPL (code)

7/9/2017 © 2015 RTI 18

Page 18: C++ Coroutines

Using future.then()

7/9/2017 © 2015 RTI 19

robot_client

.getSpeed_async()

.then([robot_client](future<float> && speed_fut) {

float speed = speed_fut.get();

printf("getSpeed = %f\n", speed);

speed += 10;

return robot_client.setSpeed_async(speed);

})

.then([](future<float> && speed_fut) {

try {

float speed = speed_fut.get();

printf("speed set successfully.\n");

}

catch (robot::TooFast &) {

printf("Going too fast!\n");

}

});

Page 19: C++ Coroutines

Improvements over C++11 future

• Main thread does not have to block

• Callback (continuation) does not have to block– The thread setting the future value invokes the

callback right away

• Request/Reply correlation isn’t explicit because the callback lambda captures the necessary state– No incidental data structures necessary (state

machines, std::map) for request/reply correlation (see Sean Parent’s CppCon’15 talk about no incidental data structures)

• Same pattern in Javascript promises and other places

7/9/2017 © 2015 RTI 20

Page 20: C++ Coroutines

7/9/2017 © 2015 RTI 21

Speed up the robot to MAX_SPEEDin increments of 10 and without

blocking

Page 21: C++ Coroutines

7/9/2017 © 2015 RTI 22

dds::rpc::future<float> speedup_until_maxspeed(

robot::RobotControlSupport::Client & robot_client)

{

static const int increment = 10;

return

robot_client

.getSpeed_async()

.then([robot_client](future<float> && speed_fut) {

float speed = speed_fut.get();

speed += increment;

if(speed <= MAX_SPEED) {

printf("speedup_until_maxspeed: new speed = %f\n", speed);

return robot_client.setSpeed_async(speed);

}

else

return dds::rpc::details::make_ready_future(speed);

})

.then([robot_client](future<float> && speed_fut) {

float speed = speed_fut.get();

if(speed + increment <= MAX_SPEED)

return speedup_until_maxspeed(robot_client);

else

return dds::rpc::details::make_ready_future(speed);

});

}

Return ready future? Why not speed?

Is that recursive? Does the stack grow?... No!

What are these lambdas doing here?Is that a CPS transform? … Yes!

Page 22: C++ Coroutines

.then() in Action

7/9/2017 © 2015 RTI 23

Setup getSpeed Callback Receive getSpeed Reply and invoke setSpeed

Setup setSpeed Callback Receive setSpeed Reply and invoke speedup_until_maxspeed

.then .then .then .then .then

speedup_until_maxspeed speedup_until_maxspeed speedup_until_maxspeed

Page 23: C++ Coroutines

What’s Wrong with .then()

• Control-flow is awkward

– The last example shows that async looping is hard

• Debugging is hard

– No stack-trace (at least not very useful)

– See screenshots

– .then is stitching together lambdas (program fragments). Not seamless.

– Awkward physical and temporal continuity

7/9/2017 © 2015 RTI 24

Page 24: C++ Coroutines

Welcome C++ Coroutines

7/9/2017 © 2015 RTI 25

Page 25: C++ Coroutines

Using C++ await

7/9/2017 © 2015 RTI 26

dds::rpc::future<void> test_iterative_await(

robot::RobotControlSupport::Client & robot_client)

{

static const int inc = 10;

float speed = 0;

while ((speed = await robot_client.getSpeed_async())+inc <= MAX_SPEED)

{

await robot_client.setSpeed_async(speed + inc);

printf("current speed = %f\n", speed + inc);

}

}

Synchronous-looking but completely non-blocking code

Page 26: C++ Coroutines

C++ await Examples in Visual Studio 2015

• test_await

• test_iterative_await

• test_three_getspeed_await

• test_three_setspeed_await

• test_lambda_await

• test_foreach_await

• test_accumulate_await

• test_lift_accumulate_await

code on github (robot_func.cxx)

7/9/2017 © 2015 RTI 27

Page 27: C++ Coroutines

Awaitable Types

• Many, many possibilities. You define the semantics.• A pair of types

– Promise and Future

• Awaitable Implements– await_ready– await_suspend– await_resume– type::promise_type

• Promise Implements– get_return_object– initial_suspend– final_suspend– set_exception– return_value

7/9/2017 © 2015 RTI 28

Page 28: C++ Coroutines

C++ Generators

7/9/2017 © 2015 RTI 32

• Synchronous generator (yield) is syntax sugar for creating lazy containers

• std::experimental::generator<T>

– movable

• Many more possibilities

Page 29: C++ Coroutines

C++ Generators

7/9/2017 © 2015 RTI 33

• Synchronous generator (yield) is syntax sugar for creating lazy containers

void test_hello(){for (auto ch: hello()){std::cout << ch;

}}

std::experimental::generator<char> hello(){yield 'H';yield 'e';yield 'l';yield 'l';yield 'o';yield ',';yield ' ';yield 'w';yield 'o';yield 'r';yield 'l';yield 'd';

}

Page 30: C++ Coroutines

Generator Iterator Category

• What is it?

7/9/2017 © 2015 RTI 34

std::input_iterator_tag

Page 31: C++ Coroutines

Deterministic Resource Cleanup

7/9/2017 © 2015 RTI 35

std::experimental::generator<std::string> read_file(const std::string & filename)

{std::fstream in(filename);std::string str;while (in >> str)yield str;

}

void test_read_file(const std::string filename){{auto & generator = read_file(filename);for (auto & str : generator){std::cout << str << " ";break;

}std::cout << "\n";

}// file closed before reaching here

}

Page 32: C++ Coroutines

C++ Generators

7/9/2017 © 2015 RTI 36

• Generators preserve the local variable and arguments

void test_range(){for (auto i: range(1, 10)){std::cout << i;

}}

std::experimental::generator<int> range(int start, int count){if (count > 0) {for (int i = 0; i < count; i++)yield start + i;

}}

Page 33: C++ Coroutines

Recursive?

7/9/2017 © 2015 RTI 37

std::experimental::generator<int> range(int start, int count){if (count > 0) {yield start;for (auto i : range(start + 1, count - 1))yield i;

}}

• Crashes at count ~3000– Quadratic run-time complexity

– i-th value yields i times

• “Regular” generators can’t suspend nested stack frames– Use recursive_generator<T>

Page 34: C++ Coroutines

Composing Abstractions

• A cool way to compose generators is….

7/9/2017 © 2015 RTI 38

range-v3

• But range-v3 did not compile on VS2015– So I’ll use my own generators library– See my Silicon Valley Code Camp 2015 talk

• Composable Generators and Property-based Testing (video, slides)

– You can use boost.range if you like

• A cool way to compose lazy containers is….

Page 35: C++ Coroutines

Composing Generators

7/9/2017 © 2015 RTI 39

#include “generator.h” // See my github

std::experimental::generator<char> hello(){for (auto ch : "HELLO WORLD")yield ch;

}

void test_coroutine_gen(){std::string msg = “hello@world";

auto gen = gen::make_coroutine_gen(hello).map([](char ch) { return char(ch + 32); }).take(msg.size());

int i = 0;

for (auto ch: gen){std::cout << ch;assert(ch == msg[i++]);

}}

Page 36: C++ Coroutines

Generator and Synchrony are Orthogonal

• Yellow Boxes

– Libraries already exist (e.g., range-v3, future::then, RxCpp)

• Not really easy

– The C++ Resumable Functions proposal adds language-level support

• Like many other languages: Javascript, Dart, Hack, C#, etc.

7/9/2017 © 2015 RTI 40

Sync/Async, One/Many

Single Multiple

Sync T generator<T>::iterator

Async future<T> async_generator<T>

See Spicing Up Dart with Side Effects - ACM Queue--- Erik Meijer, Applied Duality; Kevin Millikin, Google; Gilad Bracha, Google

Page 37: C++ Coroutines

Further Reading

• Resumable Functions in C++: http://blogs.msdn.com/b/vcblog/archive/2014/11/12/resumable-functions-in-c.aspx

• Resumable Functions (revision 4)---Gor Nishanov, Jim Radigan (Microsoft) N4402

• Spicing Up Dart with Side Effects - ACM Queue --- Erik Meijer, Applied Duality; Kevin Millikin, Google; Gilad Bracha, Google

7/9/2017 © 2015 RTI 41