Duff's Device, Coroutines and Continuations
I've been doing a lot of programming recently after a bit of a hiatus. It's
not that I haven't been programming, it's more the type. Instead of writing
deep-down, think about it for 10 minutes and then press a key, kind of
software I'm writing more bread and butter code.
The change in style has made me aware of several frustrations. When writing
bread and butter code, I find that a vast majority of it is boilerplate. It
doesn't change from class to class, and I even find myself performing
cut-and-paste programming. I'm finding this frustrating because I feel that
my productivity has dropped. I find that I have a certain number of lines of
bug-free code per day in me. If I can do more by writing fewer lines, then I'm more
productive. Having to write boilerplate saps my precious daily lines of code!
So, I'm on the lookout for new ideas to help me write more code faster. I
look for ways to architect the code to avoid having to actually write code.
This is the other side of Paul Graham's "Good Hackers Don't Use Java"
argument. While I have to use C++, I'm looking for line saving ideas from
other programming languages.
One of the more code sucking types of code is the state machine. State
machines are everywhere. They are frequently implemented as tables of
functions, with each state having a lot of boilerplate at the start and the
end.
My explorations for a better state machine implementation have taken me to
Duff's Device, Coroutines and Continuations. When I started, I knew what
Duff's Device and Coroutines were, but I only had a vague concept (and an
incorrect one at that) of what a Continuation was.
First, an introduction
Everyone out there has probably heard of loop unrolling. This is when you
take a long running loop in your program and attempt to perform more transactions
per iteration. The starting point looks something like this:
int j=100;
for (int i=0;i<j;i++) {
printf("%d\n", i);
}
To speed up this type of loop, you attempt to perform more printf's per
iteration. Simplistically, you will try something like this:
for (int i=0;i<j/5;i+=5) {
printf("%d\n", i);
printf("%d\n", i+1);
printf("%d\n", i+2);
printf("%d\n", i+3);
printf("%d\n", i+4);
}
However, what happens if j = 3? The code will (now incorrectly) still output
0-4. There is a way to fix it...
Enter Duff's Device.
Essentially, Duff's Device makes use of some interesting behaviour of switch
and case statements. The switch/case statements act like named goto's instead
of if statements. This gives them the ability to jump into scopes!
So, let's change the code to use Duff's Device:
int i=0;
int j=3;
switch (j%5) {
case 0:
for (;i<j;i+=5) {
printf("%d\n", i);
case 4:
printf("%d\n", i+1);
case 3:
printf("%d\n", i+2);
case 2:
printf("%d\n", i+3);
case 1:
printf("%d\n", i+4);
}
This is how loops are unrolled. In this way, we get many more operations
for each iteration through the loop.
At this point, the audience yells out "But this doesn't save me any lines of
code!", and they would be correct. It's Duff's Device with the next concept
that can save us code. Enter Coroutines.
Coroutines
Coroutines are co-operative multitasking. At some point the task will
indicate that it yields the processor, allowing another thread of
control to be executed. If the task doesn't give up control, it is
never interrupted. Most systems have moved on to using full blown
independent threads, but there are still some classes of systems where
threads aren't appropriate. Good examples are systems with several
hundred thousand independent operations active at any time. It isn't
possible to have a real thread for each operation, and managing a
thread pool is just as hard as managing coroutines. Add to that the
benefits of avoiding locking and resource sharing issues and the
argument for coroutines becomes pretty compelling. On those systems,
code can still be simpflified by using coroutines. State machine
code can be dramatically simplified, turning it into a linear top to
bottom function rather than a set of states managed by an array or
switch.
For example, consider the following (note, the code only works with GCC):
class coroutine {
public:
coroutine() : state(0) {}
virtual void run();
private:
int state;
}
#define coBEGIN() switch (state) {case 0:
#define coRETURN() state = __LINE__;return;case __LINE__:
#define coEND() }
class coBasic : public coroutine {
public:
virtual void run();
}
void coBasic::run() {
coBEGIN();
fprintf(stderr, "Here we are in state 1\n");
coRETURN();
fprintf(stderr, "Here we are in state 2\n");
coRETURN();
fprintf(stderr, "Here we are in state 3\n");
coEND();
}
int main(void) {
coBasic co;
co.run();
co.run();
co.run();
}
Where does this save us code? We save code in several locations.
First, we don't have to write a function header for each state. Code
will flow from one state to the next. Next up, we avoid all of the
setup code. Frequently code will have to do at least a little bit of
marshalling as it prepares to run. Using this, that code is shared
among all of the states, instead of being copied, or shared as a
separate function.
There are some limitations to this implementation. First, it is very
difficult to specify the next state to invoke, linear is the name of
the game. It is possible to change the macros to use specific state
names instead of a simple return, which would allow more complex state
machines to be created. If we restrict ourselves to GCC, whose
preprocessor allows macro overloading, we can even combine the two
with both named and unnamed states.
Also, there are some limitations imposed by the use of Duff's Device.
C++ does not allow code to jump over constructors, so code like the
following will resultin a compilation error:
void coBasic::run() {
coBEGIN();
fprintf(stderr, "Here we are in state 1\n");
std::map<int, int> foobar; // <-- ERROR
coRETURN();
fprintf(stderr, "Here we are in state 2\n");
coEND();
}
To get around this, we can do two things. We can either move to
C-style declarations and do all of our declarations at the start of
the function, or we can modify the macros slightly. The modified
macros would look like this:
#define coBEGIN() switch (state) {case 0:{
#define coRETURN() state = __LINE__;return;}case __LINE__:{
#define coEND() }}
As you can see, we're adding an additional scope around each state.
In that way, we avoid jumping over a constructor for an object we
could conceivably use, but we do end up having some strange compiler
problems. Consider the following code:
void coBasic::run() {
coBEGIN();
fprintf(stderr, "Here we are in state 1\n");
std::map<int, int> foobar;
coRETURN();
fprintf(stderr, "Here we are in state 2 map size = %d\n",
foobar.size()); // <-- ERROR
coEND();
}
This will generate an undefined variable error for foobar, because
foobar is only defined in the previous state. So, we get slightly
more confusing code because scopes aren't necessarily clearly defined.
However, this does allow state-local variables to be declared and
used.
Continuations
When I originally started to write this article, I misunderstood what
a continuation was. I thought it was simply a coroutine by another
name. I was wrong. Continuations are about continuing execution at
stored location in the procedure, but they add a new wrinkle. Where
the coroutine above returns, a continuation passes the execution
location down the stack. Perhaps a picture will help:
Coroutine:
A -> B (no state)
<- B (point 1)
A -> B (starts at point 1)
<- B returns
...
Continuation
A -> B (no state)
-> C (B at point 1)
-> D (B still at point 1)
-> B (starts at point 1)
<--------------- B returns all the way back to A
A
It seems that continuations are a relatively new programming style,
with few people understanding their benefits, or even being able to
explain how they work. The easiest explanation I found was on c2.com
(http://c2.com/cgi/wiki?ContinuationExplanation), especially the C
code version (http://c2.com/cgi/wiki?ContinuationsInCee). The form of
continuation presented in that code is called a "Single Use
Continuation", meaning it can only be executed if it is still on the
stack, and that it can only be invoked once. Some languages
(Lisp/Scheme/Dylan?) fully support continuations allowing them to be
executed multiple times at any point.
There is one more thing about continuations. They save the state of
the stack at the point the continuation is created, allowing code to
return to that location using the existing local variables. This is
very useful for unraveling network and user interaction code. For
this we really need a language that supports continuations, such as
Scheme, Lisp and Dylan. In C/C++, we can mostly fake it with some
limitations.
We can fake them using various methods. In C, we can use
setjmp/longjmp or on x86, modify the stack base pointer as in the C2
example. In C++, we can do much of the same using exceptions. The
use of exceptions allows us to properly clean up the parent stack
frames, while returning to the previous stack frame. In C, we can use
the speedier code and simply "Go There".
Using the previous example, the C/C++ continuations I'm presenting here
would also invalidate any continuations in stack frames C or D when B is
executed the second time.
For a basic implementation, consider the following:
class continuation<class X> {
public:
typedef X myContinueSignal;
continuation() : state(0) {}
virtual void run();
void continue();
private:
int state;
}
void continuation::continue() {
throw X();
}
struct myContinuationContinueSignal {}
class myContinuation: public continuation<class myContinuationContinueSignal> {
public:
void run();
}
#define coBEGIN() switch (state) {case 0:{
#define coRETURN() state = __LINE__;return;}case __LINE__:{
#define coCALL(x) state = __LINE__;try {x;}catch (myContinuation::myContinueSignal&){}case __LINE:{
#define coEND() }}
I hope I've got the code correct, I've never actually plugged any of this into a compiler, so you can be reasonably sure it won't work.
This allows us to take the original state machine and add some niftyness.
void get_some_more_information(continuation *continuation) {
// Do some stuff, like send a message and then get a response
// Do some more stuff, like put the response into the class.
continuation->continue();
}
void myContinuation::run() {
coBEGIN();
fprintf(stderr, "Here we are in state 1\n");
coRETURN();
fprintf(stderr, "Here we are in state 2\n");
coRETURN();
coCALL(get_some_more_information(this));
fprintf(stderr, "Here we are in state 3\n");
coEND();
}
Here, if get_some_more_information decides it already has the requested
information, it can continue immediately. If decides that it will
have to block, it can use it's own continuation to throw back through
myContinuation::run while still preserving the state for later
execution.
Why is this useful? We could pass the coroutine down and execute it
again later. If we did that, when get_some_more_information
eventually returned the code would again execute. We could try to
execute the continuation at a possibly new state, but that may not be
what we want. It results in hard to manage execution paths, with the
code possibly getting extra, undesired state transitions.
These continuations are still limited. Due to the try/catch blocks, we
can't really interleave continuations of the same type. The destruction
of the higher stack frames makes it difficult to have multiple continuations
active at any one point in time. Finally, we can't return to the same
continuation point more than once, meaning we can't store a state and then
return to it at a future point, to, for example, re-attempt a transaction.
Coroutines and Continuations. Both are great at decreasing the amount of
bug-ridden, hard to read, unmaintainable C++ state machine code you have to
write. I know it's helped me get more done in a day.