Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features
Download
Marketing
Archives
FAQ
Blog
|
|
/* Actors, based on the Win16/32/64 WndProc mechanism.
This is a surprisingly usable and high-performance object system in 11
lines of C89, although its usability benefits greatly from GCC's
nested function extension.
The basic architecture is that you write a function for each "class"
of "objects", which is called every time a "method" is "called" on
your "object", and create a struct (whose contents are never used) to
identify each "method name". The linker assigns each struct a
separate address in memory, and that address is used to identify which
"method" is being "called" (with a conditional in your function).
In the terminology of the actor model (as I have mangled it), an
"object" is called an "actor", the code it runs (the "class") is
called the actor's "script", and instead of "calling methods", we
"send cues".
This example program displays a grid on the screen where you can move
around with your cursor keys and interact with different kinds of
objects in the grid: editable text, uneditable text, numbers, bits.
The "actor" is represented by a struct whose first item is a pointer
to its script, and whose other three items are available for the
creator to supply an initial state to the actor. In cases where you
need more than three fields, or if you want a little more type-safety,
you can define a struct for your internal state.
Overall structure of example program
------------------------------------
In this example program, we have a grid actor, which contains cell
actors; the cell actors use several different scripts, giving them
different behavior. One of the cell actors is a user interface for a
byte-buffer actor. The output from the grid is passed to a
buffered-output actor, which uses the same kind of text-buffer actor
to buffer the output before handing it off to the OS. (There's
commented-out code for an older buffered-output actor that used
stdio.) The main program doesn't interact directly with the grid
actor; instead, it interacts with an arrow-key decorator actor that
acts as a proxy for the grid, rewriting arrow-key escape sequences
from the input stream into single control characters.
Interfaces and privacy
----------------------
Because cue types (or methods) are named with memory addresses,
rather than strings or vtable offsets, they are subject to all the
visibility rules of C: you can say `static cue_t x = { "x" }`
so that `x` can be called only from within the same module; you can
pass a `cue_t` as an argument to a function that wouldn't
normally be able to send that cue type; you can conditionally
declare them in header files; and so on.
Given that, there are no explicit declarations of interfaces. If
someone tries to send an `q_fire` cue to an event detector, but
instead they send it to a nuclear-weapons launcher, you'd better hope
the nuclear-weapons launcher script didn't use the same `q_fire`
cue type. You can totally have more than one of them, as long as
they're static.
If you depend on conditionally declaring cue types in header
files, be aware that people can work around that by adding
their own declaration.
Limited return values
---------------------
The response to sending a cue can be a single `void*`, so single-word
getters work okay with return values, but for more general "return
values", we use, and rely on, the traditional stack of function calls,
taking advantage of GCC's nested functions to avoid the need for
return values. For example, instead of writing:
int len = send(b, &q_get_len, 0);
if (len) send(b, &q_truncate_to, (void*)len-1);
we manually rewrite the code in explicit continuation-passing style:
int len;
void get_len(char *s, int thelen) { len = thelen; }
send(b, &q_read, (void*)get_len);
if (len) send(b, &q_truncate_to, (void*)len-1);
For objects larger than single registers,
such as strings, it dramatically simplifies lifecycle
management. When an integer cell is asked for its string value, it
can construct that value in a stack-allocated buffer, then deallocate
it once the callback returns; `implement_get_text_as_number` is an
example:
void *
implement_get_text_as_number(int number, const cue_t *cue, void *arg)
{
if (cue == &q_get_text) {
char numbuf[32];
my_itoa(numbuf, number);
(*(get_text_callback)arg)(numbuf, strlen(numbuf));
}
return 0;
}
If you were using this scheme with a compiler that doesn't provide
nested functions, you'd almost certainly want to increase the number
of arguments to the cue from 1 to 2, the traditional Windows wParam
and lParam, so that you could pass both a function pointer and a
user-data pointer to be passed to it.
Allocation, initialization, and pointers
----------------------------------------
The only heap allocation in this program is to handle string buffers,
which resize. Although you could certainly allocate actors on the
heap, all of the actors in this example program are stack-allocated, a
circumstance which should be familiar to C++ users and bizarre to
anyone from any other OO language. (The stack allocation means the
program as a whole needs about 580 bytes of memory, not counting the
never-accessed space for the cue structs.) Two of the actor types
(byte buffers and buffered output) take a "destroy" cue to deallocate
their memory.
This implies that the creator of each actor knows its size, and
actually also its initial contents. This is somewhat suboptimal, but
I claim it's not as bad as it sounds.
Why not? Well, the primary flexibility provided by OO is that when
you send a message, it can be handled by any of several different
concrete classes, with possibly different implementations; someone can
pass you a dummy file output object, or a compressing file output
object, and your code will continue to work as before, but with added
functionality. This obviously doesn't work if you instantiate the
object yourself with some concrete class. If you use dependency
injection or factory objects, then the apparent limitation goes away,
along with the other inflexibilities that those patterns remove.
Type-checking of method parameters and actor state
--------------------------------------------------
Nope, sorry. FWIW, although I had a heck of a lot of bugs writing
this code, only one of them came from those type errors.
Actually, it would be good if there was a way to blow up when you call
a method that's not defined, but this code isn't even doing that much
type-checking.
Performance
-----------
I haven't tried doing anything hard with it. It looks like it should
be comparably efficient to C++ vtable dispatch. Crudely, I get about
100 million serial cue sends per second on simple classes on a
1.6GHz Atom, so about 16 clock cycles per cue send; presumably you
spend about another clock cycle on average per type of cue you
handle.
You suffer function prologue and epilogue overhead around the "method
dispatch" script, but then you don't need to suffer them again; the
method body can be inlined into the dispatch function. I specify
regparm(3) for the scripts themselves to cut the prologue and epilogue
overhead a bit.
The nested-functions-for-getter trick is pretty costly, particularly
since GCC constructs trampolines for them for you upon function entry
whether you're using the trampoline or not.
It seems to be fairly space-efficient. The first script in the object
file is `textstring_script`, which is a very simple class: it displays
an ASCIZ string in a cell of the grid. The entire implementation of
this class is a single function that compiles to 9 i386 instructions,
25 bytes. (That happens to be the class I used to test the cue
sends.)
Possible improvements
---------------------
4. Use functions instead of structs to identify methods? That way the
debugger knows to print their names.
Provenance
----------
Written by Kragen Javier Sitaker, with many helpful improvements
contributed by Darius Bacon.
*/
#include
#include
#include
#include
#include
#include
#include
/******************** Object system. ********************/
typedef struct { char *name; } cue_t;
#define SCRIPT(x) __attribute__((regparm(3))) void * \
x(actor *me, const cue_t *cue, void *arg)
typedef struct actor actor;
typedef SCRIPT((*script));
struct actor { script scr; void *state, *state2, *state3; };
static inline void *
send(actor *a, const cue_t *cue, void *arg)
{
return a->scr(a, cue, arg);
}
/******************** misc. ********************/
#ifdef CTRL
#undef CTRL
#endif
#define CTRL(c) ((c) & 0x1f)
void
panic(char *what)
{
char *err = strerror(errno);
write(2, what, strlen(what));
write(2, ": ", 2);
write(2, err, strlen(err));
write(2, "\n", 1);
kill(getpid(), SIGABRT);
_exit(-1);
}
/******************** Output stream protocol. ********************/
// Cues for output streams:
cue_t q_print = { "print" };
typedef struct { char *bytes; int len; } bytes_with_len;
cue_t q_flush = { "flush" };
// Convenience functions for q_print
static inline void
print(actor *actor, char *bytes, int len)
{
bytes_with_len s = { bytes, len };
send(actor, &q_print, &s);
}
static inline void
print_asciz(actor *actor, char *bytes)
{
print(actor, bytes, strlen(bytes));
}
/******************** grid. ********************/
// The grid is a rectangular array of cell actors; it routes
// keystrokes to the actor in the current cell, and redisplays the
// grid upon request.
// Cues for cells:
cue_t q_get_text = { "get-text" };
cue_t q_keystroke = { "keystroke" };
// Cue for the grid (which also handles q_keystroke):
cue_t q_draw_grid = { "draw-grid" };
// Convenience function which helps a little with type-checking.
static inline void
send_key(actor *actor, char key)
{
send(actor, &q_keystroke, (void*)(int)key);
}
typedef struct { actor *cells; int width; int height; int x, y; } grid;
static inline actor *
cell_at(grid *g, int x, int y)
{
return &g->cells[g->width * y + x];
}
static inline actor *current_cell(grid *g) { return cell_at(g, g->x, g->y);
}
#define IF break; case
#define ELSE break; default
void
grid_handle_key(grid *g, char k)
{
switch(k) {
IF CTRL('f'): g->x = (g->x + 1) % g->width;
IF CTRL('b'): g->x = (g->x + g->width - 1) % g->width;
IF CTRL('p'): g->y = (g->y + g->height - 1) % g->height;
IF CTRL('n'): g->y = (g->y + 1) % g->height;
ELSE: send_key(current_cell(g), k);
}
}
void
grid_draw(grid *g, actor *output)
{
int i, j, column_widths[g->width];
for (i = 0; i < g->width; i++) {
int maxlen = 0;
for (j = 0; j < g->height; j++) {
void get_length(char *string, int len) {
if (len > maxlen) maxlen = len;
}
send(cell_at(g, i, j), &q_get_text, get_length);
}
column_widths[i] = maxlen;
}
print_asciz(output, "\e[H\e[J"); // clear screen
for (i = 0; i < g->height; i++) {
for (j = 0; j < g->width; j++) {
char *after = "";
if (j == g->x && i == g->y) {
print_asciz(output, "\e[47;35m"); // highlight magenta on white
after = "\e[0m";
}
void emit_output(char *string, int len) {
print(output, string, len);
for (; len < column_widths[j] + 1; len++) {
print_asciz(output, " ");
}
}
send(cell_at(g, j, i), &q_get_text, emit_output);
print_asciz(output, after);
}
print_asciz(output, "\n");
}
send(output, &q_flush, NULL);
}
SCRIPT(grid_script)
{
grid *g = me->state;
if (cue == &q_draw_grid) {
grid_draw(g, arg);
} else if (cue == &q_keystroke) {
grid_handle_key(g, (char)(int)arg);
}
return 0;
}
/*
SCRIPT(stdio_script)
{
FILE *f = me->state;
bytes_with_len *b = arg;
if (cue == &q_print) { fwrite(arg->bytes, arg->len, 1, f); }
else if (cue == &q_flush) { fflush(f); }
}
*/
/******************** Buffers. ********************/
// This is an autogrowing buffer of bytes, like stralloc or
// std::string. It supports five methods.
// Previously, it was a separate struct, but now that the actor
// struct has three user-data fields, we use them as follows:
// state: char* to the beginning of the buffer
// state2: current size of buffer contents
// state3: allocated size of buffer
cue_t q_truncate_to = { "truncate-to" };
cue_t q_read = { "read" };
cue_t q_append = { "append" };
cue_t q_append_asciz = { "append-asciz" };
cue_t q_get_maxlen = { "get-maxlen" };
cue_t q_destroy = { "destroy" };
void
ensure_buf_can_hold(actor *me, int newlen)
{
int newmaxlen = newlen * 2 + 16;
char *ns;
if (newlen <= (int)me->state3) return;
ns = realloc(me->state, newmaxlen);
if (!ns) panic("realloc");
me->state = ns;
me->state3 = (void*)newmaxlen;
}
SCRIPT(buf_script)
{
char *s = me->state;
int len = (int)me->state2, maxlen = (int)me->state3;
if (cue == &q_read) {
(*(void(*)())arg)(s, len);
} else if (cue == &q_truncate_to) {
int newlen = (int)arg;
if (newlen < len) me->state2 = (void*)newlen;
} else if (cue == &q_append) {
bytes_with_len *b = arg;
ensure_buf_can_hold(me, b->len + len);
memcpy(me->state + len, b->bytes, b->len);
me->state2 = (void*)(b->len + len);
} else if (cue == &q_append_asciz) {
bytes_with_len b = { arg, strlen(arg) };
return send(me, &q_append, &b);
} else if (cue == &q_get_maxlen) {
return (void*)maxlen;
} else if (cue == &q_destroy) {
free(s);
me->state = me->state2 = me->state3 = NULL;
}
return 0;
}
/******************** cell types ********************/
// Currently there are textstring_script, strlen_cell_script,
// buf_cell_script, buf_len_cell_script, buf_maxlen_cell_script,
// bit_cell_script, and integer_script.
typedef void (*get_text_callback)(char *string, int len);
SCRIPT(textstring_script)
{
if (cue == &q_get_text) (*(get_text_callback)arg)(me->state,
strlen(me->state));
return 0;
}
static inline char *
_my_itoa(char *numbuf, int number)
{
int more_digits = number / 10;
if (more_digits) numbuf = _my_itoa(numbuf, more_digits);
*numbuf = '0' + number % 10;
return numbuf + 1;
}
void
my_itoa(char *numbuf, int number)
{
*_my_itoa(numbuf, number) = '\0';
}
void *
implement_get_text_as_number(int number, const cue_t *cue, void *arg)
{
if (cue == &q_get_text) {
char numbuf[32];
my_itoa(numbuf, number);
(*(get_text_callback)arg)(numbuf, strlen(numbuf));
}
return 0;
}
SCRIPT(strlen_cell_script)
{
return implement_get_text_as_number(strlen(me->state), cue, arg);
}
SCRIPT(bit_cell_script)
{
int *my_number_p = me->state;
int bit_offset = (int)me->state2;
if (cue == &q_get_text) {
char *to_display = " ";
if (*my_number_p & 1 << bit_offset) to_display = "x";
(*(get_text_callback)arg)(to_display, 1);
} else if (cue == &q_keystroke) {
*my_number_p ^= 1 << bit_offset;
}
return 0;
}
SCRIPT(integer_script)
{
int *num = me->state;
if (cue == &q_keystroke) {
switch((int)arg % 4) {
IF 0: --*num;
IF 1: ++*num;
IF 2: *num *= 2;
ELSE: *num /= 2;
}
}
return implement_get_text_as_number(*num, cue, arg);
}
SCRIPT(buf_cell_script)
{
actor *b = me->state;
if (cue == &q_get_text) {
send(b, &q_read, arg);
} else if (cue == &q_keystroke) {
char c = (char)(int)arg;
if (c == CTRL('h') || c == '\x7f') { // backspace and delete
int len;
void get_len(char *s, int thelen) { len = thelen; }
send(b, &q_read, get_len);
if (len) send(b, &q_truncate_to, (void*)len-1);
} else if (c < ' ') { // other control characters
// do nothing
} else {
bytes_with_len buf = { &c, 1 };
send(b, &q_append, &buf);
}
}
return 0;
}
SCRIPT(buf_len_cell_script)
{
int length;
void get_length(char *s, int len) { length = len; }
send(me->state, &q_read, get_length);
return implement_get_text_as_number(length, cue, arg);
}
SCRIPT(buf_maxlen_cell_script)
{
int length = (int)send(me->state, &q_get_maxlen, 0);
return implement_get_text_as_number(length, cue, arg);
}
/******************** buffered output ********************/
// I added this in part because I thought stdio was too inefficient,
// in part to demonstrate the idea, and in part to avoid linking in
// the whole stdio implementation.
SCRIPT(buffered_output_script)
{
actor *buf = me->state;
int fd = (int)me->state2;
int length;
void get_length(char *s, int len) { length = len; }
if (cue == &q_print) {
send(buf, &q_append, arg);
}
send(buf, &q_read, get_length);
if (cue == &q_flush || cue == &q_destroy || length > 4096) {
void output(char *s, int len) { write(fd, s, len); }
send(buf, &q_read, output);
send(buf, &q_truncate_to, 0);
}
if (cue == &q_destroy) {
send(buf, &q_destroy, NULL);
}
return 0;
}
/******************** arrow key decorator ********************/
// This is an explicit state machine that rewrites ANSI arrow key
// escape sequences (^[[A, ^[[B, ^[[C, and ^[[D,
// or the equivalents with O instead of [) into the Emacs
// control keys understood by the grid as movement commands. It's a
// proxy decorating another actor (which happens to be the grid) and
// passes through any cues it doesn't understand unchanged.
//
// Note that the state-machine state is after the next-actor field,
// and its initial value is zero, so that the creator doesn't have to
// initialize it explicitly.
enum ak_state { ak_normal = 0, ak_esc_seen, ak_lsq_seen };
SCRIPT(arrow_key_script)
{
actor *next = me->state;
enum ak_state state2 = (enum ak_state)me->state2;
inline void go_to(enum ak_state next_state) { me->state2 =
(void*)next_state; }
if (cue == &q_keystroke) {
char c = (char)(int)arg;
switch (state2) {
IF ak_normal:
if (c == '\e') {
go_to(ak_esc_seen);
} else {
send(next, cue, arg);
go_to(ak_normal);
}
IF ak_esc_seen:
if (c == '[' || c == 'O') {
me->state3 = (void*)(int)c;
go_to(ak_lsq_seen);
} else {
send(next, cue, (void*)'\e');
send(next, cue, arg);
go_to(ak_normal);
}
IF ak_lsq_seen:
switch (c) {
IF 'A': send(next, cue, (void*)CTRL('p'));
IF 'B': send(next, cue, (void*)CTRL('n'));
IF 'C': send(next, cue, (void*)CTRL('f'));
IF 'D': send(next, cue, (void*)CTRL('b'));
ELSE:
send(next, cue, (void*)'\e');
send(next, cue, me->state3);
send(next, cue, arg);
}
go_to(ak_normal);
}
} else {
send(next, cue, arg);
}
return 0;
}
/******************** Example keystroke-driven program
********************/
/* because curses makes me curse */
struct termios
my_cbreak(int fd)
{
struct termios new_termios, old_termios;
if (tcgetattr(fd, &old_termios) < 0) panic("tcgetattr");
new_termios = old_termios;
new_termios.c_lflag &= ~(ECHO | ICANON);
new_termios.c_cc[VMIN] = 0;
new_termios.c_cc[VTIME] = 0;
if (tcsetattr(fd, TCSANOW, &new_termios) < 0) panic("tcsetattr");
return old_termios;
}
enum { stdin_fd = 0, stdout_fd = 1 };
int
mymain() {
// actor stdout_actor = { stdio_script, stdout };
actor output_buf = {buf_script};
actor output_actor = {buffered_output_script, &output_buf,
(void*)stdout_fd};
actor buf1buf = {buf_script};
int my_number = 42;
actor cells[] = {
{buf_len_cell_script, &buf1buf},
{buf_maxlen_cell_script, &buf1buf},
{buf_cell_script, &buf1buf},
{textstring_script, ""},
{bit_cell_script, &my_number, (void*)7},
{bit_cell_script, &my_number, (void*)6},
{bit_cell_script, &my_number, (void*)5},
{bit_cell_script, &my_number, (void*)4},
{bit_cell_script, &my_number, (void*)3},
{bit_cell_script, &my_number, (void*)2},
{bit_cell_script, &my_number, (void*)1},
{bit_cell_script, &my_number, (void*)0},
{integer_script, &my_number},
{textstring_script, "In the end,"},
{textstring_script, "only kindness matters."},
{textstring_script, "I'm sensitive and I'd like to stay that way"},
};
grid mygrid = { .width = 4, .height = 4, .x = 2, .y = 0, .cells = cells
};
actor real_grid_actor = { grid_script, &mygrid };
actor grid_actor = { arrow_key_script, &real_grid_actor };
struct termios old_termios = my_cbreak(stdin_fd);
send(&buf1buf, &q_append_asciz, "this is a test");
/* cue sending speed test: * /
int i;
for (i = 0; i < 100*1000*1000; i++) {
send_key(&cells[3], 'x');
}
//*/
for (;;) {
char c;
send(&grid_actor, &q_draw_grid, &output_actor);
while (!read(stdin_fd, &c, 1)) usleep(20*1000);
if (c == 'q') break;
send_key(&grid_actor, c);
}
send(&buf1buf, &q_destroy, NULL);
send(&output_actor, &q_destroy, NULL);
tcsetattr(stdin_fd, TCSANOW, &old_termios);
return 0;
}
// This is sort of the equivalent of Python's "if __name__ == '__main__':"
int main() __attribute__((weak, alias("mymain")));
--
To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-hacks
|
|