Tom Schouten | 21 Nov 19:22

update

hi all,

A not so small update about PF dev mostly pushed by Metabiosis and the
summer goto10 forth workshops. Some changes are quite deep and require
me to rant a bit about PF guts. There are some new features, but most
of it is cleanup and bugfix.

READ+WRITE

PF has a lisp-style parser, which is i guess the first thing one would
notice about it being different than a classic, low-level forth. The
reason for this is of course that PF has a small built-in type/object
system. The parser ('read' = ASCII -> internal data structure) can be
combined with the serializer ('write' = data structure -> ASCII),
which eliminates the need of creating ad-hoc protocols for
communicating with PF.

This is what Metabiosis uses for communication between different PF
instances and could be called 'the PF protocol'. The list + string +
number + symbol subset which excludes raw data packets is i'd say 99%
lisp compatible. This has a lot of benifits: almost any programming
langage can read lisp syntax. This allows you to write 'engines' in
PF, and do all the non real-time stuff in lisp, or any language that
can read this simple syntax. The typical example would be PF's emacs
support.

MEMORY+THREADS

PF has a linear memory model, which means, the entire data structure
PF is aware of is a tree, which makes it possible to use simple
reference count garbage collection which is fast and deterministic.
What this means is that 'dup' and 'drop' are fast, and the language
behaves as a functional programming language (no side effects) for
some operations.

This is fairly easy to implement for single-threaded applications, but
becomes a lot more complicated for multi-threaded one due to locking
issues. I found out a bit too late that having posix thread support
requires you to think before you code.

To make a long story short: PF's core data structures, which can't be
changed without a full rewrite, are not written to take into account
pre-emptive task switching. Trying to do this anyway requires locking
everywhere, and is very error prone since it's not centralized.

The alternative is to take out all locks, and write your own event
scheduling code. That way atomicity is guaranteed, real-time response
is guaranteed due to the deterministic memory management, and the code
becomes a lot easier to get right. This is something i learned to
appreciate toying with more lowlevel forths.

I/O

Uptil now i've only used PF in fairly linear configurations: reading
from one file/process, writing to one file. The standard UNIX stuff.
The problem which Metabiosis exposed was that in order to support more
distributed systems, a lot more care needs to be taken about how to
schedule different tasks. PF had a deadlock problem when different
instances where reading/writing to each other in a cyclic way.

This is solved now. In interfacing with the operating system, PF uses
non-blocking I/O only. It still presents blocking I/O to the forth
level in the same way. Currently, the only tasks that are run during
block are output tasks that write raw data to streams.

This can be straightforwardly extended to arbitrary multiple blocking
forth tasks later, but is there at the C level now due to time
constraints.

RESULTS

* i/o buffer fixes + removed pthread in the language core
* cleaned up memory management code 

    so the core has one less library dependence and got a lot simpler
    too. plus the road is open for multiple blocking forth tasks.

* parser fixes: rewrote as coroutine + added raw packet support

    parser is now a separate task, which was necessary to support
    multiple blocking consoles without pthread support. and it's
    possible to read/write raw packets like matrices and
    images. however, this is not standard and might change in the
    future, but is fairly relyable when the PF instances are exactly
    the same version and run on the same kind of machines (for
    endianness)

* added new 'daemon' mode -> multiple consoles using a unix socket
* readline console is a separate program 'pf-console'

    the readline console which previously ran in a thread is now a
    separate application. it communicates with pf using the 'console
    protocol'. this is a line based pure ASCII protocol: console sends
    one line with PF commands, and receives a line of
    feedback. arbitrary data can easily be tunneled through the ASCII
    by using strings and the 'serialize' word.

* fixed stream code
* OSC now parses from strings, and uses generic UDP network support
* video record and playback objects

    pf can now read/write to/from: files+fifos, subprocesses, TCP/UNIX
    streams and UDP datagrams. this is all raw data. it supports 2
    structured data protocols: the standard PF serialized atom stream,
    and a line based I/O mode.

    there is support for using ffmpeg and mencoder for video
    in/out. this works fairly well, and relieves the burdon of having
    to implement interfaces to libraries on the C level, which often
    break. as a consequence, i think all open source codecs are
    supported for writing, and a huge amount of both open and closed
    codecs are supported for reading. still video only though.

* PF in Pure Data

    not directly related to all the above, but i fixed a lot of bugs
    here too. atm, the single 'tick' task that's used in most
    animation scripts can be supported inside Pd. i think it's time
    somebody started implementing 3DP again :)

Gmane