Ryan Dahl | 21 Oct 04:36 2010

2010 Q4 Roadmap

= Versions

The master branch has diverged from v0.2 to the extent that most
patches are no long cherry-picking cleanly into v0.2.  The 0.2.x
branch will continue to be supported through November. After v0.2.3,
releases will be only be made for serious bugs. Users are encouraged
to suggest patches for back-porting to the v0.2 branch (or even
submitting patches on v0.2). Expect v0.2.4 by October 23 with a small
number of touch ups. A weekly release of the master branch will be
done under v0.3.X starting with v0.3.0 by October 23th.

= Major Areas of Development

== Write Bunching and String Dumping

Currently socket.write() executes the write(2) syscall. This creates a
number situations where it is difficult to pack data into a single
packet - e.g. writing the header of an HTTP message and then a small
body.  The alternative is to buffer all writes until the next of the
iteration - just before going to select() (or kqueue, ...) sending a
single large writev(2) call with all the data.

Another problem is dumping large strings to sockets. It currently
requires a copy from V8's heap to a Buffer, and then to the kernel.
The benchmarks for this are rather bad. While refactoring the system
to support writev(2), we will attempt to pull strings data directly
from V8. A prototype patch for this has been built:
http://gist.github.com/614169 but it appears that to dump multiple
strings at a time, we will need additional toolage in the form of a GC
compact lock. It's unclear how difficult this is yet.

== Long Stack Traces

A more academic problem: in Node's single threaded event loop you're
always trashing the call-stack. So often times you get exceptions
being thrown in the middle of nowhere without reference to how
execution arrived there. These slides describe the problem in more
detail: http://nodejs.org/illuminati0.pdf

A prototype implementation has been done at
http://github.com/ry/node/tree/eventsource but it is currently very
far behind master. The feature needs the ability to be instrumented
dynamically, because keeping references (let alone stack traces) at
each Event significantly impacts performance.

== Website and Documentation

Rob Righter and his team at Medium have developed a great new website
for Node, which I intend to merge soon. Before that can be done, the
documentation should be rethought. The single man page structure is
becoming unwieldy. Micheil Smith has done some work to separate the
docs into multiple files, and those change should probably be rebased
and landed.

== TLS and Crypto

I have long neglected support on the TLS system. This needs to change
in the coming weeks. The work can easily be divided into two parts:
TLS and Crypto bindings. The less important part is cleaning up code
for Cipher, Hmac, Hash, Sign, etc in src/node_crypto.cc. Much of this
was written before Buffer, and has only recently been made to use
them. The code is not DRY at all, even for C++. Those bindings have
many complicated allocations and it is not unlikely that they contain
memory leaks.

The more important part is making SecureStream work properly. There is
currently an object called SecureStream which is included as a
property in the net.Stream objects (this.secureStream). When a socket
is set to use TLS, it is with great effort and lots of looping that
data is churned through this property before going to the socket or
hitting the user. Ideally this concept could be separated from
net.Stream entirely (indeed - it may be necessary with the "write
bunching" changes) and become a proper subclass of stream.Stream (that
is, implement the Stream interface as described by Mikeal Rogers:
http://gist.github.com/597812)

Decoupling from OpenSSL BIOs will not happening this year, however I
look forward to seeing Paul Querna's Selene grow.

== Performance Testing

It is imperative that performance be measured with little noise in a
regular fashion so that we can back out of any regressions. Some
scripts using R/ggplot2 is my preferred method of graphing. Getting a
system that pushes up a histogram of response times to certain HTTP
responses would be quite useful.

= The Second Stable Release

By December new development should taper off and the system should
solidify for the next stable release, which will happen before the end
of the year - hopefully including all of the work outlined above.


Gmane