Mark Miller | 11 Mar 2009 01:02
Picon
Gravatar

Re: [serverjs] Re: (Relational) Database interface

[+e-lang, +google-caja-discuss]

As I'm in a hurry, all my code below is untested. Apologies in advance
for any bugs.

On Tue, Mar 10, 2009 at 6:57 AM, Kris Zyp <kriszyp@...> wrote:
>> What do you see as the benefits?
>
> Being able to pause execution allows for encapsulation. The fundamental
> problem we face is that interfaces that are built for synchronous
> expectations can not utilize asynchronous actions without bending their
> interface. For example, suppose we had written:

> function inStock(product){
>    return product.numInStock > 0;
> }
> function allInStock(products){
>    products.every(function(){
>       return inStock(product);
>    });
> }

Presumably you meant

     return products.every(function(){

?

> And perhaps inStock is used in other places as well. But now, suppose we
> want to change the implementation of inStock to do a DB query if the
> numInStock is close to zero, to verify the number. If the DB API is
> asynchronous, we must rewrite the function to something like:
> function inStock(product, onComplete){
>    if (product.numInStock > 5){
>       onComplete(true);
>    }
>    queryDB("select numInStock from product where id=" + product.id,
> function(results){
>       onComplete(results.numInStock > 0);
>    }
> }

Using promises, and assuming queryDB("select numInStock from product
where id=" + product.id) returns a promise for a product, I'd write

  function inStock(product) {
    if (product.numInStock > 5) {
      return true;
    }
    var productVow = queryDB("select numInStock from product where
id=" + product.id);
    return Q.when(productVow, function(prod) { return prod.numInStock > 0; });
  }

I would then assume that inStock returns a promise for a boolean. (In
most ways, a real boolean can be treated as a resolved promise for a
boolean.) Since we have a set of promises for booleans, we need an
eventual equivalent of Array.prototype.every. This is exactly the
asyncAnd example from Concurrency Among Strangers. Rewriting asyncAnd
from E to Cajita-on-ES3.1 + ref_send.js:

  function asyncAnd(answerVows) {
    var result = Q.defer();
    var numLeft = answerVows.length;
    if (numLeft <= 0) { return true; }
    answerVows.forEach(function(answerVow) {
      Q.when(answerVow, function(answer) {
        if (answer) {
          --numLeft;
          if (numLeft <= 0) { Q.resolve(true); }
        } else {
          Q.resolve(false);
        }
      }, function(reason) { Q.resolve(Q.reject(reason)); });
    });
    return result.promise;
  }

I would then write an eventual version of your allInStock as

  function allInStock(products) {
    return asyncAnd(products.map(function(prod) { return inStock(product); }));
  }

Among the many advantages this has over the blocking style are that it
can resolve its answer to false or broken as soon as any query reports
back false or broken. It can also send out all the queries "at once",
i.e., without waiting for responses from previous queries. If the
databases are remote, this is a huge latency savings.

> And now we are stuck rewriting all the functions that utilize "inStock".
> If some of these function can't be rewritten, than we are simply out of
> luck.

If your inStock would allow other turns to proceed in its event loop
while blocked, then it is a bug, not a feature, that your callers do
not need to be modified. Their stateful assumptions will no longer be
valid.

> This is also illustrated by the situation with ServerJS's
> (apparently from what I can tell) favored web interface API being
> synchronous and possibly having a database API that is async. Without
> anyway to use the async API in the sync interface, the async API is
> useless.
>
> Of course, one can retort that we should make all APIs async then. But
> this is absurd, making programs ridiculously complex. Iterations and
> branch logic become terribly messy, and in reality probably less then 5%
> of functions really ever need any async handling. But the trick is
> predicted which ones those will be. The issues with other events being
> executed will a function is paused pales in comparison to this mess.
> This is a constant issue we face with building real web applications,
> and I have no interest in passing this on to the server.
>
>>
>>> But getting back to the original point, are you suggesting that there is
>>> a way that an async database API could somehow be used in a
>>> handle-request-by-function-call (where the response is sent when the
>>> function returns) web request dispatcher API? Or are you arguing that a
>>> call-to-send-response API with async callbacks throughout code isn't
>>> that bad?
>>>
>>
>> The latter. Communicating event loops with promises and "when" is much
>> more pleasant to use than communicating event loops by themselves.
>> When event loops are only asynchronously coupled, they can also be
>> pleasantly distributed without changing the computational model.
>>
>> However, sometimes, when you know the IO is local and prompt, it is
>> sufficiently more convenient to express the local IO as synchronous
>> that the temptation cannot resisted. E itself provides for synchronous
>> file IO for this reason. If you do provide such synchronous DB access,
>> I urge you to block the caller's event loop as a whole (as Erlang does
>> during a receive operation) rather than allow it to serve other events
>> while the caller is blocked. This is still avoids creating recursive
>> reentry surprises, but is less safe against deadlock.
>>
>> In short, my preference order, from most to least error prone, is
>>
>> * Communicating Event Loops with promises and "when" (E, Waterken, JS
>> with ref_send).
>> * Communicating Event Loops with manual callback registration (current
>> JS practice, Tcl, libasync)
>> * Share-nothing concurrency with blocking receive (Erlang style)
>> * Cooperative coroutining (Narrative JS, Smalltalk (almost))
>> * Preemptive shared memory threading (Java)
>>
>> By my criteria, current JS practice is almost as good as it gets. I
>> hope at least we don't make things worse.
>>
>>
> I certainly agree with most of these points. Utilizing promises is far
> superior to callback style APIs. We use promises extensively in Dojo, to
> the point where callback style APIs look like ugly warts in need of
> deprecation.

Cool. Where should I start to read about promises in Dojo?

> Preemptive shared memory threading should not even be
> considered. The main thing I want is a "wait" method on the promises
> that will block execution until the promise is fulfilled. Of course this
> necessitates executing other events in the event queue, so that there
> will be some code that can fulfill the promise.

I am skeptical of this. How would you determine which events are in
which category? Another possibility is to only wait on a remote
promise -- a promise whose resolver is held by another event loop.
Then you could block your event loop as a whole while waiting for that
promise to resolve. This would give you Erlang style concurrency,
safely rescuing your sequential code from being rewritten. You may
even occasionally still avoid deadlocking ;).

> I would certainly be
> willing to agree that there are a class of events that should not be
> dispatched from the event queue while an promise is waiting. In
> Persevere, we have an event loop mechanism. All requests from a given
> client are executed in sequence; there is basically only one type of
> event that enters the event queue, an HTTP request. If Persevere had
> promises with a "wait" method, I would absolutely block any further HTTP
> requests (for the client) while a prior request handler was paused. But
> once again, DB and other IO event handlers would need to be able to
> execute to fulfill promises and resume the "main" HTTP request handling
> execution. (It should also be noted that Persevere uses extensive
> concurrency for handling requests from all the different clients with
> other threads. However, these threads do not share any mutable transient
> data (at the JavaScript level), so it is basically a shared nothing
> approach. But concurrent handling of requests is crucial for scalable
> web servers).

If we think of these IO handlers as conceptually being isolated in
distinct event loops, then perhaps we are agreeing.

> For the ServerJS group, I think we can agree on these recommendations:
> * ServerJS will not define any API for creating preemptive shared
> transient memory threads.

+1

> * A promise API should be created with at least methods for registering
> fulfillment callback, failure callback, and fulfilling the promise.

+1. I propose the Waterken ref_send API for this, but let's also have
a look at Dojo's.

> * Asynchronous libraries that have a single fulfillment to an action
> that should utilize the promise API rather than having a callback parameter.

+1

> And then my desire is that the promise API also have a "wait" method
> that will wait until the promise is fulfilled, and then return the value
> that fulfilled the promise (or throws the error that caused the promise
> to fail). Presumably, if nothing else, I would be allowed to add wait as
> an extension point to the standard API. But, IMO, it is simply untenable
> to have async libraries that can not be utilized within sync functions.

If this preserves Erlang-style safety, as we may be agreeing on above,
I have no objection. Otherwise, the introduction of your wait
operation has all the problems of
<http://www.felocity.org/blog/article/javascripts_strange_threaded_nature>
-- disrupting the stateful assumptions of every other piece of code
that might call something that .. that might call your wait().

> Another approach to this problem (rather than adding a "wait" method)
> would be to have a new language-level async call operator that could
> handle returned promises by pausing the current function and returning a
> promise to it's caller.

Since Q.when returns a promise for what its body will return, you can
accomplish essentially the same thing by putting the rest of the
current function into the body of a "return Q.when(...)".

> This is of course is an aside since it can't be
> addressed by ServerJS, but more musings of possible discussions on other
> mailing lists. But I believe an async call operator would appease your
> concerns about functions being paused for event loop processing that do
> not expect any events to be processed while they are being executed,
> since functions that do not use the async call operator would never be
> paused. For example, if we wrote:
> function(){
>    var a = foo->(); // an async call to foo, foo may be a normal sync
> function, or it may return a promise
>    return a + 6;
> }
> This would be sugar for:
> function(){
>    var a = foo();
>    if (a instanceof Promise){
>       var $1 = new Promise();
>       a.when(function($2){
>          $1.resolve($2 + 6);
>       });
>       return $1;
>    }
>    return a + 6;
> }

Since your unsugared form calls foo() directly, I'll assume below that
is correct, and that foo() may return a promise:

  function() {
    return Q.when(foo(), function(a) { return a+6; });
  }

If that assumption is not correct and you intend to call foo
asynchronously, replace "foo()" with "Q.send(foo, 'run', [])". In
either case, it is not clear that we need any sugar.

> I don't know if this would still be classified as sugar since the
> capabilities required to handle looping and branches are more in the
> realm of generator's semantics. This is semantically similar to Neil
> Mix's "JS 1.7 threading library" (that Wes referenced), but with
> sufficient capabilities to actually be useful. Of course this still
> violates the principle of encapsulation that I argue is important, but
> it makes it so easy to write code that can easily defend against changes
> to asynchronous behavior, that I would gladly accept it.

Very cool. I think we're rapidly converging.

--

-- 
Text by me above is hereby placed in the public domain

    Cheers,
    --MarkM

Gmane