Jonas Bonér | 20 Jun 14:09 2009

Akka Actor Kernel - Re: Lift and Goat Rodeo

Hi Ray.

After seeing your excellent email I figured it would be good time to
write up the vision I have had for the last 6-9 months, and have been
hacking on for 3-4 months now. It is very much aligned with your
ideas. Here is the vision I have, some of it is already implemented
and some not. Trying to find the time :-)

Happy for all feedback I can get.

-----
Vision Akka Actor Kernel

* Erlang OTP-style embrace/expect failure. Supervisor hierarchies,
components (active objects/actors) restarted automatically according
to preconfigured restart strategies.
* Asynchronous and non-blocking message-passing. SEDA in-a-box.
Configurable thread-pools, queue semantics etc.
* Distributed components, both send messages to and link/supervise
remote components.
* Distributed STM on top of the message-passing model to do
compositional message-passing flows. Possible JTA integration.
* Generic Transactional Map, Vector and Ref. In-memory version and
persistent version backed up by Cassandra. Other backends will be
Terracotta, Tokyo Cabinet. Will provide persistente SPI.
* Java & Scala API. Components can plain Java or Scala objects, turned
into active objects using AspectWerkz bytecode weaving proxies. They
can also be Scala actors.
* All components or supervisor families are OSGi bundles. Possible to
run Akka as stand-alone Apache Karaf based server or deploy in other
OSGi kernel.
* JMX management and monitoring of components, queue depths, thread pools etc.
* Hooks transparently into Spring and Guice.
* Hooks into Apache Camel. E.g. allow components to function as Camel
endpoints and to wire up component interactions using Camel.
* REST layer. Expose the components as REST services.

/Jonas

2009/6/19 Ray Racine <ray.racine <at> gmail.com>:
> Great Manifesto.
>
> Skewing the trend towards the Enterprise Business needs perspective
> (slightly different than the social network application).
>
> A System Of Record is a system designed to maintain data.  Think RDB, form
> frameworks, tables, data integrity, ACID, O/R frameworks, etc.  J2EE, and
> similar frameworks are geared for System Of Record applications.  Setup a
> SKU, coupon, contract, price list, campaign.  Data maintenance.
>
> Its not too far of stretch to say commercial vendors, and most frameworks
> are very System Of Record oriented.  Need to build a new app to maintain X.
> Install RDB, app server, select O/R mapping and GUI form framework, place
> warm bodies in front of drag and drop IDEs and go for it.
>
> A System Of Service must service 1,000s of requests per second in millisecs,
> 24/7/365 with 99.99 % reliability e.g. a pricing service.  The business
> logic has to run in microseconds.  Your favorite O/R mapping framework
> hasn't even initiated a JDBC call, heck hasn't even allocated a connection
> from the pool and its already exhausted its 1 ms in allotted time.
>
> An item may undergo a few 10s of price changes a year on the System Of
> Record, yet that item's price may be served 100,000 times for each change on
> the System Of Service.
>
> Systems Of Record deal with lots of meta-data associated with maintaining
> the core data.  I might need only 30-50 data elements to determine a price,
> however, the System Of Maintenance has 100's of data elements in dozens of
> tables for data associated with SOX, security, versioning, authentication,
> approvals, etc...
>
> Data on a System Of Record should be in Boyce-Codd normal form.  Data on
> System Of Service should be structured in whatever way is necessary to
> achieve the workload, think denormalized data structured primarily for
> update workload and secondarily by read-only workload.
>
> There are lots of options for building a System Of Record.  Only the
> Amazons, Facebooks, LinkedIns, and Googles have solutions for building
> Systems Of Service.  Your average Joe-Sixpack enterprise has few options out
> there to build Systems Of Service.  Enterprises need to bring their mashable
> corporate API out onto the internet.  To offer that API they need a System
> Of Service to implement it.  But there are no off the shelf solutions.
>
> Systems are rarely (would be nice) both the System Of Record and the System
> Of Service.  Lets define what a System Of Service looks like.
>
> 1. A System Of Service shall run on a commodity box cluster.
> 2. A System Of Service shall support "hot" code changes (business logic).
> 3. A System Of Service shall be "consistent" in its answer.
> 4. A System Of Service shall be capable of incremental, near perfect
> horizontal scale out.
> 5. A System Of Service shall support failure.
> 6. A System Of Service shall support maximum performance via local resident
> data.
> 7. A System Of Service is session stateless.
> 8. A System Of Service is fed the data necessary to perform its function
> from a System Of Record.
> 9. Any server member of a System Of Service shall handle an update request
> from a System Of Record.
> 10. A Client of, or a System Of Record for, a System Of Service shall not
> observe a distinguished member of the service.  Any server shall be able to
> handle any request.
>
> Clusters
>
> Item #1 is just a given in today's world.  Big mid-range boxes just don't
> make sense.  The amount of pure horse power available on some Intel 64 bit
> commodity servers boggles the mind.
>
> No Down Time, Micro Deployment And Provisioning
>
> Current J2EE application servers are HUGE one size fits all monolithic
> entities.  What is needed is a small framework capable if incremental
> functionality for what is needed by the application.  The next generation
> application server will be a small, robust, OSGi framework server, which is
> configured to meet the needs of the application.  Need JPA, ESB, BPEL,
> Messaging, Batch processing, Transactions, Paxos, Key Storage, Servlet,
> COMET, HTTP, RestLet, SOAP, XMLRPC, EDI, etc, just select the needed
> services for installation in the OSGi framework and create a customized
> application server for the specific requirements of the application.
>
> If done correctly, one can micro hot deploy new versions or releases of the
> various modules, including your own OSGi modularized business logic.
>
> Consistent, Scalable, Robust
>
> Items #3,4,5,6,7 are really the key issues.
>
> Computation is relatively easy to scale out.  More boxes, more instances of
> executing code, even stateful applications aren't too bad with simple server
> affinity capabilities.  Data scale out is the problem, specifically data
> mutation.  By definition a System Of Service is primarily a service that
> operates upon mostly read-only data.  The service may serve 10,000 prices
> for every one price change, but price changes must be supported, and there
> has to be consensus within the cluster on state changes (data mutations).
>
> Item #5 means data must reside in multiple locations.  This is satisfied by
> a Dynamo/Cassandra/Voldemort KV storage system, but item #6 is stronger, it
> requires all data to be co-located on all servers.    Items #3 and #7 state
> a client may be serviced by any arbitrary member of the cluster and receives
> a consistent answer.  But item #7 says state (data) is being mutated by an
> external agent.
>
> One way to achieve the above set of constraints is to treat the entire
> cluster as a state machine.  The cluster is in some state S and transitions
> to a new state S' on updating of state.  If EACH member of the cluster
> applies the same globally ordered transformations then each server will
> provide the same consistent answer modulo latency.
>
> This is the consensus problem.  One solution to the consensus problem is the
> Paxos algorithm.  Zookeeper uses a version of Paxos to achieve consistent
> binding of hierarchical Key-Values across a cluster.   See "Paxos For System
> Builders", I am pretty darn sure it is the original paper used by the
> original Yahoo team that implemented Zookeeper.  Zookeeper is Paxos without
> the ability to define Listeners.
>
> If the cluster reaches consensus on which "command" to execute next and then
> each server in the cluster executes said command, the cluster acts as a
> single monolithic state machine.
>
> Its safe, in the sense that if consensus cannot be reached the system
> "fails" in its current state.  i.e., it will continue to serve prices, but
> will not process price changes.  In the face of failure, a cluster will make
> progress if a majority of nodes can reach consensus.  A failed node will
> reconcile and resynch its state machine with the cluster upon rejoining.
>
> 2PC, 3PC and e3PC transactional systems degenerate versions of Paxos (some
> simplification of).
>
> Dynamo like KV storage systems are substantial improvements over RDBs for
> Systems Of Service.  However, one still has to "fetch" the data for each
> request (it may have just changed).  Depending on the performance needs of a
> System Of Service _any_ cross network data fetch is too slow.  Therefore
> data must be cached, staleness must be dealt with and complexity explodes.
>
> At this point one just says lets just colocate (cache) all the data
> necessary to execute the service on each server and be done with it.  And
> why not?  A 16 even 32 gig server is nothing out of the ordinary these days
> and are quite capable of holding the equivalent of 100's of millions of rows
> of relational data in memory.  This raises the consistency question,
> answerable via distributed cluster node consensus.  Paxos.
>
> Under the System Of Service model Dynamo-like KV storage systems serve as a
> reliable drop off zone for data from Systems Of Record, and State
> Checkpoints. These data quanta can be as simple as JSON / REST oriented data
> updates.  (See
> http://project-voldemort.com/blog/2009/06/building-a-1-tb-data-cycle-at-linkedin-with-hadoop-and-project-voldemort/
> for a similar approach.)  A failed node or a joining node to a System Of
> Service must roll forward the last checkpoint executing each globally
> ordered command.
>
> Ah, yes, time to get to the point of all this...  Its the overlap to your
> manifesto.  Zookeeper <-> Paxos for transactions.  JOSH (Jason) Needs,
> Dynamo/Voldemort/Cassandra KV-Storage.   Lots of common overlap on some core
> technologies.
>
> I'll be creating a Git repo shortly with the start of a Scala based
> implementation of "Paxos for System Builders".
>
> Dave, I think you are on the right path, if for no other reason, I've
> observed similar trends and reached similar conclusions. :)
>
> A System Of Service app server is the next JBoss, the analogue of what J2EE
> is to Systems Of Record applications today.
>
> Ray
>
>
> On Thu, Jun 18, 2009 at 3:19 AM, David Pollak
> <feeder.of.the.bears <at> gmail.com> wrote:
>>
>> Folks,
>>
>> At the end of the Scala Lift Off, after I finished my third beer, Martin
>> Odersky came over to me and asked, "so, what's the future of Lift?"
>>
>> I gave a hand-waving answer about the features for 1.1.  But Martin is not
>> a hand-waving kind of guy and I think I owe him and the other folks in the
>> Scala and Lift communities more.
>>
>> There's a lot more that's necessary for web app development than Lift, an
>> abstraction to the HTTP request/response cycle, can provide.
>>
>> Over the last couple of years, I've been noticing trends in web
>> development, in the needs of my various consulting gigs, and in some other
>> projects.  It's clear to me that it's time for a unified data and data
>> management model that goes beyond OR mapping and that is scalably
>> transactional.  I've put together a model that looks to the developer like
>> STM but is backed with ZooKeeper and Cassandra.  I've blogged about it at
>> http://blog.lostlake.org/index.php?/archives/94-Lift,-Goat-Rodeo-and-Such.html
>>
>> Just as my web framework manifesto was the genesis of what has become
>> Lift, I hope that my notions and ramblings in this blog post will become
>> concrete, usable code over the next few months and a solid platform for
>> building the next generation of web systems over the next few years... all
>> built with Scala at their core.
>>
>> Thanks,
>>
>> David
>>
>> --
>> Lift, the simply functional web framework http://liftweb.net
>> Beginning Scala http://www.apress.com/book/view/1430219890
>> Follow me: http://twitter.com/dpp
>> Git some: http://github.com/dpp
>
>

--

-- 
Jonas Bonér

twitter:  <at> jboner
blog:    http://jonasboner.com
work:   http://crisp.se
work:   http://scalablesolutions.se
code:   http://github.com/jboner


Gmane