Subject: Parallel Haskell Digest 5 Newsgroups: gmane.comp.lang.haskell.cafe Date: Wednesday 31st August 2011 07:39:42 UTC (over 6 years ago) Parallel Haskell Digest ======================= Edition 5 2011-08-31 Hello Haskellers! Eric here, reprising my role as Parallel Haskell Digester. Many thanks to Nick for good stewardship of the digest and nice new directions for the future. This month we have Erlang PhDs (and a postdoc), new partners, two Monad Reader articles in progress and some strategies. As usual, this digest is made possible by the [Parallel GHC Project][n0]. News ---------------------------------------------------------------------- [Scalable Erlang PostDoc and 2 PhD Studentships][n1] (8 Aug) What, Erlang?! Yes, you are reading the right digest. If you're generally into parallelism and concurrency in functional programming languages, you may be especially interested to know about this announcement from Phil Trinder. > RELEASE project - A High-level Paradigm for Reliable Large-scale > Server Software - is funded by the EU Framework 7 programme for 36 > months from October 2011. Its aim is to scale the radical > concurrency-oriented programming paradigm to build reliable > general-purpose software, such as server-based systems, on massively > parallel machines (100 000 cores). Word of the Month ---------------------------------------------------------------------- Last month, we had `par` and `pseq` as our twin words of the month. Let's pursue this little train of parallel thought; our next word the month is *strategy*. Strategies have been around since the 1993 paper [Algorithm + Strategy = Parallelism][asp]; that's before even we started using monads in Haskell! They've recently been revamped in Simon Marlow's [Seq no More][seq-nm] paper, and it's this version of strategies that we'll be exploring here. Strategies are built on top of the `par` and `pseq` primitives we saw in the [last digest][phd4]. They provide a nice way to express the often complicated logic we need to make the best use of parallelism in our code. Use of strategies can also help to make parallel code easier to read and maintain because they allow us to more cleanly separate the core logic from our code that which pertains to our use of parallelism. Before delving into strategies, let's take a small notational detour by introducing the `Eval` monad. Suppose we wanted a parallel version of the `map` function, something that would apply a function to each item of a list. Using the `par` and `pseq` from the last digest, we might express this function parMap :: (a -> b) -> [a] -> [b] parMap f [] = [] parMap f (a:as) = b `par` bs `pseq` (b:bs) where b = f a bs = parMap f as If we look carefully at the code we can observe that there is something inherently sequential in the way we have expressed this parallel computation: first spark off `f a` then recurse to the tail of the list, and finally cons. The `Eval` monad builds off the insight that expressing parallelism is fundamentally (perhaps counterintuitively) about ordering things. Monads are well-suited for expressing ordering relationships, and so they have been been pressed to work for expressing parallel computation as well. data Eval a instance Monad Eval runEval :: Eval a -> a rpar :: a -> Eval a rseq :: a -> Eval a `Eval` is just a strict identity monad, with `rpar` and `rseq` as counterparts to `par` and `pseq`. We use `Eval` to compose sequences of parallel computation, which we extract the results of by using the `runEval` function. If you're not familiar with monads, you can get away with just treating `Eval` as new notation, or rather, borrowed notation, the same that we use IO, parser combinator libraries, QuickCheck and a plethora of other useful monads. It's worth noting also that despite appearances, we are still in purely functional territory -- no IO here! -- with the notion of sequencing being limited to controlling parallelism and evaluation depth. To make use of `Eval` for our `parMap` function, we could write a version like the below. It introduces a change of type, from returning `[b]` to `Eval [b]`. In the general case, we could just use the `runEval` function to get our result back, but we are not baking it into `parMap` because we would typically want to use then function within a greater `Eval` context anyway. parMap :: (a -> b) -> [a] -> Eval [b] parMap f [] = return [] parMap f (a:as) = do b <- rpar (f a) bs <- parMap f as return (b:bs) As before, this function captures the basic idea of its sequential counterpart `map`: apply function, recurse to tail, cons new head to new tail. This is a passable parallel map, but there are still two things which are unfortunate about it. First, we have repeated the implementation of map, not a big deal for such a small function but a potential maintenance problem for more complex code. Second, we have only captured one sort of parallel evaluation firing off sparks for all the cons cells, but in practice getting parallelism right requires some often careful tuning to get the right level of granularity and account for dependencies. For instance, what if instead of doing each cell in parallel, we wanted to bunch them up into chunks so as to minimise the coordination overhead? What we need is a way to express ideas about running code in parallel (such as “break into chunks and run each chunk simultaneously"), preferably avoid duplicating existing code or otherwise making it more complicated than it needs to be. This is where strategies come in. A strategy is just a function that turns some data into an `Eval` sequence. We've already encountered two strategies above, `rpar` which sparks off a parallel evaluation, and `rseq` which evaluates its argument to weak head normal form. Many more strategies possible particularly when they are custom designed for specific data types. Also, if we have a strategy, we can imagine wanting to use it. This we capture with `using`, which applies the strategy function and runs the resulting sequence. Parallel evaluation aside, you can almost pretty much think of |
|||