Tony Morris | 12 Sep 13:53 2011

Re: Dependency Injection

On 09/12/2011 09:38 PM, Billy wrote:
> A debate started on the scala-language group that I am moving to this
> group as it is more fitting here. The matter of DI was discussed, and
> I am actively porting the DI portions of Spring to scala to answer a
> question of practicality. In this effort, the resulting solution
> (which I have dubbed "Recoil") may end up not looking anything like
> the original framework. The requirements that spawned this are a
> matter of practicality around updating large solutions that have
> already been put into production use. My goal is to see if an object-
> functional framework can be devised that allows for zero updates to
> existing solutions so as to maintain a single codebase within a global
> scale enterprise. It is hoped that such a solution will allow for
> domain specific rules to be injected in order to meet the needs of
> multiple different domains without apriori knowledge of said rules.
> Billy

[moving to scala-debate per agreed request]

I don't recall your constraints so cannot answer your question.

Someone (I forget who) recently wrote (I forget where): "DI is just
socially acceptable global variables." This is mostly true -- I say
mostly because I think the adverb "globally" is redundant and
misleading. That is to say, there is no such thing as a global variable.
All variables are scoped to some context and it is the extent of this
context that is a measure of detriment. This is why you hear people
talking about "keeping their side-effects local." This wishful-thinking
almost never eventuates because side-effects are pervasive. I am
side-tracking here, but going back to the original topic briefly.

In the absence of my awareness of your constraints, I can point out what
it is that most people want when they think they want DI, and in fact,
do not, ever (it is one of many forms of masochism in programming --
bare with me).

First, let us consider a general Scala program and generalise it. This
is just an arbitrary program -- I am trying to make it as convoluted as
possible so that you can go back to a real program and apply the same
reasoning. Importantly, this program is side-effect free at this point.

val a = e1
val b = e2(a)
val c = e3(a, b)
val d = e2(b)

OK, now I am going to generalise it by running the same program, but in
a for-comprehension. We do this by following these rules:
1) Remove the 'val' keyword
2) The = symbol becomes <-
3) We wrap the program in for and yield

I am going to create a data type that simply wraps a value and provides
flatMap and map methods so I can do this:

case class Id[A](i: A) {
  def map[B](f: A => B) = Id(f(i))
  def flatMap[B](f: A => Id[B]) = f(i)

...and since I don't want to explicitly wrap/unwrap my values with Id, I
am going to provide an implicit for in and out:

object Id {
  implicit def IdIn[A](a: A) = Id(a)
  implicit def IdOut[A](a: Id[A]) = a.i

OK, so now let's translate our program:

for {
  a <- e1
  b <- e2(a)
  c <- e3(a, b)
  d <- e2(b)
} yield d

Now that you accept that any program can be written this way, let us
step away for a moment and address the idea of DI. There are usually two
variations on DI:
1) The "configuration" (or context) is done and the application must
start by first initialising this context, then the application may run.
The application then reads from the configuration during run-time but
does not modify it. If this order is altered, you end up with a broken
program. A "DI" container attempts to promise you that no such thing
will occur -- this is essentially what the selling point is.

This dependency on explicit execution order is directly anti-thetical to
the functional programming thesis. This is a consequence of there being
a widely-scoped variable that kind-of pretends otherwise.

If you turn your head just a little, you can see this is a somewhat
degenerate notion of what is called "uniqueness typing." I digress.

2) Same as above, however, not only is the application permitted to read
the configuration, but it is also permitted to *write* to it. This means
that the application depends on *more* explicit execution order and the
possibility of bugs increases even more.

Imagine if I said, "you know what, turn all that DI stuff off, we are
going to initialise our values up front and pass them all the way
through the application." You would surely protest, "but that is so
clumsy!" and you'd be right, but only at first glance.

You see, there is a way to pass these values through quite neatly and
no, this is not using Scala's implicit keyword (which is insufficient),
this is something else. OK, so let's first start by thinking about case
1) above where the application only has read access to some context. I
will name this context, "Context", it is a data type that is
somewhere-or-other that we would like to pass through our application --
but no writes to it. I'm sure you can imagine what Context would really
be -- feel free to make it up for the use-case.

So, our values that were once mere values, are now computed as if they
have access to a Context. We can denote this with a data type:

case class ComputedWithContext[A](cx: Context => A)

So we now have "first-class" values computed with a Context, rather than
being mere values. We can now create these by accessing a Context "as if
it were passed" -- that is to say, although we don't yet have a Context,
we may create values that access that Context (when it is eventually
passed) by wrapping a function (or a trait if you prefer).

This is simple and straight-forward enough. But watch this:

case class ComputedWithContext[A](cx: Context => A) {
  def map[B](f: A => B): ComputedWithContext[B] = ComputedWithContext(f
compose cx)
  def flatMap[B](f: A => ComputedWithContext[B]): ComputedWithContext[B]
= ComputedWithContext(c => f(cx(c)) cx c)

We see here that ComputedWithContext happens to have pretty handy map
and flatMap methods. What can we do with them?

OK, so suppose our program above is a little different to the original
in that actually, our expressions (e1, e2 and e3) require a Context, so
each of them becomes become ComputedWithContext[T] where previously they
were just the type T (they may be all different values for T or same --
no matter).

For example, e1 may have been an Int where now it is a
ComputedWithContext[Int] and e2 may have been a String where now it is a
ComputedWithContext[String]. You get the point.

Here is how our program looks:

for {
  a <- e1
  b <- e2(a)
  c <- e3(a, b)
  d <- e2(b)
} yield d

This is precisely the same program syntax. The type of this expression
is ComputedWithContext[T] where the type T depends on the value d. In
other words, we may pass a Context in to this value and it gets
"threaded" through our program and our program *doesn't change* if we
write it in this general form. We may "stack these layers" on top of
what started as Id and our program remains unaltered. The "theory" of
doing this is quite involved, mostly because it is kick-arse interesting
and we could talk about it some time, but that's another story!

Importantly, there are no variables here. Not one and not a pretend
value that is actually a variable at application time (which I'm sure
you've been reminded of more than once when using DI).

So, this is how we deal with passing read-only context through our
* without being clumsy by explicitly passing it
* being quite efficient and readable in fact!
* without using variables that leads to program bugs and difficulty
reading and debugging code

How do we deal with read and write values (case 2)? Well, we need a new
different data type for that:

case class WriteWithContext[A](cx: Context => (A, Context))

Notice how this is the same data type as before except the function can
now produce a *new* Context as well as the computed value (paired). This
is to say, we may "modify" the Context as it is threaded through. But
what about map and flatMap, can we write those? Of course:

case class WriteWithContext[A](cx: Context => (A, Context)) {
  def map[B](f: A => B): WriteWithContext[B] = WriteWithContext(c => val
(a, cc) = cx(c); (f(a), cc))
  def flatMap[B](f: A => WriteWithContext[B]): WriteWithContext[B] =
WriteWithContext(c => { val (a, cc) = cx(c); f(a) cx cc })

Don't get too carried away with reading those methods, but just note
that flatMap "threads the Context through whatever the function is,
which may be modifying it."

OK, so now if we suppose that our expressions (e1, e2, e3) actually had
access to the Context, but were also able to "modify" it by returning a
new Context (or just leaving it alone, for which there is library
support of course), then our program would look like this:

for {
  a <- e1
  b <- e2(a)
  c <- e3(a, b)
  d <- e2(b)
} yield d

Yep, exactly the same as before. So now we have a value that we can pass
in a Context and it is threaded through the program, potentially
"modifying" the Context as it is threaded through and we get a value and
the resulting Context at the end. We may wish to drop either of these --
in practice, the Context often gets dropped, since it was only need to
compute the value -- and of course, there is library support for that.

So hopefully now you see that DI can be replaced by a superior
programming model, at least for this example, and I promise, for any
example. We just have to come to terms with a few data types and
abstractions and we can kick that baby to the gutter where it belongs.

Hope that helps!


Tony Morris