Abstracting Async.Concurrently

Sep 17, 2016

Applicatives have an interesting relationship to concurrency. fraxl is my attempt to abuse this relationship with free monads to get something more abstract than haxl. By using a special Applicative instance, fraxl allows computations to be implicitly concurrent.

The async package has something similar in Control.Concurrent.Async.Concurrently. The Concurrently applicative is a newtype around IO. But its Applicative instance parallelizes the IO actions.

{-# LANGUAGE DeriveFunctor #-}

newtype Concurrently a = Concurrently { runConcurrently :: IO a }
  deriving Functor

instance Applicative Concurrently where
  pure = Concurrently . return
  Concurrently fs <*> Concurrently as =
    Concurrently $ (\(f, a) -> f a) <$> concurrently fs as

Because of the monad law (<*>) = ap, Concurrently cannot be a monad. Any monadic binding will not be executed concurrently, forcing ap to behave differently than (<*>). Despite this, you can convert between Concurrently and IO painlessly by packing and unpacking the Concurrently newtype.

But both fraxl and haxl fall prey to the same issue. They have to knowingly break this monad law to parallelize computations. However, being free monads, they hide this behind interpreters that make the law hard to observe. So morally, the law is not being broken. The free monad accumulates the computation, and the interpreter runs it while promising not to break any laws.

This solution leaves me uneasy. Cherry picking when the law doesn’t matter feels half-baked. This is why I like Concurrently so much. It allows seamless conversion between concurrent and sequential code, without being lenient on the laws. Abstracting Concurrently’s design to other monads would be great. In essence, Concurrently is an applicative tied to the monad IO, with conversions between the two to allow you to use either. As a class, this would look something like this.

{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FunctionalDependencies #-}

import qualified Control.Concurrent.Async as Async

class (Applicative f, Monad m) => Concurrently f m | f -> m, m -> f where
  runConcurrently :: f a -> m a
  runSequentially :: m a -> f a

instance Concurrently Async.Concurrently IO where
  runConcurrently = Async.runConcurrently
  runSequentially = Async.Concurrently

One of the key features of Async.Concurrently is that switching between it and IO preserves concurrency. That is, runSequentially (runConcurrently x) <*> runSequentially (runConcurrently y) is just as concurrent as x <*> y. This suggests the laws:

runSequentially . runConcurrently = id
runConcurrently . runSequentially = id

Async.Concurrently abides by this quite trivially, since it’s just a newtype around IO with a different Applicative instance. Luckily, something like fraxl can use a similar trick. fraxl uses a special free monad that abuses an underlying Applicative, and consequently “breaks” the monad law. However, using Async.Concurrently’s trick, fraxl could have a newtype for concurrency, and then use an ordinary free monad for the sequential monad, instead of its special one.

Recall what fraxl looks like with the pseudo-law-breakage.

-- Old

data Ap f a = ...
-- Concurrent, law-breaking `Applicative` instance
data Free f a = ...
type Fraxl f = Free (Ap f)

getSite :: String -> Fraxl MyDataSource Site
getSite = ...
printSite :: Site -> Fraxl MyDataSource ()
printSite = ...

example :: Fraxl MyDataSource IO ()
example = do
  --                               Concurrent
  (a,b) <- (,) <$> getSite "site1.com" <*> getSite "site2.com"
  printSite a
  printSite b

In this version, we have a free monad with a weird Applicative instance that “breaks” the monad laws. But we get concurrency in do blocks for no extra charge. Using the newtype trick, we can move the weird Applicative instance to a type with no Monad instance, at the cost of slightly more verbose do code.

-- New

data Ap f a = ...
-- Sequential `Applicative` instance
data Free f a = ...
type Fraxl f = Free (Ap f)

-- Concurrent, law-abiding `Applicative` instance
-- No `Monad` instance
newtype Free' f a = Free' (Free f a)
type Fraxl' f = Concurrently (Ap f)

instance Concurrently (Free' f) (Free f) where
  runConcurrently (Free' a) = a
  runSequentially = Free'

getSite :: String -> Fraxl MyDataSource Site
getSite = ...
printSite :: Site -> Fraxl MyDataSource ()
printSite = ...

getSite' :: String -> Fraxl' MyDataSource Site
getSite' = runSequentially . getSite

example :: Fraxl MyDataSource ()
example = do
  (a,b) <- runConcurrently $ --       Concurrent
             (,) <$> getSite' "site1.com" <*> getSite' "site2.com"
  printSite a
  printSite b

Works wonderfully! We’ve encoded the same structure into example, without having to subvert the monad laws.

There are probably some other interesting instances of Concurrently. I don’t know what they are, but the pattern of having special applicatives isn’t new. Whatever the instances are, there are some cool consequences. Traversal, for instance, can become implicitly concurrent.

traverse' :: (Concurrently f m, Traversable t) => (a -> m b) -> t a -> m (t b)
traverse' f = runConcurrently . traverse (runSequentially . f)

for' :: (Concurrently f m, Traversable t) => t a -> (a -> m b) -> m (t b)
for' = flip traverse'

example = do
  ys <- for' xs $ \x -> do
    z <- f x
    y <- g z
    return y
  ...

You can write monadic code in for', and have each computation implicitly done in parallel. This largely accomplishes one of haxl’s motivating examples.

There are some problems though. The need for runConcurrently would likely be controversial. On one hand, it does make it slightly more verbose, and will mean you often duplicate combinators as I did with traverse'. On the other hand, it forces the concurrency of a computation to be documented at the type level. We know by looking at any expression (or its type) if it’s going to do things concurrently. This could save some headaches in debugging.

The biggest downside is losing implicit concurrency with ApplicativeDo. With ApplicativeDo enabled on the law-breaking version of fraxl, writing example like this yields some nice desugared code.

{-# LANGUAGE ApplicativeDo #-}

example :: Fraxl MyDataSource ()
example = do
  a <- getSite "site1.com"
  b <- getSite "site2.com"
  printSite a
  printSite b

-- Desugars to:

example :: Fraxl MyDataSource ()
example = join $ (\a b -> printSite a *> printSite b)
              <$> getSite "site1.com"
              <*> getSite "site2.com"

With the Concurrently version, the sugared do code might look a little worse, again depending on whether you prefer the concurrency to be explicit.

{-# LANGUAGE ApplicativeDo #-}

example :: Fraxl MyDataSource ()
example = do
  (a, b) <- runConcurrently $ do
    a <- getSite' "site1.com"
    b <- getSite' "site2.com"
    return (a, b)
  printSite a
  printSite b

-- Desugars to:

example :: Fraxl MyDataSource ()
example = runConcurrently
  ((,) <$> getSite' "site1.com" <*> getSite' "site2.com")
  >>= \(a, b) -> printSite a *> printSite b

Functionally, it’s the same result. The do block just had to be much more explicit about its concurrency. It’s unclear to me if this tradeoff is a good or bad one.

Pros:

Law-abiding with no tricks.
Explicit concurrency can make code easier to reason about.
Lets fraxl use an existing free monad instead of a custom one.

Cons

Explicit concurrency is more verbose.
Do notation gets in the way instead of being clearer.

I’m leaning towards calling it good, because I’m often a fan of being explicit about dirty tricks. But I can definitely see why someone wouldn’t like this approach.