Showing posts with label Haskell. Show all posts
Showing posts with label Haskell. Show all posts

Wednesday, May 13, 2015

a cabal sandbox for scripting

Later edit: this post is now obsolete, thanks to Stack.

As a way to facilitate Haskell scripting, I have created the hs-scripts repo on GitHub.

The idea is being able to run interpreted Hakell scripts, without having to install any package at the global level.

To test it, clone the repo, create a sandbox, and install the library. The repo should also be added to the PATH variable.

The library is empty at the moment, but is intended to hold common functions and definitions shared by multiple scripts.

You can invoke cabal repl in the repo to open a GHCi session with a lot of useful packages available (a poor man's Haskell Platform, if you will).

There are two batch scripts, hs.bat for Windows and hs.sh for Linux. They take a Haskell script name (without extension) as parameter and execute the script in the context of the sandbox. Scripts are searched in the current directory first and then in the /scripts subfolder of the repository.

Scripts are easy to compile, if necessary.

So far there's only one script, a replica of Python's ever-useful python -m SimpleHTTPServer.


Sunday, March 29, 2015

Is ExceptT over IO really an anti-pattern ?

I was reading this exceptions best practice guide for Haskell. It contains good advice, for example about how masking all exceptions is a terrible idea. However, I sort of disagree about the "ExceptT over IO being an antipattern" part.

One of the reasons given is that "it's non-composable". I actually think that making errors explicit in the types can be more composable, especially if you want to combine or transform the errors in some  manner. Wrapping an exception to provide additional context tends to be more cumbersome than mapping over the error type with withExceptT (or first from Bifunctor).

When you compose multiple exception-throwing functions, if you aren't careful your result funcion will end up  (maybe even unbeknownst to you!) throwing a zoo of unrelated exceptions with scant context.

(Digression: the recent pipes-cliff library actually does a good job in wrapping every IOException that it encounters, tagging it with extra information.)

On the other hand, I admit that making all the errors explicit in the signature complicates it, and it becomes annoying if you consider those errors as "fatal" anyway and you don't intend to handle them, or at least not at the current layer.

Which brings us to the other main criticism: that using "ExceptT over IO" doesn't really ensure the absence of exceptions, even if it seems to imply it, and that it only provides a second, largely redundant, channel for communincating errors. In my opinion, there can be cases when we might want two separate error channels: if we want to distinguish between outright fatal vs. recoverable errors that we may want to handle, for example. Or between errors we want to annotate/map over/compose easily versus errors we just want to pass to upper layers unchanged.

In my process-streaming and conceit libraries I actually employ this dual approach. The execution functions re-throw any IO exceptions they encounter, but you can also have an explicit error type. If case the error type is never used, it remains polymorphic and we can use functions that have a simpler signature but require the error type to unify with Void.





Sunday, February 22, 2015

A dynamically generated FromJSON instance

Suppose you need to invoke a function (one like decode from pipes-aeson) that has a FromJSON constraint, but the exact parsing method required for the JSON is not fully known until runtime. Perhaps because field names in the JSON have some prefix that is only determined through reading  a configuration file, or by asking the user.

One solution is to use the tools provided by reflection package to dynamically generate the FromJSON instance at runtime, after having constructed the parse function.

The reflection repository already contains an example of a dynamically generated Monoid instance. I have adapted it for the FromJSON case; the code can be found in this gist.

Some tips for defining these kinds of local instances:

  • You never ever have to declare a Reifies instance, this is done by the internal machinery of the reflection package, through some dark magic I don't understand, even if it's explained here.
  • Instead, you use Reifies as a precondition for the instance you are actually declaring (FromJSON in this case). You are expressing something like "whenever this phantom type s reifies a typeclass dictionary at the type level, I can reflect it back and use the recovered dictionary to implement the methods of my instance".
  • The actual value passed to the reflect function is ignored, it only serves to tell the function what type reifies the value we want to reflect. Assuming you have a Reifies s a constraint, you can always pass something like Proxy :: Proxy s to reflect and obtain an a.
  • At some point, you will need a small auxiliary function to convince the compiler that the phantom type in your datatype is the same as the phantom type in the Proxy supplied by reify. Otherwise the Reifies constraint won't be satisfied, and your instance won't kick in!
  • Notice that, thanks to parametricity, the phantom type in the Proxy supplied by reify can never escape the callback. So be sure to strip from your result any newtype that still carries the phantom type!
See also this answer on Stack Overflow.

Monday, December 22, 2014

Connecting to Denodo Virtual DataPort from Haskell

My employer Denodo has released an "Express" version of its data virtualization platform. The Virtual DataPort component lets you do cool things like joining tables that reside in completely different databases (say, one table in Oracle and another in MySQL). And it comes with a boatload of connectors for a wide range of data sources.

I'm interested in accessing all that data goodness from Haskell. There are a couple of ways of doing it.

One would be to use Virtual DataPort's REST interface. That's a valid option and maybe the subject of another post.

Another is to connect through the ODBC interface on port 9996. As it happens, the wire protocol aims to be compatible with that of PostgreSQL, so maybe I could just use Haskell's popular postgresql-simple library. Let's try it out.

Musicbrainz


First, we need an underlying database instance that we can "wrap" with Virtual DataPort. Let's use the Musicbrainz database, a perennial favorite for testing. We can download it as a virtual machine and run it on Virtual Box. I usually set the guest networking to NAT and only forward PostgreSQL's port, which is 5432.

Virtual DataPort


Then we download and install Denodo Express, start the Virtual DataPort server, and launch the graphical administration tool.

From the administration tool's GUI, we have to import a few tables from the Musicbrainz database. First we create a JDBC data source:


(The Musicbrainz database is called musicbrainz_db, the user/password is musicbrainz/musicbrainz.)

And then import the artist_name and release tables:


Haskell


Ok, now for the Haskell part. postgresql-simple depends on native libraries. In CentOS 7 for example, we'll have to install the package postgresql-devel using yum, before invoking cabal install postgresql-simple.

Once we have the required dependencies, we can dabble with the following snippet (notice that we are connecting to Virtual DataPort, not directly to Musicbrainz):

...and it works! A sweet stream of data flows from the Musicbrainz database, passes through Virtual DataPort and arrives at our Haskell client.

And thanks to the virtualization capabilities of Virtual DataPort, that stream could potentially represent the combined flow of a number of affluents.

Monday, November 10, 2014

Two new "validation" applicatives

There exists an Either-like Applicative (usually called "Validation") that either collects all the errors, or returns a success. This is different from the Monad instance of Either, that stops at the first error encountered.

The type can be found at the validation package. However, that package incurs on a heavy lens dependency.

Recently, the type has cropped up without the lens dependency in both transformers (as the Errors type synonym) and either (as Validation).

The version in the either package has a more precise type, as it only requires the error values to form a Semigroup, but not necessarily a Monoid. So you can use a NonEmpty to accumulate errors.

Thursday, October 23, 2014

conceit 0.2.1.0

I have released a new version of the conceit package with the following changes:

The Applicative instance now doesn't require those ugly (Show e, Applicative e) constraints on the error type.

There's a Monad instance for Conceit. I initially made the >> operator concurrent, like *>, but that was a bad idea. In the end, the Monad instance has sequential operations and the Applicative concurrent ones, like it happens in Haxl.

There are also MonadThrow, MonadCatch and MonadError instances. The first two work throwing and catching exceptions from IO, the last one works more like ExceptT.

A special run function was added for when the error is "impossible":  Conceit Void a -> IO a. This makes easier to use Conceit in place of Concurrently.

The internals of this new version of conceit have been copied from those of Concurrently in Simon Marlow's async package, with modifications to support the new behaviour.

Sunday, October 19, 2014

Colchis, yet another JSON-RPC 2.0 client

There's no shortage of JSON-RPC 2.0 libraries on Hackage, as any cursory search will show. I have just added my own, called Colchis.

It is a pipes-based client that makes use of bidirectional pipes. It doesn't have a lot of features, in particular the only supported transport is raw TCP. I think HTTP transport support would be easy to add, but I don't have the need right now. No notifications or batched requests, either.

My aim is to be able to communicate with jsonrpc4j servers in "streaming" mode.

Check the examples folder in the repo for some examples of usage.

Monday, October 6, 2014

conceit 0.1.0.0

I have updloaded to Hackage a tiny package called conceit. It contains a slight variation on the Concurrently type from the async package.

Concurrently is hugely useful. A instance of Applicative, its <*> executes two IO actions concurrently (duh!) and, if one of the threads throws an exception, the other one is killed before re-throwing the exception. Very handy to ensure proper cleanup.

Conceit is similar to Concurrently, but the IO actions return an Either value. If both actions return with Right, the return values are combined. But if one of the actions returns with Left, the other action is immediately terminated and the Left value is returned. One could say that Conceit behaves like race for errors and like Concurrently for successful return values.

A Bifunctor instance is provided to help "massage" the errors.

Thursday, October 2, 2014

process-streaming 0.6.0.0

I have released (a few days ago already) version 0.6.0.0 of process-streaming, my library of pipes-based helpers built on top of the process package.

The API of version 0.3.0.0 proved to be too convoluted, with bloated and obscure function signatures. Hopefully the API for the new version is more intuitive.

The central abstraction if that of a Siphon. A Siphon represents a computation that either drains a Producer completely, or fails early aborting the consumption of the Producer. stdout and stderr can only be consumed through Siphons. Siphons can be created out of regular Consumers, pipes folds, or Parsers from pipes-parse.

Why do we need Siphons? When consuming both stdin and stderr of an external process, it is important that the two are drained to completion, because otherwise the process may block due to an output buffer that is not being read and fills up.

Siphons have an Applicative instance that "forks" a producer and feeds it to two computations. This Applicative instance made the support for branching pipelines of processes easy to implement.

The test suite contains several examples of usage.

Monday, June 30, 2014

The lazy Writer monad

I found an interesting example of the lazy State monad that uses "head recursion".

I tried to adapt the example for the lazy Writer monad but it proved difficult. If you naively consume the head of the infinite accumulator, it will hang, even with the lazy version of the monad:

The trick is to use Dual from Data.Monoid:

Maybe I suffer from a failure of imagination, but I can't think of any practical problem in which the lazy Writer monad offers an advantage over the strict version.

Monday, June 9, 2014

process-streaming 0.3.0.0

I have released version 0.3.0.0 of process-streaming, my library of pipes-based helpers built on top of the process package.

It contains breaking changes. Some functions have new names, and a few newtypes have been introduced in order to hide unnecessarily complex signatures. Hopefully the changes make the library a bit more intuitive.

You might find this library useful if:

  • You want an easy way to consume the standard streams of an external process using the tools provided by the pipes ecosystem.
  • You want concurrent, streaming access to both the stdout and stderr of an external process, without fear of deadlocks caused by full output buffers.
  • You want to consume stdout and stderr combined in a single stream.
  • You want to be relieved of tedious bookkeeping like: automatically killing the external process on the face of errors and exceptions, ensuring the termination of threads spawned for concurrent reads, and so on.

Monday, May 12, 2014

The IO monad has a non-strict bind

The Maybe monad has a bind function that is strict in its arguments:

The IO monad however has a non-strict bind function:

And so does the strict State monad:

It's easy to see in the source code:

To reduce m >>= k to Weak Head Normal Form, you only need to go as far as the StateT constructor; you don't need to touch any of the parameters. I suppose something similar happens with IO.

What's so strict then about the IO monad and the strict State monad? Their run functions. This:

As opposed to this:

Notice how in the strict case putting the expression in WHNF with seq is enough to trigger the exception.

Of course, with IO we don't have a explicit run function (one that isn't evil, I mean). We just put the IO action into main.

Notice however that even the strict State monad is not too strict:

This doesn't trigger an exception, either:

But this does:

Some good links about laziness here, here, here, here, here and here.

Monday, May 5, 2014

Understanding representable functors

Representable functors are put to good use in the linear library, and they can be a useful tool in other contexts. Here's a (hopefully not completely misleading) post on the subject.

The general idea

Consider the functor that carries a type a to the type of infinite lists of elements of type a. Let's call that functor Stream. So Stream Bool is the type of infinite lists of boolean values, Stream Char is the type of infinitely long strings, and so on.

There is a type with which the Stream functor has a special relationship: the type of natural numbers. Think about it: any value of type Stream a can be alternately represented by a function of type ℕ -> a. To recover the stream, we just have iterate over the naturals, applying the function at each step. Conversely, we can regard a Stream a value as a "memoized" version of a ℕ -> a function. We can recover the function by indexing on the stream.

There are other functors that have a similar "special relationship" with a certain type. That type need not be . Consider the following very simple functor:

This Pair functor has a special relationship with the Bool type. A value of type Pair a can be alternately represented by a function Bool -> a. Conversely, we can regard a value of type Pair a as a memoized or tabulated version of a Bool -> a function.

There seems to be a pattern here!

One more example: the Identity functor is represented by the trivial unit type (), that has only one value and carries no information. Identity a is just a, so there are no "positions" over which to "range", so to speak.

Composition, products, sums

Functors have the nice property that the composition of two functors is still a functor. We can ask ourselves if the composition of two representable functors is still representable. The answer is yes.

The composition (think of it as "nesting") of two functors is represented by the product of the representations of the original functors. Think of the type Stream (Stream a). It is like an infinite bidimensional array of a values. To index into it, we need two coordinates. So it is represented by (ℕ,ℕ).

The product of two functors is a functor as well. And the product of two representable functors is again representable! It is represented by the sum of the representations of the original functors. Supose you have a pair (Stream a, Stream a). If we want to index into it, we must first choose a side, and then drill into the stream of that side. So the product is represented by Either ℕ ℕ.

A third way of combining functors is summing them. Is a sum of representable functors representable? I don't know about the theory, but intuitively the answer seems to be no. We can't "index" on sum types, because we might attempt to target a branch that isn't there!

The Haskell implementation

The Representable typeclass can be found in module Data.Functor.Rep of the adjunctions package. It has the Distributive class as a prerrequisite. Distributive can be understood as the class of functors with a "fixed shape" whose values can be zipped together easily.

If you check the source, you'll see that the type families extension is used to associate each instance of Representable with its representation. It is instructive to explore which correspond to which. For example, Rep Identity = () as we have alredy mentioned.

You might wonder, why is Stream missing from the list of instances, after having been used many times as an example? It's there, actually, but you have to squint a little.

The Stream functor can be defined as Cofree Identity (the definition of Cofree is here). Cofree Identity is represented by sequences of unit () values. And the type of sequences of () is isomorphic to the natural numbers.

Saturday, May 3, 2014

Applicative vs. Monad

In an Applicative, effects build the railroad upon which the locomotive of function application will travel. All the effects take place while building the railroad, not when the locomotive moves. The locomotive cannot change its course in any way.

A Monad is like having a locomotive in a railroad which is still under construction. Passengers can actually yell to the construction crew a few meters ahead to tell them things like "I like the scenery on this side, please lay the tracks a bit more towards the right" or "you can stop laying tracks, I get out right here". Of course, for this strange railway, the company can't provide a timetable or a fixed list of stops that can be checked before embarking on the train.

Sunday, February 23, 2014

process + pipes = process-streaming

There are a number of good libraries for launching external processes already in Hackage. Basic libraries like process, and libraries that build on top of it like shelly and process-conduit (there is also module System.IO.Streams.Process from io-streams).

An official pipes-process package hasn't come out yet. After working recently with Ruby's Open3 package, and becoming inspired by this thread in the Haskell Pipes Google group, I decided to cobble together my own experiment. process-streaming is the result (git repo here).

I wanted to scratch a number of itches:

  • To my knowledge, neither shelly nor process-conduit provide concurrent streaming access to both the stdout and the stderr of a process, which is sometimes necessary. (shelly does provide direct access to the handles, but you have to orchestrate the concurrency by yourself.)
  • shelly and process-conduit use exceptions for signaling errors. This makes for simpler signatures. I wanted to test if a purely Either-based approach to errors could work without making the signatures excessively complicated.
  • Combining stdout with stderr is a frequent use case with some subtleties in the implementation. You have to be careful not to mangle lines, and also ensure that both streams are drained continuously (because otherwise you can get yourself into a blocking scenario).
  • Besides using plain Consumers, it should be easy to consume stdout and stderr using folds from Pipes.Prelude (and pipes-text and pipes-bytestring and pipes-group) and also parsers from the pipes-parse package. And it would be nice if two parsers could be run in parallel over the same Producer.
The haddock documentation for the System.Process.Streaming module is here. A tutorial with some examples can be found here.

If nothing else, this experiment has served me to understand the pipes ecosystem a little better. I really like the pipes-text approach for handling decoding leftovers (putting them in the return values of pipes) and the FreeT-based approach for parsing lines without never having to keep a whole line in memory at any time.

Saturday, November 30, 2013

Monday, November 18, 2013

My solution to the Twitter waterflow problem

The Twitter waterflow problem originated in this post. After becoming aware of the existence of functional solutions in this other post, I decided to roll my own in Haskell before looking at them.

It turned out to be more tricky than I had expected. The first version of the algorithm was hideously ugly and verbose, and I couldn't make it to work. I tried a new approach and came up with the following:

I haven't tested it extensively but I think it works.

I "compress" consecutive columns of equal height, so instead of working with a list of heights, I work with a list of (height,width) pairs. The initial list with uncompressed columns is created in the (zip l (repeat 1)) expression.

I traverse the list from left to right, carrying an accumulator parameter for the volume, and also an axiliary list of the "high" columns visited so far. These columns can potentially enclose water if another sufficiently high column is encountered on the other side of a "valley".

The algorithm ensures that the columns in the auxiliary list (a stack, really) are strictly increasing in height, and have all been "compressed".

Lines 3-4 purge columns that go "uphill" from the left border, since these columns can't possibly contain a valley. Once we find the first "peak", we put it into the auxiliary list.

Line 7 compresses "valley floors".

Line 8 identifies "valleys" whose floors have already been compressed. We find how much water that portion of the valley will hold by multiplying the width of the floor with the height of the lowest wall. We add the volume to the accumulator and then fill that portion of the valley with concrete (so to speak) to avoid double-counting that volume in future iterations.

Friday, May 31, 2013

Haskell equivalents of some Clojure forms

While trying to learn Clojure, I keep confusing doall, dorun, and doseq. What are their differences, and when to use each one?

The functional language I know best is Haskell, so I will find what Haskell functions correspond to the Clojure ones, using Hoogle.

doall. 


The documentation says:
When lazy sequences are produced via functions that have side effects, any effects other than those needed to produce the first element in the seq do not occur until the seq is consumed. doall can be used to force any effects. Walks through the successive nexts of the seq, retains the head and returns it, thus causing the entire seq to reside in memory at one time.  
Ok, we have a list where each element is produced through a side effect. Using doall we perform all these effects "up front" and get the whole list as a result.

In Haskell, values which are obtained through side effects are "tagged" with the IO monad, to distinguish them from pure values. A value of type IO a could be understood as "a description of a program that, when run, produces a value of type a".

We want a function with a type like this:
[IO a] -> IO [a] 
[] is the type constructor for lists. -> means a function from some type to another. For simplicity, in this post we only deal with lists whose elements are all of the same type.

We put the signature in Hoogle, we find the following
sequence :: Monad m => [m a] -> m [a] 
Evaluate each action in the sequence from left to right, and collect the results.
The type is more general but, when m is IO, this is the function we are looking for.

dorun. 


The documentation says:
When lazy sequences are produced via functions that have side effects, any effects other than those needed to produce the first element in the seq do not occur until the seq is consumed. dorun can be used to force any effects. Walks through the successive nexts of the seq, does not retain the head and returns nil.
In Haskell, functions which are executed only for side effects return values of type (). There is only one such value, also named (). It's roughly similar to void in C and Java.

Let's then search for the signature
[IO a] -> IO ()
We find
sequence_ :: Monad m => [m a] -> m () 
Evaluate each action in the sequence from left to right, and ignore the results.
The trailing underscore is a notational convention. It means that this is a function that behaves like the one without the underscore, but ignores the return value.

doseq. 


The documentation says:
Repeatedly executes body (presumably for side-effects) with bindings and filtering as provided by "for". Does not retain the head of the sequence. Returns nil.
Ok, it seems that now the values in the input list are "pure", but we want to apply an effectful function to each one of them. And we do not care about the return value.

So we must search Hoogle for something like
[a] -> (a -> IO b) -> IO () 
We find
forM_ :: Monad m => [a] -> (a -> m b) -> m ()
So that's it. doall is sequence, dorun is sequence_, doseq is forM_. I won't confuse them anymore!

A good introduction to Haskell IO can be found here.

Monday, December 31, 2012

Evaluating probabilistic cellular automata is comonadic, too! (part II)


Part I is here. The complete code is here.

We have defined our CA datatype and a local update rule for it. Since our CA is probabilistic, the update rule has a monadic effect in the Rand monad.

To perform a "global update" of the CA, we pass the current CA state and the local update rule to the extend function of Comonad:

extend localRule ca :: EnvT Probs U (Rand StdGen Bool)

The returning type is almost, but not quite, what we need. We now have a CA in which each cell is a monadic random effect. What we would like is to have the whole CA inside the Rand monad, so that a single call to runRand or evalRand gives us a value of EnvT Probs U Bool. We need a way to aggregate the monadic effects of each singular cell into a global monadic effect for the whole CA.

This is similar to the problem of transforming a list of IO actions into an IO action that returns a list. List is an instance of Traversable, so we can use the sequence function on it. Likewise, we must make our type U an instance of Traversable.

The easiest way is to have the instance derived for you automatically. Just use the DeriveTraversable extension and add Traversable to the deriving clause when declaring the U datatype.

But it doesn´t work. More precisely, it works only for one half of the universe! I wanted to print the CA's new state, 8 cells around the center (the showca function is defined in the gist):

    putStrLn . showca 8 
             . lower 
             . flip evalRand (mkStdGen 7) 
             . sequence 
             $ extend localRule ca

This printed only the left side of the universe. When it tried to print the first cell to the right, the whole computation just hanged. Why?

sequence is defined in terms of traverse. The default definition provided by the compiler first traverses the left side of the universe, then the center cell and the right side last. The StdGen generator is threaded throughout. But each side is infinite; when traversing the left side, we can get the updated generator to use on the right side only after an infinite number of steps. Laziness doesn't help here, because there is a data dependency.

This stumped me for a while, but finally I came up with an alternate version of traverse which doesn't suffer this problem:

    instance Traversable U where
        traverse f (U lstream focus rstream) = 
            let pairs =  liftA unzip 
                       . sequenceA . fmap (traversepair f) 
                       $ zip lstream rstream 
                traversepair f (a,b) = (,) <$> f a <*> f b
                rebuild c (u,v) = U u c v
            in rebuild <$> f focus <*> pairs

The trick is to zip the two streams together before traversing them, and unzip them afterwards inside the monad. This way, accessing the value in the first cell to the right doesn't require an infinite number of steps.

That solved, there is still one problem left. When we use evalRand to obtain the new state of the CA, we don't get an updated generator alongside it. And we can't use runRand to get it because the CA is potentially infinite. As we lazily consume the returned CA state, the generator we supplied is still used "under the hood" for generating new values. If we feed the updated generator returned by runRand to the following iteration, the whole computation would hang, again.

Fortunately, the MonadRandom class provides a getSplit operation which "bifurcates" a generator. At each iteration, we split the generator, throwing one half into the black hole of evalRand, and keeping the other half for the next iteration. We now know enough to define a function that, given a generator and an initial state of the CA, produces a lazy stream of all the succesive states of the automaton:

evolve :: EnvT Probs U Bool -> Rand StdGen (EnvT Probs U Bool)
evolve ca = sequence $ extend localRule ca

history :: StdGen -> EnvT Probs U Bool -> Stream (EnvT Probs U Bool)
history seed initialca =     
   let unfoldf (ca,seed) = 
           let (seed',seed'') = runRand getSplit seed 
               nextca = evalRand (evolve ca) seed'
           in  (nextca,(nextca,seed''))
   in unfold unfoldf (initialca,seed)

That's all folks! Happy New Year.

Edit: I wasn't sure of the correctness of my Traversable instance, so I posted a question on Stack Overflow.

Sunday, December 30, 2012

Evaluating probabilistic cellular automata is comonadic, too! (part I)

I'm trying to develop my intuition for comonads and comonad transformers in Haskell. This article in A Neighborhood of Infinity helped me grasp the basic concepts, so I decided to continue in the same vein, and explore how to handle simple probabilistic cellular atomata like the ones described in this link.

Let's start by adapting sigfpe's datatype to make it use the standard Comonad typeclass from the comonad package:

    data U x = U (Stream x) x (Stream x) deriving (Functor,Foldable)

    right (U a b (c:>cs)) = U (b:>a) c cs
    left  (U (a:>as) b c) = U as a (b:>c)

    instance Comonad U where
       extract (U _ b _) = b
       duplicate a = U (tail $ iterate left a) a (tail $ iterate right a)

I have used the Stream datatype (from the streams package) instead of lists, to actually enforce that the universe is infinite in both directions.

Now, to allow probabilistic "evolution" of the CA, we need two things:

  • Some way of storing and accessing the probability parameters. Think of them as the "basic constants" of the CA's universe.
  • Some way of generating random values that plays well with the Comonad instance. Remember that randomness is a kind of monadic effect.

The first one is easy, we'll just use the EnvT comonad transfomer. This transformer attaches "environment values" to comonadic values. The environment can be later extracted from the transformed comonadic value using the ask function.

Let's add probabilities to our U datatype. We pass a vanilla U value to the EnvT constructor, along with a probabilities tuple:

    type Probs = (Float,Float,Float,Float)

    ca :: EnvT Probs U Bool
    ca = EnvT (0.0,0.6,0.7,0.0) $ U (repeat False) True (repeat False)

    probs :: Probs
    probs = ask ca
 
Easy as a pie! Notice the following: with monads, there is a uniform interface to "put things in" (return) but the interfaces to "get things out" are specialized (runReaderTrunWriterT) or don't even exist (like with IO). With comonads, there is a uniform interface to "get things out" (aptly named extract) but the interface to "put things in" depends on the comonad (the EnvT constructor in this case).

Let's now define a probabilistic local update rule which calculates the next state of a cell from the state of its neighbours. We are using the MonadRandom package:

    localRule :: EnvT Probs U Bool -> Rand StdGen Bool
    localRule ca =
        let (tt,tf,ft,ff) = ask ca
            black prob = (<prob) <$> getRandomR (0,1)
        in case lower ca of
            U (True:>_) _ (True:>_) -> black tt
            U (True:>_) _ (False:>_) -> black tf
            U (False:>_) _ (True:>_) -> black ft
            U (False:>_) _ (False:>_) -> black ff

We already know what ask does. But what does lower do? lower is the dual of lift. lift adds new effects to a base monadic value; lower strips one layer of a transformed comonadic value and returns the comonadic value underneath. Here we use it to access the base U value and pattern match against it.

We are almost there. Having defined localRule, we can use it together with extend to perform a "global update" that transforms an EnvT Probs U Bool into an EnvT Probs U (Rand StdGen Bool). But now we've hit a bit of a block. What we really want to get is a Rand StdGen (EnvT Probs U Bool) which we could then "run" using evalRand and obtain the next state of the CA. How to do this is explained in the next post.

A gist with the Haskell code can be found here.

Part II of this post is here.