Twitter, Puddles and foldlr

Blue Water DropsAbout a year ago I happen to devise a solution (spoiler!) to a puzzle latter known as “Twitter’s waterflow problem”. Despite the number of alternative solutions, my original approach still stand out in its own way, and can be used to demonstrate a few peculiar things about recursion and list folding.

The puzzle probably existed before Twitter itself, yet it received great popularity after Michael Kozakov’s post about his interview at Twitter. As Reddit commenter noticed, “luxury of time and hindsight are wonderful things indeed”, so here goes the problem specification.

Specification

We are given a list of numbers, representing walls of different height. The goal is to determine how much rain water can be accumulated between the walls.

Here’s a visual representation of [2,5,1,2,3,4,7,7,6]:

[2,5,1,2,3,4,7,7,6]

In this case, the amount of accumulated water is 10 (note that there are no implicit side walls).

Requirements

Because of the puzzle popularity, many solutions, in different programming languages, were posted:

Approach, conciseness, performance and even correctness of those solutions vary greatly. What kind of solution do we want? It must be:

  • efficient (O(n)time complexity),
  • concise (1-5 lines of code),
  • compatible with linear-access sequences (like list),
  • preferably “functional”.

Now you may take your time and try to solve this problem by yourself before reading any further. In this way, you will not only enjoy a rather cute puzzle, but be able to comprehend with greater clarity what I’m going to show next.

Formalization

Without regard for the whole process of reasoning, the problem can be formalized as follows:

  • water volume is the sum of all column volumes;
  • column volume is a positive difference between wall height and limiting water level;
  • limiting water level is a minimum of left- and right peak wall heights;
  • left/right peak wall height is a maximum wall height before/after column.

Multiple-pass solution

A concise “functional” solution that was presented later (and which I initially considered myself) is:

Haskell (slightly improved version):

volume xs = sum
  (zipWith (-) (zipWith min (scanl1 max xs) (scanr1 max xs)) xs)

A similar approach in other languages:

Scala:

def volume(xs: Seq[Int]) = 
  xs.scanLeft(0)(math.max).zip(xs).zip(xs.scanRight(0)(math.max))
    .foldLeft(0){ case (s, ((l, x), r)) => s + 0.max(l.min(r) - x) }

Clojure:

(defn volume [xs]
  (reduce + 0
    (map #(max 0 (- (min % %3) %2))
         (reductions max 0 xs)
         xs
         (reverse (reductions max 0 (reverse xs))))))

It’s actually a rather beautiful solution, because it almost directly captures the formalization stated above.

By creating the temporary collections we can avoid O(n2) time complexity (the algorithm runs in O(n) time) which arises from calculating limiting level “on the spot”. However, exactly because of that, we have to traverse the given collection multiple times.

Can we do better? It turns out that we can.

Single-pass solution

The main challenge in designing the algorithm is in that, at every step, we need to know both left- and right peak wall heights. At first glance, there’s no way to satisfy this condition neither using folding functions nor recursion.

Recursion is usually looked at either:

In reference to list folding, recursion is used to traverse a list either from left to right or from right to left.

However, such a dichotomy holds true only for the folding functions (namely, foldl and foldr). With recursion it’s possible to fold a list in both directions simultaneously! Here’s how:

Haskell:

volume = fst . f 0
  where f _ [] = (0, 0)
        f l (x:xs) = let (s, r) = f (max l x) xs
                     in (s + max 0 (min l r - x), max x r)

Scala:

val volume = {
  lazy val f: (Int) => Seq[Int] => (Int, Int) = l => {
    case Seq() => (0, 0)
    case Seq(x, xs @ _*) =>
      val (s, r) = f(l max x)(xs)
      (s + (0 max ((l min r) - x)), x max r)
  }
  f(0).andThen(_._1)
}

Clojure:

(def volume 
  (letfn [(f [l xs]
    (if (seq xs)
      (let [x (first xs) [s r] (f (max l x) (rest xs))]
        [(+ s (max 0 (- (min l r) x))) (max x r)])
      [0 0]))]
    (comp first (partial f 0))))          

In such a way, we can traverse a given list only once. Keep in mind though, that non-tail recursion implicitly “unfolds” and then “folds” a call stack to perform the calculation.

The single-pass code snippets are slightly more verbose (especial, the non-Haskell versions) than ones for the multiple-pass solution. That is because we are utilizing recursion directly, without abstracting it via a higher-order function. Let’s improve this.

Defining “foldlr”

Just like with foldl and foldr, it’s possible to encapsulate the details of folding a list from both sides.

Let’s create a helper higher-order function called “foldlr”:

Haskell:

foldlr :: (l -> t -> l) -> (l -> t -> r -> r) -> l -> r -> [t] -> r
foldlr _ _ _ r [] = r
foldlr f1 f2 l r (x : xs) = f2 l x (foldlr f1 f2 (f1 l x) r xs)

Scala:

implicit class SeqExt[T](val xs: Seq[T]) extends AnyVal {
  def foldlr[L, R](l: L, r: R)(f1: (L,T)=>L)(f2: (L,T,R)=>R): R =
    if (xs.isEmpty) r
    else f2(l, xs.head, xs.tail.foldlr(f1(l, xs.head), r)(f1)(f2))
}

Clojure:

(defn foldlr [f1 f2 l r xs]
  (if (seq xs)
    (f2 l (first xs) (foldlr f1 f2 (f1 l (first xs)) r (rest xs)))
    r))

Applying “foldlr”

Now we can rely on the foldlr function to simplify the code:

Haskell:

volume = fst . foldlr max
  (l x (s, r) -> (s + max 0 (min l r - x), max x r)) 0 (0, 0)

Scala:

def volume(xs: Seq[Int]) = xs.foldlr(0, (0, 0))(math.max)
  { case (l, x, (s, r)) => (s + 0.max(l.min(r) - x), x.max(r)) }._1

Clojure:

(defn vol [xs] (first (foldlr max
  (fn [l x [s r]] [(+ s (max 0 (- (min l r) x))), (max x r)])
  0 [0 0] xs)))

The simplification is rather impressive — the code now looks really concise and to the point. Besides, according to benchmarks (in Haskell), this code performs 2x faster than the code from the multiple-pass solution.

Conclusion

Is foldlr abstraction useful in practice? It’s hard to tell for sure, maybe… As for me, I have happened to use this approach in “real-life” projects several times already.

P. S. After publishing the original solution then I (unintendedly) got an invitation from Twitter. Though I don’t think I deserve it, as, under time pressure, I would probably produced no solution at all 🙂

Leave a Reply