[Haskell for CP] Solving an interactive problem using laziness

Haskell is an interesting and beautiful language for many reasons. One is laziness, which lets us write very elegant computation recipes.

I will attempt to solve an interactive CP problem from a recent codeforces contest - Guess the K-th Zero.

§ Problem

There is a hidden array consisting of 0s and 1s, of length at most $2 \cdot 10^5$. In a query, you can ask the sum of any subarray. The goal is to find the $k$-th zero from the left, with at most $20$ queries.

§ Solution

We can binary search on the answer. Say the $k$-th zero is at position $p$. If we query the sum for range $[1, i]$, we have two cases. If $i < p$, then $sum (a_1 .. a_i) < i + k$. Otherwise it will be $\ge i + k$.

We binary search on the index $i$, to find the lowest $i$ satisfying $sum(a_1 .. a_i) \ge i + k$.

§ Query Object

For convenience, I will create a datatype for representing queries. You can either ask the sum, or tell the answer.

data Query = Ask Int Int | Answer Int

instance Show Query where
  show (Ask l r) = "? " ++ show l ++ " " ++ show r
  show (Answer x) = "! " ++ show x

§ Solution

Say we are given the inputs $n$ and $k$ and then the query responses as a list. We should write a function that returns an array of queries.

Logically, we know that we can only construct the query array after knowing $n$ and $k$ and the previous query responses. But here, we will be oblivious to that, and act as if we have all the query responses with us. Or rather, we have the thunk to it - a promise that when we really want it, the value will be made available. Isn’t lazy evaluation cute!

This enables me to write the solve function as follows:

solve :: [Int] -> [Query]
solve (n:k:rs) = askOrAnswer 1 n rs -- rs is the list of query responses
  where
    askOrAnswer :: Int -> Int -> [Int] -> [Query]

askOrAnswer lo hi rs will try to look for the answer in the range $[lo, hi]$, assuming that rs are the responses to all future queries. So while building it, we can only use rs in the tail of the return value. The head of the returned query list cannot force rs, otherwise we will have a deadlock.

askOrAnswer lo hi rs
  | lo == hi = [Answer lo] -- We are done, just answer and stop. At this point, rs should be empty.
  | otherwise = -- let us make a query, and continue the search
      let mid = (lo + hi) `div` 2 in
        Ask 1 mid : -- head of the return value
         (let (lo', hi') = (if head xs + k > mid then (mid + 1, hi) else (lo, mid))
          in askOrAnswer lo' hi' (tail xs))

Notice that in the body of askOrAnswer, we take head rs, which is the response to Ask 1 mid. Now, because it is in the tail, it will only get forced after the query Ask 1 mid gets printed. So we know that rs has a value (is not empty) at that point of evaluation. We then shorten our range and continue the search. Notice that I just build the tail of the list by recursively calling askOrAnswer.

§ Interaction

Now to make this work, we need a fully interactive pipeline. That is, all functions that process both the input and output must be fully lazy.

main = interact compute

compute :: String -> String
compute = lines -- split input into lines. This ensures we can consume one line at a time
           >>> (\(nt:ls) -> head (words nt) : ls) -- First line contains n, t. Drop the unused t.
           >>> map read -- map all the inputs to Int. map is also lazy, only applies when forced.
           >>> solve -- pass the list to solve, and get the Query list
           >>> map show -- again lazy, convert to the format the problem expects.
           >>> unlines -- Concatenate by `\n`.

Let us try to understand this. The interact function takes this recipe - which is a function of type String -> String. main will pass the input from stdin to the function, and then pass the result to stdout. So our source of forcing is stdout.

When stdout asks for the next character, it will force compute. This forces unlines which forces map show on the first element, which forces solve, which calls askOrAnswer 1 n rs, which will generate the first Ask 1 mid, which gets output to stdout.

At this point, stdout will try to force more, but the computation waits because you want head rs to become available. When the user types in the query response, it becomes available, and effectively continues the computation.

By tracking the order of forcing, we can see when what computation happens!

Caveat - Buffered IO streams

This code runs perfectly fine when invoked from a terminal UI. But it failed when I submitted to CF, with the verdict “Idleness Limit Exceeded”. This means that the interactive runner infinitely waited for the first query.

Why is this happening? Well, it turns out our terminal interfaces keep looking for the output even when not explicitly flushed. Unless some CPU intensive process is blocking the processor, our terminal process will try to get the next few bytes always. This is so that the user experience is cleaner, without having to explicitly wait for flushes.

But an arbitrary judging environment need not do that, so we must explicitly disable stdout from buffering. Just flush each character as and when computed. To do this, we invoke a setting from System.IO - hSetBuffering stdout NoBuffering. Coupled with this, the solution works!

main = do
  hSetBuffering stdout NoBuffering
  interact compute

Or for those more functionally inclined:

main = hSetBuffering stdout NoBuffering >> interact compute

§ Bonus

Is there some way to make this code cleaner and more natural? Right now it still feels like an iteration forcibly made recursive. Can we write it using folds somehow? Remember to use only left folds, as right folds are not lazy!

Last updated: 22.06.2021