FBHC 2013 Intro Round in Clojure
posted
Everyone these days uses C++ whenever participating in programming competitions. Even I, a person who prefers to use Java in competitions, see the appeal of using C++ in competitions:
- It is less verbose than Java.
- It has the data structures you need.
- For people with an understanding of how data structures work, it’s easier to create data structures you need.
- Even if you have the “wrong” approach with a worse runtime than what the problem author expects (say you’ve implemented O(n √ n ) instead of O(n log n), you may still get away with it.)
- It’s easy to read input.
However, at times it’s impossibly hard to solve a problem without immutable data structures: Sometimes you have to do recursion and memoization, and doing so with mutable structures seems to be a recipe for disaster. And whenever your code turns more than 200 lines long you tend to lose control and end up debugging some issue, which wouldn’t be there if you did it functionally.
As such, maybe there’s time for a change. Not only because C++ lacks some features, but also because people need to see things from a different perspective: Too many people these days use imperative languages without knowing about functional ones. “Forcing” people to learn how to program functionally will make them able to utilize functional tricks, even if they’re still going to write their applications in imperative languages.
So why not try out Clojure in programming competitions? For non-functional programmers, you’ll feel crippled for some time due to the lack of mutability. For people familiar with both worlds, the main issue is how one does “the mutable stuff” one tends to do with data structures in imperative languages. It really requires thought, but when it’s done often, it gives you the power to design persistent data structures—a valuable skill in our world with ever-increasing concurrent computations.
Enter Facebook Hacker Cup
The qualification round of Hacker Cup has ended, but there’s still Round 1 for those interested in working in new ways. Usually, Round 1 requires you to solve all 3 problems correctly, but without time limit. If you feel confident enough that you’ll solve every single one of them, I encourage you to challenge yourself to use Clojure or some other functional programming language.
If you’re not sure how you should use Clojure to solve these problems, read on: I’ll show how I would have solved the problems in the Qualification round if I were using Clojure. Mind you, this is not an introduction to Clojure, so don’t treat it as such.
First of all, be sure you have Clojure in your environment. Installing it on
Linux-based systems should be as easy as installing the clojure
or
clojure1.4
package with your preferred package manager. And with that, we’re
off.
Beautiful Strings
The easiest problem in the qualification round had to do with strings, and the problem text was as follows1:
When John was a little kid he didn’t have much to do. There was no internet, no Facebook, and no programs to hack on. So he did the only thing he could… he evaluated the beauty of strings in a quest to discover the most beautiful string in the world.
Given a string s, little Johnny defined the beauty of the string as the sum of the beauty of the letters in it.
The beauty of each letter is an integer between 1 and 26, inclusive, and no two letters have the same beauty. Johnny doesn’t care about whether letters are uppercase or lowercase, so that doesn’t affect the beauty of a letter. (Uppercase
F
is exactly as beautiful as lowercasef
, for example.)You’re a student writing a report on the youth of this famous hacker. You found the string that Johnny considered most beautiful. What is the maximum possible beauty of this string?
Input
The input file consists of a single integer m followed by m lines.
Output
Your output should consist of, for each test case, a line containing the string “Case #x: y” where x is the case number (with 1 being the first case in the input file, 2 being the second, etc.) and y is the maximum beauty for that test case.
Constraints
5 ≤ m ≤ 50
2 ≤ length of s ≤ 500
Example input: | Example output: |
---|---|
|
|
The problem and Its Solution
Restating the problem, we can say that we want to give every letter of the alphabet a value from 1 to 26, and that no two letters have the same value. Given a string s, we want to find a way to assign letters a value
Intuitively, we can see that we should give the highest value to the most frequent letter and the lowest value to the least frequent letter. Notice that since we’re asked to find the sum, it doesn’t really matter which letter we give what value. As such, knowing the frequencies is enough to solve this problem.
Since we can sort the frequencies, our problem becomes fairly straightforward: Just find the frequencies, sort them highest to lowest, and give the first value 26, second 25, etc.
Implementation
To start up, we must have some way of running these things. To make this blog
post easier to read, I’ll avoid Leiningen and go with bare Clojure. However, if
you want to utilize libraries and don’t hassle too much with the classpath, make
a project and use lein run -m namespace.name
.
Create a file and make it executable (chmod +x foo.clj
), and put in the
following:
#!/usr/bin/env clojure
(println "Hello, World!")
Then execute it as any other binary (./foo.clj
) and ensure that you get
Hello, World!
in the console. Don’t be afraid if it seems horribly slow: It’s
only the startup time of Clojure, not the code itself which is slow.
Now, our first “problem” would be to read input from stdin. With Clojure, that’s
usually easy: Use (read)
when you need to read in integers or doubles, and
(read-line)
if you need to read a line. For this round, this suffices, but if
you need to read words, you have to wrap the read with a str like so:
(str (read))
2.
With that and the Clojure cheatsheet up, something like this shouldn’t be impossible for Clojurists to generate:
#!/usr/bin/env clojure
(defn best-solution []
(->> (read-line)
.toLowerCase
(filter #(Character/isLetter %))
frequencies
vals
(sort >)
(map * (iterate dec 26))
(reduce +)))
(defn pprint-result [pos res]
(println (format "Case #%d: %d" (inc pos) res)))
(defn main []
(let [T (read)]
(read-line) ;; Finish off the first line.
(dotimes [t T]
(->> (best-solution)
(pprint-result t)))))
(main)
best-solution
should be rather easy to follow: Read the line, make it
lowercase, filter the letters (keep only letters), count the frequencies, take
the values (the numbers), sort them higher to lower, multiply each frequency by
its optimal value, and sum them together. Every comma (,
) represents a line in
the best-solution
function, and it explains better how instead of why. A
common argument functional programmers use, perhaps justified this time.
Balanced Smileys
The second problem is about balancing parentheses, a suitable problem for the Lisp dialect we’re working with:
Your friend John uses a lot of emoticons when you talk to him on Messenger. In addition to being a person who likes to express himself through emoticons, he hates unbalanced parenthesis so much that it makes him go :(
Sometimes he puts emoticons within parentheses, and you find it hard to tell if a parenthesis really is a parenthesis or part of an emoticon.
A message has balanced parentheses if it consists of one of the following:
- An empty string “”
- One or more of the following characters:
a
toz
, ` ` (a space) or:
(a colon)- An open parenthesis
(
, followed by a message with balanced parentheses, followed by a close parenthesis)
.- A message with balanced parentheses followed by another message with balanced parentheses.
- A smiley face
:)
or a frowny face:(
Write a program that determines if there is a way to interpret his message while leaving the parentheses balanced.
Input
The first line of the input contains a number T (1 ≤ T ≤ 50), the number of test cases.
The following T lines each contain a message of length s that you got from John.Output
For each of the test cases numbered in order from 1 to T, output
Case #i:
followed by a string stating whether or not it is possible that the message had balanced parentheses. If it is, the string should beYES
, else it should beNO
(all quotes for clarity only)Constraints
1 ≤ length of s ≤ 100
Example input: | Example output: |
---|---|
|
|
The Problem and Its Solution
Restating the problem, we can say that our goal is to figure out whether we can
interpret the message in a way that balances the parentheses. Handling (
and
)
is easy enough, but :(
can be treated as :(
or as :
followed by a
(
. As such, we need to find a way to represent this.
One solution would be to have a set of possible states instead of a single
state. A state represents how many parentheses we have opened, for example, 3
would mean we have opened three parentheses. Our set of states when we start are
thus #{0}
, as we’ve opened zero parentheses.
Whenever we hit upon a (
, we increment every value in the set by one, and when
we hit upon a )
, we decrement them by one. Whenever we hit upon :(
, we take
the set and increment every value by one. But instead of keeping that specific
set, we merge it with the old set, having both possible states within a single
set. We do the same for :)
, except that we decrement every value in the
set. When all elements are consumed, we check if the set contains the value
zero, and report YES
or NO
based upon that.
As a minor notice: We remove every negative state in the set when we decrement, as it is not a legal position. (It’s impossible to open a negative amount of parentheses!)
Implementation
Reading in the values and spitting out a solution can be done in more or less
the exact same fashion as the previous problem, so let’s keep that part of the
code. We start off by removing irrelevant data, and group the different
combinations ((
, )
, :(
and :)
) in a list through the magic of regular
expressions.
#!/usr/bin/env clojure
(defmulti update-state (fn [state type] type))
(defmethod update-state "(" [state _]
(set (map inc state)))
(defmethod update-state ")" [state _]
(->> (map dec state)
(remove neg?)
set))
(defmethod update-state :default [state smiley]
(into state (update-state state
(subs smiley 1))))
(defn remove-irrelevant [string]
(re-seq #"\(|\)|:\(|:\)" string))
(defn possibly-balanced? []
(let [relevant (remove-irrelevant (read-line))]
(-> (reduce update-state #{0} relevant)
(contains? 0))))
(defn pprint-result [pos res]
(println (format "Case #%d: %s" (inc pos)
(if res "YES" "NO"))))
(defn main []
(let [T (read)]
(read-line) ;; Finish off the first line.
(dotimes [t T]
(->> (possibly-balanced?)
(pprint-result t)))))
(main)
We then perform a reduction, using update-state
as the reducing function,
#{0}
as the initial state, and using the different combinations we’ve filtered
out as the collection to reduce over. We then check that the final state
contains zero or not.
update-state
is a multimethod, which for OO-programmers can roughly be
translated to polymorphism without encapsulation and/or inheritance
involved. Whenever we hit on either (
and )
, we update the state as
specified in the written solution above. Those should be rather straightforward
to find. For :(
and :)
, we dispatch on the :default
method and take the
current state and merge it with the state we get by using either (
or )
.
Find the Min
The last and hardest problem is related to finding the maximal
After sending smileys, John decided to play with arrays. Did you know that hackers enjoy playing with arrays? John has a zero-based index array, m, which contains n non-negative integers. However, only the first k values of the array are known to him, and he wants to figure out the rest.
John knows the following: for each index i, where k ≤ i < n, m[i] is the minimum non-negative integer which is *not* contained in the previous *k* values of m.
For example, if k = 3, n = 4 and the known values of m are [2, 3, 0], he can figure out that m[3] = 1.
John is very busy making the world more open and connected, as such, he doesn’t have time to figure out the rest of the array. It is your task to help him. Given the first k values of m, calculate the nth value of this array. (i.e. m[n - 1]).
Because the values of n and k can be very large, we use a pseudo-random number generator to calculate the first k values of m. Given positive integers a, b, c and r, the known values of m can be calculated as follows:
m[0] = a
m[i] = (b * m[i - 1] + c) % r, 0 < i < kInput
The first line contains an integer T (T ≤ 20), the number of test cases.
This is followed by T test cases, consisting of 2 lines each.
The first line of each test case contains 2 space separated integers,
n, k (1 ≤ k ≤ 105, k < n ≤ 109).
The second line of each test case contains 4 space separated integers
a, b, c, r (0 ≤ a, b, c ≤ 109, 1 ≤ r ≤ 109).Output
For each test case, output a single line containing the case number and the nth element of m.
Example input: | Example output: |
---|---|
|
|
Problems
The first thing to notice is that b * m[i] + c may overflow if using the wrong numeric types. With longs—as is standard for Clojure from 1.4—this result is contained within the restrictions given. As such, this is not a worry in Clojure.
A more troublesome worry—or should I say the troublesome worry—is the size of n. It seems merely impossible to run this in time without doing any work at all. As such, it’s likely that we’re going to do some modulo on the result. We’ll have to do some analysis later on to find out.
To restate the problem, we have an array m where the k first elements in this list have been given, and the rest (n - k) of the values are defined as the lowest non-negative integer which not found in the k elements in front of this element.
The first thing one may notice is that if n is sufficiently large, then the values generated in the sequence may not be larger than k: As we can only peek back at the k elements behind us, this seems reasonable. In addition, as we can only peek at those k elements, it seems reasonable to believe that we at some point end up taking the value of the k+1th element behind us. The question is just when we do so.
To figure out, let’s have a look at an example
+-+-+-+-+ +=+=+=+=+=+-+-+-+-+-+
m -> |0|2|0|1| |3|4|2|0|1|3|4|2|0|1|...->n
+-+-+-+-+ +=+=+=+=+=+-+-+-+-+-+
k=4
So we see that there will be a repeating sequence of k+1 numbers, found immediately after we’ve generated the k first elements. As such, we can skip k+1 * i ≤ n-k elements and still calculate the correct element.
Implementation
Generating the random numbers seems to be fairly straightforward:
(defn lc-generator [a b c r]
(iterate #(mod (+ (* b %) c) r) a))
This creates an infinite sequence of random numbers generated by the linear congruential generator specified in this exercise. Picking the k first elements of this lazy sequence gives us what we want.
The next part is to generate the k+1 next elements in the array. The only “hard” part about this is to find the lowest element in the sequence of values we got. Having a sorted set which we fill up with the first k values, remove the elements in the array and put back the k+1th element after we’ve picked the lowest value should suffice. The only thing we have to be careful with is that there may be multiple similar values in the k first elements, so we keep a hash map with the position of the last element with that value in order to avoid using it too early.
We then just modulo, and pick the nth element in the seq. It’s not as hard as you may think, but watch out for off-by-one errors.
#!/usr/bin/env clojure
(defn lc-generator [a b c r]
(iterate #(mod (+ (* b %) c) r) a))
(defn last-position-table [lc-seq k]
(->> (take k lc-seq)
(map-indexed (fn [a b] [b a]))
(into {})))
(defn candidate-set [lc-seq k]
(let [k+1-set (into (sorted-set) (range (inc k)))]
(reduce disj k+1-set
(take k lc-seq))))
(defn k+1 [lc-seq k]
(let [hm (last-position-table lc-seq k)
ss (candidate-set lc-seq k)]
(letfn [(generate [i [m_i & r] ss]
(if (= i (inc k))
nil
(lazy-seq
(let [ss* (if (<= (hm m_i 0) i)
(conj ss m_i)
ss)
lowest (first ss)]
(cons
lowest
(generate (inc i) r
(disj ss* lowest)))))))]
(generate 0 lc-seq ss))))
(defn find-last []
(let [[n k a b c r] (doall (repeatedly 6 read))
lc-seq (lc-generator a b c r)]
(let [next-k+1 (k+1 lc-seq k)]
(nth next-k+1 (mod (- n k 1) (inc k))))))
(defn pprint-result [pos res]
(println (format "Case #%d: %d" (inc pos) res)))
(defn main []
(let [T (read)]
(read-line) ;; Finish off the first line.
(dotimes [t T]
(->> (find-last)
(pprint-result t)))))
(main)
last-position-table
returns a hash map/lookup table which tells us which index
element e has within the k first elements, or nil
if it’s not
there. candidate-set
returns a sorted set of candidates; a set with the
k+1 first non-negative integers, but without those which is used in the
k first elements.
Perhaps the hardest part to understand within this chunk is the generate
function. It generates the k+1 elements after the k first ones lazily,
using the sorted set to find the min fast and the hash map to avoid putting in
integers which is still contained in the k first elements. However, it’s not
very magical nor unknown, just a bit different for imperative and
object-oriented programmers.
Improving Ancient Skills
And that’s how easy the Facebook Hacker Cup Qualification round is in Clojure. Mind you, I’m not suggesting that experienced C++ algorists should stop using C++ and just use functional languages in programming competitions. Programming competitions are currently designed to be solvable with some mutable state and code running sequentially, at which C++ and Java excel. There are some exceptions: Some of the easier problems are in fact way easier to solve in a functional programming language. I wouldn’t be surprised if there are some in the upcoming round in Hacker Cup, and with 24 hours to do them you have the time to try and fail some. It also gives you a relatively easy introduction to functional programming. As such, I would challenge you to try it out.
However, I can understand why people prefer to use C++ and Java for these competitions. The problems are designed around these languages. I cannot think of any hard programming competition problem I’ve worked on which would be easier to solve with persistent data structures instead of transient ones. To make matters worse, very few competitions accept threading, and in e.g. TopCoder and ICPC you’re only allowed to use C, C++, and Java. While it gives you great knowledge of data structures, algorithms and great programming skills in general, maybe the competitions should be modernized. We’re living in a world where we most of us create real living systems. They often use more than one thread, and sometimes they even run over multiple computers. Creating programs which accept input and spit out some output is not how we write our programs anymore. Maybe the competitions we create should reflect that change, and not hone our abilities in data structures unsuitable for the concurrent world.
-
These problems are licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License. They are available at Facebook Hacker Cup 2013 Qualification Round. ↩
-
Don’t use laziness when reading input from stdin, it may appear out-of-order if you try to do things in parallel. Wrap every
map
,for
andrepeatedly
with adoall
:(doall (repeatedly 10 #(str (read))))
will for instance eagerly read 10 words and return them in a seq. ↩