Friday 29 July 2011

Simple Simple Recursion

The next question asks what the following function will return:

((fn foo [x] (when (> x 0) (conj (foo (dec x)) x))) 5)


I hadn't spotted until now that an anonymous function definition can give the function a name, that is to say a name that is visible inside the definition, so you can easily write recursive lambdas.

Anyway clearly this is going to loop down from 5 to 1, building up a list as it goes - so the answer is

(1 2 3 4 5)


Wrong.  This is what happens:

user=> ((fn foo [x] (when (> x 0) (conj (foo (dec x)) x))) 5)
(5 4 3 2 1)


Ha. Maybe it would be clearer (to me at any rate) if I did this with cons instead of conj:

user=> ((fn foo [x] (when (> x 0) (cons x (foo (dec x))))) 5)
(5 4 3 2 1)


The alternative way with cons instead of conj: the x and the call to foo go round the other way. But for lists, conj and cons do the same thing - they add the new element to the start of the list:

user=> (cons 1 '(2))
(1 2)
user=> (conj '(2) 1)
(1 2)


It's with vectors that conj goes the other way:

user=> (conj [2] 1)
[2 1]


when is like if except there is no else part - when the test returns false the when clause returns nil.
You can conveniently cons onto nil to start a list as you would cons onto the empty list:

user=> (cons :a nil)
(:a)
user=> (cons :a '())
(:a)
user=> (cons :a ())
(:a)


Even though nil and () are not the same.

OK, so in this function what happens is the recursive calls burrow their way down from 5 to zero, get a nil when they are down there, and then pop their way back out consing the numbers 1, 2, 3, 4, 5 onto the return list as they go -

(foo 5)
(cons 5 (foo 4))
(cons 5 (cons 4 (foo 3)))
(cons 5 (cons 4 (cons 3 (foo 2))))
(cons 5 (cons 4 (cons 3 (cons 2 (foo 1)))))
(cons 5 (cons 4 (cons 3 (cons 2 (cons 1 (foo 0))))))
(cons 5 (cons 4 (cons 3 (cons 2 (cons 1 nil)))))
(cons 5 (cons 4 (cons 3 (cons 2 '(1))))
(cons 5 (cons 4 (cons 3 '(2 1)))
(cons 5 (cons 4 '(3 2 1)))
(cons 5 '(4 3 2 1))
'(5 4 3 2 1)


Right.  Got it.

Interleave Two Sequences

Similarly to interleave two sequences, without of course using the interleave function.

So for example [1 2 3] [:a :b :c] => [1 :a 2 :b 3 :c]

(fn [sq1] [sq2]
  (loop [src1 sq1, src2 sq2, dest []]
    (if (and (seq src1) (seq src2))
      (recur
        (rest src1)
        (rest src2)
        (conj dest (first src1) (first src2)))
       dest)))

Wednesday 27 July 2011

Transforming Sequences

So with those points in mind here are two examples from 4Clojure.

To pack a sequence - that is to take all runs of one or more equal elements and put them inside their own sub-sequences. At each turn round the loop get the next element and then use take-while to gather equal ones into a sub-list and use drop-while to skip over the ones waiting to be processed, like this:

(defn pack-sequence [sq]
  (loop [src sq, dest []]
    (if (seq src)
      (let [x (first src)]
        (recur
          (drop-while #(= x %) src)
          (conj dest (take-while #(= x %) src))))
      dest)))


So for example (pack-sequence '(1 1 2 2 2 3) => '((1 1) (2 2 2) (3))

And similarly to replicate a sequence, meaning to duplicate each element a nominated number of times: use repeat to do the duplication and concat to add the elements to the growing destination sequence:

(defn replicate-sequence [sq n]
  (loop [src sq, dest []]
    (if (seq src)
      (recur
        (rest src)
        (concat dest (repeat n (first src))))
    dest)))


So for example (replicate-sequence '(1 2 3) 2) => '(1 1 2 2 3 3)

Doing Sequences

So I have invested in The Joy Of Clojure by Fogus & Houser. I just felt like buying a book. Well, with a few of their tips I can update my scheme for transforming sequences. One observation is that "quoting the empty list isn't idiomatic Clojure" (p.34) - so you don't have to write '() because () will do. Another observation is that the idiomatic way to test whether a sequence is empty is to use (seq s), which returns nil if (empty? s), and nil evaluates to boolean false. On p.87 they point out that you can remove that call to reverse on closing a sequence transformation by starting with a vector and using conj instead of cons. So instead of:

(defn copy-sequence-1 [sq]
  (loop [src sq, dest ()]
    (if (seq src)
      (recur
        (rest src)
        (cons (first src) dest))
      (reverse dest))))


You can do this:

(defn copy-sequence-2 [sq]
  (loop [src sq, dest []]
    (if (seq src)
      (recur
        (rest src)
        (conj dest (first src)))
      dest)))


You get a vector back but that's not a problem. In fact we read "in idiomatic Clojure code lists are used almost exclusively to represent code forms" (p.90) - for data rather than code "lists rarely offer any value over vectors".

Saturday 23 July 2011

Removing Duplicates

OK the next task is to remove successive duplicates from a sequence.  There is a Clojure function distinct that nearly does this, but it removes all duplicates, not just successive ones:-

user=> (distinct "Mississippi")
(\M \i \s \p)


My first thought is that I can build this from other existing functions.  First, partition the sequence into pairs of elements, that is to say elements 1 and 2, then elements 2 and 3, then elements 3 and 4 etc.:

user=> (partition 2 1 "Mississippi")
((\M \i) (\i \s) (\s \s) (\s \i) (\i \s) (\s \s) (\s \i) (\i \p) (\p \p) (\p \i))


Now filter out the ones where the elements are the same - ie (first %) is not equal to (second %) - because these are the ones where there is a duplicated element:

user=> (filter #(not (= (first %) (second %))) *1)
((\M \i) (\i \s) (\s \i) (\i \s) (\s \i) (\i \p) (\p \i))


(In the Clojure REPL, *1 returns the result of the previous evaluation)

And now put the remaining pairs back into a sequence by getting the first element of each pair with a map:

user=> (map first *1)
(\M \i \s \i \s \i \p)


Nope - of course this misses the final element of the original list, which was never the first element of one of the pairs.

Incidentally we can turn this sequence back into its original string form as follows:

user=> (apply str *1)
"Misisip"


Anyway, perhaps this will require a purpose built function.  Here is the function that eliminates successive duplicates:

(defn deduplicate [sq]
  (if (empty? sq)
  '()
  (let [x (first sq)
        xs (rest sq)]
    (cons x (deduplicate (drop-while #(= x %) xs))))))


This is a basic pattern to transfer one sequence to another via a recursive function: the part that does the work is the drop-while which removes any elements from the rest of the list that match the element being processed.

To drop this into an existing line of code you would wrap the definition in a (do....) construction and let the value you've just defined be returned as the value of the expression.

However, the rules on 4Clojure say that you can't define any new functions.  Therefore it has to be an anonymous function.  How to write an anonymous recursive function? Hmm.  Here I'm going to use the loop/recur construction, so the function is iterative rather than recursive.  It contains a GOTO rather than a CALL to do the loop.  I pass two parameters, one the source sequence that the elements are coming from and the other a destination sequence which is where they will be accumulated.  Yes anyway it looks like this:

(fn [sq]
  (loop [source sq destination '()]
    (if (empty? source)
        (reverse destination)
        (let [x (first source]
              xs (rest source)]
          (recur
            (drop-while #(= x %) xs)
            (cons head destination))))))


Using a recur you build the destination list the wrong way round so I return the reverse thereof.

Monday 18 July 2011

Home-made flatten function

OK, the recursive function that flattens:

(defn my-flatten [sq]
  (if (empty? sq)
  '()
  (let [head (first sq)
        tail (rest sq)]
       (if (sequential? head)
          (concat (my-flatten head) (my-flatten tail))
          (cons head (my-flatten tail))))))

So now we have:

user=> (my-flatten '(1 2 3 4 (5 6 ( 7 8 ) 9 10 ) (11 12) 5 6))
(1 2 3 4 5 6 7 8 9 10 11 12 5 6)


Looks about right. So, you transfer a list to another list but if the next element is a list then you flatten this to get the elements out and then concatentate on to the processed version of the rest (you don't cons because this would put the elements back in a list).

Thursday 7 July 2011

Flatten a Sequence - hmm

As for the requirement to create code that flattens a nested sequence into a flat sequence - but without using the function flatten - I couldn't think of a way to do this without bringing in a recursive function. Therefore I cheated and looked at the source code for the function flatten. It turns out that there is a Clojure function tree-seq than negotiates trees for you, thereby doing the hard bit of climbing your tree and fetching it down
in branches. So my solution is a fraud as I looked it up:

#(filter
  (complement sequential?)
  (rest (tree-seq sequential? seq %)))


Of course it also helps to find the predicate sequential?, which is the general purpose tester for the compound data structures list vector etc. Now to atone for cheating I should write the recursive function anyway.

Tuesday 5 July 2011

Fibonacci and Factorial

The next example is the Fibonacci series.  I had a feeling this would appear sooner or later.  What, haven't we done the factorial function yet?  I've been writing software on and off for thirty odd years and, incredibly, seem to have go this far without ever needing a Fibonacci number. Or a factorial function.

Anyway the trick with the Fibonacci series - and I remember this from Halloway's book - is to generate the numbers in pairs, so each element contains enough information go make the next one - so if one element is [a b], the next one is [b (+ a b)].  Your vector pair is a two number wide window into the Finonacci sequence, which flows through your windows from right to left.  Start with [0 1] and iterate this transformation and behold:


user=>(take 20 (iterate (fn [[a b]] [b (+ a b)]) [0 1]))
([0 1] [1 1] [1 2] [2 3] [3 5] [5 8] [8 13] [13 21] [21 34] [34 55] [55 89] [89
144] [144 233] [233 377] [377 610] [610 987] [987 1597] [1597 2584] [2584 4181]
[4181 6765])


We want only the second number from each pair to solve the puzzle, so we  map second:


#(map second
  (take %
    (iterate (fn [[a b]] [b (+ a b)]) [0 1])))
 


Clojure helps us to keep this concise.  It you pass a vector such as [0 1] to a function with a parameter list like [x] then x is set to the value of the vector within the body of the function, and you would have to winkle out the values 0 and 1 explicitly:  however, if you change the parameter list to this [[a b]] then the vector is opened up when the function is called, and within the body of the function a will be 0 and b will be 1.  You can do the same trick with other data structures.

A Palidrome detector will obviously work by comparing an object with the reverse of the object - - but hang on, take the hint - - (reverse) returns a seq - therefore you have to compare the reverse of the object with the seq of the object, like this:


#(= (reverse %) (seq %))


To work out a Factorial looks like a gift: you reduce the list of numbers from 1 to x by multiplying them together - you get the sequence of numbers from 1 to x from the function range - so factorial % comes out as


#(reduce * (range 1 %))


Wrong.  Too confident.  In fact the function (range) does not include its upper limit - so the factorial needs to be:


#(reduce * (range 1 (inc %)))

Monday 4 July 2011

Counting the Elements

As for counting the number of elements in a sequence without using count or writing a loop - the best I could come up with before my lunch break concluded was to use a lambda (fn [x] 1) that turns everything to one, map this onto the sequence to get a load of ones, and then use reduce and plus to add these ones together:-


#(reduce + (map (fn [n] 1) %))


hmm

Adding the elements of a list of numbers is easier than this - you can use reduce and the plus operator.  To get the odd numbers from a list of numbers you user the function filter with the predicate function odd?

The the examples seem to be out of sequence here.  Anyway, the next one is to produce the reverse of a sequence without using the functions reverse or rseq.

This took a bit more thought.  In the end I changed the original sequence to a list of pair vectors where the first element is an increasing number, like this:

Starting with this list:


user=>'(:a :b :c :d)
'(:a :b :c :d)


Use a map to insert numbers in each one:


user=>(map (fn [x y] [x y]) (iterate inc 1) *1)
([1 :a] [2 :b] [3 :c] [4 :d])


Now sort this using a function that compares those numbers we added:


user=>(sort (fn [x y] (> (first x) (first y))) *1)
([4 :d] [3 :c] [2 :b] [1 :a])


Finally extract the elements by mapping second onto the result:


user=>(map second *1)
(:d :c :b :a)


So the full solution looks like this:


#(map second
 (sort
  (fn [x y] (> (first x) (first y)))
  (map (fn [x y] [x y])
  (iterate inc 1)
  %)))


But surely there is a simpler way?

Friday 1 July 2011

Lambdas in Clojure

The form (fn [...] (....)) is the equivalent of (lambda (...) (....)).  It puts the parameters in a vector [...] rather than a list (...).  This comes in handy later because it clarifies the definitions especially when you use the option to use multiple argument lists.

There is also an alternative economical form for anonymous functions - for example a lambda to add 5 to a number is just #(+ % 5): to compare a number to 5 would be #(> % 5) and so on.  These are just the right size to pop into map and filter functions.  The symbol % stands for the argument to the function.

So for example the question to find the penultimate element of a list, by reversing it and then getting the second element:

(#(second (reverse %)) '(1 2 3 4)) => 3


or using function composition this could be:-

(#((comp second reverse) %) '(1 2 3 4)) => 3


which in this instance is no simpler.

When I came to the example of returning the nth element of a sequence I caught myself starting to write a loop.  Hello?  In low-level languages we write jumps: in middle-level languages we write loops: but in high-level languages we operate on compound data structures as a whole.  The first element is given by the function first.  To get later ones we drop so many from the front of the sequence and then get the first of what's left - so to get the nth element of sq we want (first (drop n sq)).

Yes there is a function nth that does this anyway but the point is to see how that works.