Recursion Top Comparing Recursion and LoopingRecursion Patterns Contents

Recursion Patterns

You can download the functions defined or used in this chapter with the following commands:

    wget troll.cs.ua.edu/ACP-C/recursionPatterns.c
    wget troll.cs.ua.edu/ACP-C/recursionPatterns.h

These files will help you run the test code listed in this chapter. You will also need to download the integer list module.

Manipulating arrays and lists with recursion

Recursion, arrays, and lists go hand-in-hand. What follows are a number of recursive patterns involving arrays and lists that you should be able to recognize and implement. Recursion for lists makes heavy use of the head and tail functions from the previous chapter. As all the examples in this chapter use lists of integer values. We will also make use of the following analogs for arrays:

list expression equivalent array expression
head(items) items[0]
tail(items) items+1
items == 0 size == 0

The last expression tests whether the list (or array) is "empty". Note that an array name points to the first slot in the array. Suppose the name of the array is items. Then items + 1 points to the second slot in the array. If the size of the array items is n, the the size of the array pointed to by items + 1 must be n - 1. This is why adding one to the array name is equivalent to taking the tail of a list.

The counting pattern

The counting pattern is used to count the number of items in a collection. If the collection is empty, then its count of items is zero. The following function counts and ultimately returns the number of items in an array:

    int
    countList(Node *items)
        {
        if (items == 0) //base case
            return 0;
        else
            return 1 + countList(tail(items));
        }

The functions works on the observation that if you count the number of items in the tail of a list, then the number of items in the entire array is one plus that number. The extra one added in accounts for the head item that omitted when the tail was counted.

The array version is similar:

    int
    countArray(int *items,int size)
        {
        if (size == 0) //base case
            return 0;
        else
            return 1 + countArray(items+1,size-1);
        }

As with using a loop, the counting pattern for arrays is a bit useless as we need to the the size of the array before we can determine its size. However, the pattern is a basis for all other recursive patterns for arrays:

The accumulate pattern

The accumulate pattern is used to combine all the values in a collection. The following function performs a summation of the list values:

    int
    sumList(Node *items)
        {
        if (items == 0) //base case
            return 0;
        else
            return head(items) + sumList(tail(items));
        }

Recall that the head function calls getNodeValue on the node at the front of a list of items. Note that the only difference between the count function and the sum function is the recursive case adds in the value of the head item, rather than just counting the head item. That the functions countList and sumList look similar is no coincidence. In fact, most recursive functions, especially those working on collections, look very similar to one another.

The equivalent sumArray is left as an exercise.

The filtered-count and filtered-accumulate patterns

A variation on the counting and accumulate patterns involves filtering. When filtering, we use an additional if statement to decide whether or not we should count the item, or in the case of accumulating, whether or not the item ends up in the accumulation.

Suppose we wish to count the number of even items in a list:

    int
    countListEvens(Node *items)
        {
        if (items == 0) //base case
            return 0;
        else if (head(items) % 2 == 0)
            return 1 + countListEvens(tail(items));
        else
            return 0 + countListEvens(tail(items));
        }

The base case states that there are zero even numbers in an empty list. The first recursive case simply counts the head item if it is even and so adds 1 to the count of even items in the remainder of the list. The second recursive case does not count the head item (because it is not even) and so adds in a 0 to the count of the remaining items. Of course, the last return would almost always be written as:

    return countListEvens(tail(items));

The array version is similar:

    int
    countArrayEvens(int *items,int size)
        {
        if (size == 0) //base case
            return 0;
        else if (items[0] % 2 == 0) //head item always at index 0
            return 1 + countArrayEvens(items+1,size-1);
        else
            return countArrayEvens(items+1,size-1);
        }

As another example of filtered counting, we can pass in a value and count how many times that value occurs:

    int
    occurrencesInList(int target,Node *items)
        {
        if (items == 0)
            return 0;
        else if (head(items) == target)
            return 1 + occurrencesInList(target,tail(items));
        else
            return occurrencesInList(target,tail(items));
        }

An example of a filtered-accumulation would be to sum the even-numbered integers in an array:

    int
    sumArrayEvens(int *items,int size)
        {
        if (size == 0)
            return 0;
        else if (isEven(items[0]))
            return items[0] + sumArrayEvens(items+1,size-1);
        else
            return sumArrayEvens(items+1,size-1);
        }

The extreme and extreme index patterns

The extreme and extreme index patterns are a bit more difficult to implement recursively, due to the need to carry more information around while making recursive calls. For example, here is an example of the extreme pattern for arrays. We use a helper function that carries the extra information around:

    int
    getMax(int *a,int size)
        {
        return getMaxHelper(a,size,a[0],1); //start looking at index 1
        }

The third argument of the helper function is the assumed largest value. The fourth argument is the index at which one starts looking for a better max. Here is the helper function:

    int
    getMaxHelper(int *a,int size,int best,int i)
        {
        if (i == s) //no more indices to check
            return best;
        else if (a[i] > best) //found a bigger number!
            return getMaxHelper(a,s,a[i],i+1)
        else //best is still best
            return getMaxHelper(a,s,best,i+1)
        }

The first recursive case finds a better "best" number at the current index i, so it replaces the old best with the better candidate in the recursive call. The second recursive case doesn't replace best. In both recursive cases, the index is incremented so that subsequent elements can be checked.

The extreme index patter for arrays is similar:

    int
    getMaxIndex(int *a,int size)
        {
        return getMaxIndexHelper(a,size,0,1); //start looking at index 1
        }

In the case of getMaxIndex, the index of the best candidate so far is passed around, rather than its value. The implementation of getMaxIndexHelper is left as an exercise.

A similar strategy is used to find extreme values in a linked list.

The filter pattern

A special case of a filtered-accumulation is called filter. Instead of summing the filtered items (for example), we collect the filtered items into a collection. The new collection is said to be a reduction of the original collection.

Suppose we wish to extract the even numbers from a list. The structure of the code looks very much like the sumEvens function in the previous section, but instead of adding in the desired item, we join the item to the reduction of the tail of the array:

    Node *
    extractListEvens(Node *items)
        {
        if (items == 0)
            return 0;
        else if (isEven(head(items)))
            return join(head(items),extractListEvens(tail(items)));
        else
            return extractListEvens(tail(items));
        }

Given a list of integers, extractEvens returns a (possibly empty) list of the even numbers:

    //test (compile with ilist.c and recursionPatterns.c)
    #include "ilist.h"
    #include "recursionPatternsPatterns.h"
    int numbers[] = {4,2,5,2,7,0,8,3,7};
    Node *a;
    a = arrayToList(numbers,sizeof(numbers)/sizeof(int));
    displayList(extractEvens(a));

Running this code yields:

    {4}{2}{2}{0}{8}

Recall that, in the previous chapter, our attempt at filtering with a loop left the collected items in the opposite order. A recursive implementation preserves the order of values found in the original list.

With extractEvens defined, then sumListEvens could be written very simply as:

    int
    sumListEvens(Node *items)
        {
        return sumList(extractListEvens(items));
        }

Unlike many patterns, recursive filtering is best suited for lists; recursive filtering of arrays is very problematic and will not be dealt with here.

The map pattern

Mapping is a task closely coupled with that of reduction, but rather than collecting certain items, as with the filter pattern, we collect all the items. As we collect, however, we transform each item as we collect it. The basic pattern looks like this:

    Node *
    mapList(int (*f)(int),Node *items)
        {
        if (items == 0)
            return 0;
        else
            {
            int v = f(head(items));
            return join(v,mapList(f,tail(items)));
            }
        }

Here, f is used to transform each item in the recursive step. As with mapping using a loop, we use a function pointer as the type of the formal parameter f.

Suppose we wish to subtract one from each element in an array. First we need a transforming function that reduces its argument by one:

    int decrement(int x) { return x - 1; }

Now we can "map" the decrement function over an array of numbers:

    //test (compile with ilist.c and recursionPatterns.c)
    #include "ilist.h"
    #include "recursionPatterns.h"
    int numbers[] = {4,3,7,2,4,3,1};
    Node *a = arrayToList(numbers,sizeof(numbers) / sizeof(int));
    Node *b = mapList(decrement,a);
    displayList(b);

The code yields the following output:

    {3}{2}{6}{1}{3}{2}{0}

We see that each value in the resulting list is one less than the corresponding number in the original numbers array.

Both map and filtering (reduce) are used heavily by Google in programming their search strategies.

The search pattern

The search pattern is a slight variation of filtered-counting. Suppose we wish to see if a value is present in a list. We can use a filtered-counting approach and if the count is greater than zero, we know that the item was indeed in the list:

    int
    findInList(int target,Node *items)
        {
        return occurrencesInList(target,items) > 0;
        }

In this case, occurrencesInList helps find do its job. We call such functions, naturally, helper functions. We can improve the efficiency of find by having it perform the search, but short-circuiting the search once the target item is found. We do this by turning the first recursive case into a second base case:

    int
    findInList2(int target,Node *items)
        {
        if (items == 0)                     //empty list, not found
            return 0;                   
        else if (head(items) == target)     //short-circuit!
            return 1;
        else
            return findInList2(target,tail(items));
        }

When the list is empty, we return false because if the item had been list, we would have hit the second base case (and returned true) before hitting the first. If neither base case hits, we simple search the remainder of the list (the recursive case). If the second base case never hits, the first base case eventually will.

The shuffle pattern

Sometimes, we wish to combine two lists. This is easy to do with the append function:

   append(list1,list2)

This places the first element in the second list after the last element in the first list. However, many times we wish to intersperse the elements from the first list with the elements in the second list. This is known as a shuffle, so named since it is similar to shuffling a deck of cards. When a deck of cards is shuffled, the deck is divided in two halves (one half is akin to the first list and the other half is akin to the second list). Next the two halves are interleaved back into a single deck (akin to the resulting third list). Note that while appending is destructive, in that it changes the first list, shuffling is non-destructive, in the neither of the shuffled lists are modified.

We can use recursion to shuffle two lists. If both lists are exactly the same length, the recursive function is easy to implement using the accumulate pattern:

    Node *
    shuffle(Node *list1,Node *list2)
        {
        if (list1 == 0)
            return 0;
        else
            {
            Node *rest = shuffle(tail(list1),tail(list2));
            return join(head(list1),join(head(list2),rest));
            }
        }

If list1 is empty (which means list2 is empty since they have the same number of elements), the function returns the empty, since shuffling nothing together yields nothing. Otherwise, we shuffle the tails of the lists (the result is stored in the temporary variable rest), then join the the first elements of each list to the shuffled remainders.

If you have ever shuffled a deck of cards, you will know that it is rare for the deck to be split exactly in half prior to the shuffle. Can we amend our shuffle function to deal with this problem? Indeed, we can, by simply placing the extra cards (list items) at the end of the shuffle. We don't know which list (list1 or list2) will go empty first, so we test for each list becoming empty in turn:

    Node *
    shuffle2(Node *list1,Node *list2)
        {
        if (list1 == 0)
            return list2;
        else if (list2 == 0)
            return list1;
        else
            {
            Node *rest = shuffle2(tail(list1),tail(list2));
            return join(head(list1),join(head(list2),rest));
            }
        }

If either list is empty, we return the other. Only if both are not empty do we execute the recursive case.

One can make shuffling even simpler by manipulating the recursive call that shuffles the remainder of the lists. If we flip the order of the lists in the recursive call, we only need to deal with the first list becoming empty:

    Node *
    shuffle3(Node *list1,Node *list2)
        {
        if (list1 == 0)
            return list2;
        else
            return join(head(list1),shuffle3(list2,tail(list1)));
        }

Note that in the recursive call, we take the tail of list1 since we joined the head of list1 to the resulting shuffle. The base case returns list2 because if list1 is empty, the "shuffle" of the remaining elements is just list2. Also, even if list1 and list2 are both empty, the base case test returns the correct value of zero, the null pointer or empty list.

Finally, note how much simpler it is to shuffle two lists using recursion as compared to shuffling two arrays with loops.

The merge pattern

With the shuffle pattern, we always took the head elements from both lists at each step in the shuffling process. Sometimes, we wish to place a constraint of the choice of elements. For example, suppose the two lists to be combined are sorted and we wish the resulting list to be sorted as well. The following example shows that shuffling does not always work:

    //test (compile with ilist.c and recursionPatterns.c)
    #include "ilist.h"
    #include "recursionPatterns.h"
    int n1[6] = {1,4,6,7,8,11};
    int n2[4] = {2,3,5,9};
    Node *c = shuffle3(arrayToList(n1,6),arrayToList(n2,4));
    displayList(c);

The result is shuffled, but not sorted:

    {1}{2}{4}{3}{6}{5}{7}{9}{8}{11}

The merge pattern is used to ensure the resulting list is sorted and is based upon the filtered-accumulate pattern. We only accumulate an item if it is the smallest item in the two lists:

    Node *
    mergeList(Node *list1,Node *list2)
        {
        if (list1 == 0)         //list1 is empty, return list2
            return list2;
        else if (list2 == 0)    //list2 is empty, return list1
            return list1;
        else if (head(list1) < head(list2))
            return join(head(list1),mergeList(tail(list1),list2));
        else //the head of list2 is smaller
            return join(head(list2),mergeList(list1,tail(list2)));
        }

As with shuffle2, we don't know which list will become empty first, so we check both in turn.

In the first recursive case, the first element of the first list is smaller than the first element of the second list. So we accumulate the first element of the first list and recur, sending the tail of the first list because we have used/accumulated the head of that list. The second list we pass unmodified, since we did not use/accumulate an element from the second list.

In the second recursive case, we implement the symmetric version of the first recursive case, focusing on the second list rather than the first.

Testing our function:

    //test (compile with ilist.c and recursionPatterns.c)
    #include "ilist.h"
    #include "recursionPatterns.h"
    int n1[6] = {1,4,6,7,8,11};
    int n2[4] = {2,3,5,9};
    Node *c = mergeList(arrayToList(n1,6),arrayToList(n2,4));
    displayList(c);

yields:

    {1}{2}{3}{4}{5}{6}{7}{8}{9}{11}

The fossilized pattern

If a recursive function mistakenly never makes the problem smaller, the problems is said to be fossilized. Without ever smaller problems, the base case is never reached and the function recurs48 forever. This condition is known as an infinite recursive loop. Here is an example:

     int
     countList(Node *items)
        {
        if (n == 0)
            return 0;
        else
            return 1 + countList(items); //should be tail(items)
        }

Since countList recurs on the same list it was given, items never gets smaller so it never becomes empty. Fossilizing the problem is a common error made by both novice and expert programmers alike.

The bottomless pattern

Related to the fossilized pattern is the bottomless pattern. With the bottomless pattern, the problem gets smaller, but the base case is never reached. Here is a function that attempts to divide a positive number by two, by seeing how many times you can subtract two from the number:49

    int
    div2(int n)
        {
        if (n == 0)
            return 0;
        else
            return 1 + div2(n - 2);
        }

Things work great for a while:

    //test (compile with ilist.c and recursionPatterns.c)
    #include "ilist.h"
    #include "recursionPatterns.h"
    printf("div2(16) is %d\n",div2(16));

yields:

    div2(16) is 8

and:

    printf("div2(134) is %d\n",div2(134));

yields:

    div2(134) is 67

But then, something goes terribly wrong:

    printf("div2(7) is %d\n",div2(7));

results in the program crashing:

    Segmentation fault      (core dumped)

What happened? To see, let's visualize our function, as we did with the gcd function previously, by adding a print statement:

    int
    div2(int n)
        {
        printf("div2(%d)...\n",n);
        if (n == 0)
            return 0;
        else
            return 1 + div2(n - 2);
        }

Now every time the function is called, both originally and recursively, we can see how the value of n is changing:

    div2(7)...
    div2(5)...
    div2(3)...
    div2(1)...
    div2(-1)...
    div2(-3)...
    ...
    Segmentation fault      (core dumped)

Now we can see why things went wrong, the value of n skipped over the value of zero and just kept on going. The solution is to change the base case to catch odd (and even) numbers:

    int
    div2(int n)
        {
        if (n <= 1)
            return 0;
        else
            return 1 + div2(n - 2);
        }

Remember, when you see a recursion depth exceeded error, you likely have implemented either the fossilized or the bottomless pattern.

lusth@cs.ua.edu


Recursion Top Comparing Recursion and LoopingRecursion Patterns Contents