Nodes Top Lists and LoopsLists Contents

Lists

You can download the functions defined or used in this chapter with the following commands:

    wget troll.cs.ua.edu/ACP-C/ilist.c    //integer-valued nodes
    wget troll.cs.ua.edu/ACP-C/ilist.h
    wget troll.cs.ua.edu/ACP-C/rlist.c    //real-valued nodes
    wget troll.cs.ua.edu/ACP-C/rlist.h
    wget troll.cs.ua.edu/ACP-C/slist.c    //string-valued nodes
    wget troll.cs.ua.edu/ACP-C/slist.h
    wget troll.cs.ua.edu/ACP-C/olist.c    //'object'-valued nodes
    wget troll.cs.ua.edu/ACP-C/olist.h
    wget troll.cs.ua.edu/ACP-C/employeeList.c    //employee linked list demo

These files will help you run the test code listed in this chapter. You will also need to download the list module.

The List data structure

We could stop at this point and just use nodes and their operations to make true lists. But just as constructors, accessors, and mutators free us from being concerned how nodes and records are structured, we can design some list (sometimes called linked list) operators that free us from some of the details of the nodes themselves. If we do our design and implementation properly, a reader of our code will have few clues that lists are build from nodes.

The first task is to decide what will represent an empty list. The null pointer will work well for this role.

    Node *
    newEmptyList(void)
        {
        return 0;
        }

Next, we define join, a function that will be used to add a value to a list, returning the new list. We will use our integer-valued nodes from the previous chapter:

    Node *
    join(int value,Node *items)
        {
        return newNode(value,items);
        }

The join function takes a value and a list as arguments and returns a new node that glues the two arguments together. The result is a list one value larger than the list that is passed in. Note that join is non-destructive with regards to the original list; if we want to modify the original list, we must reassign the original list with the return value of join:

    items = join(value,items);

We can see from the definition of an empty list and from join that a list is either 0 (the null pointer) or a node. Two accessor functions are often defined for lists, head and tail. The head function returns the value stored in the first node in the list, while the tail returns the chain of nodes that follows the first node:

    int head(Node *items) { return getNodeValue(items); }
    Node *tail(Node *items) { return getNodeNext(items); }

Two mutator functions are also commonly defined:

    void setHead(Node *items,int v) { setNodeValue(items,v); }
    void setTail(Node *items,Node *n) { setNodeNext(items,n); }

You may be wondering at this point why we bother to distinguish lists and nodes. One reason is we can improve lists by keeping some more information around; we will save that improvement for later. Another reason is it is vitally important to practice the concept of abstraction. Modern software systems exhibit a large number of abstraction layers. With our lists, we can see for layers of abstractions: variables, which abstract locations in memory, structures, which abstract collections of variables, nodes, which abstract structures with two components, and lists, which abstracts a chain of nodes. Each level of abstraction frees us from thinking about the details of the underlying layers. We emphasize abstraction over and over, because it is so very important.

The join operator

Let us now look at the join operation more closely. Here is a rewrite of join that uses a temporary variable, n, to hold the new node that will join the given value and list together:

    Node *
    join(int v,Node *items)
        {
        Node *n = newEmptyNode();     //step 1
        setNodeValue(n,v);            //step 2
        setNodeNext(n,items);         //step 3
        return n;
        }

First, let us create an empty list to which we will join a value:

    a = newEmptyList();

Graphically, the situation looks like this:

Now, we call join to join the value of 13 to the list a.

    a = join(13,a);

At the very start of the join function, the formal parameters v and items have been set to the value 13 and a, respectively. The situation looks like this:

The variables v and items, since they are local to the function join, are shown in blue35. After step 1, n = newEmptyNode(), we have:

As with items and v, the variable n is shown in blue as it is a local variable. After step 2, setNodeValue(v,None), the node n has its value set to v. the variable n. The situation changes to:

After step 3, setNodeNext(n,items), the next pointer of n, which is null, is changed to the value of items, which is also null. So the situation remains the same as before. Finally, n is returned and the variable a is reassigned to have the value of n. The variables v, items, and n go out of scope and disappear, leaving:

If we add a second value to the list a, as in:

    a = join(7,a);

at the start of the join function, we have:

After creating node n (step 1) and setting its value pointer (step 2), we have:

Setting n's next pointer to items yields:

At this point, we return the value of n and assign it to a:

From this last drawing, we can see that list a now includes the value 7 at the front of the list, as intended. As before, at this point the variables local to join, are no longer in scope.

Chaining calls to join

As seen in the previous section, the join function is used to build a list:

    a = 0;
    a = join(13,a);
    a = join(7,a);
    a = join(42,a);

The above code builds the following list:

An alternative way to build the exact same list is to chain together the series of calls to join:

    a = join(42,join(7,join(13,0)));

Both methods are rather tedious for larger arrays, so we will make use of a list constructor that takes an array of integers and returns a list of those integers. This code:

    int numbers[] = {42,7,13};
    Node *a = arrayToList(numbers,sizeof(numbers)/sizeof(int));

produces the same list. A first attempt at defining arrayToList might look like this:

    Node *
    arrayToList(int *array,int size)
        {
        int i;
        Node *items = 0; //start items as the empty list
        for (i = 0; i < size; ++i)
            items = join(array[i],items);
        return items;
        }

Unfortunately, this produces a list whose values are found in the reverse order from that of the array. Why this is so is answered by the question, "What is the last value joined to the growing list and where did it end up in the list?". Obviously, the last value joined was the last value in the array and it ended up as the first value in the list, since join places its first argument at the head of the list. By a similar logic, the next-to-the-last value in the array ended up in the second position in the list, and so on.

We can fix this deficiency by defining a function that creates a new list in the reverse order of a given list. The code mimics our arrayToList function:

    Node *
    reverse(Node *a)
        {
        Node *items = 0;
        while (a != 0)
            {
            items = join(head(a),items);
            a = tail(a);
            }
        return items;
        }

Again, the last value in the incoming list a is placed in the first position of the new list items.

The arrayToList function becomes:

    Node *
    arrayToList(int *array,int size)
        {
        int i;
        Node *items = 0; //start items as the empty list
        for (i = 0; i < size; ++i)
            {
            items = join(array[i],items);
            }
        Node *rev = reverse(items);
        freeList(items);
        return rev;
        }

This revised version of arrayToList is a bit inefficient in that the values to be placed in the list are traversed twice, once by accessing every value in the array (the for loop) and once by accessing every value in reverse's incoming list. The arrayToList functions found in the list modules mentioned in the start of this chapter cleverly make a single traversal. See if you can figure out how they work.

Reading a list from a file

Consider a file of integers that we wish to read into a list. As always, we use the read pattern:

    do the initial read
    while (the read was good)
        {
        process the read
        read again
        }

Concretely, we have:

    Node *
    readIntegers(FILE *fp)
        {
        int v;
        Node *items;

        items = 0;            //items points to an empty list
        fscanf(fp,"%d",&v);
        while (!feof(fp))
            {
            items = join(v,items);  //update items
            fscanf(fp,"%d",&v);
            }
        return items;
        }

Like the arrayToList function in the previous subsection, the readIntegers function also reverses the order of the integers as found in the file. If order must be maintained, one can reverse items before returning or use the same trick the version of arrayToList in the ilist module uses.

Displaying a list

To display a list, we need to walk the list. A walk of a list is comprised of visiting every node in the list from front to back. Typically, a loop is used to implement a walk. Here is a display function for a list of integer-valued nodes:

    void
    displayList(Node *items)
        {
        while (items != 0)
            {
            int v = head(items)
            printf("{%d}",v);
            items = tail(items); //take a step
            }
        printf("\n");
        }

With this function in place, we can see if join is really working. Here is some code that tests both join and display:

    //test (compile with ilist.c)
    #include "ilist.h"
    int numbers[] = {42,7,13};
    Node *a = arrayToList(numbers,sizeof(numbers)/sizeof(int));
    displayList(a);

Running this code yields the following output, as expected:

    {42}{7}{13}

Lists of other values

We can make lists of any kind of value. The modules rlist.c and slist.c contain analogs to the functions in ilist.c, except they work with real numbers (double) and strings (char *), respectively.

Another module, the olist.c module, allows for lists of pointers to arbitrary structures (or any pointers, other than function pointers). Consider the employee record structure from the chapter on structures. One might read a file of employee records, storing them into a linked list, this way:

    #include "employee.h"
    #include "olist.h"
    ...
    Node *
    readEmployeeList(FILE *fp)
        {
        Node *employees = 0; //zero is the empty list
        Employee *e = readEmployeeRecord(fp);
        while (!feof(fp))
            {
            employees = join(e,Employees);
            e = readEmployeeRecord(fp);
            }
        return employees;
        }

A careful inspection of the olist module reveals that the value in a node has type void *. A variable of this type can store any pointer (except a function pointer) safely. However, once stored, the pointer loses any knowledge of the kind of thing to which it points (i.e. the kind of thing found at the address stored in the pointer). Another way to describe this loss of information is to say the pointer becomes generic. Once the pointer is genericized, it is up to the programmer to convert it back to the kind of pointer it once was. For example, this code fails:

    Node *emps = readEmployeeList(fp);
    char *name = head(emps)->name;     //head(emps) is a generic pointer

because the code writer has not converted the generic pointer returned by olist's head function into an Employee pointer. This code, on the other hand, succeeds:

    Node *emps = readEmployeeList(fp);
    Employee *e = head(emps);  //generic pointer converted to Employee pointer
    char *name = e->name;  

The latter two lines can be combined into a single line with a cast:

    Node *emps = readEmployeeList(fp);
    char *name = ((Employee *) head(emps))->name;

One can also use the accessor function for the name field in the Employee structure:

    Node *emps = readEmployeeList(fp);
    char *name = getEmployeeName(head(emps));

This gives the compiler the information that the generic pointer returned by head(emps) is really a pointer to an Employee structures, since getEmployeeName accepts Employee pointers.

The file employeeList.c demonstrates using the Employee structure with the olist module.

lusth@cs.ua.edu


Nodes Top Lists and LoopsLists Contents