Input and Loops Top NodesStructures Contents

Structures

You can download the functions defined or used in this chapter with the following commands:

    wget troll.cs.ua.edu/ACP-C/employee.c
    wget troll.cs.ua.edu/ACP-C/employee.h

These files will help you run the test code listed in this chapter.

Structures

Arrays in C are limited in that they are homogeneous and cannot grow as needed. To tackle the homogeneity problem, we will use something called a structure; a structure will allow us to collect heterogeneous data together. To solve the expandability problem, we will use structures to implement linked lists. We'll discuss linked-lists in subsequent chapters; for now, we will focus on the structures themselves.

Just like we can collect literal values together in an array and can collect actions together in a function, we can collect variables together in something called a structure. Suppose we wished to collect an integer, a string, and a double variable together. To do so, we would define a structure like this:

    struct stuff
        {
        int x;
        char *s;
        double r;
        };

A structure in C is introduced with the keyword struct, followed by the name of the structure (in this case, stuff) followed by a collection of variable definitions enclosed in braces. Our stuff structure is composed of three variables, x, s, and r.

Like arrays, we can allocate a structure both statically and dynamically. A static allocation looks like this:

    struct node n;

To access the individual variables in a statically allocated structure, we use the dot operator:

    //test
    struct stuff { int x; char *s; double r; }; //define the struct
    struct stuff n;                             //allocate the struct
    n.x = 42;
    n.s = "hello, world!";
    n.r = 3.14159;
    printf("n.x is %d\n",n.x);
    printf("n.s is %s\n",n.s);
    printf("n.r is %f\n",n.r);

Running this code yields:

    n.x is 42
    n.s is hello, world!
    n.r is 3.141590

Unlike statically allocated arrays, statically allocated structures can be returned from functions under certain circumstances. Such manipulations of structures can be inefficient, so we will focus on dynamically allocated structures. Here is the dynamically allocated version of the above code fragment:

    //test
    struct stuff { int x; char *s; double r; }; //define the struct
    struct stuff *n;                            //create the pointer
    n = malloc(sizeof(struct stuff));           //malloc failure test omitted
    n->x = 42;
    n->s = "hello, world!";
    n->r = 3.14159;
    printf("n->x is %d\n",n->x);
    printf("n->s is %s\n",n->s);
    printf("n->r is %f\n",n->r);

Note that the "dot" operator is replace by the "arrow" operator; the arrow is composed of a hyphen followed by a greater-than symbol.

Records as structures

Recall, from the previous chapter, the concept of a record as a collection of related data. A structure, because of its ability to bundle heterogeneous data together, makes an ideal repository for the data comprising a record. When we used an array to store a record, we needed to keep each field of the record as a string, even when some fields were clearly numbers. With structures, we are freed from that constraint.

Suppose, as before, our record is composed of name, title, years of service, and salary:

    "Amber Smith"       President   32   97000.05
    "Thad Jones"        Assistant   15   89000.42
    "Ellen Thompson"    Hacker       2  147000.99

A structure to reflect such a record could be defined as:

    struct employee
        {
        char *name;
        char *title;
        int years;
        double salary;
        };

Since typing struct employee is such an onerous task, many Computer Scientists will use the typedef mechanism to make an alias:

    typedef struct employee
        {
        char *name;
        char *title;
        int years;
        double salary;
        } Employee;

Given this typedef, we can now use the terms struct employee and Employee interchangeably:

    struct employee *e = malloc(sizeof(struct employee));
    Employee *e = malloc(sizeof(Employee));

These dynamic allocations are equivalent. Once you have an employee record, it is extremely useful to define a function for displaying said record:

    void
    displayEmployee(Employee *e)
        {
        printf("Employee: %s\n",e->name);
        printf("    Title: %s\n",e->title);
        printf("    Years of Service: %d\n",e->years);
        printf("    Salary: $%.2f\n",e->salary);
        }

Given the Employee structure definition, a (renamed) version of the readRecord function from the previous chapter, becomes:

    Employee *
    readEmployeeRecord(FILE *fp)         // we pass the file pointer in
        {
        char *name;
        
        name = readString(fp);           //name is a string, not a token

        if (feof(fp)) { return 0; }      // no record, return null

        //make an empty record

        Employee *e = malloc(sizeof(Employee)); //malloc failure test omitted

        e->name = name;
        e->title = readToken(fp);
        e->years = readInt(fp);
        e->salary = readReal(fp);

        return e;
        }

To test our readEmployeeRecord function, we will successively read employee records from a file and print out the individual variable values for each record:

    void
    displayEmployeesInFile(FILE *fp)
        {
        Employee *e;

        e = readEmployeeRecord(fp);
        while (!feof(fp))
            {
            displayEmployee(e);
            e = readEmployeeRecord(fp);
            }

        fclose(fp);
        }

We use the standard reading pattern: read the first item before the loop - test if the read was successful - process the item read - read again at the bottom of the loop. In this case, processing the item means displaying the record.

Assuming the employee data can be found in the file employees.dat, we can implement our main:

    //test (compile with employee.c and scanner.c)
    #include "scanner.h"
    #include "employee.h"
    FILE *fp;
    if ((fp = fopen("employees.dat","r")) == 0)
        fprintf(stderr,"employees.dat could not be opened for reading\n");
    else
        displayEmployeesInFile(fp);
    fclose(fp);

Constructors, accessors, and mutators

You may have heard of the term object-oriented when it comes to programming languages. Object-oriented languages were created because organizing code along object lines can be a very good way to write programs. While the C language is not object-oriented, we can write our code in much the same way.

The fundamental unit in an object-oriented language is the class, which is roughly equivalent to structures in C. Given a class, objects are created using something called the new operator, which causes a new object to come into being. We will simulate the new operator by defining a function to dynamically allocate a structure and return it:

    Employee *
    newEmptyEmployee(void)
        {
        Employee *p = malloc(sizeof(Employee)); //malloc failure test omitted
        p->name = "";                           //set to the empty string
        p->title = "";                          //set to the empty string
        p->years = 0;                           //set to zero
        p->salary = 0.0;                        //set to zero
        return p;
        }

A good implementation initializes the variables within the structure. Given p, the pointer to the structure, recall that we access the individual variables in the structure by using the arrow operator, which is a hyphen followed by a greater-than symbol. We initialize all the variables stored in the structure to "empty" values.

A function such as newEmptyEmployee is known as a constructor, as it specifies how a structure should be constructed and initialized. We can make an alternate constructor whereby we pass in all the values stored in the structure:

    Employee *
    newEmployee(char *n,char *t,int y,double s)
        {
        Employee *p = malloc(sizeof(Employee)); //malloc failure test omitted
        p->name = n;
        p->title = t;
        p->years = y;
        p->salary = s;
        return p;
        }

Object-orientation also introduces the terms accessors and mutators. An accessor (sometimes the term getter is used) retrieves a component of an object or, in our case, the value of a variable inside the structure. Since we have four variables in our Employee structure, we will have four accessors:

    char *getEmployeeName(Employee *e) { return e->name; }
    char *getEmployeeTitle(Employee *e) { return e->title; }
    int getEmployeeYears(Employee *e) { return e->years; }
    double getEmployeeSalary(Employee *e) { return e->salary; }

A mutator (sometimes setter) replaces a component in an object:

    void setEmployeeName(Employee *e,char *n) { e->name = n; }
    void setEmployeeTitle(Employee *e,char *t) { e->title = t; }
    void setEmployeeYears(Employee *e,int y) { e->years = y; }
    void setEmployeeSalary(Employee *e,double s) { e->salary = s; }

A convention we will follow is that the structure will always be the first argument to a function that manipulates the structure.

Given these definitions, we can rewrite readEmployeeRecord:

    Employee *
    readEmployeeRecord(FILE *fp)         // we pass the file pointer in
        {
        char *name,*title;
        int years;
        double salary;
        
        name = readString(fp);           //name is a string, not a token

        if (feof(fp)) { return 0; }      // no record, return the null pointer

        name = name;
        title = readToken(fp);
        years = readInt(fp);
        salary = readReal(fp);

        return newEmployee(name,title,years,salary);
        }

We can also rewrite displayEmployee to use accessors:

    void
    displayEmployee(Employee *e)
        {
        printf("Employee: %s\n",getEmployeeName(e));
        printf("    Title: %s\n",getEmployeeTitle(e));
        printf("    Years of Service: %d\n",getEmployeeYears(e));
        printf("    Salary: $%.2f\n",getEmployeeSalary(e));
        }

Note how readEmployeeRecord and displayEmployee now give no clues as to how an employee record is stored. Indeed, with an alternate definition of the typedef Employee, with new versions of constructors and accessors, we could store an employee records as an array of four strings (as in the previous chapter) and both readEmployeeRecord as well as displayEmployee would still work as before, without any changes.

This idea of abstracting the internal details of how data is stored is a powerful tool in writing code that is easy to understand, modify, and maintain.

Storing collections of records

We have omitted from our discussions how we might store a collection of records. One possibility is to store them in an array.

Storing records in a dynamic array

The code for storing records in a dynamic array will be almost identical to the readTable function shown in the previous chapter. Recall, that we will need to return two values from the function that performs this task, the number of records found and the array that stores them:

    Employee **
    readAllEmployeeRecords(FILE *fp,int *finalSize)
        {
        int count;
        int size = 10;                //initial size of destination array
        Employee *record;
        Employee **table;

        //allocate the destination array
        table = allocate(sizeof(Employee *) * size);

        count = 0;
        record = readEmployeeRecord(fp);
        while (!feof(fp))
            {
            if (count == size)              //array is full!
                {
                // grow the array by doubling its size
                size = size * 2;
                table = reallocate(table,sizeof(Employee *) * size);
                //now there is enough room
                }
            table[count] = record;           //DO NOT FREE THE RECORD!
            ++count;
            record = readEmployeeRecord(fp);
            }
        fclose(fp);

        //shrink the array to 'count' number of elements
        table = reallocate(table,sizeof(Employee *) * count);

        //count holds the number of items, store it in *finalSize
        *finalSize = count;
        return table;
        }

A call to this function might look like:

    int size;
    Employee **table = readAllEmployeeRecords(fp,&size);

At this point, table points to an array of Employee record pointers and size holds the final size of that array.

One can see by comparing readTable from the previous chapter and readAllEmployeeRecords in this chapter, we see that the only changes are the return value, the type array of records (the table), the type of the individual record, and the name of the function to read a record. This is why the pattern approach to programming is so powerful. Once you understand the pattern, it is relatively simple to implement it for a variety of types.

Another way to store records

There is another way to store a collection of records: using a linked list. We investigate linked lists in the next chapter.

lusth@cs.ua.edu


Input and Loops Top NodesStructures Contents