Input and Output Top LoopsScope Contents

Scope

A scope holds the current set of variables and their values. In C, there is something called the global scope. The global scope holds all the values of the built-in variables and function names (remember, a function name is a pseudopointer, similar to a variable).

When you analyze a C program, either by compiling and running it or by reading it, you start out in the global scope. As you come across definitions of variables28 and functions, their names and their values are added to the global scope.

Both variables and functions can be defined in the global scope. Processing of the following code adds the names x and main to the global scope:

    #include <stdio.h>

    //this is the global scope

    int x = 3;                  //x is bound to 3

    int
    main(int argc,char **argv)  //main is bound to a function object
        {
        ...
        return 0;
        }

This interaction adds two names to the global scope:

    //this is the global scope
    int y = 4;

    int
    negate(int z)
        {
        return -z;
        }

What are the two names? Highlight the following line to see the answer:

The two names added to the global scope are y and negate.

Indeed, since functions are defined in the global scope, and the name negate is being bound to a function object, it becomes clear that this binding is occurring in the global scope, just like y being bound to 4.

Scopes in C can be identified by block nesting level. The global scope holds all variables defined with an nesting level of zero or, in other words, not within a block. Recall that when functions are defined, the body of the function is contained in a block. Thus, the function body is code that is nested within the global scope and constitutes a new scope, with a nesting level of one. We can identify to which scope a variable belongs by looking at the pattern of nesting. In particular, we label variables as either local or non-local with respect to a particular scope. Moreover, we can label non-local variables as in scope or out of scope.

In Scope or Out

The nesting pattern of a program can tells us where variables are visible (in scope) and where they are not (out of scope). We begin by first learning to recognizing the scopes in which variables are defined.

The Local Variable Pattern

All variables defined at a particular nesting level or scope are considered local to that nesting level or scope. In C, if one defines a variable, that variable must be local to the current scope. An exception are the formal parameters of a function definition; these belong to the scope that is identified with the function body. So within a function body, the local variables are the formal parameters plus any variables defined immediately within the function body.

Let's look at an example. Note, you do not need to completely understand the examples presented in the rest of the chapter in order to identify whether names are local and non-local.

    //this is the global scope
    int
    theta(int a,int b)
        {
        //this is a non-global scope
        int c,d;
        c = a + b;
        c = kappa(c) + X;
        d = c * c + a;
        return d * b;
        }

In this example, we can immediately say the formal parameters, a and b, are local with respect to the scope of the body of function theta. Furthermore, variables c and d are defined in the function body so they are local as well, with respect to the scope of the body of function theta. It is rather wordy to say "local with respect to the scope of the body of the function theta", so Computer Scientists will almost always shorten this to "local with respect to theta" or just "local" if it is clear the discussion is about a particular function or scope. We will use this shortened phrasing from here on out. Thus a, b, c, and d are local with respect to theta. The function name theta is local to the global scope since the function theta is defined in the global scope.

The Non-local Variable Pattern

In the previous section, we determined the local variables of the function. By the process of elimination, that means the names kappa, and X are non-local. The name of function itself is non-local with respect to its body. The name theta, if it is referenced with the body of theta is non-local as well.

In a function body, any variable that is not a formal parameter and is also not defined immediately within the function body must be non-local.

The Visible Variable Pattern

A variable is accessible or visible with respect to a particular scope if it is in scope. A variable is in scope if it is local or was defined in a scope that encloses the particular scope. Some scope A encloses some other scope B if, by moving (perhaps repeatedly) leftward from scope B, scope A can be reached. Here is example:

    int Z = 5;

    int
    iota(int x)
       {
       return x + Z;
       }

The variable Z is local with respect to the global scope and is non-local with respect to iota. However, we can move leftward from the scope of iota one nesting level and reach the global scope where Z is defined. Therefore, the global scope encloses the scope of iota and thus Z is visible from iota and its value can be accessed within iota.. Indeed, the global scope encloses all other scopes and this is why the built-in functions are accessible at any nesting level.

Here is another example that has two enclosing scopes:

    int X = 3;

    int
    phi(int a)
        {
        int Y = a - 1;
        if (isEven(a))
            {
            int b = X + 1;
            return a * b;
            }
        else
            {
            return X * Y;
            }
        }

The global scope locally defines two names, X and phi. If we look at function phi, we see that it has two local variables, the formal parameter a and the locally defined variable Y. In the if statement, we see that the true branch is a block. This block is a new nesting level and therefore a new scope. This new scope has the locally defined variable b, but references the non-local variable X. The false branch is also a block and a new scope, but it has no local variables. However, it references the non-local variables X and Y.

Are all these non-local variables accessible? Consider the first non-local reference to X. Moving leftward from the true branch of the if, we reach the body of phi, so the scope of phi encloses the scope of the true branch. Moving leftward again, we reach the global scope, where X is defined. Therefore, the global scope encloses (transitively) the scope of the true branch, so X is visible and accessible within the true branch. For the same reasons, both X and Y are visible and accessible within the false branch of the if.

In the next section, we explore how a variable can be inaccessible.

The Tinted Windows Pattern

The scope of local variables is like a car with tinted windows, with the variables defined within riding in the back seat. If you are outside the scope, you cannot peer through the car windows and see those variables. You might try and buy some x-ray glasses, but they probably wouldn't work. Here is an example:

    int
    alpha(int a)
        {
        if (isEven(a))
            {
            int b = a + 1;
            printf("%d\n",a * b);
            }
        printf("%d\n",b * 2);
        }

The print statement at the end of the function causes an error:

    error: 'b' undeclared (first use in this function)
         printf("%d\n",b * 2);
                       ^

The rule for figuring out which variables are in scope and which are not is: you cannot see into an enclosed scope. In the example, the scope of the if is enclosed by the scope of alpha. Therefore, in alpha, any references to variables local to the if block are invalid. Contrast this with the non-local pattern: In the if block, any references to variables local to alpha (and the global scope, by transitivity) are valid.

Tinted Windows with Parallel Scopes

The tinted windows pattern also applies to parallel scopes. Consider this code:

    int
    gamma(int a)
        {
        return a + delta(a);
        }

    int
    delta(int x)
        {
        // starting point 1
        printf("the value of a is %d\n",a); //x-ray!
        return x + 1;
        }

Note that the global scope encloses both the scope of gamma and the scope of delta. However, the scope of gamma does not enclose the scope of delta. Neither does the scope of delta enclose the scope of gamma.

One of these functions references a variable that is not in scope. Can you guess which one? Highlight the following line to see the answer:

The function delta references a variable not in scope.

Let's see why by first examining gamma to see whether or not its non-local references are in scope. The only local variable of function gamma is a. The only referenced non-local is delta. Moving leftward from the body of gamma, we reach the global scope where where both gamma and delta are defined. Therefore, delta is visible with respect to gamma since it is defined in a scope (the global scope) that encloses gamma.

Now to investigate delta. The only local variable of delta is x and the only non-local that delta references is a. Moving outward to the global scope, we see that there is no variable a defined there, therefore the variable a is not in scope with respect to delta.

When we actually run the code, we get an error similar to the following when running this program:

    error: 'a' undeclared (first use in this function)
        printf("the value of a is %d\n",a); //x-ray!
                                        ^

` The lesson to be learned here is that we cannot see into the local scope of the body of function gamma, even if we are at a similar nesting level. Nesting level doesn't matter. We can only see variables in our own scope and those in enclosing scopes. All other variables cannot be seen.

Therefore, if you ever see a variable-not-defined error, you either have spelled the variable name wrong or you are trying to use x-ray vision to see somewhere you can't.

Alternate terminology

Sometimes, enclosed scopes are referred to as inner scopes while enclosing scopes are referred to as outer scopes. In addition, both locals and any non-locals found in enclosing scopes are considered accessible or visible or in scope, while non-locals that are not found in an enclosing scope are considered inaccessible, or not visible or out of scope. We will use all these terms in the remainder of the text book.

Three Scope Rules

Here are three simple rules you can use to help you figure out the scope of a particular variable:

The first rule is shorthand for the fact that formal parameters belong to the scope of the function body. Since the function body is "inside" the function definition, we can say the formal parameters go in.

The second rule reminds us that function names are defined in the global scope, so they go out with respect to the function body.

The third rule tells us all the variables that belong to ever-enclosing scopes are accessible and therefore can be referenced by the innermost scope. The opposite is not true. A variable in an enclosed scope can not be referenced by an enclosing scope. If you forget the directions of this rule, think of tinted windows. You can see out of a tinted window, but you can't see in.

Shadowing

Consider the rule that says that references to variables or names in an outer scope are visible. There is one exception to that rule; consider the following code fragment:

    int a = 4;

    void
    epsilon(int a)
        {
        printf("a is %d\n",a);
        return;
        }
    ...
    epsilon(13);
    ...

In this example, the global scope defines two names, a and epsilon. The scope of the body of epsilon defines the name a. When epsilon is called with some argument (in the example, 13), what is printed for the value of a, the value passed in and bound to the formal parameter a or the value of the global variable a?

If we write a program around this code fragment, we see the following output:

    a is 13

Clearly, the C compiler preferred the formal parameter over the global version. The reason is, when resolving the value of a variable, the most local scope is searched for a binding for that name. Should the local scope not hold a binding, the immediate enclosing scope is searched. If no binding is found there, the enclosing scope of that scope is search and so on, until the global scope is reached. Should no binding be found in the global scope, an undefined variable error is generated by the compiler.

In the example, a binding for variable a is found in the local scope and the value found there is retrieved for the print statement. As a consequence, it is impossible to retrieve the value of the global binding for a. It is said that the formal parameter shadows the global variable. Being in the shadow of the formal parameter means it cannot be seen. In general, when a more local variable has the same name as a less local variable that is also in scope, the more local variable shadows the less local version.

In the example, if you wished to reference both the global a and the formal parameter a, you would need to rename the formal parameter.

Multiple definitions and declarations

C does not allow multiple definitions having the same name. Thus, one cannot have two (or more) functions named the same or two variables in the same scope with the same name. On the other hand, one can have multiple declarations. For example, it is perfectly legal to repeat a function prototype/signature/declaration:

    char *f(int,double);
    char *f(int,double); //duplicative, but OK

In contrast, the compiler complains about:

    int square(int x) { return x * x; }
    int square(int x) { return x * x; } //duplicative and illegal

Variables work the same way, except there is a slight wrinkle in C. In the global scope, the following is legal in C:

    int x;
    int x; //duplicative, but OK in the global scope

The C compiler interprets one of the lines as a declaration and the other as a definition. However, if the above lines were found in a non-global scope, say a function body, the compiler would complain about multiple definitions:

    int almostSquare(int y)
        {
        int x;
        int x;        //duplicative and illegal
        x = y - 1;
        return x * y;
        }

Within a non-global scope, both lines are considered definitions, thus violating the multiple definition rule for C. One can avoid the multiple definition error in a non global scope by explicitly stating that one of the lines is a declaration. This is accomplished with the extern keyword:

    int almostSquare(int y)
        {
        int x;
        extern int x;        //OK
        x = y - 1;
        return x * y;
        }

Initializing a variable forces the compiler to treat a potential declaration as a definition. In the global scope:

   int x = 13;   //definition
   int x;        //declaration

the first line must be interpreted as a definition as the variable is initialized. To avoid a multiply defined variable error, the compiler interprets the second line as a declaration. However, the following is illegal in all scopes:

    int x = 13; //definition
    int x = 13; //also a definition, therefore illegal

Do not tag a variable as extern and initialize it at the same time:

    extern int x = 100; //don't ever do this

If you do this, you are telling the compiler this is only a declaration, not a definition (via extern), but at the same time, you are telling the compiler this is a definition (via the initialization).

You might be thinking, this is really a non-issue since nobody in his or her right mind would define/declare the same variable in the same scope. Such a situation, it turns out, happens quite commonly. Suppose a file named c.h contains the following line:

    int x;

and both module a.c and b.c include c.h. Then a.c and b.c are compiled together to make the executable:

    gcc -Wall a.c b.c

To the compiler, definitions in both a.c and b.c are all in the global scope, so the compiler sees two declarations of x, since c.h is included in both. The compiler turns one of those declarations into a definition and everything works. However, if the line in c.h is changed to:

    int x = 100;

then the compiler, when compiling a.c and b.c together, sees two definitions of x and emits a compiler error.

To sum all this up in two simple rules, we have:

  1. Never initialize a variable in a .h file
  2. If a global variable is used across modules, only initialize the variable in one module.

lusth@cs.ua.edu


Input and Output Top LoopsScope Contents