Literals Top OperatorsVariables Contents

Variables

Suppose you found an envelope lying on the street and on the front of the envelope was printed the name numberOfDogsTeeth. Suppose further that you opened the envelope and inside was a piece of paper with the number 42 written upon it. What might you conclude from such an encounter? Now suppose you kept walking and found another envelope labeled meaningOfLifeUniverseEverything and, again, upon opening it you found a slip of paper with the number 42 on it. Further down the road, you find two more envelopes, entitled numberOfDotsOnPairOfDice and StatueOfLibertyArmLength, both of which contain the number 42.

Finally, you find one last envelope labeled sixTimesNine and inside of it you, yet again, find the number 42. At this point, you're probably thinking "somebody has an odd affection for the number 42" but then the times table that is stuck somewhere in the dim recesses of your brain begins yelling at you saying "54! It's 54!". After this goes on for an embarrassingly long time, you realize that 6 * 9 is not 42, but 54. So you cross out the 42 in the last envelope and write 54 instead and put the envelope back where you found it.

This strange little story, believe it or not, has profound implications for writing programs that both humans and computers can understand. For programming languages, the envelope is a metaphor for something called a variable, which can be thought of as a label for a place in memory where a literal value can reside. In other words, a variable can be thought of as a convenient name for a value. In many programming languages, one can change the value at that memory location, much like replacing the contents of an envelope5. A variable is our first encounter with a concept known as abstraction, a concept that is fundamental to the whole of Computer Science6.

Variables

Most likely, you've encountered the term variable before. Consider the slope-intercept form of an algebraic equation of a particular line:

y = 2x - 3

You probably can tell from this equation that the slope of this line is 2 and that it intercepts the y-axis at -3. But what role do the letters y and x actually play? The names x and y are placeholders that stand for the x- and y-coordinates of any conceivable point on that line. Without placeholders, the line would have to be described by listing every point on the line. Since there are an infinite number of points, clearly an exhaustive list is not feasible. As you learned in your algebra class, the common name for a place holder for a specific value is the term variable.

One can generalize7 the above line resulting in an equation that describes every line.

y = mx + b

Here, the variable m stands for the slope and b stands for the y-intercept. Clearly, this equation was not dreamed up by an English-speaking Computer Scientist; a cardinal rule is to choose good names or mnemonics for variables, such as s for slope and i for intercept. But alas, for historical reasons, we are stuck with m and b.

The term variable is also used in most programming languages, including C, and the term has roughly the equivalent meaning. The difference is programming languages use the envelope metaphor while algebraic meaning of variable is an equivalence to a value8. The difference is purely philosophical and not worth going into at this time.

Creating, combining, and printing variables

Suppose you found three envelopes, marked m, x, and b, and inside those three envelopes you found the numbers 6, 9, and -12 respectively. If you were asked to make a y envelope, what number should you put inside? If the number 42 in the sixTimesNine envelope in the previous story did not bother you (e.g., your internal times table was nowhere to be found), perhaps you might need a little help in completing your task. We can have C calculate this number with the following dialog:

    //test - copy this code and paste it at the quickc prompt
    int m = 6;
    int x = 9;
    int b = -12;
    int y = m * x + b;
    printf("y is %d\n",y);           //should print: y is 42

One creates variables in C by first giving the type of the value the variable is to hold. In this case, we are creating variables that hold integers; the keyword int tells us that. Next comes the name of the variable. If one wishes to immediately give the variable a value (which we do), we follow the variable name with an equals sign and the value we wish the variable to have. Note that this value can be a literal or some combination of literals and variables. In any case, the type of the expression (generally) should match the type of the variable.

The code that creates a variable is known as a variable definition. When a variable is defined, the compiler allocates some space in memory to hold the value of the variable. Sometimes, we wish to inform the compiler that a variable will be defined elsewhere. For that task, we use a variable declaration. To declare that a variable exists somewhere else, we use the extern keyword. For example, to say that the variable x, defined above exists, we would say:

    extern int x;      //variable declaration: x exists!

This tells the compiler the subsequent occurrences of x in the code refer to an integer and that space for x has already been allocated. Many people, even long-time C programmers, confuse the terms definition and declaration. The reason is C does not treat declarations and definitions consistently. For example, in some contexts, C can treat:

    int x;

as a declaration, but in others, C treats it as a definition9. Just remember, if you see a type and a variable name, it's (likely) a definition. If you see the word extern in front, it's a declaration.

Continuing on with the example above, we combine the values of m, x, and b using multiplication10 and addition to determine the value of variable y. C, when asked to compute the value of an expression containing variables, as in:

    m * x + b

goes to those envelopes and retrieves the values stored there and combines them as directed. The last line of code:

    printf("y is %d\n",y);

shows of some of the sophistication of the printf function. As we discussed when we introduced the main function, functions can take in raw data. In this case, the printf function is being passed two pieces of information, a string and a variable, separated by a comma. The string guides printf on how to do the printing. Inside the string is the character sequence:

    %d

This informs printf to look for an additional value beyond the string and substitute that value for the %d. Moreover, the d in the %d indicates that printf should print that value as an integer. Moreover, when we give a variable to a function to be used as data, we actually send the value of the variable. In our code, the value of y will be 42, so we should see:

    y is 42

displayed when we run the program. The %d is known as a format directive. Inserting our test code into the template program tester.c gives us the following source code:

    //template for testing code
    //download with: wget troll.cs.ua.edu/ACP-C/tester.c
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <math.h>

    int main(int argc,char **argv)
        {
        //test code goes here
        int m = 6;
        int x = 9 ;
        int b = -12;
        int y = m * x + b;

        printf("y is %d\n",y);

        return 0;
        }

Compiling:

    lusth@warka:~/cs100$ gcc -Wall -g tester.c -o tester

and running this program produces the output:

    lusth@warka:~/cs100$ tester
    y is 42

just as expected. Pasting the test code at the quickc prompt gives us the same result.

Here are some more examples of variable creation:

    //test
    int dots = 42;
    int bones = 206;
    printf("%d\n",dots);            //should print: 42
    printf("%d\n",bones);           //should print: 206
    int CLXIV = bones - dots;
    printf("%d\n",CLXIV);           //should print: 164

After a variable is created, the variable or its value can be used interchangeably. For example, it is an easy matter to set up an equivalence between the variable PI and the real number 3.14159.

    //test
    double PI = 3.14159;
    double radius = 10.0;
    double area = PI * radius * radius;
    double circumference = 2 * PI * radius;
    printf("area is %f, while circumference is %f\n",area,circumference);

Compiling and running this code inside the template program should yield the output:

    area is 314.159000, while circumference is 62.831800

Recall that the double type is used for variables that hold real numbers. Notice how the expressions used to compute the values of the variables area and circumference are more readable than if 3.14159 was used instead of PI. In fact, that is one of the main uses of variables, to make code more readable. The second is if the value of PI should change (e.g., if a more accurate value of PI is desired11), we would only need to change the definition of PI. From the example above, we can see that to print a real number, we need to use the %f format directive in the guide string for printf.

Variables and memory

A variable in C refers to a memory location. At that memory location, the value of the variable is stored. One can imagine memory as a series of boxes. Using m as an example, the variable m identifies one of those boxes and the value of m is placed inside that box:

Just like a house on a street (the street being memory and the house being the box), every box in memory has an address. Having variables makes the actual address irrelevant, but let's just say the variable m refers to the box at address 1976 in memory12. If actual addresses are important, we will place them above the box:

Since the actual address of a memory location rarely matters, we will most often use a simpler notation for variables and their values:

m: 6

This notation should be read as "m identifies a location in memory that holds the value 6". Often, we will shorten this to "m is 6".

Variable naming

Like many languages, C is quite restrictive in regards to legal variable names. A variable name must begin with a letter or an underscore and may be followed by any number of letters, digits, or underscores.

Variables are the next layer in a programming languages, resting on the literal expressions and combinations of expressions (which are expressions themselves). In fact, variables can be thought of as an abstraction of the literals and collections of literals. As an analogy, consider your name. Your name is not you, but it is a convenient (and abstract) way of referring to you. In the same way, variables can be considered as the names of things. A variable isn't the thing itself, but a convenient way to referring to the thing.

While C lets you name variables in wild ways:

    int _1_2_3_iiiiii__ = 7;

you should temper your creativity if it gets out of hand. For example, rather than use the variable m for the slope, we could use the name slope instead:

    int slope = 6;

We could have also used a different name:

    int _e_p_o_l_s_ = 6;

The name _e_p_o_l_s_ is a perfectly good variable name from C's point of view. It is a particularly poor name from the point of making your C programs readable by you and others. It is important that your variable names reflect their purpose. In the example above, which is the better name: b, i, intercept, or _t_p_e_c_r_e_t_n_i_ to represent the intercept of a line?

Uninitialized variables

If you create an integer variable, but don't initialize it:

    int x;

the value held by that variable is some random integer13. Thus, if your program uses the value of an uninitialized variable, not only is it likely to compute incorrect results, it will also likely compute different results each time it is run. The same is true of all other types of variables as well.

An uninitialized variable is said to be filled with garbage, which leads to the old saying about computer programs:

garbage in, garbage out

Activities

Note: all activities in assume a Unix-style system, either Linux, Mac OSX, or Cygwin on Windows.

name purpose
variables practicing with C variables

lusth@cs.ua.edu


Literals Top OperatorsVariables Contents