Conditionals Top ScopeInput and Output Contents

Input and Output

Generally all computational entities, be it variables, functions, or programs, require some sort of input and output. Input roughly corresponds to the process of supplying data and output roughly corresponds to the process of retrieving data. For a variable, assignment is the way values (data) are supplied to the variable, while referencing the variable (as part of an expression) is the way that data is retrieved. For functions, the input is the set of arguments that are supplied in a function call, while the output of a function is its return value. For programs, input refers to pulling in data and output refers to pushing out results. In this chapter, we focus on managing the input and output of programs.

Input

Input is the process of obtaining data that a program needs to perform its task. There are generally three ways for a program to obtain data:

We examine these three approaches in turn.

Reading from the keyboard

To read from the keyboard, one uses the scanf function.

Reading integers

To read an integer, an integer directive is supplied to scanf:

    //test
    int num;
    printf("Please enter an integer: ");
    scanf("%d",&num);
    printf("the number you entered was %d\n",num);

The scanf function takes a guide string and a pointer to a memory location, Moreover, scanf returns when it has seen an integer followed by a newline on the input. At a level lower than scanf, C waits until an entire line has been input before scanf can start its task.

In the example, we give the address of num; the & operator returns the address of the num variable. We have seen pointers to arrays. Pointers to variables are really the same thing. The difference is a pointer to a single variable can be thought of as an array with a single slot. Let's rewrite the above example a little bit:

    //test
    int num;
    int *addr;              //an 'array' pointer
    addr = #            //&num has type 'int *'
    printf("Please enter an integer: ");
    scanf("%d",addr);
    printf("num is %d\n",num);
    printf("addr[0] is %d\n",addr[0]);
    printf("*addr is %d\n",*addr);

Suppose num is placed at memory location 1004. Immediately prior to the first printf, the situation looks like:

The addr variable, as desired, holds the address of num. Now the value of addr is passed as the second argument to scanf. The scanf function reads in a number and places it in the memory location specified by the second argument. In this case, it is placed at memory location 1004. So, if the user enters the number 42, then 42 is stored as the value of num:

The final three print statements all output the value 42. The first does so since the value read in by scanf was stored in num. The second does so since addr can be considered a pointer to an array of length 1:

The final print statement prints 42 since *addr is an alternate form of addr[0]. When we take the address of a variable, we will rarely use the array indexing form used by the second print statement, preferring instead the pointer dereferencing form of the third print statement.

Reading real numbers

Reading a real number is similar to reading an integer, except the "%lf" directive and the address of a double variable is passed to scanf:

    //test
    double realNumber;
    printf("Please enter a real number: ");
    scanf("%lf",&realNumber);
    printf("the number you entered was %f\n",realNumber);

Reading single characters

For a character, the "%c" directive and the address of a character variable is passed to scanf.

    //test
    char ch;
    printf("Please enter a character: ");
    scanf("%c",&ch);
    printf("the character you entered was <%c>\n",ch);

With the "%c" directive, the next character on the input stream is read, regardless of whether or not it is a whitespace. If one places a space before the percent sign in the directive, then scanf will skip over any whitespace and read the first non-whitespace character pending on the input:

    //test
    char ch;
    printf("Please enter some whitespace, then a character: ");
    scanf(" %c",&ch); //note the space in the directive
    printf("the first darkspace character you entered was <%c>\n",ch);

Reading strings

One can use the scanf function to read multiple characters at a time from the keyboard with the "%s" directive:

    //test of VERY DANGEROUS CODE
    char buffer[512]; //room for 511 chars plus the null char
    printf("Please enter a token: ");
    scanf("%s",buffer);
    printf("the token you entered was <%s>\n",buffer);

where a token is a contiguous sequence of non-whitespace characters. Why is this code dangerous? The reason is scanf does not ensure that no more than 511 characters (in this example) are read into array buffer. So if the token entered is, say, 600 characters in length, an additional 90 characters will be read into memory, trashing the memory beyond the extent of the array.

A very clever and devious person at some point realized that if the token was long enough, the memory trashing would extend beyond the memory space reserved for the local variables and into the region of memory that stores the location of the caller of the current function. When the current function returns, program control jumps to the caller of the function. By carefully choosing the additional characters, the return location is overwritten with the address of the buffer itself. If the buffer is filled with characters that look like machine code, then this new program would run with the same privileges as the original program. If a program with high privileges is exploited in this way, then the exploiter could potentially take over the system.

This is not just a theoretical problem. The first widespread internet virus, the Morris worm, used this approach and brought large portions of the internet to a halt26.

In fact, many of the vulnerabilities being discovered or exploited today are due to this use of scanf and related functions in system-level C code. So, NEVER USE SCANF TO READ IN A STRING! Instead of scanf27, one should read a line one character at a time and checking to see if too many characters are encountered before the newline is seen. This is a rather onerous task, so it has been done for you. To get this code, download the following files:

    wget troll.cs.ua.edu/ACP-C/scanner.c
    wget troll.cs.ua.edu/ACP-C/scanner.h

The scanner.c file contains the definitions of the following functions:

function purpose
readInt returns the next integer
readReal returns the next double
readChar returns the next darkspace character
readRawChar returns the next character, whitespace or darkspace
readToken returns the next token
readString returns the next double quoted string
readLine returns the remainder of current line

You can read more about these functions in the documentation of scanner.c. You can find the prototypes of the functions in scanner.h.

The scanf code for reading a token can be replaced with a safe call to readToken. To use the scanner, you need to add the following line after the system includes:

    #include "scanner.h"

Then you can make calls to the scanner functions:

    //test
    char *s;  //note change of s to a char pointer
    printf("Please enter a token: ");
    s = readToken(stdin);     //stdin is the keyboard
    printf("the token you entered was <%s>\n",s);
    free(s); //s points to a malloc'd array so free it when done

To compile code that calls scanner functions, you will need to compile scanner.c with the rest of your program. Suppose your main function is in tester.c and it calls readToken. To compile the program, you would enter the command:

    gcc -Wall -g -o tester tester.c scanner.c

The scanner functions require the passing in of a file pointer. This allows them to read from the keyboard (using stdin) or from a file (more on that later on in the chapter). The first four functions listed are wrappers to scanf. The last three read characters into a statically allocated array. If there is not enough room, an error message is printed and the program is terminated. If there is room, the array is copied into malloc'd memory and a pointer this memory is returned.

Reading directly into arrays

One can read directly into arrays by using pointer offsets:

    //test
    int a[3];
    printf("give me the first number: ");
    scanf("%d",a+0);
    printf("give me the second number: ");
    scanf("%d",a+1);
    printf("give me the third number: ");
    scanf("%d",a+2);

    printf("[%d]",a[0]);
    printf("[%d]",a[1]);
    printf("[%d]",a[2]);
    printf("\n");

If the three numbers read were 2, 13, and 42, in that order, then the output would be:

    [2][13][42]

Sometimes you will see:

    scanf("%d",&a[1]);

instead of:

    scanf("%d",a+1);

The former is usually written by a programmer who is a little unsure of the relationships between statically allocated arrays and pointers. One can also use the scanner functions:

    ...
    printf("give me the first number: ");
    a[0] = readInt(stdin);
    printf("give me the second number: ");
    a[1] = readInt(stdin);
    printf("give me the third number: ");
    a[2] = readInt(stdin);
    ...

Reading from the command line

The second way to pass information to a program is through command-line arguments. The command line is the line typed in a terminal window that runs a C program (or any other program). Here is a typical command line on a Linux system:

    lusth@warka:~/l1/activities$ prog3

Everything up to and including the dollar sign is the system prompt. As with all prompts, it is used to signify that the system is waiting for input. The user of the system (me) has typed in the command:

    prog3

in response to the prompt. Suppose prog3.c is a file with the following code:

    #include <stdio.h>
    #include <stdlib.h>
    #include <math.h>
    #include <string.h>

    int
    main(int argc,char **argv)
        {
        int i;
        printf("Number of command-line arguments: %d\n",argc);
        printf("Command-line arguments:\n");
        i = 0;
        while (i < argc)
            {
            printf("    %s\n",argv[i]);
            ++i;
            }
        return 0;
        }

We haven't covered loops yet, but all the program does is print out all of the command line arguments. In this case, the output of this program would be:


    lusth@warka:~/l1/activities$ gcc -Wall -g -o prog3 prog3.c
    lusth@warka:~/l1/activities$ prog3
    Number of command-line arguments: 1
    Command-line arguments:
        prog3

From the output, we can see that there was one command-line argument, the name of the executable. Looking more closely at the code:

    printf("Number of command-line arguments: %d\n",argc);

we see that the number of command line arguments is stored in the formal parameter argc. We see also:

    printf("    %s\n",argv[i]);

that the command line arguments themselves are stored in the formal parameter argv. This tells us that argv points to an array of strings. A pointer has a star, and a string has a *, hence the typing of argv:

    char **argv;

with two stars.

Any whitespace-delimited tokens following the program file name are stored in argv along with the name of the program being run. For example, suppose we run prog3 with the this command:

    lusth@warka:~/l1/activities$ prog3.py 123 123.4 True hello, world

Then the output would be:

    Number of command-line arguments: 6
    Command-line arguments:
        prog3
        123
        123.4
        True
        hello,
        world

From this result, we can see that all of the tokens are stored in argv and that they are stored as strings, regardless of whether they look like some other entity, such as integer or real number.

If we wish for "hello, world" to be a single token, we would need to enclose the tokens in quotes:

    prog3 123 123.4 True "hello, world"

In this case, the output is:

Number of command-line arguments: 5
Command-line arguments:
    prog3
    123
    123.4
    True
    hello, world

There are certain characters that have special meaning to the system. A couple of these are '*' and ';'. To include these characters in a command-line argument, they need to be escaped by inserting a backslash prior to the character. Here is an example:

    prog3 \; \* \\

To insert a backslash, one escapes it with a backslash. The output from this command is:

    Number of command-line arguments: 4
    Command-line arguments:
        prog3
        ;
        *
        \

Although it looks as if there are two backslashes in the last token, there is but a single backslash. Most Linux programs, including the terminal window shell, uses two backslashes to indicate a single backslash.

What command-line arguments are

The command line arguments are stored as strings. Therefore, you must use atoi or atof if you wish to use any of the command line arguments as integers or real numbers, respectively. Here is code that tests that there are two command-line arguments beyond the program name, the first representing an integer and the second representing a real:

    //test
    int x;
    double y;
    if (argc != 3)
       {
       fprintf(stderr,"there should be two args beyond the program name\n");
       exit(1);
       }
    x = atoi(argv[1]);
    y = atof(argv[2]);

    printf("1st additional arg is %d\n",x);
    printf("2nd additional arg is %f\n",y);

Given the additional arguments 23 and 4.5, the output should be:

    1st additional arg is 23
    2nd additional arg is 4.500000

Reading from files

The third way to get data to a program is to read the data that has been previously stored in a file.

C uses a file pointer system in reading from a file. To read from a file, the first step is to obtain a pointer to the file. This is known as opening a file. The file pointer will always point to the first unread character in a file. When a file is first opened, the file pointer points to the first character in the file.

Reading files using fscanf

Suppose we wish to read from a file named data. We first obtain a file pointer by opening the file like this:

    FILE *fp = fopen("data","r")
    if (fp == 0)
        {
        fprintf(stderr,"file data could not be opened for reading\n");
        exit(1);
        }

The fopen function takes two arguments, the name of the file and the kind of file pointer to return. We store the file pointer in a variable named fp (a variable name commonly used to hold a file pointer). In this case, we wish for a reading file pointer, so we pass the string "r". We can also open a file for writing; more on that in the next section. In any case, you should always test the return value of fopen; a return value of zero means a problem has occurred.

Once we have the file pointer, we can use fscanf to read various kinds of items. For example, to read an integer, we would use the "%d" directive:

   //test
   int x;
   FILE *fp = fopen("data","r");
   //test of fp omitted
   fscanf(fp,"%d",&x);
   printf("the number read was %d\n",x);
   fclose(fp);

Note how similar the call to fscanf is to scanf. The following two calls are equivalent:

    scanf("%d",&x);
    fscanf(stdin,"%d",&x);

In fact, scanf and fscanf are identical except that scanf hard-wires the file pointer to stdin, the keyboard.

When we are done reading a file, we close it:

    fclose(fp);

Always remember to close your files!

Reading files using the scanner

We can also use the scanner functions to read from a file, by passing a file pointer instead of stdin:

   //test
   int x;
   FILE *fp = fopen("data","r");
   //test of fp omitted
   x = readInt(fp);
   printf("the number read was %d\n",x);
   fclose(fp);

As always, remember to close your files when finished.

Output

Once a program has processed its input, it needs to make its output known, either by displaying results to the user or by storing the results in a file.

Writing to the console

We have been writing to the console using printf for some time now. The printf function is variadic, which means it can take a variable number of arguments:

    //test
    int x = 3;
    double y = 14.4;
    char *z = "hello";
    printf("an integer, %d, a real number, %f, and a string, %s\n",x,y,z);

The arguments after the guide string are match to the directives in the guide string, in order given. If you have a mismatch between the directive and the corresponding argument, you will get a warning message when you compile.

Printing quote characters

Suppose I have the string:

    char *str = "Hello";

If I print my string:

    printf("%s",str);

The output looks like this:

    Hello

Notice the double quotes are not printed. But suppose I wish to print quotes around my string, so that the output looks like:

    "Hello"

To do this, the print statement becomes:

   printf("\"%s\"",str);

If I want the double quotes to be in the string itself, str would assigned thusly:

    str = "\"Hello\"";

If you need a refresher on what the string "\"" means, please see the chapter on strings.

Writing to a file

C also requires a file pointer to write to a file. The fopen function is again used to obtain a file pointer, but this time we desire a writing file pointer, so we send the string "w" as the second argument to fopen:

    FILE *fp = fopen("data.save","w")

Now the variable fp points to a writing file object; we write to the file using fprintf. The correspondence between fprintf and printf is the same as that between fscanf and scanf. The following two calls are equivalent:

    printf("hello, world!\n");
    fprintf(stdout,"hello, world!\n");

The printf function simply hardwires the file pointer to stdout, which represents the console. As with reading, a file opened for writing should be checked for a successful open and should be closed with fclose when reading is complete.

Opening a file in order to write it has the effect of emptying the file of its contents soon as it is opened. The following code deletes the contents of a file (which is different than deleting the file):

    // delete the contents
    FILE *fp = fopen(fileName,"w")
    //check that fopen did not encounter a problem
    fclose(fp);

If you wish to start writing to a file, but save what was there previously, call the open function to obtain an appending file pointer:

    FILE *fp = fopen(fileName,"a")

Subsequent writes to fp will append text to what is already there.

lusth@cs.ua.edu


Conditionals Top ScopeInput and Output Contents