It’s a trap! – Inputting sentences in C programs

If you are inputting in a sentence into a C program, then there are a number of choices: scanf, gets, and fgets. Let’s consider that the string being input is “winter is coming”. The lamest choice is probably scanf, which will only read in the first word of the sentence.

char str[100];
scanf("%s", str);

The remainder of the string will remain in the buffer (unless you use the code trick published previously). The most obvious choice is gets(), but it has a real shady background. When compiling a program using gets(), no warnings will show up (at least not with gcc)… however when the program is run, the following warning appears “warning: this program uses gets(), which is unsafe.” Long story short, the gets() function does not perform bounds checking, therefore it’s extremely vulnerable to buffer-overflow attacks. For example, consider the code below:

char str[100];
gets(str);

If less than 100 characters are entered as input, there is no problem what-so-ever. However, if more than 99 characters are entered, gets() will not stop writing at the end of the string. Instead, it continues writing past the end and into memory it doesn’t own (scanf is known to do similarly stupid things). The problem will manifest itself in a number of ways: immediately crashing, incorrect program behavior, or possibly no visible effect – it really depends on the amount of extra text entered. Here’s a piece of code to illustrate this phenomena:


#include <stdio.h>
#include <math.h>
#include <string.h>

typedef struct person
{
    char name[5];
    int dob;
} person_t;

int main(void)
{
    person_t me;

    me.dob = 1970;
    printf ("me.dob is %d\n", me.dob);
    printf ("Enter the persons name: ");
    gets(me.name);
    printf ("me.name is %s\n", me.name);
    printf ("me.dob is %d\n", me.dob);

    return 0;
}

Now here’s the program running, with the input for the string name being “Skywalker”, clearly larger than the 4 characters available.

me.dob is 1970
Enter the persons name: Skywalker
me.name is Skywalker
me.dob is 114

Notice how the value stored in dob has changed? Not the best scenario in the world. This is worse here because of the use of a struct, whereby the memory for the fields name and dob are closely coupled together. So the ay to fix this problem is “you don’t need to use gets(), try fgets(), now move along”. fgets() is just the file version of gets() – however is allows control over how many characters are read into the string.

char str[100];
fgets(str,sizeof(str),stdin);

In this case fgets() reads in at most one less than sizeof(str) (or 99) characters from the input stream stdin (standard input) and stores them into the string pointed to by str. Reading stops after an EOF or a newline (Enter). If a newline is read, it is stored into the buffer. A ” is stored after the last character in the buffer. The one caveat with this is that fgets() *may* place the newline character into the string  if there is enough room to store it. This can cause problems further down the road when processing the string. For example, if we were to enter “winter is coming”, this is what would be stored:

winterIScoming

The length of this string is actually 17, because of the ‘\n‘ stored at the end. If the length of str were 17 when it was created, it would not be a problem. So to avoid this, the ” should be moved up one element in the string using some code of the form:

str[strlen(str)-1] = '\0';

But only if the length of the string input is less than the declared string size (minus 1).

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s