Arrays are not pointers!

As I have mentioned before, anyone who wants to become proficient in C should own a copy of Deep C Secrets, by Peter Van der Linden (you can find an online copy on archive.org). In it he covers a number of topics which are usually confusing to students, especially the art of pointers. One of the things Mr. Van der Linden discusses is how “arrays are not pointers“. Confused? Well you likely are not alone.

Typically most C programmers are told that “arrays are the same as pointers”. This can be dangerous because it sets the wrong tone, and is really a half-truth. It’s a bit of a chicken and egg problem. Arrays look like pointers, and pointers can refer to array objects.

For example, people sometimes think that char s[] is identical to char *s. But they aren’t identical. The array declaration char s[12] requests that space for 12 characters be set aside, to be known by the name s. The pointer declaration char *p, on the other hand, requests a place that holds a pointer, to be known by the name p. This pointer can point to almost anywhere: to any char, to any contiguous array of chars, or frankly nowhere. To illustrate this better consider the following example the following declarations:

char s[] = "tauntaun";
char *p = "dewback";

would produce structures of the form:

A reference like m[4] then generates different code, depending on whether m is an array, or a pointer. When a compiler sees s[4], it starts at location s[0], moves 4 elements past it, and obtains the character there. When it sees expression p[4], it starts at the location p, obtains the pointer value there, adds 4 to the pointer, and finally obtains the character pointed to.

The reality is that it’s really easy to get confused in C, especially when books say things like “an array is merely a cleverly disguised pointer”. Here is a further example.

char s[] = "tauntaun";
char *p = s;

In this case, p is a pointer to char; it refers to a block of memory large enough to hold the address of a single char object. It is initialized to point to the first element in s. So *p is equivalent to s[0], *(p+1) is the same as s[1], etc.

One of the reasons people get confused is because books say things like “the compiler converts array subscripts into pointer dereferences, s[i] becomes *(s+i)“. Or they go further down the rabbit hole and declare the following are all equivalent:

s[i]
*((s)+(i))
*((i)+(s))
i[s]

Which they are, but it’s not something that novice C programmers should be exposed to. Why are they equivalent? Because array subscripting is defined in terms of pointer operations. If you don’t believe me, then consider what C’s creator Dennis Ritchie has to say about it [1]. Arrays in C’s predecessors, B and BCPL were pointers, as they were in the extended language “new B”. But there were issues, specifically with arrays and the creation of structured (record) types – if they contained an array there was “no convenient place to stash the pointer”.

“The solution constituted the crucial jump in the evolutionary chain between typeless BCPL and typed C. It eliminated the materialization of the pointer in storage, and instead caused the creation of the pointer when the array name is mentioned in an expression. The rule, which survives in today’s C, is that values of array type are converted, when they appear in expressions, into pointers to the first of the objects making up the array.”

Dennis Ritchie [1]

Too many books fail to explain when arrays are pointers and when they are not.

  1. Dennis Ritchie, “The Development of the C Language“, Bell Labs (1993)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.