C obscenities – weird array notation

There are things one discovers about C that are just plainly stupid. Array notation falls into that category. What seems simple in some languages is made challenging in C due to the multiple ways an array element can be accessed. Part of this is attributable to the relationship between arrays and pointers, which is kind of murky to say the least. In the most basic sense, a C array declaration of the form:

int array[20];

will create an integer array containing 20 elements. The ith array element is then accessed in the following manner: array[i]. What this is really doing is accessing the memory at that particular location, which looks something like this (in pseudocode):

address(array[i]) = address(array[0]) + i * size(int)

For any array in C, that first element, array[0] is the linchpin, allowing access to the rest of the array. So setting the ith element of an array to the value 12 is normally achieved in the following manner:

array[i] = 12;

However when Kernighan and Ritchie designed C, they tried to create a unified treatment of arrays and pointers, one that would expose the “pointer-like” qualities of arrays using the “decay convention” – an array is treated as a pointer that points to the first element of the array. Therefore taking a subscript with value i is equivalent to the operation: “pointer-add i and then type-dereference the sum”. For example:

array[i] = *(array + i)

Similarly, for a 2D array :

array[i][j] = *(*(array + i) + j)

In many languages, this type of notation is “hidden” from the programmer. As array dimension increase, so too does the complexity of the equation which can be used. So here are the myriad of ways that the value 12 can be assigned to the 3rd element of an array a.

a[2] = 12;
*(a+2) = 12;
2[a] = 12

Whooooooa… what is that last piece of notation – THAT can’t be legal? Oh but it is. In other languages the use of such an obscenity would result in a syntax error of some sort. Not so in C. Due to the [] operator being defined as above, the following are equivalent:

array[i] = *(array + i)
array[i] = *(i + array)

which implies

*(i + array) = i[array]
array[i] = i[array]

This is a direct artifact of arrays behaving like pointers. Nasty… and confusing.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s