Memories of memory

One of the limitations (or benefits if you look at it another way) of contemporary programming languages is that they deal with memory management so you don’t have to. Programmers don’t have to worry about stacks and heaps – they are hidden away in the dark recesses of the machine. C uses stacks and heaps, but the reality is that in most cases when a modern language talks about the stack, what it’s really referring to is the heap. It’s using the heap – and most programmers have no clue.

Python, for instance is implementation dependent in how it manages memory, although in most cases it is handled internally by some form of memory manager. For example Cpython uses a private heap containing all Python objects and data structures.

Does it really matter? In some cases no, not at all. There are situations when I don’t really care where or how something is stored. But there are other situations, such as embedded applications, where knowledge of memory management is paramount (hence the use of C-like languages on embedded systems).




Design of a string type for C

As mentioned in a previous post, strings in C are kind-of blah. Part of this has to do with the use of a string terminator, and the fact that they are not first-class objects. What about using length counts instead? How would this be achieved? One way would be to store the length count in a single byte at the start of the string, say index 0? This does have the effect of limiting the length of a string to 255 characters, but this rally shouldn’t be a problem. If you are storing larger strings, there is likely a better data structure, or you could simply use an array of characters Differentiating it in this way is similar to how Fortran deals with arrays. Too complicated? Unlikely, especially for the novice programmer, who no longer has to deal with the terminating string fiasco. Also strings could be indexed from 1..n without the loss of the precious “0-index”, which is used for storing the length of the string.

So what would a rejigged string look like? A simple string, with a maximum length of 255, might look like this (adopted from the keyword used by ALGOL 68):

string s;

This means that s[1] to s[255] would contain the characters in the string and s[0] would contain the length of the string. A longer string might be achieved through using the moniker long. For example:

long string[2000] s;

This would create a string with 2000 characters. Making an array of strings might be accomplished by:

string[30] s[40];

This would create an array of 40 strings, each 30 characters in length. Another annoying feature of C is that it provides functions for string to number, but not vice versa. This could be fixed by having a cast operator (string), which avoids having to use sprintf().

Of course, in an ideal world these strings would be even more efficient if they were coupled with the ability to use substrings s[i:j], overload the + operator for concatenation, == for equality and use !s to return the string length, but then, maybe I’m thinking of another language…

How lucid is Lucid?

In 1976, a language appeared from the University of Waterloo named Lucid. What is interesting about this unconventional “data flow” programming language is that the order of the statements in the language is irrelevant, and assignment statements are equations. It seemed to be primarily designed to carry out mathematical proofs, describing an algorithm in terms of assignments and loops. Here is a simple Lucid program to calculate the square root of a number N:

1  N = first input
2  first I = 0
3  first J = 1
4  next J = J + 2×I + 3
5  next I = I + 1
6  output = I as soon as J>N

Now let’s look at how it works:

  1. inputs N
  2. initialize the loop variable I
  3. initialize the loop variable J
  4. repeated, generates successive value of J
  5. repeated, generates successive value of I
  6. terminates the loop and outputs the result

Of course to the average programmer, this seems kind-of intuitive, not completely left of field like some languages (Lisp anyone?). The authors describe this language as being spartan – containing NO procedures (as differentiated from functions), data structures, control structures or I/O. They go on to remark that “not having to worry about control flow is remarkably liberating” [1].

I doubt there is still a compiler anywhere, but it is a cool language to at least explore on paper.

[1] Ashcroft, E., Wadge, B., “Some common misconceptions about Lucid”, ACM SIGPLAN Notices, 15(10), pp.15-26 (1980).

The shortcut if

In a paper critiquing Pascal in 1973, A. N., Habermann comments on the if statement. He suggested that the code:

i := if i=7 then 1 else i+1

more clearly expresses that a value is assigned to i than the statement:

if i=7 then i:=1 else i:=i+1

What do you think? Is the embedded statement easier to read? Maybe, for an experienced programmer, maybe not so much for a novice. It is similar to the problem found in C with the ternary if statement.

i = i==7 ? 1 : i+1;

This suffers from a lack of readability, mostly related to the use of two symbols ? and : to represent then and else. Using the actual words if, then and else would be two verbose. That and it doesn’t seem logical to everyone to embed a decision statement within an assignment. But it does make nice compact code.

Habermann, A.N., “Critical comments on the programming language Pascal”, Acta Informatica, 3, pp.47-57 (1973).

Pascal’s Achilles Heel

The Pascal programming language was designed for teaching. Anyone who learned programming in the 1970s and 80s likey did so using Pascal. One of the main idiosyncrasies with the design of Pascal is the use of semicolons. In C, semicolons perform the task of terminating statements, so it is hard to use them in the wrong context. In Pascal, semicolons are statement separators adopted from the syntax of ALGOL. This basically means that they do not exist in places where the layout of the program would make them redundant. For example, consider this piece of code in C:

1 while (!odd(y))
2 {
3    y = y / 2;
4    x = sqrt(x);
5 }

whereas in Pascal the code would look like this:

1 while not odd(y) do
2 begin
3    y := y div 2;
4    x := sqr(x)
5 end;

The two statements on lines 3 and 4 are separated by a semicolon. The semicolon after the end on line 5 separates the while loop from the next statement. Most Pascal compilers will also accept the following:

1 while not odd(y) do
2 begin
3    y := y div 2;
4    x := sqr(x);
5 end;

But failure to add the semicolon at the end of line 5, as in:

1 while not odd(y) do
2 begin
3    y := y div 2;
4    x := sqr(x);
5 end
6 y := y - 1;
7 z := x * z;

This will result in an an error of the form:

Fatal: Syntax error, ";" expected but "identifier Y" found
Fatal: Compilation aborted

Similarly, a semicolon before an else statement will effectively chop the if statement in two, causing an error. The following is the correct way:

if i > j
then maxi := i
else maxi := j;

Basically if you are writing programs in Pascal, remember the following two rules:
a semicolon before ELSE is wrong;
a semicolon before END is unnecessary. 

The devolution of usability

One of my hobbies is woodworking, and one magazine I liked a lot before it disappeared was Woodworking Magazine. Or rather it disappeared by merging with Popular Woodworking, to become Popular Woodworking Magazine. What is amazing is the website evolution, because normally websites improve over time. Not so in this case. The first image shows the webpage of Woodworking Magazine in 2005. This magazine had no ads, and its website reflects this with no ads, and a very clean front page. It is clear that the current issue is the main stay of the page. The webpage actually reflected the magazine, which also contained no ads.


In comparison, consider the Popular Woodworking site in 2005. It too depicted the current issue of the magazine, and was quite clean, even though there were some ads on the website. The information on the left side of the webpage is well organized, making it easy to find relevant information.


After the merger circa 2010, the website too evolved into a hybrid (shown here in 2011). The menu has transformed from vertical to horizontal, and a video stream has been added.


Finally a snapshot of the website from 2016. It is now an extremely busy website festooned with advertising.


Compare this to the Fine Woodworking website, which offers a much cleaner browsing experience. There are ads, but they are lower down, so as to not crowd out the content on the opening portion of the website. Everything is easy to find, and the use of whitespace makes things stand out.


Structural erosion in old code, i.e. rot

Old code rots. Not in the traditional sense of the word of course. There is no physical decay, and it doesn’t smell, it usually manifests itself as code that slows over time, or just stops working. So in reality it may be more accurate to say that it erodes in the same way that steel rusts, slowly weakening. Structural erosion of code occurs for a number of reasons. Sometimes it is because the environment itself changes. Code is ported from an old system to a new one, and runs. Yet the increased speed of the system may have a negative impact on the old code due to the type of algorithm used in the code.

A good example is old technology. iPhones (or any phone for that matter) last for a certain number of years. At some point the technology changes, and they are no longer capable of updating the operating system. The iPhone then becomes a tomb for the software within it. Without an OS update, it is then likely that over a short term, apps will no longer be able to be updated either. Or maybe an app is no longer supported, so it may not function properly if transferred to a new iPhone. It sometimes happens with compilers as well. Really old code rots, because it requires a *lot* of changes to make it function, possibly because the software development environment has changed too much.

It is also present in websites, where dependencies such as links no longer exist, causing the website to become dysfunctional.

The user interfaces of Star Trek – vocal

One of the more interesting aspects of the computer systems on Enterprise is the human-computer interface. Computer stations are equipped with audio I/O, and a seemingly unlimited set of words, in unrestricted English. Here’s an example:

Computer. Digest log recordings for past five solar minutes. Correlate hypotheses. Compare with life forms register. Question: Could such an entity within discussed limits exist in this galaxy? (Episode: Wolf in the Fold)

There is no way with current technology that we could ever fathom such an understanding of English, or any language for that matter. The request also implies quite a high level of intelligence for the computer itself.

What about the whole speech thing?
So the Enterprise relies heavily on speech recognition and semantic comprehension of a natural language. Speech recognition takes phonemes (speech sounds) and tries to make them into words.  In Star Trek, recognition of spoken words has been completely solved. In 1977 the capabilities were akin to 1000 words recognized for one speaker. Is it any better today? Today we have Siri, maybe the forefront of speech I/O. Microsoft apparently has a word-error-rate (WER) of only 6.3%, slightly lower than IBM’s Watson team at 6.9%. In 1995, the WER was 43% (IBM). Speech recognition has always been challenging because every persons speech is so different, but great strides are being made.

Aside from this, semantic comprehension, or understanding is a completely different ballgame. What progress has there been on the design of algorithms to analyze statements?

Schmucker, K.J., Tarr, R.M., “The computers of Star Trek”, BYTE, Dec. pp.12-14, 172-183 (1977)

Survival usability

If you were to be stuck on an island in the middle of the Pacific ocean, what tool would you want to have? What would be the smallest, more usable survival tool? What about this? The READYMAN Wilderness Survival Card. In fact READYMADE make survival cards for fishermen, medical, and even a hostage escape card.


It has  22 tools: fish hooks (9), snare locks (4), arrow heads (3), sewing needles (2),  saw blades (2), an awl, and a set of tweezers.



Are some people just clueless?

I don’t profess to love technology, in fact I am very wary of it. I do enjoy programming, and I realize that computers aren’t going away anytime soon. So it saddens me then when people within universities have no clue whatsoever that one of the top growth areas is computer science. Why? Partially is because it’s relevant. Students who do CS get jobs. Good paying jobs. It’s hard to find a job, in the sciences anyway, that doesn’t require some programming skill to process the vast amounts of data being produced. We have come to the point where it is hard to ignore computer science education.

Here’s the bottom line. CS enrolment in most places has ballooned in the past few years. At Cornell, their enrolment has gone from 175 in 2011 to 684 in 2016. In fact institutions across North America have seen doubling and tripling of CS enrolments in the past five years. Just a little blip really. Nothing at all provocative about those numbers. On the there side of things, Canada’s demographic is changing – over the next 10 years the number of 17-24 year old will actually decline. Less students = less government funding. This could be offset by increasing enrolment is CS, OR by offering new forms of degrees – 4 year coop?, more 3 years degrees, industry-focused degrees etc.

The worst thing to do? Ignore computer science. Put no effort into building innovative degrees, or investing in infrastructure.

Innovation needs computer science. But apparently not everyone understands that.