Programming languages are the interface to the machine

We interact with machines in many ways, but the most primeval means is by writing programs. Programs become the soul of the machine, without them, the machine is useless, soulless. The programming language is the medium between the world of humans and the world of 1’s and 0’s.  Yet despite 60 odd years of programming language design, we are yet to design one which is easy for everyone to use. This is partially due to the intricacies of the machine, but also due to the fact that we don’t design for language usability per se. The easiest language for people to learn programming language concepts is a simple one – very little syntax, easy to decipher. The problem of course is more complex than that because programming does not only involve writing code.

There is the task of solving a problem, then developing an algorithm, and converting that algorithm into a programming language. Nothing is trivial. For example it is easy to state a problem of the form: “find all faces in a photograph“. The problem *seems* simple because to humans it is. Developing an algorithm is considerably more complex.

There are also differences between natural language and the programming language, what some would term linguistic transfer. Programming is difficult for novices because of how humans interpret a language. A case in point is the use of  while to indicate a loop.

while (n < 1000)
  -- do something constructive
end

The term while could be construed as a period of time, but more often it is misinterpreted as being applied continuously (i.e. a continuously active test), versus being evaluated once per iteration of the loop. In English one could say “while the sun is shining, play outside“, which is a loop of sorts, but not one we actively think about. We don’t look at our environment every minute to determine whether the sun is still shining. A novice programmer may misconstrue a while loop to terminate the minute the condition is satisfied.  It is an instinctive thing. Ada may actually have had a good idea with the concept of a loop where the exit is introduced within the loop itself.

loop
   exit when (n < 1000);
   -- do something
end loop;

We know a physical loop is continuous, be it in the form of a rope loop, or a loop in a road (or even the new Apple HQ building at Apple Park). Leaving the loop in the road involves making a conscious choice to leave while in the loop. This actually makes sense in many ways because there is one generic loop to deal with most situations. It could also deal with the case where one wants to do something n times, but that may itself be better expressed using something like a repeat statement:

repeat n times
   -- do something
end

Or what about this?

do repeat i with values 1 to n
   -- do something
end do

Too wordy? Likely. The same issues are encountered by the if statement. Novice programmers might not yet understand the sequential life of a program, and so they might consider an if statement to be “waiting” continuously, instead of being executed in sequence. Something like “if it starts raining, open an umbrella“. It might seem easy to the experienced programmer, but not so to the novice.

Is there a definitive solution? Unlikely. The use of the existing programming vernacular is ingrained into programmers. We have never really delved into the terms used for loops since the for loop appeared in Algol, and the while loop shortly afterwards. It might be possible to make small changes over time… but computer science rarely does things in small ways. We likely have to find better ways of teaching people to program.

 

 

Advertisements

Debunking myths about programming languages

Super programming languages are best.
There is no such thing as a “super” programming language. There is no “secret weapon” that will make writing a piece of software easier, and faster. If the algorithm isn’t any good, it doesn’t matter what the language is, it won’t save it. Garbage in, garbage out.

All languages are created equal.
Ah, no. Certain languages are good at some things, others are good at other things. For instance C is a fantastic systems language, Cobol is great for business stuff, Fortran is good at scientific work, and parallel processing, Ada is good for real-time systems. There is *no* one language to rule them all.

Fortran and Cobol are old, and quite useless.
Really? Fortran is 60 years old, C is 45. There isn’t much difference. You may not hear much about Fortran or Cobol, but that’s because they are quiet achievers. Not brash and loud-moutheed like Java. It doesn’t matter how *old* a language is, as long as it can do what you want it to do.

Object-oriented languages are better.
Not. OO appeared in the mainstream in the mid 1980s in the guise of C++. OO is a *methodology* for designing software, not a religious experience. Many languages (even Cobol) have gone down the OO route. Some people love OO so much one wonders if it is a cult of some sort.

GOTOs are harmful.
Yes, yes they are. Don’t think that you should use them in your programs. The unrestricted use of goto’s stems from the early years of languages such as Fortran. They were “it”, before other structures appeared. If your program is using a lot of them, then there is something wrong. Experienced programmers know where they can use goto’s. If you don’t understand where problems could occur, don’t use goto.

Compiled code always works perfectly.
??? The compiler will try and turn your program into something that can be interpreted by a machine. If the program compiles, it just means that the proper syntax was used, and the compiler was happy. Doesn’t mean there isn’t some hidden logic error in there somewhere. The larger the program and the  more complex the algorithm, the more likelihood that something might go wrong, sometime.

Python is better than C.
Like I said, no one language is better than another. C does well with systems programming, dealing with memory, low level stuff. Python is a good glue language, and is good for rapid development. C is a compiled language, Python is an interpreted language. C’s programs are fast, Python is slower. C is not an easy language for novice programmers, Python is. C requires very few dependencies, Python requires many depending on what you want to do. They are different, and that’s okay. And besides Python is written in C.

 

 

 

A simple programming language

In “The Humble Programmer”, written by Dijkstra in 1972 he made the following statement:

Another lesson we should have learned from the recent past is that development of ‘richer’ or ‘more powerful’ programming languages was a mistake in the sense that these baroque monstrosities, these conglomerations of idiosyncrasies, are really unmanageable both mechanically and mentally.”

And to be honest, he was probably right… but that wasn’t even the worst of it. If he thought Algol 68 was a monster, C++ and Ada would balloon to gargantuan proportions. Have we lost the ability to create simple programming languages? I don’t necessarily mean for the purpose of creating commercial applications – the nature of interaction with hardware, networking etc, makes languages with large scale structure necessary. But along the way we have used these commercial languages to teach novices to program – and it hasn’t really worked.

One of the reasons why students become disengaged with introductory programming courses may be the vast amounts of language knowledge required, on top of learning problem solving skills, and algorithm design. It may be too much for people not use to thinking in a “programming” mode. Learning about language-specific syntax, and memory management, design, testing, blah, blah, blah – it may be overwhelming. Which is surprising, but then those of us who learned programming in the 1980s  learned to program in Pascal, or maybe Fortran. Before languages added things like OO, became “functional”, and masters of everything that a language could be, they were simple – no really they were.

There were languages created specifically for teaching programming. Pascal was one. S-Algol was another (it appeared after Pascal).

Here is a program in S-Algol which performs a Bubble sort:

let n = readi
let x = vector 1::n of 0.0
for i = 1 to n do x(i) := readr
for i = 1 to n-1 do
   for j = 1 to n-i do
      if x(j) > x(j+1) do
      begin
         let temp = x(j)
         x(j) := x(j+1)
         x(j+1) := temp
      end
write "The sorted numbers are 'n'"
for i = 1 to n do write x(i), "'n"?

The structure of this language was very simple. It was from a pre-OO time when the programming world was simpler, happier.

 

Why Algol 68 was weird (and ultimately failed)

Look at this piece of code written in Algol 68, which finds the largest element of an array:

proc max = (ref[,] real a) real:
begin real big := -maxreal
   co assign the maximum real number allowed co;
   for i from 1 lwb a to 1 upb a do
   for j from 2 lwb a to 2 upb a do
   if a[i,j] > big then big := a[i,j] fi
   od od;
   big
end

It summarizes many of the things that were wrong with the language, and likely why it would have been a challenging language to teach people to program in.

  1. It uses a *lot* of acronyms. lwb is short for lower-bound.
  2. It uses control-structure terminators (for bracketing) that are the reverse of something in the beginning of the control structure… but not consistently. For example the for-do loop ends with od, and yet the if-then clause ends with fi. Shouldn’t that end in neht, or should the for loop end in rof? Not the smartest choice in the world.

There are elements of Algol68 that are inherently functional, however on the whole the language is mired by complexity. For example, consider the six forms of coercion (down from eight in the first report), something that would likely overly confuse the novice programmer: deproceduring, dereferencing, uniting, widening, rowing, voiding. Algol68 suffered from (i) being developed by committee – far too many people wanting far too many things incorporated into the language, and (ii) not being implemented widely enough, because implementation was a tricky process.

Check out this piece of code, which calculates a factorial:

Which is not exactly easy to read. Guess they didn’t care much about formatting either – not unusual for the time.

Reading a copy of the “Report on the Algorithmic Language ALGOL 68” might send you into a mild seizure, and even the Algol68 Genie learning guide is not for the faint-of-heart. I think if my first class in programming had involved Algol68, I would not have continued in computer science.

I think Algol 68 is summed up by this quote from  Aldous Huxley in his book Those Barren Leaves: It is the great dead language of the future. By the time Algol68 made its appearance, there were other hounds already barking at the gates… simpler languages that would take the lead.

 

 

Abbreviations in programming languages

Ever wonder where those wonderful abbreviations in C came from? You know += ? If you delve into Algol 68, you will notice some of them there.

Until Algol 68 datatypes were mostly indicated as keywords such as integer, real,  character. As the new kid on the block, Algol 68 introduced abbreviations that unlike PL/I were not optional. So integer became int, character became char, Boolean became bool (pity C didn’t make booleans integral from day one). For the novice programmer, this does tend to make programs more challenging to read. It also coined the term void. Algol 68 also allowed datatypes to be lengthened and shortened, in a way which will be inherently familiar to C programmers:

long real, long int, short real, short int, long long real, long long int

See the similarity? Of course it doesn’t end there. Algol 68 also allowed a form of optimization as it relates to operators:

sum plusab x

Look familiar? This is the “plus and becomes” operator, which can also be written as +:=, and in C as += . It’s like a piece of assembler jumping into your code… and the presence of this in the code was meant to signal the compiler that this code can be optimized. There were others of course: minusab, timesab, divab, and overab/modab for integers.

 

Experiential programming (with Julia) (ii)

So once the novice programmer is comfortable with dealing with small pieces of data, the purview of problems can be extended to say deal with 2D data such as simple monochromatic images. So let’s look at the “noise” problem. Some images can contain a type of noise called “impulse” or salt-and-pepper noise – random black (0) and white (255) pixels. Below is an (extreme) example of noise on a small image of a dome:

In this case, the algorithm is again a simple one, based on median filtering the image. This involves looking at each pixel in the image, calculating the median value of some region around that pixel, and then assigning this value to a new image in the same location. This could be explained i using some form of diagram showing how a picture is made up of individual pixels. Basically a median filter removes “extreme” pixels, which could represent the noise.

It will extend the experiences with both arrays, loops, and the median() function. In algorithmic terms.

  1. Read in a monochrome image called noisyimage.txt (it’s a simple text image), and store it in a 2D array (matrix).
  2. Find the size of the [image] array, and create a new array to store the “cleaned” image.
  3. For every element (pixel) in the 2D array, select a 3×3 neighbourhood around the pixel, and calculate the median value.
  4. Assign this median value as the new pixel in the output image.
  5. Write the output image in a file called cleanimage.txt.

An important part of experiential learning with respect to programming is also program comprehension. So instead of creating the code, from the algorithm it is also possible to provide the code and allow students to read the code and try and associate the parts of the code with the parts of the algorithm.  The Julia program to perform this noise suppression is only 9 lines in length. Here is what the program looks like:

imgIn = readdlm("noisyimage.txt",Int16::Type)
dx, dy = size(imgIn)
imgOut = zeros(dx,dy)

for i=2:dx-1, j=2:dy-1
   block = imgIn[i-1:i+1,j-1:j+1]
   newPixel = median(block)
   imgOut[i,j] = trunc(UInt8,newPixel)
end

writedlm("cleanimage.txt",imgOut)

Colour-coding helps to associate like regions of code. For example, the code in red here is associated with input/output, blue with setting data up, and purple with actually performing the median filtering.

Swift fixes C’s “issues”

Regardless of what you think about Apple’s new language Swift – it has fixed many of the underlying issues with C, and it’s descendants. The number one thing may be that they actually gave the language a new name that it *not* related in anyway to C. Some will see it as some form of evolution of Objective-C, but it’s not. Language design has always involved taking design prompts from existing languages. Since its inception in the early 1970s, few have attempted to fix some of the inadequacies of its language structure. What are some of its “fixes”?

Strings are fully supported

Unlike C, where strings are character arrays, strings in Swift are represented by the String type. This makes life easier, because operations like equality can now be done using ==. The downside? It is  not possible to access an element of a string via an integer index (although it is still possible to access it). There is a cornucopia of information on Swift strings here at Ole Begemann’s blog.

Assignments do not return a value

One of the problems in C is code of the form:

if (x = y){
    i = x + 1;
    vortex[i] = 0;
}

This effectively does not compare x and y, but assigns the value of y to x which is always true (unless y is zero). The assignment operator in Swift does not itself return a value, preventing the use of  i=0 instead of i==0 in if statements.

No need to use break statements in switch blocks.

The case statements in Swift do not fall through the bottom of each case and into the next one by default. To do that you have to explicitly use the fallthrough clause.

switch does intervals.. AND compound cases

Finally! A switch statement that allows for intervals in the case clauses.

let grade = 73
var lettergrade: String
switch grade {
    case 0..<50: lettergrade = "F"
    case 50..<60: lettergrade = "D"
    case 60..<70: lettergrade = "C" 
    case 70..<80: lettergrade = "B"
    case 80..100: lettergrade = "A"
}

And even compound cases.

let ch: Character = "o"
switch ch {
    case "a", "e", "i", "o", "u": print("vowel")
    default: print("consonant")
}

Integer overflows are trapped as a run-time error.

Can be allowed through the use of a special integer operators: &+, &-, &*, &/, and &%.

Braces in if are not optional

In C, you can write a single statement after an if, and there is no issue. Multiple statements require the use of braces { }, to contain the statements. This can lead to programming errors in code when people forget to include the braces. For example:

if (x > y)
    printf("%d is greater than %d\n", x, y);
    max = x;

Here the last statement is executed regardless of whether x is greater than y or not, because the braces have been omitted. Swift does not allow for the omission of braces around the statement, they are mandatory… so no more issues like the example above. (Or dangling-else for that matter)

The use of for-in

Swift provides a for of the for loop which allows iteration over “containers”, or numeric ranges. For example:

for index in 1...5 {
    print("\(index) times 5 is \(index * 5)")
}

printf be gone!

As you will notice from the previous example, the values are printed out using an overloaded print statement.

do-while becomes repeat-while

Just a small thing I guess, but the word repeat is more meaningful from a looping perspective than do (especially to novice programmers).

NO GOTO

Another language decides that enough of goto is enough. It’s not there, so don’t look for it. Surprisingly you can create a similar feature (and here is a blog post that shows you how).

FUNCTIONS RETURN MORE THAN ONE VALUE

Yes, like many of its contemporaries, Swift allows more than one value to be returned via a tuple.

func circle(radius: Double) -> (Double, Double) {
    var area=0.0, circumference=0.0 
    area = 3.14159 * radius * radius
    circumference = 2.0 * 3.14159 * radius
    return (area, circumference)
}

This is called in this manner:

let values = circle(7.0)
print("area is \(values.0) and circumference is \(values.1)")

There are *many* other features, like the use of optional parentheses in control structures. I kind of like the type inference as well. Both these are the same.

var life:Int = 42
var life = 42

You can also run Swift as an interactive session from the terminal. For more basics try Swift by Example.

 

Experiential programming (with Julia) (i)

In experiential programming, the concepts of learning to program are taught as a side-effect of solving a problem. This is sometimes called problem-based learning. The syntax of a language is introduced when it is needed to solve a problem. The novice programmer then obtains an understanding of how a particular piece of programming language syntax can be used in an algorithmic context. Consider a simple example which finds the median of a series of user-input numbers. In its simplest context, the problem involves three steps:

  1. Obtain the user-input numbers.
  2. Calculate the median value.
  3. Output the median to the user.

So first, we ask the user how many numbers they wish to enter.

println("How many numbers? ")

This introduces the most fundamental output device in Julia, the function println(). Next this piece of information has to be obtained from the user.

n = parse(chomp(readline()))

This seems more complicated, but it basically reads the line of input, “chomps” the <return> from the end, and then turns the input into a number.

Next, a storage medium is created to store the values entered. This vehicle is an array of numbers, in this case with zeros as starting values.

t = zeros(n)

Now a loop can be introduced to allow for the n numbers to be input.

for i=1:n
   t[i] = parse(chomp(readline()))
end

The final part of the puzzle involves calculating the median value of this set, and outputting  this value out to the user. This is relatively easy, using the function median().

m = median(t)

Now output it:

println("The median value = ", m)

This 8-line program works quite nicely and introduces some of the basic programming ideas:

  • input from and output
  • using a container to store more than one piece of data
  • iterative structures called loops
  • using a built in function, in this case a statistical function

There is no need at this point to discuss datatypes – the way the program has been written, all input numbers will be assumed to be float numbers¹. The great thing about this Julia program is the lack of things that can make life more complicated for a novice programmer. There are no objects, no datatypes, and no having to deal with memory, or formatting instructions for input/output statements.

println("How many numbers? ")
n = parse(chomp(readline()))
t = zeros(n)
for index=1:n
   t[index] = parse(chomp(readline()))
end
m = median(t)
println("The median value = ", m)

Yes, there are some words which will seem strange to novice programmers,i.e. parse and chomp. But, there are many other things that exist in other languages that are transparent here. Performing a median calculation in C would require building a function called median(). And unlike Python, (and C) where arrays begin at element 0, arrays in Julia begin at element 1… so an array with 10 elements is indexed from 1 to 10. Now this code could be easily modified so that median() is replaced with mean(), or var() (variance).

¹Experienced programmers may loathe this idea of using floating-point numbers as the default, but it does not harm anything and makes life way easier for the novice. They may also be troubled by the dynamic typing, but this takes away the problems associated with declaring the variables index for example.

Now this could be used as the jump off point for asking the novice to modify the code to add a feature like calculating the sum of the numbers, and outputting this to the user. (In Julia this is as easy as using the sum() function.

 

Why programming for non-CS students is different

There is a lot of literature in the CS community about the reasons students drop-out of introductory programming courses. There are two streams of students: those who will go on to immersive careers in computer science (e.g. software engineering), and those who will use programming as a tool in their discipline, maybe to analyze process data. The latter includes fields like finance, physics, and bioinformatics. They are two vastly different groups of people, and it is almost impossible to create one course that suits everyone. Teaching a course based on a language such as C or C++, which requires knowledge of low level concepts such as memory management, will alienate non-CS students. Spend too much time on high-level use of a language, and the CS-majors will miss critical information on low-level concepts.

Often it comes down to difficulties associated with the programming language being used to teach programming concepts. For example, non-CS students don’t really need to know about object-oriented programming – it often just complicates things from the perspective of building programs. Game developer John Carmack once said “Sometimes, the elegant implementation is just a function. Not a method. Not a class. Not a framework. Just a function“. They also don’t need to know that much about memory, and certainly not to the complexities of dynamic memory, or indeed complex ways of coding things to make a program more efficient, e.g. vectorization.

So what to teach a non-CS student in the way of learning to program?

  1. How programs can be used to solve real-world problems. This often involves the use of real-data, e.g. digital images. Computer-centric examples are often irrelevant for non-CS individuals.
  2. Focus on problem solving.
  3. How to formulate a problem as an algorithm. For example if the problem is  removing random noise from a monochromatic photograph – how do we do this?
  4. Learning the basic constructs of programming – making decisions, repeating things, how to store data (from a high-level perspective).
  5. Writing the programs to implement the the algorithms
  6. How to test programs.

Choice of language is extremely important, and from that perspective Julia is a perfect language to teach novice programmers who don’t want to make a career of it, but want the skills to help them in their disciplines. Teaching programming can also be moved to a more experiential process, whereby languages syntax is introduced as they are used in the problem solving context.

The design of dead languages…

Once you are versed in a few languages, in principle you should be able to read any piece of code and decipher some idea of what it is doing. Take for example the following piece of code below. Obviously from the code you can quickly discern that it calculates Fibonacci numbers, but what happens in the code, and what language is it?

Firstly, don’t fret about the language, likely you have never seen it before… it is S-algol, short for St. Andrews Algol (a dialect of Algol 60). It doesn’t differ terribly from any other procedural language. The first part of the program revolves around the two phrases let and write. Obviously write is used for output. The use of let likely seems strange to many programmers. It basically “introduces” a variable, i.e. declares it with an initial value. Interestingly Swift uses let to assign constants. The while loop is fairly trivial (no parentheses, compound statement delineated by begin-end).

The rest of the program is pretty self-explanatory. There is an if-do statement, which uses = for equality, and relegates assignment to the classic := operator. Basically this program prints the first 42 Fibonacci numbers in a 6 column by 7 row format – the “‘n” outputs a newline after six columns. You’ll notice that semicolon is used, but only to separate two or more statements on a single line (that almost makes sense).

Once you know the basic control structures used in programming, it shouldn’t be that hard to decipher most pieces of code (languages like APL are an exception of course). Even dead languages have a tale to tell about their design, and what features could evolve into future languages. What would I choose from S-algol?

  • The use of the words reads and write is extremely intuitive – print always has associations with a physical printer. The reads function is quiet unique because it is used in the form x := reads , which does not require parameters.
  • The language uses dynamic typing, which is odd for a compiled language, however the language was designed to “deduce” the type from its declaration using let.
  • Strings are integral types (Yah!). String constants are easy to create: let s := “olympus” . The ++ operator can be used to concatenate strings.
  • Arrays are vectors, such that let x := vector 1::8 of 0 creates an 8-element integer array containing zeroes. very clean syntax. The use of the term vector is good because it builds on existing knowledge.

Negatives? The languages uses both = and := for assignment in vectors to distinguish between constants and variables. This seems kind-of vague from the examples I have seen, and could lead to confusion.

P.S. If you want to play with the S-algol compiler you can find the sourse code here.