Musings on why the Towers of Hanoi is hard to comprehend

So we have looked at the recursive solution to the Towers of Hanoi (ToH), and it seems somewhat magical. If you work through the trace line by line, you can see how the problem is in its essence recursive. Then why is the Towers of Hanoi a horrible recursive problem for novices? Towers of Hanoi may not be very appropriate for a novice programmer, because although the problem seems easy, and the solution exudes elegance, trying to explain the solution is far more convoluted.


Part of the problem lies with the problem itself. It is best illustrated in a tactile manner, so that it can be physically manipulated. It works on paper too, but obviously playing a game is much easier.

Before it’s use as one of the poster-children for recursion, the “Tower of Hanoi” was actually used in experimental psychology. One study from 1962 looked at the effects of verbalization on problem solving [1], or more specifically investigating how acts of verbalizing during problem solving result affected problem solving effectiveness. Here however they refer to the problem as the “three circle problem”, i.e. “the task of transferring discs graduated in size from one circle to another in a triangular configuration of three circles“. This work was based on previous work from 1932 [2], yet is a doppelgänger of the Tower of Hanoi.

The work [2] looked at solving the issue with a variable number of discs, from 3-8, even asking participants to “try and find a general rule by which the problem may be solved“. The authors pose a series of differing trials, some with generalized instructions, others with none. When participants were finished solving 8 discs, they were asked for (i) a general statement of the most efficient way of transferring any number of discs, and (ii) a formula for computing the least number of moves necessary for transferring any number of discs. What was clear from the experiments is that the task, although a tactile one, was difficult for the subjects. Some took upwards of 3 hours, and one participant made nearly 15,000 moves. It was evident that those groups provided with generalized instructions fared better than those with none, who were not even able to formulate a generalization.

Indeed there has been some exploration into why problems like Tower of Hanoi are hard to solve from a problem solving perspective [3]. But why the difficulty? Here are some insights [3]:

  • The problem rules can be a major source of difficulty.
  • Rules consistent with real world knowledge aid problem solving, as does the physical representation in the ToH problem.
  • Intensive training on the rules significantly decreases the difficulty of the problems.
  • Participants cycle through a substantial amount of exploratory behaviour before they reach a point where they can proceed quickly to their goal.

Kotovsky et al. [3] concluded that problems that impose higher processing loads require more training. A problem solver has a limited amount of processing capacity that is easily overloaded by the memory requirements of unfamiliar problem rules. The difficulty in solving the ToH algorithmically is likely grounded in some of the psychological aspects of problem solving. In part it may be the digital nature of how the problem is presented. External representations such as a physical Tower of Hanoi game can serve as memory aids, extending working memory [4]. But external representations, can also be intrinsic to the problem, without them the problem may actually change in nature. They can be used to guide, constrain, and even determine cognitive behaviour [4].

The problem is that ToH, like similar problems can be solved using a physical game, or even on paper, however it is the process of making the solution recursive that is the issue. When solving the problem, people aren’t necessarily doing so using a recursive cognitive process. A computational function is recursive if (and only if), one of its processes is a call to itself. This results in an internal application of the same function, while at the same time keeping whatever it was doing before in memory, creating a chain of deferred functions. While it might be possible to solve the ToH with three discs using working memory, the more discs added to the problem, the harder this becomes.

[1] Gagne, R.M., Smith, Jr., E.C., “A study of the effects of verbalization on problem solving”, Journal of Experimental Psychology, 63(1), pp.12-18 (1962)
[2] Ewert, P.H., Lambert, J.F., “Part II: The effect of verbal instructions upon the formation of a concept”, Journal of General Psychology, 7, pp.400-412 (1932)
[3] Kotovsky, K., Hayes, J.R., Simon, H.A., “Why are some problems hard? Evidence of Tower of Hanoi”, Cognitive Psychology, 17, pp.248-294 (1985)
[4] Zhang, J., “The nature of external representations in problem solving”, Cognitive Science, 21(2), pp.179-217 (1997)

Recursion and the human mind

Although not completely natural, recursive thinking is uniquely a human trait, and it has been suggested that it is one of the characteristics that distinguishes human language from other forms of animal communication [1]. Corballis [2] suggests that recursion gives humans the ability to “mentally travel in time”. This is based on the idea of episodic memory, which refers to particular episodes of life that can be relived in our minds. This differs from semantic memory which is the storehouse of knowledge. Episodic memories are considered to be recursive, as they involve making mental references to an earlier mental self. Canadian neuroscientist Endel Tulving [3] suggests that recovery of semantic memories requires noetic awareness, which is simply knowing. Conversely, the recovery of episodic memories requires autonoetic awareness, or self-knowing. So episodic memories are recursive because they involves making references to an earlier mental self. Therefore the recursiveness of thought is intrinsically tied to the notion of self, or as Corballis puts it “not merely knowing that one is a physical object , but knowing that one knows, or knowing that one has mental states”. 

This concept is fundamental to the evolution of technology. Tool use is by no means unique to humans, however humans are the only animals that have been observed using a tool to make a tool, which is then used to make a tool [4]. A blacksmiths hammer is used to craft a hammer, which is then used to craft a hammer, etc. Development of tools requires travel into the past to explore previous designs of hammer, as  well as the future to envision new designs.

However the identification of recursive processes used on a daily basis is elusive, as they are often attributed to other methods [5]. Imagine if you will, a tree that has been felled. The act of de-limbing the tree (i.e. cutting off all its branches to leave just the trunk) could be thought of as a recursive process.

       if (tree has limbs)
          cut off lowest limb

The problem is that this process is just as easily described in an iterative manner:

       while (tree has limbs)
          cut off lowest limb

And in reality, this is how the process is perceived by humans. Humans don’t actively think about recursion. Sometimes we may actively use recursion to solve problems, but likely don’t realize it.

[1] Hauser, M.D., Chomsky, N., Fitch, W.T., “The faculty of language: What it is, who has it, and how did it evolve?” Science, 298, pp. 1569-1579 (2002)[2] Corballis, M., “The uniqueness of human recursive thinking: the ability to think about thinking may be the critical attribute that distinguishes us from all other species”, American Scientist, 95(3), pp.240-248 (2007)
[3] Tulving, E., Elements of Episodic Memory, London, Oxford University Press (1983)
[4] Beck, B.B., Animal Tool Behavior: The Use and Manufacture of Tools by Animals, Garland STPM Press, New York (1980)
[5] Vitale, B., “Elusive recursion: A trip in recursive land”, New Ideas in Psychology, 7(3), pp.253-276 (1989)

Programming is…

With apologies to Douglas Adams [1]:

Programming is hard. Really hard. You just won’t believe how vastly, hugely, mind-bogglingly hard it is. I mean, you may think that making a decent Macaron is hard, but that’s just peanuts to programming.

[1] Douglas Adams. Hitchhiker ’s Guide to the Galaxy. Harmony Books, 1979.

Teaching Grade One’s to code – Are U nuts???

Ontario has released its new math curriculum for elementary schools, and although a pandemic is not the most optimal time, it is to be applauded that there is some movement in teaching real math skills like everyday finance. There is one sticking point though, and that it introducing coding in first grade. I’ve talked about this before (Coding? Why kids don’t need it), so I am hoping someone doesn’t do anything completely crazy and start teaching programming of any sort. If they start with basic problem solving skills, that will go a long way to inducting students into the world of programming. And I say programming, because coding is a terrible word to use – coding means just that coding, writing physical code.

The term programming is better, because it is more inclusive of the world of solving problems using a machine. Review the problem, look for solutions, turn those solutions into algorithms, code the algorithms using a programming language, test the solutions, etc. Even worse is if they use the wrong tools, or heaven forbid it is taught by people who have no understanding of programming. Doing it wrong in grade one could have long-lasting effects.

Here are the supposed skills for coding in Grade 1:

C3.1 solve problems and create computational representations of mathematical situations by writing and executing code, including code that involves sequential events
C3.2 read and alter existing code, including code that involves sequential events, and describe how changes to the code affect the outcomes

Like, are they kidding? They are kidding right? This is grade 1, and they expect them to write code? It will be interesting to see what sort of environment that is in. Of course read on to the expectations for grade 8 and they aren’t that much different. That says there likely isn’t a complete plan.

Lots of people seem to think kids should learn to code, yet one of the biggest problems with students coming into university is that they can’t solve problems. I would say the optimal solution is having them spend grades 1-4 just concentrating on problem solving. There are thousands of mathematical and physical puzzles that are interesting, and that help build those skills. Or better still, spend some time in the physical world problem solving, it doesn’t even matter if the problem being solved is no relation to programming, or machines. Building things with Lego is an awesome way to build those skills (and I’m not talking step-by-step models).

There are definite benefits to learning to code, but not everyone is interested in “coding”, and not everyone is interested in STEM. Here are some supposed reasons as to why kids should learn to code:

  • Coders are in high demand – Now maybe, but in 15-20 years?
  • Coding provides a competitive advantage – Being able to think outside the box is an advantage. Coding is useless if you you’re not solving a problem.
  • Coding knowledge allows students better understand the world – Complete c**p, getting out into the world helps us understand the world better.
  • Coding is fun and satisfying – To some maybe, not to all. Many students find it completely boring, because they are taught by boring people using boring materials.
  • Coding improves creativity – Ahhh, no. Doing creative things helps improve creativity. The Fibonacci sequence is way more interesting in real life, e.g. plants, than it is on a computer.
  • Coding improves problem solving – Coding implements solutions to problems, so problem solving helps coding. Coding is just a tool.
  • Coding improves collaboration – So does play.
  • Coding improves communication – As above… in fact technology seems to do quite the opposite.

Companies pushing this idea of coding for kids are running businesses. These companies say that it’s fun because “If you know how to code you can build apps, video games and websites!” Hogwash. There have been very few studies since the 1980s on how programming could be implemented in elementary schools (and that’s because research funding bodies don’t generally fund research into education, which is kind of ironic).

Coding is a means to an end. It is a function of using a programming language, which is just a tool, to implement a series of instructions – logical instructions. So the problem solving skills attained for programming are constrained by the logical nature of the 1’s and 0’s. Emphasizing these skills comes at the expense real-world skills that don’t fit neatly into a binary sphere. If the pandemic has taught us anything, it is that skills like making bread are just as important as reading and writing (and shockingly you can use it to teach math, chemistry, and problem solving!).

Done properly, with the right resources, and piloted in a few schools, this could be a fantastic learning opportunity for students. Done poorly, with a lack of coherent strategy, and poorly trained teachers, and the result will do more harm than good.

The value of a humanities education

In Australia there is a move afoot to change its university fee structure, and it begs to question how far out to lunch they are. Universities in Australia are federally “controlled”, for want of a better term, meaning that fee structures are mandated by the government. They recently decided to change the fee structure to benefit what they consider subjects deemed of high esteem in recovering from the pandemic, like STEM and healthcare. Humanities (liberal arts) subjects bear the brunt of this transformation with student contributions (taking aside the Government subsidies), increasing by 113%. It seems as though they don’t consider humanities subjects as contributing as much to society, as STEM subjects do.

Here’s the thing, you can’t base an entire society off STEM type subjects, it makes for a very one-sided view of the world. To the layman there may be no intrinsic value in a person with a history degree, but that’s because the people making these statements know very little about the world. History plays an incredible role in most fields, including STEM. Does it matter how people lived in pre-Roman Britain? We can always learn something from looking into the past. Maybe it would provide us with an insight into more sustainable practices? It almost provides a balance to the somewhat rigid perspective of STEM.

In addition people taking humanities degrees have a vastly differing perspective of the world, partially because they learn differing skills. How many essays do STEM students typically write? And by this I mean pieces of work which undertake a critical analysis of something, not some “academic” paper. How often is a science student allowed to be creative? How often does a computer scientist critically analyze how a particular programming language evolved (they did once, but those CS pioneers who knew how to write are long gone). STEM has one view of the world, but it is not one that wavers, or is open to new and varying opinions. The humanities are central to our society, diminish them, and you diminish society as a whole.

Not everyone wants to be a scientist, or a mathematician, or a computer scientist. Some people want to explore human culture, and history. We should not value higher education solely on what it can financially contribute to society. A society based solely on money is a fickle one indeed. Some people seem to believe that the future is only paved in STEM, but there is great risk in not providing a balanced education. Having done my undergrad science degree at an Australian university, it was not until I sat in a history course during my honours year that I realized I had probably done the wrong degree. There few if any opportunities for science students to take humanities subjects, and I believe it is still like that in many institutions.

History helps bind us to our past, the mistakes we have made along the way. In an ever changing world, it is one of the few things we can learn from, so as not to make the same mistakes again (and let’s face it, many of our gravest mistakes have stemmed from too much of a reliance on technology in one form or another). Where would our world be without art? To throw the humanities on the “not useful” pile is reckless and short sighted.

Young people should make their own decisions about what they want to do with their lives. It should not be the teachers choice, nor the parents, nor some out-of-touch politician. Making a choice to do a degree based solely on its perceived financial merits is also not the best approach. Money isn’t everything. A person should do a degree that speaks to them, something they are passionate about.

Recursion – The Towers of Hanoi (iii)

Solving the ToH for one disc, or even two discs is trivial (moving from tower A to C, using B as the intermediary). For two disks we move the smaller top disc to B, the larger bottom disc to C, and then the smaller disc on B to C. This takes three moves. Moving three discs takes a bit more effort. However we already know how to move 2 discs, so the problem can be restated in terms of an existing solution. Move the top two discs in the tower from A to B. Then move the third disc from A to C, and finally move the two disc stack from B to C. What about if we had four discs… well we know hot to move 3 discs, so… well you get the idea. Here is a visual depiction on how 3 discs can be moved.

Fig. 1: The seven moves required to solve the 3-disc T0H

We can then solve the Towers of Hanoi problem for a stack of discs of height n, by trying to solve it for a stack of height n−1. To solve for a stack of height n−1, we try to solve for a stack of height n−2, etc. Now this can be easily solved using a recursive algorithm, because each smaller stack is a sub-problem of the original. To move n discs from tower A to tower C, using tower B as the intermediary, a rudimentary algorithm would look like this:

  1. Move n−1 discs from A to B.
  2. Move one disc from A to C.
  3. Move n−1 discs from B to C.

From this a more definitive recursive algorithm can be formulated, again moving a tower from A to C using B as the intermediary:

PROCEDURE moveDiscs(N,A,C,B)
   IF N > 1 THEN
      move the top disc from tower A to tower C

Here is a basic C function, called moveR() to perform the ToH recursively (towers A, B, and C, have been replaced with the more descriptive from, to, using.

void moveDiscs(int N, int from, int to, int using) {
   if (N > 0) {
      moveDiscs(N-1, from, using, to);
      printf("move %d -> %d\n", from, to);
      moveDiscs(N-1, using, to, from);

One way of deciphering this is to lay out each level of the recursion as it proceeds, basically tracing the recursive process. If the function above is called as moveDiscs(3,1,3,2), it would move 3 discs from tower 1 (A), to tower 3 (C), using tower 2 (B) as the intermediary.

The various “L” labels specify the level of recursion, the numbers ❶→❼ coincide with the example diagram in Fig.1. The STOP sign shows where the function is called recursively, but terminates, signifying that it cannot recurse deeper. In all, although there are only 7 moves (shown in blue) for 3 discs, there are actually 15 recursive calls, albeit 8 are terminations.

Don’t mention the “search”

I like Instagram, although this pandemic is somewhat eating into my ability to post shots of coffees. The problem is the searching. You can search for hashtags, and account names, and geotagged locations, but you can’t just search for generic things, like terms that would occur in a description, or even your own photo stream. So if you tagged a photograph with #gruyerecheese, likely you would be able to find it. If you didn’t, then bad luck – you’re going to have to physically view all your posts (best done on a laptop). Instagram does not offer a native search capability for old content. Annoying? That isn’t the half of it.

For people with hundreds of posts, it would be nice to have some sort of feature that lets you natively search posted posts. It just makes sense, although maybe most people don’t care? In the age of searching, not having this sort of facility just limits how quickly I can find something. It’s almost as bad as the photo archive on my iPhone, which at least sorts by place, and time. Yes, I get tagging everything would be useful, but that isn’t always convenient at the time something is being posted. The whole purpose of writing a blurb next to a photo is to provide contextual information. It’s almost as bad as websites with no search capability.

I would almost call is careless, but then Instagram doesn’t have an iPad or Mac app either, which makes you think there is more of a sinister reason for not having native search. Or maybe they don’t want to add any more features, I mean apart from the filter/photo editing side of things, and the data storage, there isn’t much evolving in the software. You have to ask yourself how many tech companies *really* care about the needs of their users?

Recursion – The Towers of Hanoi (ii)

It is in the instruction sheet for the original game that one finds indications for a recursive solution, stated for an arbitrary number of discs, namely:

“The Game easily teaches itself, in solving first the problem for 3, 4, and 5 disks.”

“The Game is always possible and requires double the time each time that an additional disk is placed on the tower. Anyone who knows how to solve the problem for eight disks, for example, moving the tower from peg number 1 to peg number 2, also knows how to solve it for nine disks. One first moves the eight top disks to peg number 3, then moves the ninth disk to peg number 2, and then brings the eight disks from peg number 3 to peg number 2. Now, in augmenting the tower by one disk, the number of moves is doubled, plus one, from the preceding game.”


It is hard to trace the original appearance of the Towers of Hanoi (ToH) solved algorithmically. It had been long used in mathematics to describe the process of induction. In a paper published in 1963, Harold McIntosh [1] discussed the use teaching large computers, in the guise of an IBM 7090. Using Lisp (arguably the first language to implement recursion), he suggests using ToH and the Chinese Ring Puzzle, both of which are “essentially recursive in nature”, making them good problems for the use of Lisp. The ToH was described in 1964 by Aiko Hormann [2] for use in Gaku, an early learning machine.

The original puzzle had ten disks, but the programming problem usually only appears with three disks – it is therefore a subset of the original problem. Why? Likely in order to make it simple to describe, and not waste computing resources at a time when they were very limited.

The basic computer science problem consists of three wooden pegs, and n discs, where n is classically equal to 3. The left outer peg contains the n discs of varying size which decrease in diameter from bottom to top. The goal of the Towers of Hanoi problem is to move all the discs from the leftmost peg to the rightmost peg, following two rules, which are the same as in the original game. The first rule states that only one disc can be moved at a time. The second rule states a larger disc cannot be placed upon a smaller disc.

At some point it became popular to use it in programming courses, especially after the addition of recursion to Algol 60, the first procedural language to have this ability. The problem can be solved in a non-recursive manner, but it is inherent in-elegant. The recursive solution is simple, and elegant, even if somewhat bemusing to the novice recurser. Here is a basic version of the Towers of Hanoi in Algol 68 (written in Genie Algol68):

PROC movedisk = (INT n, f, t) VOID:
   print(("move ",f," --> ",t,newline))
PROC towersofhanoi = (INT n, f, t, u) VOID:
   IF n < 2 THEN
   towersofhanoi(3, 1, 3, 2)

A SNOBOL4 version appeared in a text on programming [3] in the late 1960s.

Snobol4 code for Towers of Hanoi
SNOBOL4 code

As you can see it is super simple, and that inherently is one of the problems for understanding it.

  1. McIntosh, H.V., “An Experiment in Teaching the Use of Large Electronic Computers”, The American Mathematical Monthly, 70(2), pp.207-209 (1963) 
  2. Hormann, A., “How a computer system can learn”, IEEE Spectrum, 1(7), pp.110-119 (1964)
  3. Griswold, R.E., Poage, J.F., Polonsky, I.P., The SNOBOL4 Programming Language (2nd ed.), Prentice-Hall (1971).

The internet: Drinking from a hydrant

Getting information off the Internet is like taking a drink from a fire hydrant.

Mitch Kapor (application developer)

The Internet contains a *lot* of information. The problem is that only a percentage of that information is useful, and it often takes quite a long time to actually filter out the stuff that isn’t useful. Often to find real information you are forced to look beyond the internet. Recently I was working on a historical backstory to the Towers of Hanoi puzzle, for a book I am attempting to finish on recursion. There is a good amount of information out there on topics such as this, but it takes a good amount of work to collate it. Much of it was in French, so that requires some translation, which is easy with the Google Translate app. Thankfully there has been a good effort to digitize old journals, even from the late 19th century. I got somewhat lucky here because there is a cohesive list of all the works from the puzzle’s creator, Édouard Lucas, and a *bunch* of digitized works dating way back at Bibliothèque National de France. Not all such things are easier, for some you need to dig a little deeper.

I am also working on a series of articles relating to radioactivity in vintage lenses. There are two really good articles available from a 1987 issue of The British Journal of Photography. The website of the journal has no access to backdated issues, and so I was stuck for a while, until I ordered them through the interlibrary loan service (RACER) at work. They arrived next day through email as PDFs. But a person not working at a university may have a hard time finding this information. I had a similar experience a few years back when I tried to find the original article on bokeh, published in Photo Techniques Magazine in 1997. Thankfully Toronto has an *awesome* library system and I was able to obtain the articles from the archives at the Reference Library (you can find some issues from 2007 onwards on Wayback Machine).

The thing is that finding information on the Internet can be challenging. Some exists in repositories that require special access, and therefore don’t appear as a result in a search query. Others appear in other digital forms, and some don’t exist as digital entities at all. There is no doubt that without the Internet, research would be harder. I’m also currently writing an article deciphering a Mesopotamian recipe for bread, so the availability of resources like Sumerian dictionaries is *awesome*. On the flip-side, trying to trace the background of some woodworking tools is nigh on impossible.

The biggest problem is the amount of chaff on the Internet, i.e. worthless things. Try and find a cohesive definition of something and you might be faced with 101 differing approaches to explaining it. It makes it hard to actually decipher what is going on. A good example is finding a layman’s explanation of the Retinex algorithm for images, which provides enhancement of low-light images, amongst other things. Try and find an article which explains it without math, or without adding too much complexity? If a simple explanation exists, I am yet to find it (maybe because it doesn’t exist).

Teachers aren’t always good learners

There is a lot to dislike about the traditional way of teaching students in a university setting. In my honest opinion, we should be pivoting away from traditional lecture settings, towards learning which is much more organic. There are of course many differing reasons why that idea can be challenging. Foremost is class size. Large classes are inherently more economical – classes of just 20 students just don’t work in the global schema that is the university model. As I student, I never really enjoyed many of my classes. They were too large, and I wasn’t really interested in the style of teaching. I taught myself programming by writing programs in Pascal, not by attending lectures (the instructor was somewhat aloof, and off on his own planet, and no one could decipher his scratchings on the blackboard). I wasn’t a perfect programmer, but then who really cares? You don’t become an expert on anything after taking one class.

The only class I ever had interest in was a final year class on numerical mathematics using Fortran. The instructor was Gary Bunting, and likely he was the best prof I ever had as an undergrad. The class had three people in it as I recall, and Gary was just down to earth. The computing profs, were all aloof, probably didn’t help that they seemed more theoretical than practical in their knowledge. Some spent more time playing Moria, than actually teaching people… but I digress. What made that class fun, was the fact that it was taught in a way that made the concepts seem real. It was about working through problems, and finding tangible solutions. The rest of academia seemed boring in comparison.

The worst class I ever took in CS was “Unix and C”. It was taught by an individual who knew a lot about graph theory, but zip about the content of this class. Unix was trivial, it was C that was the problem. The instructor didn’t know how to code in C, so we basically watched B&W videos. It put me off video instruction for decades. At some point though, likely in my mid 20’s I realized that I really didn’t like learning… in the traditional sense anyways. I have always taught myself most things. Baking, carpentry, plumbing, photography, even programming… all things I learned by reading books, watching YouTube videos, and practice. It seems odd from someone who teaches for a living, but I don’t actually enjoy being at the other side of the process. Even now, the only courses I take are workshops at Lee Valley, and only if I know the instructor.

Don’t get me wrong, I’m a life-long learner – just not in the traditional sense. It’s partially because I’m not a theoretical person – theory is fine, but practical is way better. Probably the reason why I enjoyed woodworking in high school. CS was the only reason I stayed around in academia, because I can build things (although given my interest in history I should have majored in it).