The user interfaces of Star Trek – vocal

One of the more interesting aspects of the computer systems on Enterprise is the human-computer interface. Computer stations are equipped with audio I/O, and a seemingly unlimited set of words, in unrestricted English. Here’s an example:

Computer. Digest log recordings for past five solar minutes. Correlate hypotheses. Compare with life forms register. Question: Could such an entity within discussed limits exist in this galaxy? (Episode: Wolf in the Fold)

There is no way with current technology that we could ever fathom such an understanding of English, or any language for that matter. The request also implies quite a high level of intelligence for the computer itself.

What about the whole speech thing?
So the Enterprise relies heavily on speech recognition and semantic comprehension of a natural language. Speech recognition takes phonemes (speech sounds) and tries to make them into words.  In Star Trek, recognition of spoken words has been completely solved. In 1977 the capabilities were akin to 1000 words recognized for one speaker. Is it any better today? Today we have Siri, maybe the forefront of speech I/O. Microsoft apparently has a word-error-rate (WER) of only 6.3%, slightly lower than IBM’s Watson team at 6.9%. In 1995, the WER was 43% (IBM). Speech recognition has always been challenging because every persons speech is so different, but great strides are being made.

Aside from this, semantic comprehension, or understanding is a completely different ballgame. What progress has there been on the design of algorithms to analyze statements?

    before i wrote fig, i wanted to experiment with a language that was designed to be spoken, like in star trek (actually star trek was the inspiration.) the language was called “nudity” because it was bare of syntax that would need to be pronounced. its logo is still used for fig.

    obviously the computers in star trek have a very sophisticated ai, and im not an ai researcher. i would be into ai, except when i was 16 i decided that i was afraid of ai research moving too quickly, and didnt want to contribute even if i could. to see how its being used, i think im happy with that decision– not that all ai is bad or even worrysome.

    i tend to think of the speech that goes into the star trek computers as being mostly like google search queries– and a lot of processing goes into google search queries! but your example clearly defines instructions, still, they are just function calls.

    i mean if youd taken a very non-ai language like nudity and built enough libraries, throw a language processor on top and have processes (libraries) to be called and compared to that language processing (python nlp for example) youd just have to keep getting increasingly sophisticated over the years.

    i think you could have a more narrow language and vocabulary for the purpose of coding in a subset of english, but it would take years and years (and years) to build. of course, this sounds and could be naive– its still what i think about when i try to imagine how programs were created by officers in star trek. was it more like voice-controlled emacs, or google queries, or both?

