Autocorrect Was Invented in the U.S. for a Secret Chinese Computer

Every time autocomplete, or its aggravating variant autocorrect, declares war on you, now you’ll know who to thank. Its inventor’s name was Samuel Hawks Caldwell, and it shall live in infamy. Thanks a lot, pal. Caldwell didn’t set out to annoy generations of computer users. His goal was something else completely, and it’s an odd story.

Back in the late 1950s, China was not the technology society it is today. In those early days of computing it was viewed as backwards, and possibly doomed to remain that way due to the difficulties of making their pictographic words understandable to computers.

Moved to action by altruistic ambitions electrical engineer Caldwell began work on a machine called the “Sinotype.” He viewed it as a gift to the Chinese people, writing later, “Many will wonder why this work was ever done or why our military establishment devoted substantial funds and attention to the project,’ he later wrote. ‘The answer to this question seems simple and clear. In selling the idea to the military authorities, the writer had only one real argument … to the effect that a machine for composing Chinese would improve communication among men, and that no improvement of communication ever harmed the cause of peace among men.” His backers were not quite so high-minded: The U.S. Army and Air Force, and the Carnegie Foundation saw the Linotype as a way to disseminate anti-Communist propaganda in Chinese on a scale never before imaginable.

You would think expertise in the Chinese language would be a prerequisite to developing a Chinese computer, but Caldwell didn’t know he language. After all, what he taught at MIT was electrical engineering. So he talked with native Chinese students attending the school and learned that they’d been taught to form words using a specific sequence of brush strokes, much the same way that American children are taught to write “t” by drawing a vertical line and then topping it with a horizontal one. To Caldwell, this meant that “Chinese has a ‘spelling.’“

Caldwell enlisted a professor of Far Eastern Languages at Harvard, Lien-Sheng Yang to study the structure of Chinese characters, stroke-by-stroke. It turned out that there were just a handful of common sequences that eventually diverged into specific characters, and that computer analysis could identify the intended character after only five or six strokes, regardless of its final complexity. You see where this is going, right? Autocorrect.

“Autocorrect” in Traditional Chinese.

To Caldwell, this was the breakthrough that would allow him to create his dreamed-of Chinese computer. (To you and me, it’s something else.) He and Yang derived a vocabulary of 22 strokes from which roughly 2,000 Chinese words could be constructed. Each of these was awarded a place on the Sinotype keyboard. (22 was the number of keys on a Western keyboard.)

The Sinotype was nearly announced to the world in the summer of 1959, with the U.S. under the pressure of fears the Chinese would beat them to it. Unsure that it would work as advertised, and not wanting to risk embarrassment internationally, the Eisenhower administration hesitated, and the moment passed.

Caldwell died in 1960, and his computer was renamed a few times, as the “Chi-coder,” the “Ideographic Encoder,” and the “Sinotype II,” which moved away from a stroke-based keyboard to the popular Pinyin layout.

And what remains of Caldwell’s work? Well, you know it, you probably hate it, and there we have it: autocomplete and its annoying cousin autocorrect.