Skip to content
Technology & Innovation

DNA Is Multibillion-Year-Old Software

Nature invented software billions of years before we did. “The origin of life is really the origin of software,” says Gregory Chaitin (inventor of mathematical metabiology). Life requires what software does. It is fundamentally algorithmic. And its complexity needs better thinking tools.


Nature invented software billions of years before we did. “The origin of life is really the origin of software,” says Gregory Chaitin. Life requires what software does (it’s foundationally algorithmic).

1. “DNA is multibillion-year-old software,” says Chaitin (inventor of mathematical metabiology). We’re surrounded by software, but couldn’t see it until we had suitable thinking tools.

2. Alan Turing described modern software in 1936, inspiring John Von Neumann to connect software to biology. Before DNA was understood, Von Neumann saw that self-reproducing automata needed software. We now know DNA stores information; it’s a biochemical version of Turning’s software tape, but more generally: All that lives must process information. Biology’s basic building blocks are processes that make decisions.

3. Casting life as software provides many technomorphic insights (and mis-analogies), but let’s consider just its informational complexity. Do life’s patterns fit the tools of simpler sciences, like physics? How useful are experiments? Algebra? Statistics?

4. The logic of life is more complex than the inanimate sciences need. The deep structure of life’s interactions are algorithmic (loosely algorithms = logic with if-then-else controls). Can physics-friendly algebra capture life’s biochemical computations?

5. Describing its “pernicious influence” on science, Jack Schwartz says, mathematics succeeds in only “the simplest of situations” or when “rare good fortune makes [a] complex situation hinge upon a few dominant simple factors.”

6. Physics has low “causal density” — a great Jim Manzi coinage. Nothing in physics chooses. Or changes how it chooses. A few simple factors dominate, operating on properties that generally combine in simple ways. Its parameters are independent. Its algebra-friendly patterns generalize well (its equations suit stable categories and equilibrium states).

7. Higher-causal-density domains mean harder experiments (many hard-to-control factors that often can’t be varied independently). Fields like medicine can partly counter their complexity by randomized trials, but reliable generalization requires biological “uniformity of response.”

8. Social sciences have even higher causal densities, so “generalizing from even properly randomized experiments” is “hazardous,” Manzi says. “Omitted variable bias” in human systems is “massive.” Randomization ≠ representativeness of results is guaranteed.  

9. Complexity economist Brian Arthur says science’s pattern-grasping toolbox is becoming “more algorithmic … and less equation-based.” But the nascent algorithmic era hasn’t had its Newton yet.

10. With studies in high-causal-density fields, always consider how representative data is, and ponder if uniform or stable responses are plausible. Human systems are often highly variable; our behaviors aren’t homogenous; they can change types; they’re often not in equilibrium.

11. Bad examples: Malcolm Gladwell puts entertainment first (again) by asserting that “the easiest way to raise people’s scores” is to make a test less readable (n = 40 study, later debunked). Also succumbing to unwarranted extrapolation, leading data-explainer Ezra Klein said, “Cutting-edge research shows that the more information partisans get, the deeper their disagreements.” That study neither represents all kinds of information, nor is a uniform response likely (in fact, assuming that would be ridiculous). Such rash generalizations = far from spotless record. 

Smarter faster: the Big Think newsletter
Subscribe for counterintuitive, surprising, and impactful stories delivered to your inbox every Thursday

Mismatched causal density and thinking tools creates errors. Entire fields are built on assuming such (mismatched) metaphors and methods.  

Related: olicausal sciences; Newton pattern vs. Darwin pattern; the two kinds of data (history ≠ nomothetic); life = game theoretic = fundamentally algorithmic.

(Hat tip to Bryan Atkins @postgenetic for pointer to Brian Arthur).

Further reading: Microsoft Plans to Have a DNA-Based Computer by 2020

More: Human DNA Could Store All the World’s Data

Illustration by Julia Suits, The New Yorker Cartoonist & author of The Extraordinary Catalog of Peculiar Inventions.


Related

Up Next