« Why Oh Why Are We Ruled by These Liars? | Main | Why Oh Why Can't We Have a Better Press Corps? (When Is Mosul Not Mosul? Department) »

November 14, 2005

Google as a Post-Von Neumann Architecture

Highly recommended:

Edge: TURING'S CATHEDRAL by George Dyson: Exactly sixty years ago, at the Institute for Advanced Study in Princeton, New Jersey, mathematician John von Neumann began seeking funding to build a machine that would do this at electronic speeds. "I am sure that the projected device, or rather the species of devices of which it is to be the first representative, is so radically new that many of its uses will become clear only after it has been put into operation."... "Uses which are likely to be the most important are by definition those which we do not recognize at present because they are farthest removed from our present sphere."... When the machine finally became operational in 1951, it had 5 kilobytes of random-access memory: a 32 x 32 x 40 matrix of binary digits, stored as a flickering pattern of electrical charge, shifting from millisecond to millisecond on the surface of 40 cathode-ray tubes.

The codes that inoculated this empty universe were based upon the architectural principal that a pair of 5-bit coordinates could uniquely identify a memory location containing a string of 40 bits. These 40 bits could include not only data (numbers that mean things) but executable instructions (numbers that do things) — including instructions to transfer control to another location and do something else. By breaking the distinction between numbers that mean things and numbers that do things, von Neumann unleashed the power of the stored-program computer, and our universe would never be the same.... From an initial nucleus of 4 x 10^4 bits changing state at kilocycle speed, the von Neumann's archetype has proliferated to individual matrices of more than 10^9 bits, running at speeds of more than 10^9 cycles per second, interconnected by an extended address matrix encompassing up to 10^9 remote hosts....

In the early 1950s, when mean time between memory failure was measured in minutes, no one imagined that a system depending on every bit being in exactly the right place at exactly the right time could be scaled up by a factor of 10^13 in size, and down by a factor of 10^6 in time.... Fifty years later, thanks to solid state micro-electronics, the von Neumann matrix is going strong. The problem has shifted from how to achieve reliable results using sloppy hardware, to how to achieve reliable results using sloppy code. The von Neumann architecture is here to stay....

As organisms, we possess two outstanding repositories of information: the information conveyed by our genes, and the information stored in our brains.... He considered the second example in his posthumously-published The Computer and the Brain: "The message-system used in the nervous system... is of an essentially statistical character," he explained. "In other words, what matters are not the precise positions of definite markers, digits, but the statistical characteristics of their occurrence... a radically different system of notation from the ones we are familiar with in ordinary arithmetics and mathematics... Clearly, other traits of the (statistical) message could also be used: indeed, the frequency referred to is a property of a single train of pulses whereas every one of the relevant nerves consists of a large number of fibers, each of which transmits numerous trains of pulses. It is, therefore, perfectly plausible that certain (statistical) relationships between such trains of pulses should also transmit information.... Whatever language the central nervous system is using, it is characterized by less logical and arithmetical depth than what we are normally used to [and] must structurally be essentially different from those languages to which our common experience refers."...

In a digital computer... everything depends not only on precise instructions, but on HERE, THERE, and WHEN being exactly defined. It is almost incomprehensible that programs amounting to millions of lines of code, written by teams of hundreds of people, are able to go out into the computational universe and function as well as they do given that one bit in the wrong place (or the wrong time) can bring the process to a halt.

Biology has taken a completely different approach. There is no von Neumann address matrix, just a molecular soup, and the instructions say simply "DO THIS with the next copy of THAT which comes along." The results are far more robust. There is no unforgiving central address authority, and no unforgiving central clock. This ability to take general, organized advantage of local, haphazard processes is exactly the ability that (so far) has distinguished information processing in living organisms from information processing by digital computers....

[T]emplate-based addressing did not catch on widely until Google (and brethren) came along. Google is building a new, content-addressable layer overlying the von Neumann matrix underneath.... We call this a "search engine".... However, once the digital universe is thoroughly mapped, and initialized by us searching for meaningful things and following meaningful paths, it will inevitably be colonized by codes that will start doing things with the results. Once a system of template-based-addressing is in place, the door is opened to code that can interact directly with other code....

My visit to Google? Despite the whimsical furniture and other toys, I felt I was entering a 14th-century cathedral — not in the 14th century but in the 12th century, while it was being built. Everyone was busy carving one stone here and another stone there, with some invisible architect getting everything to fit. The mood was playful, yet there was a palpable reverence in the air. "We are not scanning all those books to be read by people," explained one of my hosts after my talk. "We are scanning them to be read by an AI."...

Posted by DeLong at November 14, 2005 01:07 PM