The Tower of Electric Babel

Imagine what people in ancient Sumer must have thought about writing. It must have seemed astoundingly high tech.

Not only can you use it to talk to people in distant places, it also provides you with a memory sharper than the best storyteller's.

Times have changed somewhat since then, although surprisingly less than we might think.

Many of the basic patterns of life for any literate society were already in place in Sumer -- schools, laws, accounting, taxes, bureaucracies, and other furnishings of the modern world.

We're at last reaching the limits of the effectiveness of writing as the primary means for storing and distributing knowledge. What was good for Sumer may no longer be adequate for us.

We've made some refinements, of course: printing presses -- and very recently a whole office full of gadgets like magnetic disks.

At root, though, we're still recording symbols that stand for spoken words. It's hard to imagine what could possibly come after writing.

Even after overcoming the physical limitations imposed by paper, some towering obstacles to knowledge-processing and communication will still remain.

Those obstacles are inherent in the nature of writing. Actually, they're rooted in the nature of language itself.

The Babel Barrier

Language has certain characteristics that aren't apparent just from everyday use. To see what I mean, please perform a couple of quick "thought experiments."

First, imagine that you have a friend who has been deaf since birth. He's curious to know what it's like to listen to music. He asks you to describe as vividly as you can what you experience when you listen to your favorite Chopin Nocturne.

You give him a description that's so rich and detailed that he even manages to fool some of your other friends into thinking that he can actually hear it too. Although he can't truly know what it's like to hear music, language isn't subtle enough to reveal the discrepancy between his concept of the experience and the experience itself.

Language is like a net that's too coarse to catch the finer nuances of real human experience.

When pressed to admit it, we all know that language is inadequate for sharing personal experience. Nevertheless, think of how often you hear people say things like, "Why don't you understand what I'm telling you?

I already told you very clearly how I feel about that!" Such statements betray a tacit belief that language ought to be able to bridge the experience gap between human beings.

For most workaday communication, language works well enough. That's why we're generally not aware of how "grainy" its resolution of experience really is.

Throughout history, the demands that civilization has put on language generally haven't exceeded its capacity.

The next thought experiment shows up a weakness in our "folk" view of how language works.

Imagine that humans at last make contact with an extraterrestrial civilization living in a nearby solar system. We know nothing about each other except what we can learn through sending messages.

We get to work on trying to understand each other's languages. We both have computers that pick out the patterns in unknown languages. Our computers can probably help us figure out the overall syntax and grammar -- the order of the symbols in the messages -- of each other's languages.

All we have to do is to assign meaning to the words we've delineated. This is where both of us run up against a very deep and stubborn barrier to communication. I like to call it the Babel Barrier.

Even though the aliens learned enough about the grammar of human languages to create fairly correct sentences in English, German, Japanese or Navajo, they still wouldn't have the faintest clue what those sentences would mean.

The heart of our difficulty is the fact that language in itself tells nothing about how the speaker experiences the world. Language relies on experiences held in common by the speaker and hearer, and which serve as reference points.

Without those landmarks, language turns out to be useless. In other words, language only lets you talk about what everybody already knows.

This reveals a misconception that most people -- including many linguists -- have about language: that language "contains" information, and works like a pipeline to carry that information from one brain to another.

The fallacy lies in believing that language directly transmits information.

Rather than thinking of language as a pipeline or channel, think instead of the game of charades. The audience has the problem of trying to guess what the actor is thinking about.

Making random guesses would take far too long. There are too many possibilities. The actor tries to help the audience guess by reducing the number of possibilities. This happens in a couple of ways. The first is by process of elimination.

By accumulating the results of each successful guess, you can home in very quickly on the right answer, even starting from an astronomically large number of possibilities.

The second principle involves the use of common reference points between the actor's experience and the audience's. The actor makes a gesture or performs a pantomime to suggest something that she believes the audience has also experienced and knows about.

Communicating through language is a fantastically complex process where speaker and listener try to assess one another's experience of the world and grope for some common ground.

Unlike the way computers process language, humans don't just mechanically code and decode messages according to a set of rules. Language is a guessing game.

It's a game of trying out different concepts and scenarios, and seeking a "best fit" between what the speaker has said and what the listener knows would make the most sense on the basis of past experiences.

This is a process of natural selection, similar perhaps in some ways to the evolution of species, or to other biological systems such as the immune system.

Normally, this process runs its course so quickly and effortlessly that we're not aware that we actually guessed what the other person was thinking.

The role of context, both linguistic and experiential, is absolutely essential. When talking on the telephone, you have little trouble understanding what someone is saying until you have to pick out a word for which you have no background or context.

Noise and imperfections in the phone line may make it necessary to say "V as in Victor" or "D as in dog" when spelling out context-poor words.

Language is a remarkably creative process. Humans are ingenious at establishing context by making reference to common experience. The fact that computers are neither creative nor very knowledgeable about human experience is what makes it impossible for them to truly understand meaning in language.

Computers can only manipulate the symbols of language according to preset rules, and then only under very special circumstances.

Indexing and the limits of language

Jorge Luis Borges wrote a short story called "The Library of Babel" exploring the idea of a library that contains every book that could ever be written, whether meaningful or nonsensical. These books use every possible combination of letters, whether they make real words in any language or not.

This is an incredibly large library, of course! People are born in the library and spend their entire lives there, looking in vain for meaning in the books. Scattered randomly throughout the library are some truly excellent books.

Sometimes people find fragments of meaning in books, and sometimes people impose meaning on them, but the immense majority of the books are perfect nonsense.

It's harder to find a meaningful book among all the nonsense ones than it would be to write it from scratch in the first place.

The problem of indexing books in an immense library has some interesting quirks. Imagine a library that contains a copy of every book from every library in the world.

Not only does it have books, it also has magazines, newspapers, technical journals, music, and film. It has at least one copy of everything published in every medium.

This is the modern descendant of the fabled Library of Alexandria. It's connected to the global computer Utility, so anyone may conveniently browse through it using an electronic TV-book.

Clearly, the challenge will now be to locate what you need in this bewilderingly large collection.

The librarians who oversee this Library want to engineer a magnificent indexing system that'll make it possible to instantly locate every book making reference to any given subject.

An index that just located material by title, by subject, or by keyword descriptions wouldn't be effective.

The Library is so vast -- and growing quickly all the time -- that a reader might not even know which subject or keyword to ask for.

An indexing system that prompted the reader for a particular subject or keyword would be "begging the question." The reader might not have any idea that a certain interesting subject even existed.

A piece of knowledge may belong to more than one of the categories in the index, or it might not actually fit in any of the categories at all. Most knowledge gets caught between subject categories. Indexing books by subject can never be precise enough.

Even worse, many ideas "reside" in more than one book. Therefore, simply indexing books wouldn't be good enough. Nor would indexing chapters, because the correspondence between chapters and single ideas wouldn't be one-to-one either.

The marvelous indexing system that the librarians want to create would have to have an entry for every distinctly identifiable idea. It would then have to cross-index all the materials in the Library to include each instance of each idea.

Obviously, such an index would itself be too large to be useful, so there'd have to be indices for the index, and indices for the indices to the index, and on and on indefinitely.

Another short story by Borges comes to mind, in which mapmakers in a mythical country seek to make the perfect map. In order to show every detail, the map ends up being the same size as the territory it represents!

This problem has the flavor of fractal geometry. Benoit Mandelbrot, one of the inventors of fractals, studied the mathematics of cartography.

He observed that trying to determine the exact length of a coastline raises a perplexing dilemma. The more detail you include in your map of the coastline, the longer the total length of the coast becomes, because you're including more of the zig-zags from the little coves and promontories.

In principle, the coastline could be indefinitely long, because you'd have to count the jaggedness of individual atoms in the grains of sand. How long is the coastline, then, when it's really a line that partly spills over into two dimensions?

In this same way, the closer you look at a given category of knowledge, the more sub-categories you'll see, and the more closely you look at those sub-categories, the more sub-sub-categories you'll see, and so on without end.

The librarians in this monstrously large Library need a master librarian who has read and understood all the books in the Library, and can answer any question about the knowledge they contain.

Since this is a job no human could possibly do, the librarians try to build a machine to do it. However, they immediately crash up against the barrier faced by any automaton designed to process language.

The electronic librarian would have to operate according to a set of rules, or a set of rules about other sets of rules, and so on indefinitely. It wouldn't be practical to program such a machine to handle the concepts it will encounter in the books.

The programmers would themselves have to read and understand the books, and then specify the rules for mechanically processing all the concepts in them.

They would have to invent rules to cover all the ways in which each concept would relate to all the other concepts. That would take terrifically long time.

Suppose that you could simply hire hundreds of thousands of human librarians to read and memorize the books, and then to sit next to phones, ready to answer questions regarding what they've read.

Well, wouldn't you still face the problem of finding out which librarian has the answer? Also, once you found the right librarian, wouldn't you then have to spend a long time trying to understand what the librarian has read?

Wouldn't the librarian have to refer to knowledge stored in books that neither of you had read, thus causing comprehension problems for both of you?

An electronic Library that assembled all human knowledge would be so gigantic that we'd get lost in it.

We're reminded of the plight of shipwrecked sailors adrift on the ocean: "Water, water, everywhere, but not a drop to drink." To be useful, knowledge -- like water -- has to be available when, where, and in the form that we need it.

Elements of knowledge processing

By useful knowledge, I mean knowledge that helps you make sense of your environment and solve your problems.

It helps you decide which course of action will most likely lead to the results you want, and it helps you distinguish successful choices from unsuccessful ones.

Just like other resources, knowledge has to be processed. Unprocessed knowledge has many undesirable qualities. It may be incomplete, irrelevant, unreliable, or incomprehensible.

The heart of knowledge processing is the filter. A filter works by eliminating all items that don't match a set of conditions which you've chosen in advance.

Notice that eliminating what you don't want is functionally equivalent to extracting what you do want.

Suppose you have to find diamonds lost in a mound of dirt. Your first problem is choosing what kind of filter to use. A "diamond filter" would have to be able to distinguish diamond from dirt.

It would have to test for characteristics that diamonds always have and dirt doesn't, or else characteristics that dirt always has but diamonds don't. You might have to use more than one filter -- such as one for hardness and one for moisture content, for example.

Objects that passed the test for hardness would go on to get tested for moisture. By chaining filters together you can achieve more refined sifting. The filter might also consist of the sharp eyes and nimble fingers of a human worker hired to do the job.

After passing the whole mound of dirt through the filter, you'll have created some useful knowledge. Hopefully, you'll have successfully sorted the mound into two new groups, one heap labeled "Dirt," and one small, but beautiful pile labeled "Diamonds."

In reality, things won't go that easily, of course. What you'll probably get is a whole lot of piles, each pile with a label telling how likely it is to contain the diamonds -- 80% chance, 50% chance, less than 10% chance, and so on.

Still, you'll have more useful knowledge than you did before you filtered the mound.

You'll have to invent a more stringent filter system. The drawback with the tighter filters, though, is that they're much more expensive to use and they take much longer to sort through the dirt.

You only have enough time and money for sorting one or two heaps, so of course you pick the ones that the previous, cheaper filters said had the best chance of containing the diamonds.

Unluckily for you, when you sift through those smaller piles you still just come up with dirt. Looks like the mound didn't contain any diamonds after all! The trouble is that you didn't properly gather the dirt.

Gathering is an important principle of knowledge-processing, just as filtering is. Gathering is the inverse of filtering. Whereas filtering means working to extract only the knowledge you need, gathering means working to obtain all the knowledge you need.

Choosing the right filter to give you the results you want may not be easy. An important part of knowledge-processing, therefore, is testing filters to see which ones work best for you.

The process of finding appropriate filters requires gathering and filtering, also. You have to gather all the filters that might work, then filter the filters to find those that work the best.

The more complex the knowledge-matter that you're working with, the more you'll need filters for filters for filters, and so on.

Knowledge is useful only to the extent that it has been adequately collected, selected, connected, and corrected.

Collect means to gather knowledge thoroughly and intelligently so you can be reasonably sure you've got all available knowledge that pertains to solving whatever problem you're facing.

Well-collected knowledge is ready-at-hand. It's quickly and easily accessible. Ill-collected knowledge, on the other hand, is scattered and fragmentary.

Poorly collected knowledge may also be hidden -- partially or totally. Well-collected knowledge assures you that it's complete. The more thoroughly collected knowledge is, the more sure you can be that you're not missing anything you'd find interesting at that moment.

If pertinent knowledge exists, but you don't have access to it, or if you don't even know it exists, then for you that knowledge is poorly collected.

For knowledge to be well-collected, it must be available quickly and conveniently whenever it's needed. To the extent that knowledge is at all difficult to obtain, it isn't well-collected.

Collecting knowledge requires effort. Once collected, however, then it must be effortlessly accessible.

A collection of knowledge can never be perfectly complete. Knowledge processing always comes at a price. Mostly, what it costs is time. If you use machines -- computers -- to help you process knowledge, then you'll also have to supply energy to build and run them.

You want to make your collection as complete as you can afford it to be. In many cases, you'll bump up against certain limits where consuming more time and energy brings diminishing returns.

Not only do you want all the knowledge that's available, you also want to make sure you get only the knowledge that's strictly germane to the situation you're dealing with.

If your collection contains irrelevant or inappropriate knowledge, then it hasn't been sufficiently selected. To select knowledge means to filter it, eliminating the knowledge that doesn't apply to the task at hand.

Knowledge selection is a sorting process. You classify knowledge according to how well it fits with the problem you need it for.

To do that, you have to make judgments or choices -- which also rely on knowledge that's been collected and selected.

As is true with collecting, selecting can never be perfectly accurate. You can never be completely sure the criteria you use to sort knowledge are appropriate -- you may include some knowledge you never actually use, or you may leave out some that you end up needing.

The more finely you sort knowledge, the more time and energy it'll cost you. Looser filters are faster and cheaper, but they're more likely to let irrelevant knowledge pass through, and they may also clumsily reject knowledge that's useful.

Tighter filters, on the other hand, usually consume more time and energy when sorting.

You have to be able to assimilate the knowledge that you've collected and selected. In other words, it has to connect with the knowledge you already have.

To be well-connected, knowledge must be presented in a carefully selected order. It has to relate closely with what the learner -- the listener or reader -- already knows.

The medium that's presenting the new knowledge -- let's call it the teacher -- must create a bridge between the new knowledge and the learner's existing knowledge. The length of that bridge depends on how much the learner already knows about the subject being presented.

If the knowledge is abstruse and largely unrelated to the learner's everyday experience, the bridge needed to connect it will be very long.

The bridge may connect the learner with the new knowledge in using either a series of short steps or of longer ones.

Shorter steps supply more detailed intermediate knowledge, and take less for granted about what the learner already knows. The longer the steps, the more background knowledge the learner has to be able to supply.

Because of the Babel Barrier, there's a limit to how small you can make the steps when using language as the medium for presenting the new knowledge.

The teacher has the responsibility of finding out what the learner already knows and building the bridge out of small enough intermediate steps so that the learner can fully assimilate the new knowledge. In other words, the teacher has to learn about the learner.

In many cases, explicit "book" knowledge can't be readily assimilated because it fails to connect well enough with the learner's experience. Written language has disadvantages in connecting knowledge.

First of all, written material rarely gets prepared especially for just one learner. Because each learner has a unique knowledge background, text can easily miss the mark and take steps that the learner can't follow.

Another problem with text is that it can't be interrupted; the reader can't stop the writer and ask for clarifications. Sometimes knowledge has to be presented using some medium other than language or written text.

Detailed visual illustrations or personal demonstrations might help, for example.

After being collected, selected, and connected, knowledge also has to be corrected for it to be useful. Correct means to assess how reliable the knowledge is.

Correction involves evaluating the reliability of knowledge sources. To correct knowledge, you need knowledge about knowledge.

You have to find our where the knowledge comes from and how it was collected and selected.

Whenever you collect or select knowledge, you have to make judgments about what items might turn out to be useful. Those judgments are always a matter of guesswork.

Knowledge correction means keeping track of what those judgments were, and evaluating how reliable they are.

Quite obviously, correction is an imperfect process because you can never know exactly how accurate your collection and selection criteria were.

Anti-information

I like to refer to knowledge that isn't useful as anti-information. Anti-information is more than knowledge that just doesn't happen to be useful, though; it's worse.

It clogs up the pipeline, so to speak. It distracts you and hogs your attention. Anti-information saps your powers of mental concentration, making it difficult to digest the information you really need.

Everyone's favorite example of anti-information is junk mail. Advertising, in general, is anti-informative. The communications media available today are still relatively crude. 

hey nearly always direct their message to a mass audience, with the same message going to a very large group of people. The target audience is nearly always much smaller than the number of people to whom the message is actually sent.

The advertiser and the potential clients have different information needs. The advertiser is mostly interested in collecting, while the advertisees are more interested in selecting. The advertiser wants the most complete list possible of potential clients.

On the other hand, most people are more interested in not getting bothered by ads for things they don't want to buy. The disparity between the interests of the advertisers and those of the advertisees is what creates the anti-information.

Another name for anti-information is noise. Noise occurs whenever more knowledge gets collected than what can be selected. According to classical information theory, you can overcome any amount of noise in a communication channel if you put enough redundancy in your coded message.

Redundancy means backing up your message by saying the same thing in sufficiently many ways.

Languages have a lot of redundancy built into them. In the English sentence Mary drinks milk, for example, the s is redundant. If you just say Mary drink milk everyone will understand.

The s supplies redundancy by confirming that there's only one person drinking. Even if Mary came out garbled, you'd still be expecting there to be only one person drinking, and so you'd be listening for clues about who it might be.

Providing context is an important way to reduce anti-information. Context provides redundancy. What the listener already knows about the world provides clues for what to expect the speaker to be saying.

It gives the listener some handles with which to sort out possible meanings that just wouldn't make any sense. Without sufficient context, a message can quickly turn into a lot of noise.

Anti-information can truly be debilitating. A classic case is where jet pilots get dazzled by a barrage of signals from their instruments, and they fail to notice obvious danger.

Aircraft designers have to find ways to feed the pilot only the knowledge that is useful and necessary. In other words, the instruments have to create useful knowledge by collecting, selecting, connecting, and correcting exactly what the pilot needs to respond appropriately moment by moment.

The more complex the environment, the more useful knowledge is needed to make successful decisions. Also, the more complex the environment, the more harmful anti-information becomes.

The commonly heard expression, "suffering from information overload," is in fact the opposite of what it seems. It means you don't have enough context to know how to respond under the complex circumstances at hand.

There's really no such thing as having too much knowledge, then, although anti-information can truly strangle you.

Filters of filters

A filter is a machine that makes choices. A piece of paper that lets coffee drip through while retaining the grounds works as a coffee filter by choosing between coffee and grounds.

The coffee filters "remembers" your choice to select the coffee and reject the grounds.

Some filters choose other filters. A filter of filters rejects all filters that don't give the desired results. By filtering filters, you get a more complex and subtle level of choice than does a simple filter.

As an example, think of how each person makes choices about which clothes to buy. The shopper makes judgments based on personal taste, the price of the clothes, their quality, and so on.

Each person becomes a kind of clothing filter.

Another person, a fashion writer, sits in judgment on the choices the shoppers have made. The fashion writer sorts the shoppers into groups according to how well their wardrobes fit with what the writer thinks is good taste. That makes the fashion writer a filter of filters.

Suppose you're looking for some fashion advice. You read what several fashion writers have to say about style, and you choose the one that gives you the most insight. You've become a filter of filters of filters.

The more effective the filtering, the less anti-information or noise interrupts the flow of business in the system. Filter structures -- filters of filters of filters -- can grow to be so effective that they actually adapt to changing circumstances.

They can cope with accidental and random events that would otherwise bog the system down with noise.

Adaptive, noise-resistant systems with powerful filter structures have an enormous advantage over simpler, weaker systems. They can harvest more diversity.

Using a more powerful filter is like casting a larger net. The more possibilities they can select from, the greater the chance that they will catch an interesting piece of knowledge.

Anti-information and digitization

With so much knowledge, we're beginning to strangle on anti-information. One of the most difficult challenges is to sift the wisdom out of all the knowledge that we're creating. "Wisdom" is another name for useful knowledge.

People who've worked with in a field for many years usually have wisdom, because they've processed the knowledge in that field and made it useful. As clever apes, our instinct is to reach for some kind of tool to help solve our problems.

It's natural for us now to turn to the computer to help with knowledge-processing. Will computers create wisdom, though, or will they just create more anti-information?

The French mathematician and philosopher Blaise Pascal designed the first computer. It was never built, however. Charles Babbage, a 19th century Englishman, built the first working computer.

It was mechanical, relying on gears and levers to do the calculations. He called it the "difference engine."

Early in the history of computers, mathematicians showed that any calculating machine that can add, subtract, multiply, and divide can in principle perform other mathematical operations as well.

After building the difference engine, which could perform arithmetic operations only, Babbage wanted to build a computer that would perform any mathematical operation.

He called his planned universal calculating machine the "analytical engine." It was to be an enormous, steam-powered machine.

Babbage exhausted all his resources trying to build his analytical engine. Unfortunately, machine-tooling skills in the 19th century were inadequate for crafting the sophisticated gears and levers the computer would have needed.

The analytical engine had to wait until the 20th century for electronics, where electric current takes the place of clumsy metal parts. Although he never succeeded in building an actual universal computer, his plans for it were detailed enough that it was evident how it would work.

Lady Ada Lovelace, Babbage's confidante and the daughter of Lord Byron, wrote many interesting observations about computers and their potential, which are accurate even now.

During the Second World War, the first electronic computer was designed and built in England under the supervision of a mathematician named Alan Turing at a place called Bletchley Park.

The Bletchley Park computer changed the course of the war when the allies used it to crack Nazi Germany's formidable Enigma Code, which had previously been thought impenetrable.

The US Government, meanwhile, also built a computer for use by the Manhattan Project -- to do the mountains of calculations needed by the physicists who were working on the first atomic bomb.

The Second World War was the first war whose outcome was decided largely because of computers.

The earliest electronic computers were "hard-wired." They were built to perform specific mathematical tasks. If they were needed for other tasks, they'd have to be re-wired.

John von Neumann was a co-inventor of the programmable electronic digital computer. Rather than hard-wiring the machine to perform specific tasks, the programmable electronic digital computer uses a set of logical instructions, called the "program," which instructs the computer to perform the given task.

By changing the program, the task to be performed will be different, without having to rebuild the computer each time.

Von Neumann was a Hungarian mathematician and engineer who was so prolific at laying the foundations for new theories that he's sometimes called the midwife of 20th century science.

Wherever there was a difficult and important theoretical problem, you could be sure that von Neumann was there, working on it. The fact that he would deign the invention of software to be an interesting problem implies how important a question it really is.

Since all a computer does is calculate, you have to convert any other task you want it to solve into a problem of doing calculations. When a computer works with text, for example, it may convert each letter and punctuation mark into a number.

Or, it may treat each page as a map. Just like a street map that has coordinates for helping you find a particular street, each position (called a pixel) on the page map also has a unique number.

The computer keeps track of the color of each pixel. No matter how complex the text or graphics on the page is, the computer just sees it as numbers matched up with pixels.

Changing the page means nothing more than changing the patterns of numbers.

Nearly all computers in use today are electronic digital computers. Notice that the word "digital" refers to "digits," coming from the Latin digitus, meaning "finger."

All the jobs a digital computer does, from synthesizing music to guiding an airplane on automatic pilot, it does by "counting fingers," just like a child, but overwhelmingly more quickly and accurately than any child could ever do.

In everyday life, we usually count using base ten, with numbers from zero to nine. Obviously, this way of counting developed most naturally because we have ten fingers.

Not every culture uses base ten, however. The ancient Gauls, who lived in France when the Romans arrived, counted in base twenty; they included their toes as well as their fingers.

That habit of counting still lingers in modern French, incidentally. To say "eighty" in French, you say quatre-vingts, which means, literally, "four twenties." For computers, it's easier to count in base two, or binary.

Even though the binary system uses only zeroes and ones, it can still write every number that base ten can. Binary is easier for the computer to use because it's simpler to represent electronically.

A zero could be "off" and a one could be "on," for example, or a zero could be "low voltage" and a one could be "high voltage."

The binary numbers that instruct the computer what to calculate make a kind of code. That binary code is called "machine language." The first programmable computers were very difficult to use because the programs had to be written in machine language.

During the 'fifties, computer engineers began inventing computer languages that more closely resembled human languages.

Today, there are lots of "high level" languages, such as Pascal, C, BASIC, or Fortran, which make it much easier and faster for programmers to write code for the computer.

Even though computer programs differ widely, they nearly all share three working principles -- looping, nesting, and branching.

Looping is sometimes called iteration or recursion. "Iterate" comes from the Latin word iter, meaning "again." To iterate means to perform a given operation repeatedly.

The computer may cycle in a kind of loop, doing the same calculation again and again. This is a simple but very powerful principle. Small though a shovel-full of dirt may be, if you dig enough of them, you can move a mountain.

Because computers calculate very quickly and very accurately, they can perform many iterations in just a short time.

The final instruction in a program or a section of a program may be to go back to the beginning and start again. That's what makes the computer run in a loop.

A loop that runs inside another loop is nested. Nesting means putting programs inside of other programs.

Another name for nesting is recursion. Some computer languages let the programmer nest "deeply," putting loops within loops within loops. Nesting is a more sophisticated form of iteration.

The third principle of computer programming, branching, works like a switch on a railroad track. A program with branching has more than one set of instructions which it can execute.

This is like having more than one program. The main program can branch to any of the subprograms. Usually, each branch, or subprogram, gets executed only when specific conditions are met.

Branches are useful for controlling the program, especially for escaping from loops. Without the possibility of branching out of a loop, the computer could get stuck performing the same iterations endlessly.

Through looping, nesting, and branching, a program can instruct a computer to perform many useful tasks.

It's important to realize, however, that no matter how clever the computer may seem, it's just a machine that does swift and accurate calculations following a set of strict instructions precisely written by humans.

Modern computers have capacious and efficient memories that store the results of calculations for later use by the program. This gives them an added dimension of complexity and sophistication.

As computer engineering progresses, computers are getting faster at calculating, and computer programs are becoming nimbler at exploiting the computer's calculating power for performing very subtle and complex tasks.

Today, there's even software that helps programmers write new programs, as well as libraries of programs from which master programs can automatically extract snippets of programming code to insert in new programs.

Computers are remarkable machines, arguably the most brilliant tools ever designed by humans.

Nevertheless, it's amazing how resoundingly stupid computers really are. Any intelligence that a computer might seem to have is in fact the programmer's intelligence, not the computer's.

The computer is nothing but an automaton that carries out the programmer's instructions. Some people argue that individual neurons in the brain hardly show any intelligence, either, and that intelligence lies in the connections between the neurons, and that the whole is much greater than the sum of its parts.

Isn't a program somehow greater than the sum of its instructions? Perhaps it is, but I think we'll always discover that the emergent characteristics of the program as a whole are somehow traceable to the programmer's thoughts.

In the last century, there were noted scientists who "proved" that aeronautics would never be possible. Most of us take for granted that not too long from now we'll be carrying on fairly normal conversations with machines.

Although it'd be foolish to say this will never be done. However, it's a much more difficult problem than it seems.

A true talking machine would be fundamentally different from any computer that now exists.

Symbols and the experiential gap

Alan Turing invented a famous test for artificial intelligence. He was imagining a time in the future when computers would be so clever that people would want to know whether a machine could actually have a mind.

That's a difficult question, because no one knows for sure what a mind really is.

Turing's test is thought-provoking and ingeniously simple. By using only a keyboard and terminal, a person hooks up both with the computer which is being tested for artificial intelligence, and also with another human being.

The person doing the testing poses questions to both the computer and the human subject, trying to decide which one is a mind and which one isn't. If the tester fails to determine which is which, then the machine passes the test.

It's natural to think of how the brain works in terms of the leading technologies of the day. Freud used many steam engine metaphors in describing the mind. Later, telephone analogies were the rage.

Today, many people can't resist the temptation to think of the brain as a fancy, squishy computer. In fact, the brain could hardly be more unlike a modern digital computer.

The "brain-as-a-computer" fallacy is based on two major beliefs. First, that a mind is actually "software" that can reside in any hardware complex enough to run it, whether made of living tissue or, for example, silicon; and second, that many of the general principles describing how software works in a silicon-based computer should also apply to the "wetware" mind that functions in the brain.

According to this view, the human mind is an epiphenomenon of the brain. What matters is the structure of a mind. Any sufficiently complex system with an appropriate structure could serve the same function and deliver a mind just as well as the brain does.

It's becoming increasingly clear, however, that computer engineering reveals little about how the human brain actually works.

Many of our deep-seated notions about the mind come out of the history of European philosophy. Aristotle began the tradition of formal logic with his study of syllogisms, such as the famous one: All men are mortal; Socrates was a man; therefore, Socrates was mortal.

During the Middle Ages, the Scholastics entwined formal logic with religious ideas about the world. Later, Descartes tried to show that the mind was rational and reducible to formal logic.

With Frege came the development of modern logic, out of which came the logical positivist movement in philosophy championed by philosophers like Whitehead, Russell, and Carnap.

The logical positivists believed that all knowledge could be expressed as a list of propositions. A proposition might say, for example, that the sky appears blue.

Then, you'd need propositions stating what "blue" means, what the sky is, and so on. If you had enough propositions, you could eventually capture all knowledge.

The logical positivists tried to discover the rules governing all knowledge. They believed those rules would tell which propositions were true and which must be false.

Their dream was that one day they could deduce all truth from a simple set of propositions (to put the whole universe on a T-shirt, I suppose).

As the Turing test showed, language comprehension is at the heart of what we expect from a mind. The way computers process language is not at all the same as the way that humans comprehend language, however.

Computers process language by mechanically taking apart and putting together words and sentences according to a fixed set of rules or propositions such as the logical positivists were trying to build.

The problem with this, of course, is that the list of propositions gets too long to be useful, even assuming you had enough time to catalog them and enter them into a computer's memory.

Computer engineers write programs that parse language to pick out the verbs, nouns, and other parts of speech, on the basis of rules about how those words should appear.

The programs are usually quite clever at deciding what an ambiguous word might mean by looking at other words around it.

That way, besides being able to handle a phrase like "books in the library," a program could also handle phrases like "booking a criminal," "doing the company books," "booking a flight," "hitting the books," and so forth.

Even if the computer program could handle all these phrases correctly, would it truly be understanding the language the way we do? A philosopher named John Searle invented a famous anecdote to illustrate how empty the computer's understanding of language really is.

It's called the Chinese Room paradox. Suppose you didn't know a word of Chinese. You decide to take part in an experiment in which you're put in a closed room that allows nothing to pass in or out except messages written in Chinese.

Inside the room is a reference book telling how to put the Chinese words together to make proper Chinese sentences, even without knowing what they mean.

A person sticks a message written in Chinese through a slot in the wall, and so you go ahead and consult your reference books, which have pictures of the Chinese characters along with detailed rules about how to handle each one.

You might actually fool the person outside into thinking that you really knew Chinese.

The propositional or sentential approach to language processing is bound to fail in the end. Even if programmers ever succeeded in designing machines that could correctly parse language and match the correct dictionary definition to each word, the list of propositions needed to process those dictionary definitions would be too long to be of any use. 

 language-parsing machine would end up having to process a non-computably large number of propositions.

When processing the subtleties of language meaning, a computer ends up choking on all the rules it has to execute. Some people might argue that if a computer were only fast enough, it could still truly understand language.

In practice, however, as the complexity of the knowledge that a computer must process increases, the number of necessary rules grows even faster still. Even running on a supercomputer, a program might still take years to truly comprehend simple concepts.

Words only point to the highlights -- the salient features -- of experience. Language communicates knowledge in an inherently disconnected way. Even though certain pieces of knowledge may be connected to one another, it's always up to the reader to figure out what those connections are.

The reader has to fill in the gaps and discover the connections envisioned by the writer to understand what the writer was actually thinking or experiencing.

Language has to cope with knowledge in terms of finite connections (although it's possible that poetry doesn't have that limitation).

A human reader can guess what the writer is thinking by "reading between the lines." A computer can't do that. It must have an explicit rule or proposition for everything it's expected to do.

Language is discrete. Real human experience isn't. Experience is continuously and richly varying. No two moments of experience may ever be precisely the same.

Although languages are truly complex, they can never be complex enough to match the richness of experience. Human experience seems infinite when taken moment by moment.

The closer you look, the more details you see. Although the number of possible sentences in a language is essentially limitless, there will always remain gaps in the power of the language to describe experience.

The Babel Barrier is the limit faced by all symbol systems in describing continuously varying experience. It's as solid as the boundary between numbers you can count to and those that are too large to count to.

Computers -- counting machines -- can't soar above this boundary as we so naturally do, because they're rule-based.

The data deluge

The global computer Utility, the paperless society, and the universal electronic Library will create such a flood of communication that creating truly useful knowledge and purging out anti-information will become a major challenge.

Trying to use computers to cope with this raging storm of data will prove futile. A computer, or any automaton, quickly gets hung up on the Babel Barrier, just like the aliens that tried to understand human languages.

A language, or any other symbol system, imposes a certain limit upon knowledge-processing machines, and therefore on the size and usefulness of libraries.

Any library that uses language would face this limit. A library is a kind of communication center, linking up the thoughts and experiences of people from diverse times and places. 

How will we cope with the coming knowledge explosion if we're crippled by the Babel Barrier? Could it be that language, which for so long has served us so well, might soon prove inadequate?

What kind of machine could store knowledge and make it useful without using language? What's needed is a true information-processing machine, not just a symbol-processing machine like today's computers.

Let's take a closer look at information.

Michael Webb, 1992

introduction ]
next chapter ]

home ]