Except for various cartoon characters, the Geico Gecko and Mr. Ed, animals can’t speak. Yet they have a lot to say to scientists trying to figure out the origins of human language.

Speaking isn’t the only avenue for language. After all, linguistic messaging can be transmitted by hand signals. Or handwriting. Or texting. But speech is the original and most basic mode of human communication. So understanding its origins ought to generate deeper comprehension of language more generally. And a first step toward that understanding, cognitive scientist W. Tecumseh Fitch believes, is realizing that key aspects of vocal language are not, as traditionally contended, limited to humans.

He’s not talking about a TV-show horse, of course, or animated narrators of insurance advertisements. Fitch’s point is that many creatures from the real-world animal kingdom offer clues about how the capacity for speech came to be.

It’s true that humans, and humans alone, evolved the complex set of voice, hearing and brain-processing skills enabling full-scale sophisticated vocal communication. Yet animals can make complicated sounds; parrots can mimic human speech and cats can clearly convey that it’s time for a treat. Many animals possess an acute sense of hearing and are able to distinguish random noises from intentional communication. So even though only humans possess the complete linguistic package, the components of language ability “have very deep evolutionary roots,” says Fitch, of the University of Vienna. In fact, he suggests, just a handful of changes in the communication repertoire of humankind’s ancestors endowed people with the full faculty of language.

Much of the physiological apparatus for hearing and speaking is found in all land-dwelling vertebrates — the tetrapods — including mammals, birds, amphibians and reptiles. “Humans share a significant proportion of our basic machinery of hearing and vocal production with other tetrapods,” Fitch writes in the Annual Review of Linguistics.

Life-forms occupying numerous branches of the tree of life possess anatomical tools for producing and perceiving vocal communication. Where human ability exceeds our predecessors, Fitch says, is the sophistication of the brain circuitry adapted to the uniquely human capacity for complex linguistic expression.

Historically, language experts have proposed anatomical explanations for human’s special language facility. Just as the opposable thumb permitted tool use, some authorities theorized that the lower location of the voice box in the vocal tract enabled the articulation of meaningful sounds. Or the human hearing apparatus, encompassing hair cells and eardrum and three little bones, provided the discerning ear capable of interpreting nuanced vocalizations. But in reviewing the scientific literature, Fitch finds that speech’s structural subcomponents, used for producing vocalizations and perceiving patterns in those sounds, have appeared in multiple organisms over evolutionary time.

An anatomical view of the human head and throat highlight some key structures involved in speech, including the larynx, or voice box, vocal chords (folds), windpipe, tongue and lips.

Among primates, only humans can learn to produce novel vocal sounds, but that difference isn’t explained by anatomy — the basic structure of the human voice box and vocal tract is similar to that found in other mammals. Cartilage and muscle within the vocal folds of the voice box (or larynx) give mammals better control over vocalizing than other vertebrates. Fleshy tongues and lips are also mammalian features that aid in speech production.

CREDIT: CLAUS LUNAU / SCIENCE SOURCE

The ear’s sensory hair cells, which convert sound vibrations into nerve impulses, go back as far as jellyfish, for instance. Genes instrumental to producing the hair cells are similar in insects and humans.

In some cases, a particular trait evolved independently in different lineages. But often a trait evolved once and then was passed down through a long line of descendants. Such “homologous” traits “provide the equivalent of a time machine allowing us to reconstruct an evolutionary sequence of ancestral forms,” Fitch notes. Independently arising traits, on the other hand, provide data points helpful for testing evolutionary hypotheses. Combined, the homologous inherited traits and the independent analogous traits have produced deep and novel insight into speech’s evolutionary origins.

Among the tetrapods, mammals evolved much more sensitive hearing, able to cope with a wider range of frequencies and therefore more able to process nuances of vocalizations. Humankind’s primate ancestors, for instance, possessed highly capable hearing ability. “There is nothing about the human ear that is strikingly different from that of other primates,” Fitch writes. “Our peripheral hearing apparatus was in place, in our primate ancestors, in essentially modern form long before we evolved the capacity for speech.”

But perhaps successful speech perception required “vocal tract normalization” — the ability to recognize the same words spoken by different voices (such as a child versus an old man). Humans are not, however, alone in that ability, either. Zebra finches trained to recognize vowels when listening to a male voice can still make the distinction when the speaker is a woman.

Maybe the key human-only skill is the ability to figure out which of the world’s many complex sounds are vocal efforts to communicate. In the part of the human brain that responds to sounds (the auditory cortex in the temporal lobe), some of the circuitry is specialized for voices as opposed to other sounds. But such voice-specific circuitry also exists in nonhuman primates and perhaps even dogs. “The data lead to the conclusion that the primate auditory system had already evolved to a ‘speech-ready’ level of sophistication long before spoken language evolved in our species,” Fitch writes.

If hearing skill isn’t the source of human linguistic power, maybe the human-only aptitude for speech lies in the ability to produce it. Nonhuman primates can make vocal noises, but unlike in Planet of the Apes movies cannot articulate the nuanced sounds of speech. But it’s not obvious why not, as the basic blueprint for the human vocal tract has been around for 70 million years and is shared by most mammals. Even the lower position of the voice box — the descended larynx — is not exclusively human. And that anatomical adjustment isn’t necessary for complicated vocalization, anyway. Experiments have shown that some primates have vocal tracts capable of ample vocal agility.

“An unmodified primate or mammal vocal tract would be perfectly adequate to produce intelligible spoken language,” Fitch writes.

Evolutionary tree showing the descent of major groups, called clades, of human relatives from a common ancestor. Clades listed include (from bottom to top) eukaryotes, vertebrates, tetrapods, amniotes, hominids and hominins. Scientists have used such groupings for insight into the evolution of spoken language in humans, a trait not shared by any other animal.

By studying clades — groups of species related by a common lineage of descent (simplified tree shown here) — scientists can compile clues about the evolution of speech. Many aspects of human speech and hearing, for instance, rely on features found in all tetrapods, a clade that includes mammals, reptiles and amphibians. Of particular interest are homologous and analogous traits. All mammal species, for instance, have three middle-ear bones, a homologous trait inherited from a common ancestor. Neural connections between parts of songbird brains important for vocalization may be analogous to neural connections between speech-related parts of human brains; those connections evolved independently in different lineages but may both be important for speech production.

Besides all that, parrots and many other bird species, some bats and even elephants can mimic vocal sounds. So humans’ distinctive speech can’t depend solely on vocal production ability. Considering all the evidence, the vocal and auditory skills of various animals tell a tale of multiple preludes to the human speech story. That tale reveals that humans acquired speech not via anatomical innovation for vocalizing and hearing, but by novel neural connections that control the anatomical hardware.

After all, speech requires more than producing and perceiving sounds. A speaker’s brain must decide what sounds to produce and issue instructions for producing them to the body’s vocal apparatus. And a listener’s brain must be able to decode auditory signals it receives and then issue commands for a vocal response. People are skillful at producing sounds in response to other sounds — it’s why you can repeat a word out loud after the first time you hear it.

Such controlled vocalization of a word is different from just making noise. Most animals possess neural circuitry for producing “innate” vocalizations: Dogs bark, squirrels chatter and seagulls squawk. Even humans have their own innate vocalizations, including crying, laughter and screams. But among primates, only humans have the “capacity to produce novel, learned vocalizations beyond the innate call repertoire,” Fitch notes.

Today the dominant hypothesis explaining that ability is the presence of special connections between brain regions involved in controlling speech and hearing. Innate calls — in humans and all other mammals — are initiated by direct signals from the brain stem. Indirect messaging from the cortex (the brain’s more advanced outer layer) enables voluntary suppression or production of innate calls. Unlike other animals, humans possess direct connections between nerve cells in the cortex and the nerve cells that control the muscles operating the larynx. Some apes and monkeys have direct connections from cortex to the muscles controlling the lips and tongue, but not to the muscles controlling the larynx. (Circuitry connecting the auditory cortex to the motor cortex also seem more extensively developed in humans.)

Evidence supporting the view that such direct neural connections explain human speech comes from other species that can “talk,” such as parrots and songbirds that can learn novel vocalizations. These species do have direct neural connections to their voice-generating apparatus, while non-vocal learning birds don’t.

Underlying the evolution of the brain circuitry responsible for human speech skill are genetic modifications that remain largely mysterious.

Diagram shows simplified views of human brain. One highlights the direct neural connection linking the motor cortex to the muscles in the voice box. The second shows nerve connections, shared with primates, between Broca’s region and the auditory cortex. It also shows additional connections between these brain areas running through the parietal cortex and found only in humans.

Among primates, it seems that only humans possess direct connections (shown at left) from the part of the brain that controls motion to nerve cells in the brain stem (black dot) that control the larynx muscles responsible for vocalizing sounds. Shown at right: Other primates as well as humans possess nerve connections (dashed line) between brain areas involved in language (Broca’s region) and hearing (auditory cortex). But human brains have more fully developed additional connection pathways (blue and red lines) that researchers hypothesize play key roles in producing and processing speech.

“The genetic underpinnings of … [neural] connections involved in human vocal control are virtually unknown,” writes Fitch. But genetic analyses of ancient organisms and testing DNA found in fossils is an emerging research field. “Thus, genetic data perhaps provide the most promising and exciting empirical pathway for future research on the biology and evolution of speech.”

As Fitch notes, speech is not the whole story of human language. Vocal communication is a central feature, but language encompasses much more, as linguist and neuropsychologist Angela Friederici pointed out at a recent meeting of the Society for Neuroscience.

“Language is more than speech,” said Friederici, director of the Max Planck Institute for Human Cognitive and Brain Sciences, in Leipzig, Germany. “Speech … uses a limited set of vowels and consonants to form words. Language, however, is a system consisting of words … and a set of rules called grammar or syntax to form phrases and sentences.”

Nonhuman primates can learn the meaning of individual words, she notes, but aren’t capable of combining words into meaningful sequences of any substantial length. That ability also depends on circuitry connecting different parts of the brain, current research by Friederici, collaborators and other scientists is now showing.

Understanding that circuitry depends on comparing the cellular architecture and nerve fiber tracts of the human brain with the brain of animals with lesser linguistic power. So in a way, scientists may be able to ask animals for clues not only to the evolution of speech, but to language skills more generally as well. Sort of like going straight to the source and asking the horse.