Several years ago, Christian Rutz started to wonder whether he was giving his crows enough credit. Rutz, a biologist at the University of St. Andrews in Scotland, and his team were capturing wild New Caledonian crows and challenging them with puzzles made from natural materials before releasing them again. In one test, birds faced a log drilled with holes that contained hidden food, and could get the food out by bending a plant stem into a hook. If a bird didn’t try within 90 minutes, the researchers removed it from the dataset.

But, Rutz says, he soon began to realize he was not, in fact, studying the skills of New Caledonian crows. He was studying the skills of only a subset of New Caledonian crows that quickly approached a weird log they’d never seen before — maybe because they were especially brave, or reckless.

The team changed their protocol. They began giving the more hesitant birds an extra day or two to get used to their surroundings, then trying the puzzle again. “It turns out that many of these retested birds suddenly start engaging,” Rutz says. “They just needed a little bit of extra time.”

Scientists are increasingly realizing that animals, like people, are individuals. They have distinct tendencies, habits and life experiences that may affect how they perform in an experiment. That means, some researchers argue, that much published research on animal behavior may be biased. Studies claiming to show something about a species as a whole — that green sea turtles migrate a certain distance, say, or how chaffinches respond to the song of a rival — may say more about individual animals that were captured or housed in a certain way, or that share certain genetic features. That’s a problem for researchers who seek to understand how animals sense their environments, gain new knowledge and live their lives.

“The samples we draw are quite often severely biased,” Rutz says. “This is something that has been in the air in the community for quite a long time.”

In 2020, Rutz and his colleague Michael Webster, also at the University of St. Andrews, proposed a way to address this problem. They called it STRANGE.

This video from one of Christian Rutz’s experiments shows a wild New Caledonian crow bending a plant stem into a hook to retrieve food from a hole. Although some birds were hesitant to approach the materials at first, Rutz realized that many of them could solve the puzzle with extra time.

CREDIT: B.C. KLUMP ET AL / BMC BIOLOGY 2015

Personalities aren’t just for people

Why “STRANGE”? In 2010, an article in Behavioral and Brain Sciences suggested that the people studied in much of published psychology literature are WEIRD — drawn from Western, Educated, Industrialized, Rich and Democratic societies — and are “among the least representative populations one could find for generalizing about humans.” Researchers might draw sweeping conclusions about the human mind when really they’ve studied only the minds of, say, undergraduates at the University of Minnesota.

A decade later, Rutz and Webster, drawing inspiration from WEIRD, published a paper in the journal Nature called “How STRANGE are your study animals?

They proposed that their fellow behavior researchers consider several factors about their study animals, which they termed Social background, Trappability and self-selection, Rearing history, Acclimation and habituation, Natural changes in responsiveness, Genetic makeup, and Experience.

“I first began thinking about these kinds of biases when we were using mesh minnow traps to collect fish for experiments,” Webster says. He suspected — and then confirmed in the lab —  that more active sticklebacks were more likely to swim into these traps. “We now try to use nets instead,” Webster says, to catch a wider variety of fish.

That’s Trappability. Other factors that might make an animal more trappable than its peers, besides its activity level, include a bold temperament, a lack of experience or simply being hungrier for bait.

Other research has shown that pheasants housed in groups of five performed better on a learning task (figuring out which hole contained food) than those housed in groups of only three — that’s Social background. Jumping spiders raised in captivity were less interested in prey than wild spiders (Rearing history), and honeybees learned best in the morning (Natural changes in responsiveness). And so on.

Bias in experiments can have surprising sources. In one study, pheasants did better on a learning task when housed in larger groups.

CREDIT: ISTOCK.COM / PAWEL BOBER

It might be impossible to remove every bias from a group of study animals, Rutz says. But he and Webster want to encourage other scientists to think through STRANGE factors with every experiment, and to be transparent about how those factors might have affected their results.

“We used to assume that we could do an experiment the way we do chemistry — by controlling a variable and not changing anything else,” says Holly Root-Gutteridge, a postdoctoral researcher at the University of Lincoln in the United Kingdom who studies dog behavior. But research has been uncovering individual patterns of behavior — scientists sometimes call it personality — in all kinds of animals, from monkeys to hermit crabs.

“Just because we haven’t previously given animals the credit for their individuality or distinctiveness doesn’t mean that they don’t have it,” Root-Gutteridge says.

This failure of human imagination, or empathy, mars some classic experiments, Root-Gutteridge and coauthors noted in a 2022 paper focused on animal welfare issues. For example, experiments by psychologist Harry Harlow in the 1950s involved baby rhesus macaques and fake mothers made from wire. They allegedly gave insight into how human infants form attachments. But given that these monkeys were torn from their mothers and kept unnaturally isolated, are the results really generalizable, the authors ask? Or do Harlow’s findings apply only to his uniquely traumatized animals?

$[$PB_DROPZONE,id:knowable-newsletter-article-promo$]$

Looking for more copycats

“All this individual-based behavior, I think this is very much a trend in behavioral sciences,” says Wolfgang Goymann, a behavioral ecologist at the Max Planck Institute for Biological Intelligence and editor-in-chief of Ethology. The journal officially adopted the STRANGE framework in early 2021, after Rutz, who is one of the journal’s editors, suggested it to the board.

Goymann didn’t want to create new hoops for already overloaded scientists to jump through. Instead, the journal simply encourages authors to include a few sentences in their methods and discussion sections, Goymann says, addressing how STRANGE factors might bias their results (or how they’ve accounted for those factors).

“We want people to think about how representative their study actually is,” Goymann says.

Psychology researchers have also asked whether studies of a narrow group of people, such as Western college students, really say much about human beings in general.

CREDIT: ISTOCK.COM / SKYNESHER

Several other journals have recently adopted the STRANGE framework, and since their 2020 paper Rutz and Webster have run workshops, discussion groups and symposia at conferences. “It’s grown into something that is bigger than we can run in our spare time,” Rutz says. “We are excited about it, really excited, but we had no idea it would take off in the way it did.”

His hope is that widespread adoption of STRANGE will lead to findings in animal behavior that are more reliable. The problem of studies that can’t be replicated has lately received much attention in certain other sciences, human psychology in particular.

Psychologist Brian Nosek, executive director of the Center for Open Science in Charlottesville, Virginia and a coauthor of the 2022 paper “Replicability, Robustness, and Reproducibility in Psychological Science” in the Annual Review of Psychology, says animal researchers face similar challenges to those who focus on human behavior. “If my goal is to estimate human interest in surfing and I conduct my survey on a California beach, I am not likely to get an estimate that generalizes to humanity,” Nosek says. “When you conduct a replication of my survey in Iowa, you may not replicate my finding.”

The ideal approach, Nosek says, would be to gather a study sample that’s truly representative, but that can be difficult and expensive. “The next best alternative is to measure and be explicit about how the sampling strategy may be biased,” he says.

That’s just what Rutz hopes STRANGE will achieve. If researchers are more transparent and thoughtful about the individual characteristics of the animals they’re studying, he says, others might be better able to replicate their work — and be sure the lessons they’re taking away from their study animals are meaningful, and not quirks of experimental setups. “That’s the ultimate goal.”

In his own crow experiments, he doesn’t know whether giving shyer birds extra time has changed his overarching results. But it did give him a larger sample size, which can mean more statistically robust results. And, he says, if studies are better designed, it could mean that fewer animals need to be caught in the wild or tested in the lab to reach firm conclusions. Overall, he hopes that STRANGE will be a win for animal welfare.

In other words, what’s good for science could also be good for the animals — seeing them “not as robots,” Goymann says, “but as individual beings that also have a value in themselves.”