By Published: Jan. 18, 2017

When Matthew Keller found he could not duplicate his own 2012 study that tied inbreeding to the chances of developing schizophrenia in a more-powerful secondary study, he wanted to make sure the scientific record was clear

Science can be a very messy business, with many more hypotheses posited, studied, discounted and idly discarded than most people realize, according to Matthew Keller, an associate professor in behavioral, psychiatric and statistical genetics at the University of Colorado Boulder.

“There are plenty of false starts, some of which go on for years and years until someone shows them to be false,” said Keller, whose lab researches genetic architecture of psychiatric disorders. “It’s not unlikely that a lot of the science that’s published—perhaps in some domains, most of what is published—is wrong. That sounded startling to scientists a few years ago, but people are increasingly realizing that for certain sub-disciplines of science, it may be true.”

So when Keller found he could not duplicate his own 2012 study linking inbreeding to the likelihood of developing schizophrenia in a secondary study, he wanted to make sure that subsequent research would not go down the same faulty track.

The second study, “No Reliable Association between Runs of Homozygosity and Schizophrenia in a Well Powered Replication Study,” was published in the journal PLOS Genetics in October, which also published the initial study, “Runs of Homozygosity Implicate Autozygosity as a Schizophrenia Risk Factor.”


Matthew Keller

“I thought it was crucial that we not only publish it, but publish in the same journal that we published the original finding,” Keller said. “I’m very grateful the journal did publish our second study. Most editors simply don’t want to publish null findings.” (A null finding does not support the hypothesis.)

To be sure, many researchers aren’t too inclined to discount their own findings, either. Keller actually is not sure the positive results of his first study are not valid, but in regard to the second study, the disqualifying data speak for themselves.

I have to admit that I’m disappointed in that it didn’t replicate, but I take great pride in publishing this report."

“I have to admit that I’m disappointed in that it didn’t replicate, but I take great pride in publishing this report,” he said. “At first I thought we must have done something wrong—maybe miscoded something. And my poor graduate student (PhD candidate Emma Johnson, the lead author for the second study), I forced her to jump through so many hoops, but in the end the research was sound.”

That first study found a modest but nevertheless reliable association between inbreeding and schizophrenia—leading Keller to state that that the odds of developing schizophrenia increase by approximately 17 percent for every additional percent of the genome that shows evidence of inbreeding. Inbreeding is a well-known factor in predicting occurrences of diseases caused by abnormality in a single gene, which are referred to as monogenic disorders, but in more complex diseases the epidemiological studies require huge patient lists to create enough data to establish an association.

“The effects are tiny; to detect an association you need huge sample sizes,” Keller explained. “Once you get those sample sizes, you can see lots and lots of effects.”

To create those databases, researchers from around the globe have formed consortiums, so the 22,000-person database used in the original study included patients and control populations from a number of different countries. Combining them into a single database, researchers can then examine the degree of inbreeding in schizophrenic patients compared to the degree of inbreeding in the control group.

“Both studies used very large consortium data, but they all came from smaller studies—for instance a study in Denmark would take data from 1,000 patients with schizophrenia and a similar sized control group,” he said.

The measure of inbreeding is essentially determined by how many chromosomes have identical alleles—meaning that individual received the same allele from both the mother and father. But rather than simply finding a ratio of duplicated pairs of alleles—homozygotes in the language of the discipline—researchers look for long strings of homozygosity.

“Everyone is inbred to a certain degree, but there is a large variation on how far back you have to go to before you start finding the same people in different branches of the tree,” Keller said. “Some people, you only have to go back a few generations, others much further back. The runs of homozygosity are a more sensitive measure of recent inbreeding.”

Homozygosity means that stretches of the genome are identical because they come from the exact same ancestor, and if recessive alleles exist in that run of homozygosity, their full negative effect will be revealed. In monogenic disorders that may easily be the path to disease mutations–but in more complex disorders there could be thousands of combinations leading to a disorder.

Keller remains unsure why his results failed to replicate but suspects that the use of consortium data may be partly to blame. In essence, control group establishment varies widely in each study that contributed data, so that could have easily confounded the results.

“When the research differs in how people are ascertaining cases and control, it’s very easy to get confounding results in this particular type of study,” he said. For instance, if controls are ascertained disproportionately from highly educated and well-traveled people, their levels of inbreeding could be less dramatic than in cases drawn from the more general population, with a mix of people coming from populations of long-established farming communities, for example.

“This confounding could either be covering up an association in the second study, or it could have created the association in the first study,” Keller said. The second study used a large population – more than 30,000 people – but Keller did not include the original 22,000 people from the first study.

“I still have my suspicions that the first finding was right, but my confidence has plummeted from 80 percent right after we published that study to maybe 25 percent now,” he said.  Still, publishing the null results not only gives subsequent researchers warning signs, but also a glimpse into those possible data glitches and the intrinsic difficulty in moving forward on this path.

“The main problem now is there is no additional data collected that can help us solve the riddle,” he said. “There’s no reason to do the same study on another sample of 30,000 people again, because we can’t rule out there are these confounding problems. We’ve got to find a dataset that has collected relevant information, not just about the disorder, but about potential confounding variables, to make the study worth re-running.”