Last November, Luciano Maiani, the Director-General of CERN, the European Center for Particle Physics ("where the Web was born"), was faced with an excruciating decision. The laboratory's main accelerator, the Large Electron Positron (LEP) collider, had been scheduled to be shut down. A more powerful machine, the Lepton Hadron Collider (LHC) was to be built in the same 27 kilometer circular tunnel that straddles the border between France and Switzerland in Geneva.
One of the prime purposes of the LHC was to search for a particle called the Higgs boson. Finding the Higgs would put the final piece in place for the highly successful standard model of particles and forces. Not finding it would be equally significant.
Higgs particles, if they exist, pervade all of space and provide an explanation for the origin of mass. In the standard model, all particles are intrinsically massless but gain mass (that is, inertia), by bouncing around off Higgs particles.
The mass of the Higgs is uncertain. But, as time has gone on, better estimates have been made. Gradually these have come down to the point where lower energy experiments currently running might have a chance to see it. Four independent experimental groups working at LEP have examined their data for signs of the lower mass Higgs. Last August, two of the groups reported that they had seen a total of four candidates at a mass of 115 GeV (the proton mass is a0.938 GeV). The other two groups had no candidates, but their data were not inconsistent.
After seeing these results, the Director-General delayed the shutdown until November 4 to give the groups time to get more data. One group found two more candidates, and a third group also reported two new ones, although these were all questionable and, in fact, one previous candidate went away.
The combined results from the four experiments were determined, by a rather complicated analysis, to have a statistical significance level of about 99.8 percent. In probability theory, this means that an effect as larger or larger than the one reported would, on average, appear as a random artifact if the same experiment were repeated 500 times.
The experimenters and much of the particle physics community pleaded with the Director-General to extend the LEP run into 2001. They argued that Fermilab, the recently upgraded U.S. accelerator located in Illinois, might discover the Higgs before LHC came on line in 2006. After "extended consultation with the appropriate scientific committees," Maiani decided to shut down LEP and proceed with LHC construction. Apparently, he was not convinced by the data presented. He promised to do his best to speed up LHC construction by asking for more resources.
It will be fascinating to see how it all plays out. In the meantime, we can use this as an excellent example of how standards vary from field to field. In fields such as medicine and psychology, a significance level of 95 percent is usually adequate for publication. Anything at 99.8 percent would be accepted as a valid observation.
I can understand the desire in health fields to get research results into therapeutic use as soon as possible. However, the 95 percent standard implies that every twentieth experiment, on the average, will report statistical artifacts as real effects. Is it any wonder that people are confused when they read media reports of studies that say one thing, only to be contradicted a few months later by studies that say the opposite?
Back in the 1960s, when I was a graduate student, observations of new elementary particles were being reported almost every week, and getting published, only to fail to be independently confirmed. As I recall the situation, the top physics journal, Physical Review Letters, commissioned an analysis by Berkeley physicist Arthur Rosenfeld. He counted all the experiments being done and all the various ways the data were being examined. Since large computers had just become available to help with the work, many more ways at looking at the data had become possible. Rosenfeld concluded that random effects would be expected to occur quite regularly at the loose publication standards being applied at the time.
At Rosenfeld's suggestion, the journal then adopted a 99.99 percent significance criterion for the publication of any claim of discovery of a new particle or other extraordinary effect. This implied that only one out of 10,000 similar experiments would produce the observed effect as a statistical fluctuation. The recent Higgs reports failed this test and were not publishable, at least as a claim of discovery.
I am not suggesting that medicine and psychology adopt the one-in-ten thousand standard, which is probably not achievable in those fields. However, they should be able to do better than one-in-twenty. I suggest that all fields of science make a clear distinction between ordinary and extraordinary claims. A paper providing evidence that an apple a day keeps the doctor away may be accepted for publication at a minimal standard. However, any claim that violates existing, well established scientific knowledge is extraordinary and should require extraordinary evidence--including a very low probability for being a chance artifact.
In particular, many claims of alternative medicine are sufficiently extraordinary that a high publication standard should be applied to these. Examples include homeopathy, which violates the atomic theory of matter, and therapeutic touch, which relies on a human energy field that has never been detected. An especially egregious example was the 1999 report published in the Archives of Internal Medicine that claimed evidence that intercessory prayer speeded the recovery of hospitalized heart patients. This study barely met the 95 percent threshold, and even this significance estimate has been questioned. Certainly the claim here was extraordinary and should never have been published at such a low level of significance.