By Norbert L. Kerr

Summary

This paper by psychologist Norbert Kerr discusses the emerging practice in science of asserting a hypothesis after an experiment has concluded (post hoc). In science, it has been generally accepted that a hypothesis should be advanced before any results are known (a priori). This paper coins the term HARKing, which means “Hypothesizing After Results are Known”. Kerr explains why scientists HARK and also why they should not.

Kerr stresses that his critique of HARKing does not apply to induction, which is to derive a general principle after observation of several instances. Induction is a form of post hoc reasoning, but HARKing is the presentation of a post hoc hypothesis as if it had been conceived a priori.

Scientists have always been trained to state their hypothesis a priori and then devise an experiment to test it. In recent years, textbooks on science communication have taught a second option: publishing what makes the most sense after you have seen the results.

Another long established practice, reporting negative findings, has been challenged by Bem (1987). They argued that in some cases, like in exploratory research, no one cares about your hypothesis. They are only interested in your findings.

Kerr describes five different types of HARKing:

Pure HARKing: Present the most compelling framework that explains your results regardless of whether they were plausible or anticipated before the study.
Pure HARKing + Straw Man: As before, but with additional failed hypotheses intended to show a winner among competing hypotheses.
Suppress Loser Hypotheses: Only publish hypotheses seen as plausible both a priori and post hoc. Failed hypotheses are suppressed, even if they were initially plausible.
Post Hoc Plausibility + Necessity of Anticipation: This expands the Suppress Loser type to include any hypothesis that was anticipated a priori, even if it was not initially judged plausible.
Empirical Inspiration: Publish any plausible, anticipated hypotheses, but add any additional hypotheses that are plausible after the research has been conducted.

How widespread is the problem of HARKing? Kerr points out the following circumstantial evidence:

Many papers elicit skepticism when they add a “too-convenient qualifier” to explain their results. (“We saw a higher effect in males, as we expected.”)
The theory that seems too good to be true. A remarkably strong result is found when existing theory could have made completely different predictions.
HARKing is done to cover up for poor study design which cannot adequately test its original hypothesis.

This circumstantial evidence is also backed up by data from surveys of scientists who have observed HARKing among colleagues. Pure HARKing and Empirical Inspiration were the most commonly found types. The surveys showed that Pure HARKing and Suppress Losers were frowned upon, but Empirical Inspiration was recommended as strongly as traditional a priori hypotheses.

Scientists have clear incentives to HARK. Confirmatory results are typically more desirable than a result that disconfirms a theory, thus making it a “bad theory”. Positive findings are more likely to be published. And an author who admits to adding a new hypothesis to their paper will likely be asked to redo their study. Readers are less interested in inconclusive results and failed hypotheses. They want to know what works. HARKing is also more likely to produce a good story with a “happy ending”, such as the confirmation of a “good theory”.

Given how widespread it is, it is important to know what the harms are. First, it increases the rate of false positive results (called Type I errors). Compounding this is a bias against replication studies, which could discover these false positives. These errors are easier to make than they are to fix. Second, Kerr argues that prediction is better than accommodation, which is to construct a theory to fit the data. A prediction can be falsified, whereas an accommodation cannot (in the short term, at least).

Kerr argues that HARKing is unethical, even though guidelines published by scientific authorities do not forbid it. It violates a scientist’s fundamental duty to report their work honestly. HARKing also breeds cynicism, when scientists realize that how science actually works in practice is far from its ideals.

HARKing also results in fewer discoveries. If scientists ignore disconfirmatory results, they might miss the type of serendipitous result that so often leads to breakthroughs. Theories are made worse by narrowly fitting hypotheses to observed data, as opposed to robust, generalizable theories made by predictive hypotheses. It also reduces the incentive to generate alternative hypotheses.

Kerr summarizes his points on the costs of HARKing:

Turns false positives into “hard-to-eradicate theories”.
Proposes theories which cannot be disproven.
Post hoc explanations tend to be less useful.
Communicating only what works leaves out important information about what failed.
Unjustified statistical analyses guarantee significant findings.
Presents an inaccurate model of science to students.
Encourages “fudging” in other gray areas.
Makes us less receptive to serendipitous discoveries.
Encourages narrow, context-bound theories.
Makes the status quo disconfirmable.
Reduces generation of alternative hypotheses.
Violates basic ethics of science.

Some argue that science is self-correcting, so we should not worry too much about the long-term damage caused by HARKing. But Kerr does not believe we should rely on the resilience of science. The damage could be cumulative and it erodes the culture of science.

HARKing is a difficult problem to eradicate, but Kerr presents several remedies:

More education for scientists in the areas of research methods and philosophy of science.
Address this practice in professional guidelines and codes of conduct.
Make it a basis for rejection of an article.
Alter the incentives in the publication process to include a greater emphasis on negative or disconfirmatory findings. Encourage, or even require, replications of studies.
Exploratory research and post hoc hypotheses should be explicitly identified as such.

Kerr concludes by dispelling “the illusion that we gain something of value by insisting that authors pretend to have ‘known it all along’.” HARKing costs science more than whatever benefits it purports to deliver.

To read the full article click here.

Let's start with the truth!

Support the Broken Science Initiative.
Subscribe today →

HARKing: Hypothesizing After the Results are Known

HARKing: Hypothesizing After the Results are Known

Summary

Leave A Comment Cancel reply

recent posts

Insulin Resistance as a Defense Mechanism: Summary

Exercise & Pregnancy Demystified

Daley Fix #4: Inside Federal Air Marshal Service

Title