Conducted June 12, 2024, this inaugural meeting of the BSI Journal Club looked at “Effect of exercise training for five years on all cause mortality in older adults—the Generation 100 study: randomised controlled trial.” The paper was published in October of 2020 in The British Medical Journal.


A video recording of the event will be available soon.


Bob Kaplan’s meeting recap:

At our first journal club, I discussed the paper “Effect of exercise training for five years on all cause mortality in older adults—the Generation 100 study: randomised controlled trial.”

The goal of The Journal Club is to build a solid took kit of skills to decode medical literature.  When you watch and attend The Journal Club you will improve your ability to analyze and interpret the research literature.

The following is a short debrief. 

The background and purpose

“The sceptics of using exercise as medicine argue that the literature lacks high-quality evidence on the effect of exercise on hard endpoints,” write the investigators in an accompanying editorial. “Indeed, whether physical activity actually reduces mortality has never been tested in a randomized setting, nor have the effects of different exercise intensities. With these challenges in mind, The Cardiac Exercise Research Group at the Norwegian University of Science and Technology launched the Generation 100 study in 2011. The Generation 100 study is the longest and largest randomized controlled exercise trial evaluating the effect of supervised exercise training versus physical activity recommendations on mortality in older adults.”

The hypothesis

In this study, the primary hypothesis was that “systematic exercise training [moderate intensity continuous training group (MICT) and high-intensity interval training (HIIT) group] lowers all cause mortality compared with giving advice to follow the national guidelines for physical activity [control group].”

Just above the results section of the Abstract is this: “Main outcome measure All cause mortality. An exploratory hypothesis was that HIIT lowers mortality more than MICT.”

The result

“All cause mortality did not differ between the control group and combined MICT and HIIT group.” However, the investigators reportedly observed a lower all cause mortality trend after HIIT compared with controls and MICT.

The interpretation

“Despite no associated effect on all cause mortality in the exercise groups combined, the findings that HIIT was associated with a reduced risk of mortality compared with MICT suggest that performing high intensity physical activity should have a key role in physical activity prevention programmes.

[. . .] 

“The central implication [of the physical activity guideline recommendations] is that either shorter duration vigorous physical activity or longer duration moderate physical activity or a combination of the two, that amount to the same amount of work each week, will have the same favourable health outcomes, with vigorous physical activity being the time efficient alternative. The physical activity guidelines have not been tested in large long term prospective randomised clinical trials, and information about their effect in older adults is lacking. We suggest that future guidelines for physical activity, at least for older adults, should be more specific in requiring that at least part of the physical activity should be performed at high intensity.”


There were several limitations with the “Gen100 study” (my abbreviation), briefly described below. 

Intention-to treat (ITT) analysis (considered the gold standard for analyzing and reporting results in RCTs). According to ITT principles: (1) participants must be analyzed in the treatment groups to which they were assigned, regardless of the treatments they actually received, and (2) investigators must include outcome data for all participants, regardless of whether they were lost to follow-up, dropped out of the study, or died. For example, in a 10-year placebo controlled RCT comparing the effect of a caffeine pill on weight loss, if a participant assigned to the caffeine pill accidentally received the placebo instead and died three years into the trial, investigators must (1) attribute his outcome to the caffeine group, and (2) assume he didn’t die and guess how much weight he lost had he completed the trial and include it in the analysis. If they don’t do this, and exclude the participant, their study is biased, according to the ITT principle. (The Gen100 study used an ITT analysis.)

The ITT switcheroo. When the answer to the question of assignment is given as though it were the answer to the question of adherence. This is fraud, according to Gerard Dallal, a Senior Scientist in Biostatistics at Tufts. If you’ve read several RCTs, and you weren’t familiar with ITT analysis and how it works, you won’t believe how many times this occurred in front of your eyes, and its implications. The Gen100 study is a classic case. 

The healthy volunteer effect (HVE). An insidious form of selection bias that plagues most of observational epidemiology, but can also occur in RCTs. People who are health-conscious and engage in activities that are beneficial for them are different from people who are not in many different ways, often compromising the ability to determine true effects or generalize the results. The Gen100 study was flooded by the HVE and renders any generalization of the result to the general population untenable. 

Contamination. The unintentional exposure of participants in one study group to the intervention or conditions intended for another group and interferes with reaching proper conclusions from the trial’s data. Each of the three arms were contaminated with both of the comparison arms in the Gen100 study.

HARKing. Hypothesizing after the results are known. Sometimes investigators use the results of their study to generate and test new hypotheses in the same study, and present the results as if the hypothesis was made before the study started. The finding that HIIT was associated with a reduced risk of mortality compared with MICT in the Gen100 was produced from HARKing. What’s more remarkable is that they used this hypothesis-generating observation to suggest the physical activity guidelines should be rewritten. For the p-value snobs, this post hoc observation was not even statistically significant.  

Confounding by prior treatment. Prior exposure can obscure the true effects of the intervention under investigation, making it difficult to attribute outcomes solely to the trial treatment. About 90% of all participants in the Gen100 were getting regular exercise before the study even began, perhaps for decades.

Active control group. An active control group is one in which participants engage (or are asked to engage) in some task during the intervention period. The only thing deliberately controlled for in the Gen100 study was access to voluntary supervised exercise offered by the investigators. However, both the control group and treatment groups could do as much supervised exercise as they wanted independently. If the control group is effectively assigned to the same treatment as the treatment group it should be treated as an uncontrolled study. 

These limitations are by no means exclusive to the Gen100 study, especially in lifestyle intervention trials. If there’s one thing to take away from this journal club: you need to understand the ITT principle.