The hierarchy of evidence

Updated: Oct 8

From public health guidelines to population-specific recommendations - this information should be evidenced-based, and more precisely, from the best evidence available. To understand what the best evidence is we need to first understand the best study designs, that is, producing the most accurate reflection of the true effect.

The hierarchy of evidence is a grading system attributed to study designs based on the quality of data produced from each specific design. It helps categorise studies based on which design is best capable of answering a question. It can be a helpful tool to refer to when doing some independent research as to whether a nutritional claim is valid or not. I should add, this is more relevant for suspicious diet/supplement claims you see advertised online, as opposed to official public health guidelines - but nonetheless, it's important to make sure the recommendations stem from good evidence.

There are limitations to the hierarchy of evidence as will be discussed today, but again, it is a helpful reference point providing you keep in mind these limitations.

  • A classic hierarchy of evidence pyramid

At the higher end of the pyramid are randomised experimental designs, while lying towards the bottom are study designs that are observational and normally retrospective in nature. The value of a study depends on the accuracy of the results produced from each design. Let's take a closer look!

Meta-analysis of randomised controlled trials (RCT) 

A meta-analysis of an RCT is considered the gold standard as they pool data from different RCT's providing a bigger sample size and more accurate representation of the true effect of the therapy/intervention. It's a statistical analysis of the best available data. There are limitations which need to be considered when reviewing a meta-analysis. If one pool together 10 RCTs of which 7 demonstrate a positive effect and 3 show no positive effect - this leads to a large variation in the overall results and high confidence intervals. If this is the case, we can assume there are variables that have not been accounted for in the RCT's themselves causing inconsistencies in the results.

It's important to take a thorough look at the study inclusion criteria of a meta-analysis as well as the studies themselves. If half the studies included in the analysis have significant limitations in their design or results, then the overarching results from the meta-analysis will largely be flawed. The key criteria when seeking good quality reviews and analyses are making sure the studies included in the analysis answer the same question. We will be discussing the common flaws in nutritional methodologies in a later blog post, so do keep an eye out!


The reason why RCT's are highly valued in research is because of the randomisation process itself. Randomisation is a method to help reduce bias and confounding factors that could influence the true treatment effect. Within a sample population, there will be known and unknown variables that act as confounding factors. Randomisation, providing the sample size is large enough should be able to evenly distribute these variables between each arm of the study. Observational studies and retrospective studies are unable to account for these biases and unknown confounding factors.

How the design of a study accounts for these biases determines the overall validity of the study. There are scenarios in which a study reports a significant or non-significant effect in which the bias may be responsible, leading to an overestimation or underestimation of the association between the intervention and the outcome. Moreover, the statistical significance test itself does not give consideration for these biases in the methodology. Unfortunately, it's not uncommon for people to interpret the results without taking into account the methodology and potential biases, leading to misreporting the evidence.

Observational studies

At the lower end of the hierarchy are studies that are observational and retrospective in nature. These include case reports and cross-sectional studies in which data from a population is analysed and correlations are drawn between exposure and outcomes. For example, population: type 2 diabetics - participants lifestyle factors are analysed, including diet, exercise and alcohol consumption - this data is then pooled and common traits prevalent within this population are then correlated with the outcome (type 2 diabetes).

The limitation with these studies is it is very difficult to isolate a causative variable as there are so many confounding factors that are not accounted for, leading to an overestimation or underestimation of the true effect. Despite these limitations, they do have their use in research. If strong correlations are consistently observed then this certainly indicates a meaningful effect from an exposure. This effect can then be tested in a more controlled and rigorous environment, like an RCT. Moreover, observational studies can also identify potential long-term side-effects from an intervention that can't be picked up in many RCTs. Observational studies have an unfair reputation in research, and their value should not be underestimated.

The lower end of the pyramid

Right at the bottom of the pyramid are expert opinions & mechanistic studies. The expert opinions class is understandable considering it's an authors opinion carrying his/her own biases based on their own experience, alongside no control of any confounding factors. Mechanistic evidence is often a victim of many public figures misinterpreting its use, as there are numerous occasions where mechanistic evidence is not replicated in human outcome data. As an example, there is mechanistic evidence to suggest that too much polyunstaurated fat intake has detrimental effects on inflammation, however, all human outcome evidence suggests otherwise, and in fact, polyunsaturated fat intake has numerous health benefits, potentially including a reduction in inflammation. Mechanistic evidence can be considered as a complement to human outcome data to find out how an effect happens.

Of course, the best evidence will present significantly human outcome data and mechanistic evidence to back this up, but in the absence or presence of mechanisms, human outcome data should be prioritised.

To summarise...

It's important to critically analyse all study designs, and not assume that just because a meta-analysis is at the top of the pyramid, that it must provide better data than a single RCT or well constructed observational or mechanistic study. All pros and cons should be considered when using evidence from different study designs and it's important to read through other aspects of the methodology like the inclusion and exclusion criteria, treatment allocation, blinding, follow-up, research environment and outcome measures to have a better idea as to whether the study is valid and provides data that reflects the true effect. After all, each statistical model is simply calculating how likely the results produced from the study represent the truth.

Recent Posts

See All