The GRADE approach (Grading of Recommendations Assessment, Development and Evaluation) is a method of assessing the certainty in evidence (also known as quality of evidence or confidence in effect estimates) and the strength of recommendations in health care.[1] It provides a structured and transparent evaluation of the importance of outcomes of alternative management strategies, acknowledgment of patients and the public values and preferences, and comprehensive criteria for downgrading and upgrading certainty in evidence. It has important implications for those summarizing evidence for systematic reviews, health technology assessments, and health guidelines as well as other decision makers.[2]
Background and history
[edit]This section may require copy editing for this rather promotional description based on own sources. Also, contains too many details. (December 2025) |
The GRADE began in the year 2000 as a collaboration of methodologists, guideline developers, biostatisticians, clinicians, public health scientists and other interested members. GRADE developed and implemented a common, transparent and sensible approach to grading the quality of evidence (also known as certainty in evidence or confidence in effect estimates) and strength of recommendations in healthcare.[3][4] GRADE follows careful methods to develop its guidance and other articles.[5]
GRADE official articles, guidance group, project groups and centers, networks, and formalization of Evidence-to-Decision frameworks [6]
As GRADE adoption expanded, the need for sustained methodological support and capacity building became apparent. Starting in 2010, the first GRADE Centers and Networks were established. These entities supported training, implementation, and feedback from diverse contexts, helping to ensure consistent application while allowing for contextual adaptation.[7]
During the same period, the DECIDE Project, funded by the European Commission, played a central role in formalizing Evidence-to-Decision frameworks. DECIDE supported the development and testing of EtD frameworks for different types of decisions, the publication of EtD guidance articles, and the implementation of EtD frameworks within GRADEpro.9 This work transformed EtD from an early concept into a standardized and operational component of GRADE.[8]
Governance, methodological stewardship, and the development of GRADE guidance[5]
As GRADE matured and its applications expanded across clinical medicine, public health, diagnostics, and health systems, the need for formal methodological stewardship and governance became increasingly apparent. What had initially functioned as an informal working group required clearer structures to ensure coherence, transparency, and consistency in how new methodological developments were proposed, debated, approved, and disseminated.[5]
The GRADE Guidance Group
In response to this need, the GRADE Guidance Group (often referred to as G3) was established in the early 2010s as the core governance body of the GRADE Working Group. The Guidance Group provides strategic oversight and methodological stewardship for GRADE. Guided by its chair (currently: Holger Schünemann) its responsibilities include:[5]
- setting priorities for methodological development,
- reviewing and approving proposals for new guidance or major updates,
- ensuring consistency across guidance documents,
- safeguarding the conceptual integrity of GRADE, and
- coordinating across the growing number of contributors, centers, and networks.
The creation of the Guidance Group marked an important transition in GRADE’s evolution—from a predominantly informal collaboration to a self-governing methodological enterprise. Importantly, the Guidance Group does not replace the broader GRADE Working Group; rather, it provides structure and continuity, while maintaining GRADE’s collaborative and consensus-driven ethos. Current members (as of 2025) include: Elie Akl, Sue Brennan, Philipp Dahm, Marina Davoli, Monica Hultcrantz, Miranda Langendam, Joerg Meerpohl, Reem Mustafa, Ignacio Neumann, Holger Schünemann, Nicole Skoetz, Jun Xia[7]
GRADE Project Groups
At that time, substantive methodological advances within GRADE began to be developed through GRADE Project Groups. These groups are convened to address specific methodological questions or gaps, for example, how to apply GRADE to animal research, rare diseases, public health interventions, health systems decisions, or how to assess domains such as imprecision, publication bias, equity, or values.[5]
Project groups are typically multidisciplinary and international, bringing together methodologists, content experts, and end users. Their work commonly involves:[5]
- reviewing existing methods and frameworks,
- conducting empirical or conceptual methodological work,
- testing proposals in real guideline or decision-making contexts,
- and iteratively refining approaches through discussion and application.
Project groups operate under the oversight of the GRADE Guidance Group, which reviews proposals, monitors progress, and evaluates final outputs before endorsement. This structure has allowed GRADE to scale methodologically without fragmenting into competing or incompatible approaches.[5]
Formalization of GRADE guidance and concept articles[5]
As the volume and diversity of GRADE-related publications increased, the Working Group recognized the importance of clearly distinguishing official GRADE guidance from conceptual discussions, applications, or commentaries. In response, a formal article was published describing the processes by which GRADE Guidance and GRADE Concept articles are developed, reviewed, and approved.[5]
That article clarified, among other points:
- the distinction between GRADE Guidance articles, which provide authoritative, endorsed methodological instructions, and GRADE Concept articles, which explore ideas, extensions, or emerging areas without yet constituting formal guidance;
- the role of the GRADE Guidance Group in approving guidance proposals and final manuscripts;
- expectations regarding transparency, documentation of methods, and consensus-building; and
- the importance of linking guidance development to real-world testing and application.
This formalization was a critical step in maintaining trust and clarity as GRADE became widely used. It helped readers, guideline developers, and organizations understand which publications represent official GRADE methods, which are exploratory or developmental, and how new guidance evolves from concept to endorsed standard.[5]
GRADE components
[edit]The GRADE approach separates recommendations following from an evaluation of the evidence as strong or weak. A recommendation to use, or not use an option (e.g. an intervention), should be based on the trade-offs between desirable consequences of following a recommendation on the one hand, and undesirable consequences on the other. If desirable consequences outweigh undesirable consequences, decision makers will recommend an option and vice versa. The uncertainty associated with the trade-off between the desirable and undesirable consequences will determine the strength of recommendations.[9] The criteria that determine this balance of consequences are listed in Table 2. Furthermore, it provides decision-makers (e.g. clinicians, other health care providers, patients and policy makers) with a guide to using those recommendations in clinical practice, public health and policy. To achieve simplicity, the GRADE approach classifies the quality of evidence in one of four levels—high, moderate, low, and very low:
Certainty of evidence
[edit]GRADE rates the certainty of evidence as follows:[6]
| High | There is high confidence that the true value of the estimate of interest is at one side of a threshold of interest or within a specific range. |
| Moderate | There is moderate confidence that that the true value of the estimate of interest is at one side of a threshold of interest or within a certain range. The true value of the estimate may deviate slightly from the target of the certainty rating (i.e. may possibly fall in a different range). |
| Low | There is low confidence that that the true value of the estimate of interest is at one side of a threshold of interest or within a certain range. The true value of the estimate may deviate from the target of the certainty rating (i.e. likely fall in a different range). |
| Very low | There is very-low confidence that that the true value of the estimate of interest is at one side of a threshold of interest or within a certain range. The true value of the estimate may deviate significantly from target of the certainty rating (i.e. probably fall in a different range.). |
The GRADE working group has developed a software application that facilitates the use of the approach, allows the development of summary tables and contains the GRADE handbook. The software is free for non-profit organizations and is available online.[10] The GRADE approach to assess the certainty in evidence is widely applicable, including to questions about diagnosis,[11][12] prognosis,[13][14] network meta-analysis[15] and public health.[16]
Strength of recommendation
[edit]Factors and criteria that determine the direction and strength of a recommendation:
| Factor and criteria* | How the factor influences the direction and strength of a recommendation |
|---|---|
| Problem
This factor can be integrated with the balance of the benefits and harms and burden. |
The problem is determined by the importance and frequency of the health care issue that is addressed (burden of disease, prevalence or baseline risk). If the problem is of great importance a strong recommendation is more likely. |
| Values and preferences | This describes how important health outcomes are to those affected, how variable they are and if there is uncertainty about this. The less variability or uncertainty there is about values and preferences for the critical or important outcomes, the more likely is a strong recommendation. |
| Quality of the evidence | The confidence in any estimate of the criteria determining the direction and strength of the recommendation will determine if a strong or conditional recommendation is offered. However, the overall quality that is assigned to the recommendation is that of the evidence about effects on population-important outcomes. The higher the quality of evidence the more likely is a strong recommendation. |
| Benefits and harms and burden | This requires an evaluation of the absolute effects of both the benefits and harms and their importance. The greater the net benefit or net harm the more likely is a strong recommendation for or against the option. |
| Resource implications | This describes how resource intense an option is, if it is cost-effective and if there is incremental benefit. The more advantageous or clearly disadvantageous these resource implications are the more likely is a strong recommendation. |
| Equity
This factor is often addressed under values preferences, and frequently also includes resource considerations |
The greater the likelihood to reduce inequities or increase equity and the more accessible an option is, the more likely is a strong recommendation. |
| Acceptability
This factor can be integrated with the balance of the benefits and harms and burden. |
The greater the acceptability of an option to all or most stakeholders, the more likely is a strong recommendation. |
| Feasibility
This factor includes considerations about values and preferences, and resource implications. |
The greater the acceptability of an option to all or most stakeholders, the more likely is a strong recommendation. |
- Factors for which overlap is described are often not shown separately in a decision table.
Usage
[edit]Over 100 organizations (including the World Health Organization,[17] the UK National Institute for Health and Care Excellence (NICE), the Canadian Task Force for Preventive Health Care, the Colombian Ministry of Health and Social Protection,[citation needed] and the Saudi Arabian Ministry of Health[18]) have endorsed and/or are using GRADE to evaluate the quality of evidence and strength of health care recommendations.[citation needed]
Criticism
[edit]When used to summarize evidence from nutritional science, dietary, lifestyle, and environmental exposure, the use of the GRADE approach has been criticized. That is because the GRADE system is perceived to allow only randomized controlled trials (RCT) to be rated as high evidence but this is not correct (see provided reference).[19] Non-randomized studies may be rated as high certainty if they take measures to control for confounding.[19]
References
[edit]- ^ Schünemann, HJ; Best, D; Vist, G; Oxman, AD (2003). "Letters, numbers, symbols, and words: How best to communicate grades of evidence and recommendations?". Canadian Medical Association Journal. 169 (7): 677–80.
- ^ Guyatt, GH; Oxman, AD; Vist, GE; Kunz, R; Falck-Ytter, Y; Alonso-Coello, P; Schünemann, HJ (2008). "GRADE: an emerging consensus on rating quality of evidence and strength of recommendation". BMJ. 336 (7650): 924–26. doi:10.1136/bmj.39489.470347.ad. PMC 2335261. PMID 18436948.
- ^ Guyatt, GH; Oxman, AD; Schünemann, HJ; Tugwell, P; Knotterus, A (2011). "GRADE guidelines: A new series of articles in the Journal of Clinical Epidemiology". Journal of Clinical Epidemiology. 64 (4): 380–382. doi:10.1016/j.jclinepi.2010.09.011. PMID 21185693.
- ^ "GRADE home". Gradeworkinggroup.org. Retrieved 16 August 2019.
- ^ a b c d e f g h i j Schünemann, Holger J.; Brennan, Sue; Akl, Elie A.; Hultcrantz, Monica; Alonso-Coello, Pablo; Xia, Jun; Davoli, Marina; Rojas, Maria Ximena; Meerpohl, Joerg J.; Flottorp, Signe; Guyatt, Gordon; Mustafa, Reem A.; Langendam, Miranda; Dahm, Philipp (July 2023). "The development methods of official GRADE articles and requirements for claiming the use of GRADE – A statement by the GRADE guidance group". Journal of Clinical Epidemiology. 159: 79–84. doi:10.1016/j.jclinepi.2023.05.010.
- ^ a b Neumann, Ignacio; Schünemann, Holger (December 15, 2025). The GRADE Book. GRADE Working Group.
- ^ a b "The GRADE Working Group". The GRADE Working Group. December 15, 2025. Retrieved December 15, 2025.
- ^ Alonso-Coello, Pablo; Schünemann, Holger J; Moberg, Jenny; Brignardello-Petersen, Romina; Akl, Elie A; Davoli, Marina; Treweek, Shaun; Mustafa, Reem A; Rada, Gabriel; Rosenbaum, Sarah; Morelli, Angela; Guyatt, Gordon H; Oxman, Andrew D; the GRADE Working Group (2016-06-28). "GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 1: Introduction". BMJ i2016. doi:10.1136/bmj.i2016. ISSN 1756-1833.
- ^ Andrews, J; Guyatt, GH; Oxman, AD; Alderson, P; Dahm, P; Falck-Ytter, Y; Nasser, M; Meerpohl, J; Post, PN; Kunz, R; Brozek, J; Vist, G; Rind, D; Akl, EA; Schünemann, HJ (2013). "GRADE guidelines: 15. Going from evidence to recommendations: the significance and presentation of recommendations". Journal of Clinical Epidemiology. 66 (7): 719–725. doi:10.1016/j.jclinepi.2012.03.013. PMID 23312392.
- ^ "GRADEpro". Gradepro.org. Retrieved 16 August 2019.
- ^ Schünemann, HJ; Oxman, AD; Brozek, J; Glasziou, P; Jaeschke, R; Vist, G; Williams, J; Kunz, R; Craig, J; Montori, V; Bossuyt, P; Guyatt, GH (2008). "GRADEing the quality of evidence and strength of recommendations for diagnostic tests and strategies". BMJ. 336 (7653): 1106–1110. doi:10.1136/bmj.39500.677199.ae. PMC 2386626. PMID 18483053.
- ^ Brozek, JL; Akl, EA; Jaeschke, R; Lang, DM; Bossuyt, P; Glasziou, P; Helfand, M; Ueffing, E; Alonso-Coello, P; Meerpohl, J; Phillips, B; Horvath, AR; Bousquet, J; Guyatt, GH; Schünemann, HJ (2009). "Grading quality of evidence and strength of recommendations in clinical practice guidelines: part 2 of 3. The GRADE approach to grading quality of evidence about diagnostic tests and strategies". Allergy. 64 (8): 1109–16. doi:10.1111/j.1398-9995.2009.02083.x. PMID 19489757. S2CID 8865010.
- ^ Iorio, A; Spencer, FA; Falavigna, M; Alba, C; Lang, E; Burnand, B; McGinn, T; Hayden, J; Williams, K; Shea, B; Wolff, R; Kujpers, T; Perel, P; Vandvik, PO; Glasziou, P; Schünemann, H; Guyatt, G (2015). "Use of GRADE for assessment of evidence about prognosis: rating confidence in estimates of event rates in broad categories of patients". BMJ. 350: h870. doi:10.1136/bmj.h870. PMID 25775931.
- ^ Spencer, FA; Iorio, A; You, J; Murad, MH; Schünemann, HJ; Vandvik, PO; Crowther, MA; Pottie, K; Lang, ES; Meerpohl, JJ; Falck-Ytter, Y; Alonso-Coello, P; Guyatt, GH (2012). "Uncertainties in baseline risk estimates and confidence in treatment effects". BMJ. 14: 345. doi:10.1136/bmj.e7401. PMID 23152569.
- ^ Puhan, MA; Schünemann, HJ; Murad, MH; Li, T; Brignardello-Petersen, R; Singh, JA; Kessels, AG; Guyatt, GH (2014). "A GRADE Working Group approach for rating the quality of treatment effect estimates from network meta-analysis". BMJ. 24: 349. doi:10.1136/bmj.g5630. PMID 25252733.
- ^ Burford, BJ; Rehfuess, E; Schünemann, HJ; Akl, EA; Waters, E; Armstrong, R; Thomson, H; Doyle, J; Pettman, T (2012). "Assessing evidence in public health: the added value of GRADE". J Public Health. 34 (4): 631–5. doi:10.1093/pubmed/fds092. PMID 23175858.
- ^ "GRADEpro". Gradepro.org. Retrieved 16 August 2019.
- ^ "The Saudi Center for Evidence Based Healthcare (EBHC) - Clinical Practice Guidelines". 2016-02-25. Archived from the original on 2016-02-25. Retrieved 2021-02-19.
- ^ a b Schünemann, Holger J.; Cuello, Carlos; Akl, Elie A.; Mustafa, Reem A.; Meerpohl, Jörg J.; Thayer, Kris; Morgan, Rebecca L.; Gartlehner, Gerald; Kunz, Regina; Katikireddi, S Vittal; Sterne, Jonathan; Higgins, Julian PT; Guyatt, Gordon (July 2019). "GRADE guidelines: 18. How ROBINS-I and other tools to assess risk of bias in nonrandomized studies should be used to rate the certainty of a body of evidence". Journal of Clinical Epidemiology. 111: 105–114. doi:10.1016/j.jclinepi.2018.01.012. PMC 6692166. PMID 29432858.