Summarizing evidence for systematic reviews
Health technology assessment
Clinical practice guidelines.
Unifying, transparent and sensible system for grading the rating quality of evidence pool and grading strength of recommendation
Size: 4.72 MB
Language: en
Added: Oct 27, 2025
Slides: 49 pages
Slide Content
GRADE AND GRADING OF EVIDENCE Dr Dipika Bansal, MD DM Clinical Research Unit, NIPER SAS Nagar
Grading of Recommendations, Assessment, Development and Evaluations (GRADE) Liaison of :- “GRADE Working Group” https://www.gradeworkinggroup.org/ “Cochrane” “Members of McMaster University”*. *https://community.cochrane.org/help/tools-and-software/gradepro-gdt Unifying, transparent and sensible system for grading the rating quality of evidence pool and grading strength of recommendation
Implications Summarizing evidence for systematic reviews Health technology assessment C linical practice guidelines .
Systematic Review Process Overview
GRADE Outcome centric Qualitative evaluation Transparency Documentation Quantitative evaluation KEY HIGHLIGHTS Evidence Pool not individual studies
What are we grading? two components certainty in estimate of effect adequate to support decision (quality of body of evidence) high, moderate, low, very low
GRADE: R ecommendations 1. Four categories of quality of evidence High , moderate, low and very low -Based on Methodological quality of evidence -Likelihood of bias -By outcomes and across outcomes 2. Recommendations 2 grades -Conditional (weak) -Strong (for or against an intervention)
Determinants of Q uality RCT Observational studies 5 Factors that lower quality Risk of bias Inconsistency or heterogeneity Indirectness (PICO pplicability) Imprecision (n and CI) Publication bias 3 factors that increase quality Large magnitude of effect All plausible confounding considered Dose response gradient
What Are We Grading CERTAINTY OF EVIDENCE RECOMMENDATION STRENGTH Example Effectiveness of “ SULTHIAME” as add-on therapy compared with placebo or another antiepileptic drug for the treatment of west syndrome Strong for intervention ARE WE SURE Is consequences more desirable than undesirable?
Balance of desirable and undesirable consequences Lower risk of seizure Better QoL Less Medication Fewer symptoms More resource utilization More ADRs Higher burden Criteria Research evidence Panels judgement Cost effectiveness Resources required Acceptability Feasibility Evidence to decision or recommendation Framework
What are we grading? T wo components quality of body of evidence extent to which confidence in estimate of effect adequate to support decision high, moderate, low, very low strength of recommendation strong and weak
Summary o f Findings Summary of findings : tabular presentation of key information about relevant outcomes of alternative health care interventions. It presents information about the body of evidence, key numerical results, and summary judgment about the certainty of underlying evidence for each outcome. SoF table has been chosen by the Cochrane Collaboration to present main findings of a systematic review.
GRADE Summary of findings Broader audience, End users of systematic reviews and guidelines Concise Magnitude of effect Certainty (or quality) of evidence.
Evidence profile Evidence profile : summary of evidence for a given question; it presentes relevant information about the body of evidence, key numerical results, and a detailed quality assessment and a explicit judgment of each factor that determines the quality. Used by guideline producers
GRADE evidence profile EVIDENCE PROFILE = DETAILED QUALITY ASSESSMENT + SUMMARY OF FINDINGS For guideline developer Authors Comprehensive Key numerical results Explicit judgment of each factor
How to Rat e the evidence GRADE’s approach begins with the study design. Randomized controlled trials (RCTs) start as high-quality evidence and observational studies as low-quality ev idence supporting estimates of intervention effects High (Randomised trials) Moderate (Observational studies) Very Low (Moderate) Low (High) (L o w ) Downgrade Upgrade
Determinants of quality/certainty of a body of evidence 1. limitations in detailed study design and execution (risk of bias criteria) Inconsistency (or heterogeneity) Indirectness (PICO and applicability) Imprecision Publication bias
Factors i ncrease the quality Large magnitude of effect P lausible residual bias Dose-response gradient
What Are the Components of Quality Assessment?
Risk of bias (Methodological Validity) Low risk of bias ≈ More accurate estimates of effect The risk of bias is assessed using tools such as the RoB-2, ROBIS etc https://methods.cochrane.org/bias/resources/rob-2-revised-cochrane-risk-bias-tool-randomized-trials Systematic errors Results different from the truth Study design Sample size Statistical analyses
Risk of bias table for RCTs Cochrane Collaboration Bias Authors' ju d g e m e n t Support for judgement Random sequence generation (selection bias) Low ri sk of bias High risk of bias Unclear risk of bias Allocation concealment (selection bias) Low risk of bias High risk of bias Unclear risk of bias Blinding of participants and personnel (performance bias) Low risk of bias High risk of bias Unclear risk of bias Blinding of outcome assessment (detection bias) Low risk of bias High risk of bias Unclear risk of bias I ncomplete outcome data (attrition bias) Low risk of bias High risk of bias Unclear risk of bias S elective reporting (reporting bias) Low risk of bias High risk of bias Unclear risk of bias
1. risk of bias Deferasirox for managing transfusional iron overload in people with sickle cell disease (Review)
Risk of bias Outcome specific Do not average risk quality across the studies Evaluate the extent to which each trial contributes toward the estimate of magnitude of effect.
2. Inconsistency (heterogeneity) between studies results Variation in effect size ( Point estimates vary widely across studies) Confidence intervals (CIs) show minimal or no overlap Consistency (Homogeneity ) Degree of similarity in the effect sizes of different studies Effect sizes in the same direction & Narrow effect sizes Range: Visually Cochran’s Q test or I2 statistics: Statistically
Directness Evidence includes a PICOs that inline/adhere to the specified clinical question-G eneralizability , transferability, applicability Example ACTH Infantile Spasm (0-1 yr.) Population age in Study RCTs included Infants (<1 year) RCTs included older children or adults with RCTs included different type of seizure for Research Question
Flatulence Hierarchy of outcomes according to their patient-importance effect of phosphate lowering drugs in patients with renal failure and hyperphophatemia Importance of endpoints Critical for decision making Important, but not critical for decision making Of low patient- importance 2 5 Pain due to soft tissue Calcification / function 6 Fractures 7 Myocardial infarction 8 Mortality 9 3 4 1 Coronary calcification Ca 2+ /P- Product Surrogates of declining importance Bone density Ca 2+ /P- Product Soft tissue calcification Ca 2+ /P- Product Lower by one level for indirectness Lower by two levels for indirectness
Precision Are the Confidence interval wide or studies underpowered? Precision D egree of certainty CI expresses the range in which the truth plausibly lies Precision assessed through considering following factors:- Wide confidence intervals Small number of events Small sample size
Recommendation or clinical course of action would differ if the upper versus the lower boundary of the CI represented the truth
Optimal information size Total no. of patients included must meet the threshold sample size as generated by a conventional sample size calculation. If less than threshold - Rate down (-1)
Optimal information size if the total number of patients included in a systematic review is less than the number of patients generated by a conventional sample size calculation for a single adequately powered trial, consider rating down for imprecision.
Required sample size (assuming α of 0.05, and β of 0.2) for RRR of 20%, 25%, and 30% across varying control event rates. For example, if the best estimate of control event rate was 0.2 and one specifies an RRR of 25%, the OIS is approximately 2,000 patients ( GRADE guideline n.6 Journal of Clinical Epidemiology 64 (2011) 1283e1293)
4. Publication Bias Consider rating down if: You find systematic reviews performed early , when only few initial studies are available, that will overestimate effects when ‘‘negative’’ studies face delayed publication. Early positive studies, particularly if small in size, are suspect. You find only small “positive” studies , mainly if sponsored by industry
Egger M, Cochrane Colloquium Lyon 2001 21 Funnel plot S t anda r d Er r or . 1 . 3 3 3 2 1 10 0.6 1 Odds ratio Symmetrical: No publication bias
Egger M, Cochrane Colloquium Lyon 2001 23 S t anda r d Er r or . 1 . 3 3 3 2 1 Funn e l plot 10 0.6 1 Odds ratio Asymmetrical: Publication bias?
Visual Assessment: Significant or positive results are more likely to cluster on one side of the line Non-significant or negative results may be sparse or scattered on the other side
Statistical Assessment Egger's Test : Systematic relationship between the effect size and the standard error of the effect size estimates. A significant p-value in Egger's test (typically < 0.05) suggests the presence of publication bias Begg's Test : It examines the correlation between the effect size estimates and their variances. A significant p-value in indicates publication bias. Cumulative Meta-Analysis : H ow effect size changes as studies are added one by one. D ecrease or change in direction in studies chronology indicate publication bias.
Dose response relationship Strength of a cause-and-effect link between an intervention or exposure E.g. Studying the effect of ACTH on cessation of spasm e xtent of spasm cessation based on the Dose of ACTH. Low Dose: S pasm cessation slightly. Moderate Dose: spasm cessation decrease slightly more. High Dose: Complete spasm cessation If there's a clear and consistent dose-response relationship between an intervention and an outcome, it can provide stronger evidence for the effectiveness of that intervention
Dose-response gradient childhood lymphoblastic leukemia risk for CNS malignancies 15 years after cranial irradiation no radiation: 1% (95% CI 0% to 2.1%) 12 Gy: 1.6% (95% CI 0% to 3.4%) 18 Gy: 3.3% (95% CI 0.9% to 5.6%).
Large magnitude of effect It means that the treatment being studied has a big and noticeable effect on improving or worsening a person's health. ACTH, reduced Electroclinical response by 90%, while Placebo only reduces Electroclinical response by 10%. Magnitude effect, suggests that the treatment is: Very effective at achieving the desired outcome Making it more likely to be recommended for use in healthcare .
Plausible residual bias Different types of biases (errors) in a research study that might work in opposite directions, challenging to determine the overall reliability of the study's findings. Selection Bias : Researchers accidentally selected healthier participants for the group taking the new drug Measurement Bias : equipment used in the study might not be very accurate GRADE helps researchers and healthcare professionals weigh these biases and decide how much confidence to place in the study's results when making recommendations
Strength of Evidence Grades (I) Reflect a global assessment that: Takes the required domains directly into account Incorporates judgments about the additional domains as needed Aim to: Provide “ actionable ” information for a variety of different users, readers, and stakeholders Be transparent in how the strength-of-evidence grades are reached
Strength of Evidence Grades For each comparison of interest, rate the strength of evidence for: Each major benefit (e.g., positive effects on health outcomes such as physical function or quality of life, or effects on laboratory measures or other surrogate variables) Each major harm (ranging from rare, serious, or life-threatening adverse events to common but bothersome effects) For both benefits and harms: Focus on the outcomes most relevant to patients, clinicians, and policymakers
References Guyat G. et al. GRADE guidelines: 1. Introduction: GRADE evidence profiles and summary of findings tables. Journal of Clinical Epidemiology 64 (2011) 383e394 Balshem H et al. GRADE guidelines: 3. Rating the quality of evidence. Journal of Clinical Epidemiology 64 (2011) 401e406 Guyatt G. et al. for the GRADE Working Group. Rating quality of evidence and strength of recommendations: What is "quality of evidence" and why is it important to clinicians? BMJ. 2008 May 3;336(7651):995-8 Nair, Abhijit S. “Publication bias - Importance of studies with negative results!.” Indian journal of anesthesia vol. 63,6 (2019): 505-507. doi:10.4103/ija.IJA_142_19 Neto et al. “Effects of photo biomodulation in the treatment of fractures: a systematic review and meta-analysis of randomized clinical trials.” Lasers in medical science vol. 35,3 (2020): 513-522.
“Extraordinary claims require extraordinary evidence.” ― Carl Sagan