+ - 0:00:00
Notes for current slide
Notes for next slide

Randomized controlled trials

Analysis, interpretation, intended and unintended effects

Elena Dudukina, PhD, Department of Clinical Epidemiology, AU & AUH

2023-02-08

1

What is RCT?

  • RCT is an interventional study
  • The exposure status of the participants is randomly assigned and concealed
  • We track individuals over time to determine their outcomes (longitudinal study)

Grimes, K.F. Schulz: An overview of clinical research: the lay of the land, The Lancet, Vol. 359 (9300), pp. 57-61

2

Why do we need RCTs?

  • Experimental design with random allocation of treatment is the only design (when conducted properly) that guarantees groups exchangeability (no measured and unmeasured confounding) and, therefore, grants causal conclusions
    • The observed risk of the outcome in treated and untreated in the RCT is closest we can get to the ideal counterfactual risks of the outcome in the treated and untreated

Hernán MA, Robins JM (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC

3

What is exchangeability?

  • When the groups of exposed and unexposed are exchangeable, the risk of the outcome in the treated group would have been the same as the risk of the outcome in the untreated group had the treatment been switched between the groups (M. Hernan)
  • “Exchangeability means people who are treated and untreated have the same probability of the outcome happening due to all mechanisms other than treatment” (E. Murray)

  • RCTs is the only study design for which exchangeability (no confoudning assumption) is expected to hold (unlike observational research)

  • Exchangeable: 🐈🐈🐈🐈🐈🐈🐈🐈🐈🐈 vs 🐈🐈🐈🐈🐈🐈🐈🐈🐈🐈

  • Not exchangeable: 🐈🐈🐈🐈🐈🐈🐈🐈🐈🐈 vs 🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️

4

Are these fruits exchangeable?

vs

5

Are these fruits exchangeable?

vs

  • Yes (both groups are red apples)
  • Fair comparison
6

Follow-up in RCTs

7

Mock RCT: Apples vs apples

  • RCT in the population of apples

  • Recruit apples with a set of inclusion and exclusion criteria
    • Fresh (grown in the closest summer season)
    • Red
    • Sweet
    • Medium size (120 g)
    • N=200 (100 apples per treatment arm)
    • Grown locally
8

Design of the RCT

  • Exposure: different regimes of washing
  • Outcome: "time-to-spoiled" (max. 30 days)
  • Randomly assign apples into treatment arms (washing with water + soap) vs standard treatment (washing with only water)
    • Randomization with the algorithm with ratio 1:1
    • People who wash apples cannot be blinded, but people who are going to store and follow-up apples for the outcome are blinded to the data on how apples were washed
  • Follow apples up for the outcome
  • Lost-to-follow-up possible?
9

Design of the RCT

  • Exposure: different regimes of washing
  • Outcome: "time-to-spoiled" (max. 30 days)
  • Randomly assign apples into treatment arms (washing with water + soap) vs standard treatment (washing with only water)
    • Randomization with the algorithm with ratio 1:1
    • People who wash apples cannot be blinded, but people who are going to store and follow-up apples for the outcome are blinded to the data on how apples were washed
  • Follow apples up for the outcome
  • Lost-to-follow-up possible?
  • People who follow-up apples for the outcomes are students
  • They do this during their lunch time
  • Some apples will be eaten before they have a chance to get spoiled (lost-to-follow-up)
10

Apples vs non-apples

  • In non-interventional setting epidemiologic studies are especially susceptible to bias, including confounding

  • Non-exchangeability → study compares apples with non-apples and makes erroneous conclusions attributing the found association to the effect of exposure (washing with water alone vs with water + soap) on the outcome (time to getting spoiled), while it is the confounding that can explain the observed association

vs

11

Apples vs non-apples

  • In non-interventional setting epidemiologic studies are especially susceptible to bias, including confounding

  • Non-exchangeability → study compares apples with non-apples and makes erroneous conclusions attributing the found association to the effect of exposure (washing with water alone vs with water + soap) on the outcome (time to getting spoiled), while it is the confounding that can explain the observed association

vs

  • Not a fair comparison
12

RCTs features

13

RCTs features

  • Random allocation to intervention groups

  • Both groups (active vs control) will be treated identically in all respects except for the intervention

  • Allocation concealment: personnel performing randomization are prevented from knowing the patients' treatment allocation

  • Patients and investigators remain unaware of what treatment was given until the study is completed

    • Double blind studies are not always feasible/appropriate
    • Triple blinded studies: investigators, participating subjects and data analysts have no knowledge of the treatment allocation
  • Patients are often analysed according to the group to which they were allocated, irrespective of whether they actually received the intended intervention (intention to treat analysis, ITT)

14

Ethical and practical aspects of RCTs

  • Is it ethical to randomize?
  • Is it ethical to withhold treatment and randomize to placebo?
  • Is it possible to randomize lifestyle/behavior/weight or body mass index/gestational age for infants/having certain conditions?
15

Ethical and practical aspects of RCTs

  • Is it ethical to randomize?
  • Is it ethical to withhold treatment and randomize to placebo?
  • Is it possible to randomize lifestyle/behavior/weight or body mass index/gestational age for infants/having certain conditions?
  • Costly
  • Time-consuming
  • Highly selected population → will findings from the RCT generalize for the heterogeneous real-world (patient) population?
16

Why allocation concealment & blinding are important?

17

Why allocation concealment & blinding are important?

  • Directed acyclic graphs (DAGs) for RCTs

18
  • The path Z → Y is the effect of the randomization (randomized treatment) on the outcome Y
  • When evaluating effect of Z, we evaluate the intention-to-treat effect

    • Patients are analysed within the group to which they were allocated, irrespective of whether they actually received the intended intervention
  • The path A → Y is the effect of treatment actually received on the outcome

    • Per-protocol effect
    • Observational data, since post/randomization
    • Confounding by L: L is associated with A and with Y and is not on the pathway from A to Y

19

Analyzing RCTs

  • Did someone find an error in the video lecture?
Total population, N Random treatment assignment (Z), n 28-day mortality (Y=1), n
1000 Z=1 (treated) n=482 262
Z=0 (comparator/placebo) n=518 290
  • Risk difference (RD): risk(treated)risk(comparator) = 262482290518 = 0.540.56 = 0.016
  • Risk ratio (RR): risk(treated)/risk(comparator) = 262482/290518 = 0.97
  • Small magnitude of effect
20

Interpretation of the effects on absolute and relative scale

  • Percent Relative Effect: (1-0.97)=0.03=3% decrease in relative risk

  • Risk difference

    • Among those who has been assigned to receive treatment (Z=1), ~2 lives per 100 were saved as compared to those who has been assigned no treatment (Z=0)
  • Relative risk

    • Those who has been assigned to receive treatment (Z=1) had 0.97 times the risk of dying compared to those who has been assigned no treatment (Z=0)
    • Individuals who have been assigned to receive treatment (Z=1) had 3% lower relative risk of dying compared to those who has been assigned no treatment (Z=0)
21

Features of intention-to-treat (ITT) effect

  • Effect of randomization (Z)
  • Causal interpretation since no confounding expected
  • In the ideal world everyone has perfect adherence to treatment → ITT effect is the same as per-protocol effect

22

Per-protocol effect features

23

Per-protocol effect features

  • The effect of treatment (A) on the outcome (Y)
  • Distribution of the outcome in actually treated (A=1) vs actually untreated (A = 0) → “as-treated”
  • The treatment allocation is no longer only dependent on the randomization but also can be dependent on patients' characteristics (L)

  • Patients outcome is not independent of patient's characteristics (L)

  • Patients characteristics (L) is the confounder

    • Associated with the exposure
    • Associated with the outcome
    • Not on the causal pathway between the exposure and the outcome
  • When computing per-protocol effect, need to adjust for L

24

Loss-to-follow-up

  • Source of selection bias in RCTs
25

Loss-to-follow-up

  • Source of selection bias in RCTs

  • Washing apples with water and soap keeps apples fresh longer vs washing with water alone

  • Students who evaluate apple outcomes every now and then eat the most fresh-looking apples
    • Students are hungry and fresher apples are more likely to be lost-to-follow-up
  • By doing so, they eat apples predominantly from "water + soap" group (without knowing it)
  • How this would affect the final results?
26

Loss-to-follow-up

  • At the end of follow-up
  • Observed without accounting for loss-to-follow-up

  • RR (fresh=1) = 0.5/0.6 = 0.8
  • Apples washed with water and soap have 20% lower relative risk (10% lower absolute risk) of being fresh vs apples washed with water alone
27

Loss-to-follow-up

  • At the end of follow-up
  • With accounting for loss-to-follow-up

  • RR (fresh=1) = 0.8/0.7 = 1.14
  • Apples washed with water and soap have 14% higher risk of being fresh than apples washed with water alone
28

Loss-to-follow-up

  • Differential loss to follow-up (informative censoring)
  • Follow-up studies including RCTs
  • When present, effect in those who was not lost-to-follow-up → conditioning on participation → selection bias

29

Recap

  • Randomization secures exchangeability between the treatment arms

  • Intention-to-treat (ITT) effect is an effect of the randomization on the outcome

  • Per protocol analysis aims to investigate the effect of the actual treatment on the outcome

  • If adherence to the treatment is not perfect, the Intention-to-treat effect ≠ Per-protocol effect

  • Loss-to-follow-up introduces selection bias to RCT results and needs to be addressed in the analysis

30

Break

31

Interpreting randomized controlled trials

  • Objective: valid and precise estimate of the effect of an exposure on the occurrence of an outcome

  • Exchangeability between treatment and control groups (no confounding)

32

Descriptive table (“Table 1”)

33

Accuracy of the effect estimate

  • The overall goal: precise + valid
  • Simulation example

34

Bias in epidemiologic research

  • Random error

    • Sampling (random) variability
    • Chance
  • Systematic error

    • Selection bias
    • Measurement (information) bias
    • Confounding (not expected in RCT)
    • Random non-exchangeability
35

Precision

  • No random error
  • Confidence interval width

36

Confidence interval (CI)

  • Estimate of the uncertainty due to random error alone
  • Interpretation of 95% CI
    • How often 95 % confidence intervals computed from many studies would contain the true effect size estimate
    • Given all the assumptions used to compute the intervals were correct

37

95% CI

  • 95% of the time 95% CI contains true effect parameter

Watch 95% CI simulation

38

Validity

  • No systematic error
  • Simulation example

39

Validity in studies

  • Internal validity

    • Validity of the inferences about the source population
  • External validity (= generalizability)

    • Validity of the inferences about people outside source population
  • Internal validity → external validity

40

Null hypothesis significance testing

  • Not applicable for interpretation of the results of observational studies

  • H0 - null hypothesis (hypothesis of no association)

  • Risk difference is zero or the risk ratio is 1

  • Is a hypothesis of no association between two variables in a superpopulation

  • The groups we compare were sampled in a random fashion from a superpopulation

  • Is not about the observed study groups

41

Null hypothesis significance testing

  • The difference is not statistically significant means
    • One cannot reject the null hypothesis that the superpopulation groups are different
    • Does not mean that the two observed groups are the same

Rothman, K., Greenland, S., & Lash, TL. (2008). Modern Epidemiology, 3rd Edition. Philadelphia, PA: Lippincott Williams & Wilkins. https://www.nature.com/articles/d41586-019-00857-9

42

Type I and II errors

43

Type I and II errors

44

Type I and II errors

  • Type I error: you're left with a false-positive
  • Type II error: you're left with false-negative

45

What p-value is not?

46

What is p-value?

47

What is p-value?

  • Is the probability that a test statistic (computed using the data) would be greater than or equal to its observed value, assuming that the test hypothesis is correct and all assumptions hold

  • Results are claimed to be “significant” or “not significant” according to whether the p-value is less than or greater than an arbitrary cutoff value, usually 0.05, which is called the alpha level of the test

48

P-value

  • The observed difference can be statistically significant

  • When the model used to compute it is wrong

    • Bias
  • Due to chance

    • 0.05 alpha level → statistically significant difference 5% of the time if the null hypothesis is correct
    • the alpha cutoff point is arbitrary
  • Dichotomization of study results based on p-values is harmful

    • Cherry-picking of “significant” results
49

How to think about p-value?

  • A small p-value → the data are unusual if all the assumptions used to compute the test statistics (including the null hypothesis) were correct

  • When a study is large, very minor effects or small assumption violations can lead to "statistically significant" results of the null hypothesis tests

Rothman, K., Greenland, S., & Lash, TL. (2008). Modern Epidemiology, 3rd Edition. Philadelphia, PA: Lippincott Williams & Wilkins. https://www.nature.com/articles/d41586-019-00857-9

50

Example

  • All-cause 28-day mortality was lower in patients receiving treatment #1 (35%) than those receiving treatment # 2 (43%)
  • Risk difference: −8.5% (95% CI, −18.2% to 1.2%)
  • Hazard ratio (approximates incidence rate ratio): 0.75
51

Example

  • All-cause 28-day mortality was lower in patients receiving treatment #1 (35%) than those receiving treatment # 2 (43%)
  • Risk difference: −8.5% (95% CI, −18.2% to 1.2%)
  • Hazard ratio (approximates incidence rate ratio): 0.75
  • 95% CI of hazard ratio: 0.55-1.02
  • p-value: 0.06
52

Example

  • All-cause 28-day mortality was lower in patients receiving treatment #1 (35%) than those receiving treatment # 2 (43%)
  • Risk difference: −8.5% (95% CI, −18.2% to 1.2%)
  • Hazard ratio (approximates incidence rate ratio): 0.75
  • 95% CI of hazard ratio: 0.55-1.02
  • p-value: 0.06
  • Why did authors conclude no beneficial effect of treatment #1 vs treatment #2?
  • What do you conclude?
53

Recap

  • Table1: balance is desirable but may not be perfect

  • Accuracy = Validity + Precision

  • Precision = no of random error

  • Validity = no systematic error

  • Selection bias + information bias

  • Internal validity → external validity

  • Interpret the RCT results concentrating on the validity and precision, but not p-values

54

Break

55

Intended and unintended effects in RCTs

Intended effects

  • Efficacy
  • Beneficial effects of the interventions/treatments
  • Primary outcomes (end points)
  • Should be explicitly indicated in the report of an RCT
56

Intended and unintended effects in RCTs

Unintended effects

  • Harms
  • Adverse events
  • Undesired effects of interventions/treatments
57

Unintended effects

  • “Harms should always be viewed as important whether they are labelled primary or secondary” (CONSORT statement)

  • Existence and nature of adverse effects

  • Withdrawal of participants due to an adverse event → loss-to-follow-up (selection bias)

  • Often unexpected and unpredictable

  • Harms-related stopping of an RCT

58

Reporting

  • CONSORT (Consolidated Standards of Reporting Trials)

  • Separately reporting anticipated and unexpected adverse events

  • For each study arm

  • The absolute risk of each adverse event, including recurrent

  • Number of participants withdrawn due to harms

  • Interpretation

    • Results, balancing benefits and harms
    • Considering other evidence
    • Systematic review and meta-analysis
  • Benefit-risk balance

  • Not all reported adverse events are necessarily caused by the intervention

59

Example

60

After RCT

  • Completely unpredictable adverse effects → hard to study in RCT

  • Post-authorization safety study (PASS) (non-interventional)

  • Carried out after a medicine has been authorized to obtain data on safety

61

Recap

  • Intended effect in RCTs are beneficial effects of the interventions/treatments

  • Intended effect → efficacy

  • Unintended effects in RCTs are undesired effects, or harms

    • Expected/predictable
    • Unexpected/unpredictable
  • Adverse effects can lead to loss-to-follow-up (selection bias) in RCT

  • Importance of reporting

  • Unpredictable adverse effects → Post-authorization safety study (PASS) (non-interventional)

62

63

What is RCT?

  • RCT is an interventional study
  • The exposure status of the participants is randomly assigned and concealed
  • We track individuals over time to determine their outcomes (longitudinal study)

Grimes, K.F. Schulz: An overview of clinical research: the lay of the land, The Lancet, Vol. 359 (9300), pp. 57-61

2
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow