PAN Data — Epigenetic Age & Vitamin D

Author

Ishaan Ranjan

Introduction

There is growing evidence that vitamin D supplementation is associated with slower “biologic” aging—the age reflected by molecular biomarkers rather than the number of birthdays on the calendar. In this analysis I’ll use data from the Precision Aging Network (PAN) to examine three established DNA‑methylation–based “epigenetic clocks”—Horvath 2013, AltumAge, and DunedinPACE.

Background/Tools

  • Epigenetic clocks: DNA‑methylation signatures that act as molecular clocks. We use three complementary models:

    • Horvath 2013 — 353 CpGs, pan‑tissue; returns an epigenetic age in years. Age acceleration = EpigenAge − ChronAge.
    • AltumAge — ≈21 000 CpGs, deep‑learning model trained on >140 data sets; interpreted like Horvath.
    • DunedinPACE — 38 CpGs; outputs the pace of aging (1.00 = average, >1 = faster, <1 = slower).
  • Why three clocks? Horvath and AltumAge give a snapshot of cumulative aging, while DunedinPACE acts like a speedometer, telling us how rapidly aging processes are unfolding now. Using all three offers both level and rate perspectives on biological aging.

  • PAN dataset: Community‑dwelling older adults with phenotypic and lifestyle data including supplement use.

Quick Interpretation Cheat Sheet

Clock Output “Higher” value implies
Horvath 2013 years (epigenetic age) Biological age older than calendar age (accelerated aging)
AltumAge years (epigenetic age) Same as Horvath 2013 but with higher precision
DunedinPACE ratio (bio‑years per chrono‑year) Aging faster than average (> 1.0)
  • Key R packages: tidyverse for data wrangling, table1 for publication‑ready descriptive tables, ggplot2 and ggpubr for displaying data.

Literature Review

1.  Vitamin D & Telomere Preservation in the VITAL Trial

Source: Medical News Today, May 28 2025

Vitamin D’s anti-aging promise first caught popular attention in a secondary analysis of Harvard’s large VITAL randomized trial, recently profiled by Medical News Today.

  • Design & cohort. VITAL enrolled 25,871 generally healthy U.S. adults (women ≥ 55 y; men ≥ 50 y) who were randomized to 2,000 IU/day vitamin D3, 1 g/day marine omega-3s, both, or placebo for roughly five years. In a nested lab sub-study (~1,000 participants; >2,500 leukocyte samples), investigators measured telomere length at baseline, 2 years and 4 years  .

  • Key finding. Participants receiving vitamin D showed minimal telomere shortening, whereas placebo groups exhibited the expected attrition. Omega-3s alone had no significant effect. Translating the telomere signal, the authors estimate vitamin D may confer the biological equivalent of ~3 years less aging over four years  .

  • Sub-group nuances. Benefits were strongest in non-White participants and in those not taking statins; BMI did not modify the effect  .

  • Limitations. The sub-study was post-hoc, older and mostly White, with substantial missing samples by year 4, so findings are hypothesis-generating rather than definitive  .

  • Relevance to our project. Although telomere length is a cruder marker than DNA-methylation clocks, the VITAL data reinforce the plausibility that vitamin D can slow cellular aging—aligning with our epigenetic-clock hypothesis and justifying its inclusion as a mechanistic link.

2.  DO-HEALTH Bio-Age Trial: Epigenetic-Clock Outcomes after 3 Years

Source: Bischoff-Ferrari et al. Nature Aging (2025)

DO-HEALTH is the largest factorial RCT to date testing daily vitamin D (2,000 IU), omega-3s (1 g), and a home exercise program (SHEP) in 2,157 European adults aged ≥ 70. A Swiss Bio-Age sub-study (n = 777) assayed four next-generation DNA-methylation clocks at baseline and 3 years .

  • Headline results.

    • Omega-3 alone significantly slowed PhenoAge, GrimAge2 and DunedinPACE.

    • Additive benefit: combining omega-3 with vitamin D and exercise yielded a further reduction in PhenoAge (–0.24 to –0.32 SD ≈ 2.9–3.8 months younger) .

    • Vitamin D alone did not reach significance on any clock but amplified omega-3 when combined.

  • Clinical context. The same three-way regimen previously lowered prefrailty (–39 %) and invasive cancer (–61 %) in the parent trial, suggesting molecular and clinical signals converge .

  • Strengths. Randomized, double-blind design; standardized fasting morning blood draws; multi-clock approach captures different biological pathways.

  • Limitations. Only two time points (baseline, 3 y) raise measurement-error concerns; Swiss subset may not generalize to less-healthy or non-European elders; clock changes were modest in magnitude .

  • Implications for our study. DO-HEALTH’s epigenetic evidence supports our choice of Horvath-style clocks and underscores the potential synergy between vitamin D and co-interventions. It also cautions that vitamin D effects may surface only in concert with other lifestyle factors—an angle we can test with interaction terms in our PAN dataset.

Research Question

Is self‑reported vitamin D supplementation associated with lower (“younger”) values in any three epigenetic clock measures after accounting for chronological age?

Creating Analysis Dataset

Below we reproduce the setup and data‑loading chunks exactly as in the original file; the only goal is to annotate what they are doing

Data Preparation

The code chunk below reads the PAN dataset (saved as keys2025data.csv) and drops any participants with missing information on the stratification variable vitd_supplement. This mimics the filtering step in the diabetes tutorial where rows with missing outcome data were excluded.

Code
# Read PAN dataset and create analysis subset
raw_pan <- read_csv("keys2025data.csv")

# Keep only variables required for analysis, renaming on the fly
pan <- raw_pan %>%
  select(
    hml_id,
    age,
    vitamin_d  = vitd_supplement,   # renamed here
    dose_mg,
    horvath2013,
    altumage,
    dunedinpace,
    sex,
    education = edu_yrs_hml,       # renamed here
    bmi,
    health_medical_cancer
  ) %>%
  drop_na(age, vitamin_d, horvath2013, altumage, dunedinpace)  # updated names used here

pan$cancer<- factor(pan$health_medical_cancer,
                              levels = c(0, 1),
                              labels = c("no", "yes"))



# Save subset for reproducibility
write_csv(pan, "pan_subset.csv")

# Convert from wide to long for optional analyses (wide → long)
pan_long <- pan %>%
  pivot_longer(
    cols = c(horvath2013, altumage, dunedinpace),
    names_to = "clock",
    values_to = "epigenetic_age"
  )

### this code will enable prettier output using the rms package

pand <- datadist(pan)
options(datadist='pand')

Table 1

Table 1 describes the study cohort stratified by vitamin‑D supplementation status (“no” vitamin D vs. “yes” vitamin D) and for the overall sample.
What to look for: Compare the means/medians in each column. If vitamin‑D users systematically differ (older age, higher educational attainment, lower BMI) those variables could confound any relationship between vitamin D and the epigenetic clocks.
Why it matters: The more similar the two columns are, the more confident we can be that the analyses isolate the effect of vitamin D rather than underlying demographic or health differences.

Table 1. PAN Demographics and Summary Statistics

Code
# --- Create the descriptive table ------------------------------------------
tab <- table1(
  ~ horvath2013 + altumage + dunedinpace +
    age + sex + education + bmi + cancer | vitamin_d,
  data               = pan,
  overall            = "Overall",
  render.continuous  = "Mean(SD)",
  render.missing = NULL,
  digits             = 3,
  res                = 300
)

tab 
no
(N= 369)
yes
(N= 59)
Overall
(N= 428)
horvath2013 64.3(7.35) 65.7(6.66) 64.5(7.27)
altumage 60.9(5.81) 62.1(4.75) 61.1(5.69)
dunedinpace 1.01(0.110) 1.04(0.116) 1.01(0.111)
age 64.2(7.70) 66.0(6.71) 64.5(7.59)
sex
Female 238 (64.5%) 54 (91.5%) 292 (68.2%)
Male 131 (35.5%) 5 (8.5%) 136 (31.8%)
education 16.1(2.30) 15.9(2.15) 16.0(2.27)
bmi 26.9(5.47) 29.1(7.20) 27.2(5.78)
cancer
no 338 (91.6%) 53 (89.8%) 391 (91.4%)
yes 31 (8.4%) 6 (10.2%) 37 (8.6%)

Vitamin D with VITAL

As a comparison, we show the vital demographic data, labeled, Table 2. VITAL Study Demographics and Summary Statistics.

A recent study suggested that Vitamin D supplements resulted in a 3 year difference in biologic age (https://www.medicalnewstoday.com/articles/vitamin-d-suppltements-may-slow-biological-aging-preserve-telomere-length)

and another study (DO-HEALTH) showed that 3-year supplementation showed a slight improvement in duninPACE, but the effect was greater in fish oil supplementation than vitamin D (which is at odds with the VITAL study).

Code
# Read in .csv ----------
vital_dat <- read.csv("vital_data_2019.csv")

# Create labeled levels for categorical variables ---------------
vital_dat$vitamin_d <- factor(vital_dat$vitdactive,
                              levels = c(1, 0),
                              labels = c("Yes", "No"))

vital_dat$fish_oil <- factor(vital_dat$fishoilactive,
                             levels = c(1, 0),
                             labels = c("Yes", "No"))

vital_dat$sex_factor <- factor(vital_dat$sex,
                               labels = c("Male", "Female"))

vital_dat$race_factor <- factor(vital_dat$race,
                                labels = c("Non-Hispanic White", "Black", "Hispanic", "Asian", "Native American or Alaskan", "Others/Unknown"))

vital_dat$malcancer_factor <- factor(vital_dat$malca,
                                                    levels = c(1, 0),
                                        labels = c("Malignant Cancer", "No Malignant Cancer"))

vital_dat$majorcvd_factor <- factor(vital_dat$majorcvd,
                                                    levels = c(1, 0),
                               labels = c("Has Major CVD", "No Major CVD"))

# Create nice labels for variables used in table -----------
label(vital_dat$ageyr) <- "Age"
label(vital_dat$sex_factor) <- "Sex"
label(vital_dat$race_factor) <- "Race"
label(vital_dat$bmi) <- "BMI"
label(vital_dat$vitamin_d) <- "Randomization to Active Vitamin D"
label(vital_dat$fish_oil) <- "Randomization to Active N-3 Fatty Acids"
label(vital_dat$malcancer_factor) <- "Cancer"
Code
table1(~ ageyr + sex_factor + race_factor + bmi + malcancer_factor | vitamin_d,
       data = vital_dat, render.missing = NULL, render.continuous="Mean(SD)")
Yes
(N=12927)
No
(N=12944)
Overall
(N=25871)
Age 66.6(7.05) 66.6(7.07) 66.6(7.06)
Sex
Male 6380 (49.4%) 6406 (49.5%) 12786 (49.4%)
Female 6547 (50.6%) 6538 (50.5%) 13085 (50.6%)
Race
Non-Hispanic White 9013 (69.7%) 9033 (69.8%) 18046 (69.8%)
Black 2553 (19.7%) 2553 (19.7%) 5106 (19.7%)
Hispanic 516 (4.0%) 497 (3.8%) 1013 (3.9%)
Asian 188 (1.5%) 200 (1.5%) 388 (1.5%)
Native American or Alaskan 118 (0.9%) 110 (0.8%) 228 (0.9%)
Others/Unknown 259 (2.0%) 264 (2.0%) 523 (2.0%)
BMI 28.1(5.68) 28.1(5.79) 28.1(5.74)
Cancer
Malignant Cancer 793 (6.1%) 824 (6.4%) 1617 (6.3%)
No Malignant Cancer 12134 (93.9%) 12120 (93.6%) 24254 (93.7%)

DO-Health Table 1

VITAL vs PAN vs DO-Health Comparison

  • VITAL

    • similar age

    • equal cancer percentages

    • lower BMI

    • equal male:female (50:50) while PAN had more females

  • DO-Health

    • older age (75 vs 64)

    • fewer females (60% vs 68%)

    • lower BMI (25.7 vs 27.3)

    • lower education (13.5yrs vs 16yrs)

    • similar cancer percentages

    • 1/3 participants vitamin D-deficient at baseline

Determining Correlation between Clocks in PAN (epigenetic vs chronologic age)

Below we visualize how each epigenetic clock (Horvath 2013, AltumAge, and DunedinPACE) relates to participants’ chronological age.
Each scatter plot is faceted by self‑reported Vitamin D supplement use (“Yes” / “No”), and includes the linear regression equation, the \(R^2\) statistic, and the p‑value for the age–clock association.

Takeaways

PAN vs DO-Health (1st and 2nd gen clocks):

  • Horvath clock showed strong correlation (r = 0.76) 

  • PhenoAge(r = 0.60), GrimAge (r = 0.92), GrimAge2 (r = 0.71)  were similar

PAN vs DO-Health (3rd gen clock):

  • PAN DunedinPACE correlation was r = 0.19

  • DO-Health DunedinPACE correlation was r = 0.19

Statistical Analysis

Methods

We use linear regression to examine the both the relationship between biologic and chronologic age as well as evaluate whether or not there is evidence that vitamin D supplementation slows down biologic aging. We compare these results to two large studies that have come out recently. We don’t see a relationship between Horvath biologic aging and vitamin d supplementation, either as a univariate or multivariable analysis adjusting for other variables (two plots show this, from the RMS modeling).

Results

Output shows that there is a statistically linear relationship between biologic and chronologic aging, with the \(R^2\) of 0.585 showing good predictive capacity (for human cohorts an \(R^2\) of 0.2 is typical for prediction). Interestingly, the Horvath epigenetic clock is higher than chronologic age (by a little over 1 year on average), basically showing the PAN population shows slightly higher biologic age compared to their chronologic age.

Code
multivariable.mod <- ols(horvath2013 ~ sex + education + bmi + cancer + vitamin_d, data=pan)

plot(anova(multivariable.mod), res=300, margin="P")

Code
multivariable.mod <- ols(
  horvath2013 ~ sex + education + bmi + cancer + vitamin_d,
  data = pan
)

# Panelled partial-effect plots with a custom y-axis title
plot(
  Predict(multivariable.mod),
  res  = 300,
  ylab = "Horvath Clock"   # <- new Y-axis label
)

Phenotype Significance Values with Horvath

  • Sex is statistically associated with epigenetic age using Horvath clock

  • Vitamin D wasn’t significant in lowering epigenetic age

Specific Phenotypes with Horvath

  • Male sex is associated with higher epigenetic age compared to females

  • Vitamin D supplementation interestingly shows increase in epigenetic age, though again, not statistically significant

  • Having cancer increases epigenetic age, though its not statistically significant

Dunedin Pace Analysis

Code
multivariable.dunedin <- ols(
  dunedinpace ~ sex + education + bmi + cancer + vitamin_d,
  data = pan
)
plot(anova(multivariable.dunedin), res=300, margin="P")

Code
# Fit model for DunedinPACE
multivariable.dunedin <- ols(
  dunedinpace ~ sex + education + bmi + cancer + vitamin_d,
  data = pan
)

# Panelled partial-effect plots with a custom y-axis title
plot(
  Predict(multivariable.dunedin),
  res  = 300,
  ylab = "DunedinPACE Clock"
)

Phenotype Significance Values with DunedinPACE

  • There is a statistically significant association between epigenetic age and BMI, education, and sex 

  • Vitamin D supplementation didn’t seem to impact epigenetic age

Specific Phenotypes with DunedinPACE

  • Higher BMI and male sex are associated with higher epigenetic age

  • Higher education is associated with lower epigenetic age

  • Both vitamin D supplementation and cancer increase epigenetic age, though not statistically associated

Discussion

  • Degree of correlation between epigenetic aging using Horvath clock, but not Dundenin PACE Clock

    • Very similar to what was seen in the DO-HEALTH Study
  • Precision Aging Network data did not confirm epigenetic/biologic age differences with vitamin D supplementation

    • Unlike both the VITAL and DO-HEALTH Studies, PAN is not a randomized controlled trial - it is a cross-sectional study 

    • Both VITAL and DO-HEALTH discussed epigenetic aging after 3 years of treatment/supplementation – we don’t know how long PAN participants were taking vitamin D

    • PAN is a longitudinal study, future direction of research is to include duration of supplementation