class: center, middle, inverse, title-slide .title[ # Tracing Mitochondrial Inheritance of Alzheimer’s Disease
👪 ] .subtitle[ ## A Family-Based Case-Control Study of 4ish Million Cousins ] .author[ ### S. Mason Garrison
Wake Forest University ] --- layout: true <div class="my-footer"> <span> <a href="https://DataScience4Psych.github.io/DataScience4Psych/" target="_blank">S. Mason Garrison</a> </span> </div> --- <!-- Current genetic models of Alzheimer's disease (AD) overlook mitochondrial DNA (mtDNA), despite its biological links to neurodegeneration. mtDNA with its strictly maternal inheritance without recombination, may modulate neurodegenerative processes through altered energy metabolism and reactive oxygen species production. To assess mtDNA's influence on AD, we analyzed the Utah Population Database—a comprehensive resource of 11 million individuals across 200,000 extended families, from which we identified case-control pairs and their families, totaling 4.8 million people spanning 4-to-17 generations. AD diagnoses were ascertained using linked death certificates (ICD revisions 6–10) and electronic medical records from Intermountain Healthcare and the University of Utah Health Sciences Center (ICD-9 and ICD-10). Using BGmisc (Garrison, Hunter et al. 2024) to algorithmically reconstruct pedigrees, we identified multiple genealogical clusters, including one that contained approximately 4.3 million relatives. From all clusters, we systematically extracted distant cousin dyads to evaluate extended cousin similarity in AD outcomes, jointly accounting for degrees of nuclear relatedness (6.25%, 3.125%, 1.5625%, etc.) and maternal versus paternal lineage. This design allowed us to estimate nuclear (h²) and mitochondrial (mt²) heritability simultaneously, testing whether matrilineal cousins display heightened similarity in AD relative to their paternal-line counterparts. Verifying this pattern would clarify mtDNA's role in the development of AD, guiding investigations into how maternal inheritance shapes neurodegeneration. --> # Hello world! <!-- Slide 1: Introduction (1 minute) --> <!-- Script: Good [morning/afternoon/evening], everyone. My name is S. Mason Garrison, and I'm from Wake Forest University. I'm excited to continue the saga that Mike has begun. We have had big math problems, so who am I change the vibes. So behold a tale of my big data problems. Today, I’ll be presenting our attempt to trace mitochondrial inheritance patterns in Alzheimer’s Disease using a family-based case-control design. This talk is fundamentally about inference under constraint. Because while our dataset is enormous—4.8 million people embedded in extended pedigrees—our question hinges on what the data don’t tell us just as much as what they do. --> <!-- This is work with Nithya Mylakumar, Xuanyu Lyu, Michael Hunter, Margie Gatz, Ken Smith, and Alex Burt, and it was supported by the National Institute on Aging [RF1-AG073189].--> <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#d00_slide_files/figure-html/unnamed-chunk-2-1.png" alt="QR code for these slides" width="30%" /> <p class="caption">QR code for these slides</p> </div> .footnote[.center[ [r-computing-lab.github.io/slides/00_bga_2025/d00_slide](https://r-computing-lab.github.io/slides/00_bga_2025/d00_slide.html#1) ] ] --- # Mitochondrial Heritability of Alzheimer’s Disease ### Evidence from the Utah Population Database - Mason Garrison, - Nithya Mylakumar M. - Xuanyu Lyu, - Michael Hunter, - Margie Gatz, - Ken Robert Smith, - *Alex Burt* **NIA RF1-AG073189** --- # The Power of Mitochondria (mtDNA) ### "It's the powerhouse of the cell" `⚡` .small[Every Middle School Biology Teacher] <!-- Slide 2: Why Look at Mitochondrial DNA in AD? --> <!-- Script: Alzheimer’s disease is widely recognized as heritable, but nearly all existing models focus on nuclear DNA. And yet mitochondrial DNA—or mtDNA—has multiple biological pathways connecting it to neurodegeneration. MtDNA is maternally inherited, non-recombining, and central to energy production and oxidative stress regulation. All of these are plausible mechanistic pathways for Alzheimer’s risk. Mutations in mtDNA accumulate with age, and dysfunctional mitochondria are repeatedly observed in AD-affected brain tissue. But critically, even though these pathways are biologically plausible, they’ve been underexamined at the population level. So we asked: is there evidence of mtDNA-linked familial resemblance for AD in large-scale genealogical data? --> .pull-left-wide[ - Alzheimer’s disease is widely recognized as heritable, but - nearly all existing models focus on nuclear DNA. - And yet, 🧬 mtDNA has multiple biological pathways connecting it to neurodegeneration. - MtDNA is - maternally inherited, - non-recombining, and - central to energy production and - oxidative stress regulation. - All of these are plausible mechanistic pathways for Alzheimer’s risk. (Swerdlow, 2018; Coskun et al., 2012; Wallace, 2005) ] -- .pull-right-narrow[ - Mutations in mtDNA accumulate with age, and - dysfunctional mitochondria are repeatedly observed in AD-affected brain tissue (e.g., Coskun, et al 2004). ] <!-- Swerdlow, R. H. (2018). Mitochondria and mitochondrial cascades in Alzheimer’s disease. Journal of Alzheimer’s Disease, 62(3), 1403-1416. --> <!-- Wallace, D. C. (2005). A mitochondrial paradigm of metabolic and degenerative diseases, aging, and cancer: a dawn for evolutionary medicine. Annu. Rev. Genet., 39(1), 359-407. --> <!-- Coskun, P. E., & Busciglio, J. (2012). Oxidative stress and mitochondrial dysfunction in Down’s syndrome: relevance to aging and dementia. Current gerontology and geriatrics research, 2012(1), 383170. --> <!-- Coskun, P. E., Beal, M. F., & Wallace, D. C. (2004). Alzheimer's brains harbor somatic mtDNA control-region mutations that suppress mitochondrial transcription and replication. Proceedings of the National Academy of Sciences, 101(29), 10726-10731. --> --- # The Power of Mitochondria (mtDNA) ### "It's the powerhouse of the cell" `⚡` .small[Every Middle School Biology Teacher] - And yet, despite this biological plausibility... -- - .midi[they've been almost entirely ignored at the population level.] -- ### So we asked: .center[Is there evidence of mtDNA-linked familial resemblance for AD in large-scale genealogical data?] --- background-image: url(data:image/png;base64,#img/Slide9_cropped.PNG) background-size: 98% background-position: center background-repeat: no-repeat class: middle # The --- background-image: url(data:image/png;base64,#img/Slide10_cropped.PNG) background-size: 94% background-position: center background-repeat: no-repeat # The Data: What We Have --- background-image: url(data:image/png;base64,#img/Slide11_cropped.PNG) background-size: 93% background-position: center background-repeat: no-repeat # The Data: Who We Have --- background-image: url(data:image/png;base64,#img/Slide12_cropped.PNG) background-size: 93% background-position: center background-repeat: no-repeat # The Data: Who We Have --- # The Data: What We Used <!-- Script: We used the Utah Population Database—one of the world’s largest and deepest genealogical datasets, with over 11 million individuals linked across multigenerational pedigrees. For this project, we extracted a subset of 4.8 million individuals embedded in multigenerational family trees, anchored around AD cases and matched controls. These pedigrees span up to 17 generations, seeded with founders from 19th-century Utah (Skolnick et al., 1979; O’Brien et al., 1994). --> <!-- --> <!-- Linked data included birth and death certificates, family history records, and ICD-coded diagnoses from EMRs and death records, covering ICD-6 through ICD-10. Critically, these AD indicators come from two sources: (1) death records and (2) electronic medical records from Intermountain Healthcare and University of Utah Health Sciences. Which brings us to the first constraint: diagnostic coverage. --> .pull-left[ - For this project, we extracted a subset of 4.8 million individuals: - anchored around 100,000 AD cases and their matched controls, - organized into extended family trees, - spanning up to 17 generations. ] .pull-right[ - Linked records central to our phenotype: - ICD-coded diagnoses from EMRs and death records (ICD-6 to ICD-10). - Critically, AD indicators came from two sources: - death certificates and - electronic medical records from Intermountain Healthcare and the University of Utah Health Sciences Center. ] --- # The Data: What We Made .pull-left[ - We reconstructed extended pedigrees: - using **BGmisc**, our custom R package for extended behavior genetic analysis .small[(Garrison, Hunter, Lyu, Trattner, & Burt, 2024)], - in combination with graph theory, - and computed path-based relatedness estimates .small[(Hunter, Garrison et al., RnR)]. - For each dyad, we traced: - nuclear relatedness, - maternal vs. paternal lineage, - mtDNA, and potential shared environment. ] -- .pull-right[ - To illustrate this on a human scale, we simulated a family: - spanning 6 generations, - using `simulatePedigree()` and plotted with `ggpedigree` (Garrison, 2025), - visualizing two relatedness types: - **additive** (left) and - **mitochondrial** (right). - The full dataset contains 4.8 million people— - but this mini-pedigree captures the logic of the design. ] --- ## Additive Relatedness .center[
] --- ## Mitochondrial Relatedness .center[
] <!-- Script: Our approach was to algorithmically reconstruct extended pedigrees and extract cousin dyads at various degrees of relatedness. We used BGmisc, our custom R package for extended behavior genetic analysis (Garrison, Hunter, Lyu, Trattner, & Burt, 2024), in combination with graph theory algorithms to identify cousin pairs and compute path-based relatedness estimates. --> <!-- For each dyad, we traced whether the relation passed through the maternal or paternal line, whether they shared mtDNA, and whether they had any shared environment. We then calculated polychoric correlations for AD outcomes across these bins, stratified by degree of relatedness and lineage. --> --- # If plotly doesn't behave... <img src="data:image/png;base64,#d00_slide_files/figure-html/unnamed-chunk-10-1.png" width="60%" style="display: block; margin: auto;" /> --- class: middle # Foreshadowing... In theory, this is where that talk would shift to showing the results of our analysis, but... -- ## But in practice? - We hit constraints. And not subtle ones... - But twos - family size and - diagnostic coverage --- # Constraint 1: Family Size - In a previous iteration of this project, we examined 'longevists' - individuals who lived to be in the top 10% of the population for age at death. - If you attended that talk, you may remember that we found strong cousin resemblance for longevity, but that the heart of that talk was centered around how to handle our big family... -- .pull-left.center[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#img/longevity.png" alt="A pedigree plot of the SSA mortality from 1900 to the present." width="99%" /> <p class="caption">A pedigree plot of the SSA mortality from 1900 to the present.</p> </div> ] .pull-right[ <img src="data:image/png;base64,#img/mtdna.png" width="55%" style="display: block; margin: auto;" /> ] --- background-image: url(data:image/png;base64,#img/Slide15.PNG) background-size: 94% background-position: 50% 105% background-repeat: no-repeat # Constraint 1: Family Size <!-- Script: In this project, we’re not looking at longevists, but at Alzheimer’s cases. And while the Utah Population Database is large, it’s not infinite. So we had to make some hard choices about how to define our cases and controls. --> -- <img src="data:image/png;base64,#img/sweetsummer.gif" width="80%" style="display: block; margin: auto;" /> --- background-image: url(data:image/png;base64,#img/Slide15.PNG) background-size: 94% background-position: 50% 105% background-repeat: no-repeat # Constraint 1: Family Size --- # Constraint 1: Family Size - I wish our family was that ... .tiny[small.] -- - I wish it was that ... .tiny[manageable.] - But it’s not. -- - To give you a sense of the scale of what we’re working with... --- # Constraint 1: Family Size .pull-left[ <img src="data:image/png;base64,#img/stacked_bar_log.png" width="99%" style="display: block; margin: auto;" /> ] .pull-right[ - Yes, that scale is logarithmic. - And yes, that really is a lot of cousins. - For context: - the human body contains approximately `2 × 10^13` blood cells, - and our dataset has a comparable order of magnitude in cousin pairs. - In theory, we have: - `4,160,931^2` possible cousin pairs in our big family. - That’s approximately: - 17 trillion 313 billion 346 million 786 thousand 761. ] --- # Without log scaling... .pull-left-wide[ <img src="data:image/png;base64,#img/stacked_bar_pattern_all.png" width="85%" style="display: block; margin: auto;" /> ] -- .pull-right-narrow[ - Same plot. - Just without the log scale. - This is what happens when you try to visualize 4,160,931 cousins - on a linear scale. - The y-axis is so large that it’s hard to see the smaller relationships. ] --- # Constraint 2: Diagnostic Coverage ## Who the Data Don’t Show <!-- Script: Even with a population this large, the usable signal is fragile. Many individuals have no diagnostic data at all. Others have only a single diagnostic entry, or an AD-related cause of death recorded via ICD-9. There’s also no uniform ascertainment: some participants have decades of EMR data, others none. --> .pull-left[ - Even with a population this large, the usable signal is fragile. - Diagnostic coverage is incomplete: - some individuals have clear AD diagnoses, - some have indicators of **no** AD, and - many have no diagnostic data at all. ] -- .pull-right[ - Our classification is necessarily binary: - AD case vs. non-case. - But this dichotomy obscures a third group: - those with unknown diagnostic status. - And even for known cases, we lose nuance in: - severity, - age of onset, and - diagnostic certainty. ] --- class: middle # What We Hoped to Find --- # What We Hoped to Find <!-- Script: If mtDNA contributes to AD, we would expect cousin pairs who share maternal lineage—thus mtDNA—to resemble each other more than those of equal nuclear relatedness but through a paternal line. So, for example, maternal 3rd cousins might show higher concordance for AD than paternal 3rd cousins, despite both having 1.5625% expected nuclear overlap. --> .pull-left[ - If mtDNA contributes to AD, - then cousin pairs who share **maternal lineage**—thus mtDNA— - should resemble each other more than cousin pairs with equal nuclear relatedness through the **paternal line**. ] .pull-right[ - For example: - Maternal 3rd cousins (1.5625% nuclear relatedness) - should show **higher concordance** for AD - than paternal 3rd cousins, - if mtDNA contributes to risk. ] --- # What We Found <!-- Script: We observed impressively negatively phi coefficients for AD status across all degrees of relatedness, indicating that cousins were less likely to share AD diagnoses than expected by chance. This is inconsistent with the known heritability of AD; it also suggests that our diagnostic coverage is incomplete. It did not matter whether the dyad was maternal or paternal, or how closely related they were. It didn't matter how nicely we modeled the quality of our diagnostic data, or how we controlled for nuclear relatedness. The cousin similarity design did not detect any signal of mtDNA-linked familial resemblance for AD. And frankly, in hindsight, this is not surprising. We received 100,000 people with clear AD diagnoses, and 100,000 matched controls. We also received all the demographic data for their relatives. What we didn't receive was clear diagnoses for those relatives. For many of them, that's not reasonable to get. We have 4.8 million people, but we only have 100,000 AD cases, and 100,000 non-AD controls. That doesn't mean that we have 4.6 million people who don't have AD. It means that we have 4.6 million people who we don't know whether they have AD or not. --> .pull-left[ - We observed **strongly negative phi coefficients** - for AD status across all degrees of relatedness. - Cousins were **less likely** to share an AD diagnosis than expected by chance. - This contradicts the **known heritability of AD** - and immediately suggests problems with the data. ] -- .pull-right[ - The pattern held regardless of: - whether the dyad was maternal or paternal, - how close they were, - how well we modeled diagnostic quality. - The cousin similarity design detects a signal - but it's in the wrong direction. ] --- # Results: What We Found <img src="data:image/png;base64,#img/phis.png" width="75%" style="display: block; margin: auto;" /> --- # hindsight - In hindsight, this is not surprising: - We had 100,000 well-classified AD cases and 100,000 matched controls— - but the remaining 4.6 million people? --- # They're missing... <img src="data:image/png;base64,#img/stacked_bar_pattern.png" width="75%" style="display: block; margin: auto;" /> --- # They're missing... by design <img src="data:image/png;base64,#img/stacked_bar_pattern.png" width="75%" style="display: block; margin: auto;" /> --- # What We Learned Instead <!-- Script: So what did we learn? We learned that the cousin similarity design is a powerful tool for evaluating mtDNA using pedigree structure—not through genotyping, but through patterns of resemblance. But we also learned that the data we have are not sufficient to answer our original question. --> <!-- Script: What actually failed here? Because it’s not the sample size. It’s not the analysis. It’s the design. --> .pull-left[ - Constraint 1: Family size - Turns out, this design is plenty powerful. - Even a small slice of the family was enough to detect effects. - Just not the effects we were hoping for. ] -- .pull-right[ - Constraint 2: Diagnostic coverage - The data we have are not sufficient to answer our original question. - We need more complete diagnostic coverage for relatives. - We need to be able to distinguish between AD and non-AD relatives. - We need to be able to distinguish between AD and unknown relatives. ] --- # What We Learned Instead - If anything, the case-control design induced selection bias. - Which means: - We didn’t test resemblance across families— - we tested it within a preselected outcome set. - We can’t yet resolve whether mtDNA meaningfully contributes to AD via familial resemblance. - But we’ve constructed the architecture to test that question—one that can be deployed as richer phenotype linkages emerge. --- # Implications and Next Steps <!-- Script: So what are the implications? Substantively, we can’t yet resolve whether mtDNA meaningfully contributes to AD via familial resemblance. But we’ve constructed the architecture to test that question—one that can be deployed as richer phenotype linkages emerge. Methodologically, this demonstrates a scalable framework for evaluating mtDNA using pedigree structure—not through genotyping, but through patterns of resemblance. --> - Substantively, we can’t yet resolve whether mtDNA meaningfully contributes to AD via familial resemblance. - But we’ve constructed the architecture to test that question—one that can be re-deployed with more complete phenotype linkages. - Methodologically, this demonstrates a scalable framework for evaluating mtDNA using pedigree structure—not through genotyping, but through patterns of resemblance. - The most fruitful next step is reducing missingness in our phenotype data --- # Acknowledgements <!-- Script: This work was funded by the National Institute on Aging [RF1-AG073189] and supported by the Utah Population Database. The analytic infrastructure was developed in the open-source R package BGmisc (Garrison et al., 2024). Thanks to my co-authors and to the data stewards at UPDB. --> - This work was funded by the National Institute on Aging [RF1-AG073189] and supported by the Utah Population Database. --- ## Any Questions? Feel free to ask any questions now, or reach out to me after the talk via email _garrissm@wfu.edu_ or on github _github.com/smasongarrison_. <img src="data:image/png;base64,#d00_slide_files/figure-html/qr_ds4p-1.png" width="30%" style="display: block; margin: auto;" /> .footnote[.center[ [r-computing-lab.github.io/slides/00_bga_2025/d00_slide.html](https://r-computing-lab.github.io/slides/00_bga_2025/d00_slide.html) ] ]