Breadcrumb
Workshop Discussion
Whether you're taking the ISG course live, or you just have questions outside of the live workshops, please feel free to post (or answer!) questions to the ISG Workshop Forum. Open 24-7, 365 days a year! If you'd like to follow updates about the ISG workshop, feel free to follow our facebook page.
------------------------------------------------------------------------
NOTE: This page is still under construction. We will be improving its
functionality, organization, and appearance over the next few months.
------------------------------------------------------------------------
Below, you will find material that can be used at any time to learn or brush up on statistical genetics topics.
Online Workshop
Structural Equation Modeling
All
A2. Introduction to Analysis of Twin Data Using R and OpenMx - Part 1
Elizabeth Prom-Wormley, June 2022 Virtual Workshop
This video will introduce students to twin modeling using R and OpenMx. By the end of the video, students will be able to (1) recognize the major steps involved in an OpenMx model and (2) translate implementation of a linear regression between a statistical equation, structural equation model, and an OpenMx model
A3. Introduction to Analysis of Twin Data Using R and OpenMx - Part 2
Elizabeth Prom-Wormley, June 2022 Virtual Workshop
This video builds on the basics introduced in Part 1. By the end of the video, students will be able to: (1) summarize the goals of analyzing twin data for a single phenotype, (2) summarize the general process by which to analyze twin data for a single phenotype, (3) apply basic R functions for twin-focused data analysis and visualization, and (4) translate the implementation of a basic ACE model between a structural equation model and an OpenMx model
A4. Path Analysis and its Application in Models Using Twin Data
Elizabeth Prom-Wormley, June 2022 Virtual Workshop
This video presents the basic rules used for path tracing in structural equation modeling, which is the basis for developing basic and more complex twin models. By the end of the video, students will be able to: (1) identify the advantages of applying path tracing rules and their use in structural equation models, (2) summarize the basic path tracing rules, and (3) apply basic path tracing rules to derive the expected parameters generated from simple regression models of unrelated individuals as well as basic twin models
AP1. Practical: ACE Models
Brad Verhulst & Katrina Grasby, June 2022 Virtual Workshop
This practical details several ways that an ACE model can be parameterised in OpenMx. It covers introducing siblings into a twin model and how measured genetic data can be incorporated instead of assuming that DZ twin pairs have a genetic correlation of 0.5.
AP2. Practical: Getting acquanted with twin modeling using OpenMx
Elizabeth Prom-Wormley, June 2022 Virtual Workshop
This practical will help students learn the basics of running a simple twin analysis, including challenges and useful strategies. Students will explore the development and estimation of parameters from Saturated and ACE models.
B1. Introduction to Structural Equation Modeling
Michael C. Neale, June 2021 Virtual Workshop
The basics of variation - means and variances are considered, followed by description of i) the tracing rules of path analysis and ii) matrix representation of path models. The discussion is illustrated with a simple common factor model, and considers model identification. The 27-minute presentation concludes with rationale for always using Open Source software for scientific purposes, as using closed source code is inappropriate for science.
B2. Twin Data and Likelihood
Michael C. Neale, June 2021 Virtual Workshop
This half-hour talk describes the specification of the ACE model that is widely used in human behavioral genetic studies. It considers how a model where a, c and e path coefficients are estimated differs from one where the variance components a2, c2 and e2 are directly estimated (plot spoiler: the latter allows variance components to go negative, which has less interpretability but better statistical properties).
B3. Linear Regression and Genetic Covariance Structure Modeling
Conor Dolan, June 2021 Virtual Workshop
B8. The Direct Symmetric Matrix Approach to Fitting Twin Models
Brad Verhulst, June 2022 Virtual Workshop
BP2. Practical on testing genetic mean effects in twin models
Conor Dolan, June 2021 Virtual Workshop
C1. Assumptions of the Classical Twin Design and Biases when Violated
Matthew C Keller, June 2022 Virtual Workshop
Assumptions of the CTD and discussion of biases that occur when these assumptions are violated. Participants will learn how to calculate biases by hand.
C2. Extended Twin Family Designs: The Motivation for Using Them
Matthew C Keller, June 2022 Virtual Workshop
The motivation for using Extended Twin Family Designs (ETFDs), including how ETFDs can reduce biases that occur in other designs arising from assortative mating and passive G-E covariance arising from vertical transmission.
C3. Extended Twin Family Designs: Path Tracing
Matthew C Keller, June 2022 Virtual Workshop
This lecture shows how to derive expectations of variances and covariances in ETFDs using path tracing rules. We use a “Nuclear Twin Family Design” as an example. throughout.
CP1. Practical: CTD and ETFD
Matthew C Keller, June 2022 Virtual Workshop
This practical uses the “Interactive worksheet” link along with the “CTD.NTFD.R” file located under the “Practical Files” link to go through parameter indeterminacy in the CTD and how adding information (e.g., ETFDs) can help deal with this indeterminacy. This is a great way to check your understanding of the lectures in this subsection.
D1. Modeling Heterogeneity in Twin Studies
Sarah Medland, June 2022 Virtual Workshop
This video talks about different models of heterogeneity and the terminology that is used in this literature.
E1. Multivariate Twin Modeling Chapter 1
Michael Hunter, June 2021 Virtual Workshop
This video takes us from the univariate ACE model to a multivariate ACE model: from ACE to MACE. We begin with a review of the univariate ACE model, and then extend this to two phenotypes as a Cholesky model, introducing the concept of genetic correlation along the way.
E2. Multivariate Twin Modeling Chapter 2
Michael Hunter, June 2021 Virtual Workshop
This video introduces a theory-driven multivariate behavior genetics model: the common pathway model. The common pathway model, also called the psychometric factor model, first creates a phenotypic factor model and then biometrically decomposes the factor and residual variances.
E3. Multivariate Twin Modeling Chapter 3
Michael Hunter, June 2021 Virtual Workshop
This video introduces a theory-driven multivariate behavior genetics model: the independent pathway model. The independent pathway model, also called the biometric factor model, first creates biometric factors for the A, C, and E variance components and then further biometrically decomposes the residual variances.
F1. Multivariate twin models: from univariate to bivariate
Conor Dolan, June 2022 Virtual Workshop
In this lecture, the univariate twin model is extended to the bivariate twin model. It is explained how the variance in two traits is decomposed into additive genetic, non-additive genetic, and environmental effects.
F2. Multivariate twin models: from bivariate to multivariate
Conor Dolan, June 2022 Virtual Workshop
In this lecture, the bivariate twin model is extended to the multivariate twin model. It is explained how the variance in four traits is decomposed into additive genetic, non-additive genetic, and environmental effects.
F3. Multivariate twin models: independent pathway models part 1
Dirk Pelt, June 2022 Virtual Workshop
In this lecture, the independent pathway model is introduced. A general specification of the common factor model is provided first, which is applied to genetic and environmental correlation matrices to arrive at the independent pathway model.
F4. Multivariate twin models: independent pathway models part 2
Dirk Pelt, June 2022 Virtual Workshop
In this lecture, several competing independent pathway models are tested. It explains how results from independent pathway models can be interpreted and presented.
F5. Multivariate twin models: the common pathway model
Conor Dolan, June 2022 Virtual Workshop
The common pathway model is discussed, also in relation to independent pathway models.
FP2. Practical - Independent & common pathway
Conor Dolan, June 2022 Virtual Workshop
In this practical we use the knowledge from the lectures in this section to estimate multivariate twin models, and independent and common pathway models in R using the OpenMx package. A skinfold dataset with 4 phenotypes is used, and a dataset with Neuroticism items.
G4. Multivariate Longitudinal Modeling with Genetically Informative Data
Michael C. Neale, June 2022 Virtual Workshop
Extending the Eaves et al Markov model for genetically informative data to the multivariate case is described. At the time of writing, this model has not yet been implemented in a software script, so some suggestions as to how to do so are included in the 30-minute talk.
G5. Markov Modeling with Genetically Informative Data
Michael C. Neale, June 2022 Virtual Workshop
A 30-minute talk about Markov modeling generally, with specific reference to the seminal 1986 contribution of Professor Eaves, which described Markov processes for genetic and environmental variance components. The consequences of including random intercepts for these processes are described.
G6. Biometrical Age-Based Latent Growth Curve Modeling
Michael C. Neale, June 2022 Virtual Workshop
This 26 minute talk describes genetically-informative latent growth curve modeling, including how age-at-participation, rather than wave-of-assessment is used to model development instead of the effects of being measured several times. It also shows how genetic variance in the cortical structure of the brain decreases with age, while its heritability increases.
Statistical Genetics
All
1. Introduction to Workshop and Basics of Statistical Genetics
Benjamin Neale, June 2021 Virtual Workshop
Genomics
All
A1. Introduction to common variation - Part 1
Lucía Colodro-Conde, June 2021 Virtual Workshop
This video introduces some fundamental terms and concepts, including types of genetic variation and allele frequency.
A2. Introduction to common variation - Part 2
Lucía Colodro-Conde, June 2021 Virtual Workshop
This video introduces reference panels, linkage disequilibrium, differences across ancestry, the genome browser, and genomic assembly (or build).
A3. Measuring the Genome
Katrina Grasby, June 2022 Virtual Workshop
This video presents key concepts including what a genetic variant is, DNA strand, ambiguous and unambiguous alleles. Types of genetic variation including SNPs, insertions and deletions, biallelic and multiallelic sites are described. The basics of acquiring sequence data and genotype data are introduced.
A4. Quality Control
Katrina Grasby, June 2021 Virtual Workshop
A brief description of how genotyped data is generated from DNA and the steps to clean and check genotyped data prior to imputation and/or using it in analyses.
AP1. Introduction to PLINK
Lucía Colodro-Conde, June 2021 Virtual Workshop
An introduction to PLINK file formats, using the command line, and websites.
B1. Introduction to GWAS - Part 1
Katrina Grasby, June 2021 Virtual Workshop
A basic introduction to GWAS and the commonly used regression models.
B2. Introduction to GWAS - Part 2
Katrina Grasby, June 2021 Virtual Workshop
This introduces ideas of population stratification as a confounder, the large multiple testing burden in GWAS, the challenge of power, and the need for replication.
Biobank-based PheWAS and Saddlepoint approximation test
Rounak Dey, June 2021 Virtual Workshop
This is a lecture about using the Saddlepoint approximation in Biobank-based PheWAS to account for unbalanced case-control ratios for binary phenotypes, which has been implemented in the R library SPAtest.
BP1. Practical: GWAS
Lucía Colodro-Conde & Katrina Grasby, June 2021 Virtual Workshop
These are the files used in the practical held during the 2021 virtual workshop.
C1. Imputation
Sarah Medland, June 2021 Virtual Workshop
This video talks about the concepts involved in imputation of SNP data using public reference panels.
C2. Meta analysis
Sarah Medland, June 2021 Virtual Workshop
This video talks about the common methods used to conduct a GWAS meta-analysis and the software that is commonly used to do this.
DP2. Practical: Polygenic Risk Scores
Baptiste Couvy-Duchesne, June 2022 Virtual Workshop
This practical focuses on calculating a polygenic risk score on a toy dataset (using PRSice) and evaluating its prediction accuracy in a sample of related individuals. The practical involves using several software and packages (OpenMx, R, GCTA) and includes key data management steps (merge, wide-long data formatting).
E1. Introduction to Rare+SAIGE
Wei Zhou, June 2021 Virtual Workshop
This video gives a brief introduction about the outline of the lectures on GWASs in large biobanks and data sets for common and rare variants
E2. GWAS in large-scale biobanks and cohorts
Wei Zhou, June 2021 Virtual Workshop
This is a lecture to introduce the R package SAIGE, which was developed for conducting GWASs in large-scale biobanks, while accounting for unbalanced case-control ratios, handling sample relatedness, and being scalable for large data sets.
E3. Rare variant association tests
Zhangchen Zhao, June 2021 Virtual Workshop
This is a lecture about set-based association tests for testing the association between rare genetic variants and human diseases and traits. It introduces the function called SKATbinary implemented in the SKAT library that conducts set-based association tests while accounting for unbalanced case-control ratios for binary phenotypes.
E4. Rare variant association tests in large-scale biobanks and cohorts
Wei Zhou, June 2021 Virtual Workshop
This is a lecture about conducting set-based association tests in large-scale biobanks and cohorts. It introduces the R package SAIGE-GENE, which was developed for conducting exome-wide or genome-wide set-based tests for rare variants in large-scale biobanks, while accounting for unbalanced case-control ratios, handling sample relatedness, and being scalable for large data sets.
Introduction to Mendelian randomization - Part 2
David M. Evans, June 2021 Virtual Workshop
How does Mendelian randomization work?
Introduction to Mendelian randomization - Part 3
David M. Evans, June 2021 Virtual Workshop
Calculating causal effect estimates via Mendelian randomization
Introduction to Mendelian randomization - Part 4
David M. Evans, June 2021 Virtual Workshop
An example using Mendelian randomization
Introduction to Mendelian randomization - Part 5
David M. Evans, June 2021 Virtual Workshop
Limitations to Mendelian randomization
Introduction to Mendelian randomization - Part 6
David M. Evans, June 2021 Virtual Workshop
Introduction to the MR Base website
Sensitivity analyses in Mendelian randomization studies - Part 1
David M. Evans, June 2021 Virtual Workshop
Inverse variance weighted MR analysis. The importance of “strand” when conducting two sample MR studies across cohorts
Sensitivity analyses in Mendelian randomization studies - Part 2
David M. Evans, June 2021 Virtual Workshop
Horizontal pleiotropy in Mendelian randomization studies, heterogeneity testing and multivariable Mendelian randomization
Sensitivity analyses in Mendelian randomization studies - Part 3
David M. Evans, June 2021 Virtual Workshop
MR-Egger regression
Sensitivity analyses in Mendelian randomization studies - Part 4
David M. Evans, June 2021 Virtual Workshop
The MR median estimator
Sensitivity analyses in Mendelian randomization studies - Part 5
David M. Evans, June 2021 Virtual Workshop
The MR modal based estimator
Sensitivity analyses in Mendelian randomization studies - Part 6
David M. Evans, June 2021 Virtual Workshop
Reverse causal instruments and Steiger filtering
Statistics
All
Special Topics
Estimating and modeling SNP h2 and rg
All
A1. Introduction to genetic relatedness
Katrina Grasby, June 2022 Virtual Workshop
In this video various terms that describe genetic relatedness are introduced. The role of recombination in genetic variation and linkage disequilibrium is described. The concepts of identity-by-state and identity-by-descent are compared.
A2. Deriving the Phenotypic Covariance
David Evans, June 2022 Virtual Workshop
This video shows how the phenotypic covariance is parameterized under the M-GCTA model.
A3. Deriving the Phenotypic Variance
David Evans, June 2022 Virtual Workshop
This video shows how the phenotypic variance is parameterized under the M-GCTA model.
A4. Estimate Variance Components
David Evans, June 2022 Virtual Workshop
This video introduces the variance components model underlying the GCTA software package.
B1. From correlation coefficients to variance components
Baptiste Couvy-Duchesne, June 2022 Virtual Workshop
Models to estimate heritability (twin or SNP h2) are typically linear models that can be seen as extensions of the simple linear model between two variables, from which one estimates a correlation. Here, we start from the simplest model and progressively complexify it.
B2. From correlation coefficients to variance components: part 2 - these models look familiar
Baptiste Couvy-Duchesne, June 2022 Virtual Workshop
This section covers Twin and SNP heritability, h2 from whole genome sequencing, non-additive genetic effects and GWAS approaches. To better compare and understand the different approaches and models we position them in the statistical framework of linear models.
B3. From correlation coefficients to variance components: Part 3 - We can write a lot more models
Baptiste Couvy-Duchesne, June 2022 Virtual Workshop
We continue the exploration of the statistical landscape, including polygenic risk scores (calculation and evaluation), longitudinal models, to conclude on how Statistical Equation Modeling (SEM) can also be decomposed as a set of linear models.
C0. Heritability of individual level data - Welcome message, part 0
Loïc Yengo, June 2021 Virtual Workshop
This is a very short video giving an overview of the lecture about the estimation of additive genetic (co-)variance using individual-level genomic data.
C1. Heritability of individual level data - Introduction, part 1
Loïc Yengo, June 2022 Virtual Workshop
This video briefly introduces the concepts of heritability and genetic correlation, and illustrates what these concepts could be used for.
C2. Heritability of individual level data - Concepts and tools, part 2
Loïc Yengo, June 2021 Virtual Workshop
This video introduces how to measure genetic relatedness between individuals using genomic data and how to use these measures to estimate the heritability of a traits (or a disease).
C3. Heritability of individual level data - Methods, part 3
Loïc Yengo, June 2021 Virtual Workshop
This video introduces two estimators of the SNP-based heritability: (1) the Haseman-Elston regression, which is a method of moment, and (2) the Restricted Maximum Likelihood (REML) methods, which is a likelihood-based method.
C4. Heritability of individual level data - Interpretation, part 4
Loïc Yengo, June 2021 Virtual Workshop
This video discusses how to interpret estimates of SNP-based heritability, what can bias those estimates and what implication these estimates have for Genome-Wide Association Studies.
C5. Heritability of individual level data - Overview of research topics, part 5
Loïc Yengo, June 2021 Virtual Workshop
This video presents examples of active research related to the estimation of heritability from SNP data. It addresses issues related to computational efficiency, estimation of the contribution of non-additive genetic effects or how mate choice may impact the interpretation of these estimates.
CP1. Practical: GREML
Loïc Yengo, June 2021 Virtual Workshop
This practical is based on using the software package GCTA to estimate the heritability of a trait for which causal SNP effects depend on allele frequency and linkage disequilibrium patterns (Lecture Part 4).
CP2. Practical: LDSC
Loïc Yengo, June 2021 Virtual Workshop
This practical is on using the software package LDSC, which implements the Linkage Disequilibrium Regression method to estimate the heritability of a trait (or a disease) using summary statistics from a Genome-Wide Association Study.
D1. GCTA- Genetic Relationship Matrix
td {border: 1px solid #cccccc;}br {mso-data-placement:same-cell;}
David Evans, June 2022 Virtual Workshop
How to calculate a genetic relationship matrix (GRM)
D2. M-GCTA
David Evans, June 2022 Virtual Workshop
This video introduces the M-GCTA model for estimating maternal genetic variance components.
D3. Estimating the Importance of Maternal Genetic Effects on Offspring Phenotypes with "M-GCTA"
David Evans, June 2022 Virtual Workshop
Introduction to the G-REML method and GCTA software package.
E1. Estimating Parental Effects using Polygenic Scores Part I: Model Introduction
Jared V. Balbona, June 2022 Virtual Workshop
SEM-PGS is a model that uses genetic and phenotypic data to estimate parental effects (both genetic and environmental) as well as assortative mating. Here, we cover the underlying logic of SEM-PGS and explain how it is able to obtain these estimates.
E2. Estimating Parental Effects using Polygenic Scores Part II: Model Extensions
Jared V. Balbona, June 2022 Virtual Workshop
In this second video, we cover how SEM-PGS can be used to study assortative mating, and discuss several potential model extensions that can be used to address different types of questions.
F1. Genomic SEM Introduction
Andrew Grotzinger, June 2021 Virtual Workshop
This video provides a broad overview of the Genomic Structural Equation Modeling (Genomic SEM). The video is particularly focused on background information (e.g., motivations for developing the method) and results produced from empirical applications to GWAS summary data.
F2. Lavaan syntax and SEM introduction
Andrew Grotzinger, June 2021 Virtual Workshop
In this video the basic of structural equation modeling (SEM) are introduced. In addition, the video illustrates how SEMs are specified using Lavaan syntax, which is what Genomic SEM uses for model estimation.
G2. Explaining how S and V are estimated and what they are
Michel Nivard, June 2021 Virtual Workshop
GP1. Practical: Working through the examples on the wiki one by one: munge; ldsc; usermodel functions
Michel Nivard, June 2021 Virtual Workshop
GP2. Practical: Working through the examples on the wiki one by one: sumstats and GWAS functions
Andrew Grotzinger, June 2021 Virtual Workshop
This video introduces how to use the sumstats function and multivariate GWAS functions (userGWAS; commonfactorGWAS) in Genomic SEM. These functions are often used to estimate the effect of a SNP on a latent factor and to produce the QSNP heterogeneity metric. However, this suite of functions can more generally be used to examine the effect of individual SNPs within a multivariate system of genetically overlapping traits.
H1. The Augmented Classical Twin Design
David Evans, June 2022 Virtual Workshop
This video introduces a new structural equation model called the “Augmented Classical Twin Design” which relaxes the Equal Environments assumption in twin studies.
Computing
These are topics related to installation and basic usage of software used in other topics.
All
L1. Introduction to the Unix/Linux command line
Jeffrey Lessem, June 2021 Virtual Workshop
A brief introduction to using the Unix/Linux command line focusing on tasks that will be necessary for practicals at the Workshop. It covers basic concepts that people who have never used a command line should be familiar with.
R2. Downloading and Installing R and RStudio
Elizabeth Prom-Wormley, June 2021 Virtual Workshop
This video is well-suited for those who have had limited/no exposure to R. Students will receive a step-by-step approach to downloading R onto their computers, installing R, and installing RStudio as well as some basic background on how R works.
R3. Find, Open, and Review Files in R
Elizabeth Prom-Wormley, June 2021 Virtual Workshop
This video is well-suited for those who have had limited exposure to R. It will walk students through using files within R and conducting preliminary investigations of the variables in a dataset within base R.
R4. Data management in R
Elizabeth Prom-Wormley, June 2021 Virtual Workshop
This video is well-suited for those who have had limited exposure to R. It will guide students through basic functions for data management within base R.
R5. Graphics and Basic Statistics in R
Elizabeth Prom-Wormley, June 2021 Virtual Workshop
This video is well-suited for those who have had limited exposure to R. It will guide students through basic functions for graphics and data visualization within base R.
R6. Working with Twin Data in R
Elizabeth Prom-Wormley, June 2021 Virtual Workshop
This video will focus on working with twin data. In particular, this video will help students establish the basics of running basic analyses and visualizing twin data.
Mendelian Randomization
Guidelines for Responsible Conduct
Workshop Contact Information
The Workshop Coordinator can be contacted with questions related to registration, scheduling, and travel at IBGworkshop@colorado.edu