## Breadcrumb

## Workshop Discussion

Whether you're taking the ISG course live, or you just have questions outside of the live workshops, please feel free to post (or answer!) questions to the ISG Workshop Forum. Open 24-7, 365 days a year! If you'd like to follow updates about the ISG workshop, feel free to follow our facebook page.

## ------------------------------------------------------------------------

## NOTE: This page is still under construction. We will be improving its

functionality, organization, and appearance over the next few months.

## ------------------------------------------------------------------------

## Below, you will find material that can be used at any time to learn or brush up on statistical genetics topics.

## Online Workshop

## Structural Equation Modeling

### All

#### A2. Introduction to Analysis of Twin Data Using R and OpenMx - Part 1

Elizabeth Prom-Wormley, June 2022 Virtual Workshop

This video will introduce students to twin modeling using R and OpenMx. By the end of the video, students will be able to (1) recognize the major steps involved in an OpenMx model and (2) translate implementation of a linear regression between a statistical equation, structural equation model, and an OpenMx model

#### A3. Introduction to Analysis of Twin Data Using R and OpenMx - Part 2

Elizabeth Prom-Wormley, June 2022 Virtual Workshop

This video builds on the basics introduced in Part 1. By the end of the video, students will be able to: (1) summarize the goals of analyzing twin data for a single phenotype, (2) summarize the general process by which to analyze twin data for a single phenotype, (3) apply basic R functions for twin-focused data analysis and visualization, and (4) translate the implementation of a basic ACE model between a structural equation model and an OpenMx model

#### A4. Path Analysis and its Application in Models Using Twin Data

Elizabeth Prom-Wormley, June 2022 Virtual Workshop

This video presents the basic rules used for path tracing in structural equation modeling, which is the basis for developing basic and more complex twin models. By the end of the video, students will be able to: (1) identify the advantages of applying path tracing rules and their use in structural equation models, (2) summarize the basic path tracing rules, and (3) apply basic path tracing rules to derive the expected parameters generated from simple regression models of unrelated individuals as well as basic twin models

#### AP1. Practical: ACE Models

Brad Verhulst & Katrina Grasby, June 2022 Virtual Workshop

This practical details several ways that an ACE model can be parameterised in OpenMx. It covers introducing siblings into a twin model and how measured genetic data can be incorporated instead of assuming that DZ twin pairs have a genetic correlation of 0.5.

#### AP2. Practical: Getting acquanted with twin modeling using OpenMx

Elizabeth Prom-Wormley, June 2022 Virtual Workshop

This practical will help students learn the basics of running a simple twin analysis, including challenges and useful strategies. Students will explore the development and estimation of parameters from Saturated and ACE models.

#### B1. Introduction to Structural Equation Modeling

Michael C. Neale, June 2021 Virtual Workshop

The basics of variation - means and variances are considered, followed by description of i) the tracing rules of path analysis and ii) matrix representation of path models. The discussion is illustrated with a simple common factor model, and considers model identification. The 27-minute presentation concludes with rationale for always using Open Source software for scientific purposes, as using closed source code is inappropriate for science.

#### B2. Twin Data and Likelihood

Michael C. Neale, June 2021 Virtual Workshop

This half-hour talk describes the specification of the ACE model that is widely used in human behavioral genetic studies. It considers how a model where a, c and e path coefficients are estimated differs from one where the variance components a2, c2 and e2 are directly estimated (plot spoiler: the latter allows variance components to go negative, which has less interpretability but better statistical properties).

#### B3. Linear Regression and Genetic Covariance Structure Modeling

Conor Dolan, June 2021 Virtual Workshop

#### B8. The Direct Symmetric Matrix Approach to Fitting Twin Models

Brad Verhulst, June 2022 Virtual Workshop

#### BP2. Practical on testing genetic mean effects in twin models

Conor Dolan, June 2021 Virtual Workshop

#### C1. Assumptions of the Classical Twin Design and Biases when Violated

Matthew C Keller, June 2022 Virtual Workshop

Assumptions of the CTD and discussion of biases that occur when these assumptions are violated. Participants will learn how to calculate biases by hand.

#### C2. Extended Twin Family Designs: The Motivation for Using Them

Matthew C Keller, June 2022 Virtual Workshop

The motivation for using Extended Twin Family Designs (ETFDs), including how ETFDs can reduce biases that occur in other designs arising from assortative mating and passive G-E covariance arising from vertical transmission.

#### C3. Extended Twin Family Designs: Path Tracing

Matthew C Keller, June 2022 Virtual Workshop

This lecture shows how to derive expectations of variances and covariances in ETFDs using path tracing rules. We use a “Nuclear Twin Family Design” as an example. throughout.

#### CP1. Practical: CTD and ETFD

Matthew C Keller, June 2022 Virtual Workshop

This practical uses the “Interactive worksheet” link along with the “CTD.NTFD.R” file located under the “Practical Files” link to go through parameter indeterminacy in the CTD and how adding information (e.g., ETFDs) can help deal with this indeterminacy. This is a great way to check your understanding of the lectures in this subsection.

#### D1. Modeling Heterogeneity in Twin Studies

Sarah Medland, June 2022 Virtual Workshop

This video talks about different models of heterogeneity and the terminology that is used in this literature.

#### E1. Multivariate Twin Modeling Chapter 1

Michael Hunter, June 2021 Virtual Workshop

This video takes us from the univariate ACE model to a multivariate ACE model: from ACE to MACE. We begin with a review of the univariate ACE model, and then extend this to two phenotypes as a Cholesky model, introducing the concept of genetic correlation along the way.

#### E2. Multivariate Twin Modeling Chapter 2

Michael Hunter, June 2021 Virtual Workshop

This video introduces a theory-driven multivariate behavior genetics model: the common pathway model. The common pathway model, also called the psychometric factor model, first creates a phenotypic factor model and then biometrically decomposes the factor and residual variances.

#### E3. Multivariate Twin Modeling Chapter 3

Michael Hunter, June 2021 Virtual Workshop

This video introduces a theory-driven multivariate behavior genetics model: the independent pathway model. The independent pathway model, also called the biometric factor model, first creates biometric factors for the A, C, and E variance components and then further biometrically decomposes the residual variances.

#### F1. Multivariate twin models: from univariate to bivariate

Conor Dolan, June 2022 Virtual Workshop

In this lecture, the univariate twin model is extended to the bivariate twin model. It is explained how the variance in two traits is decomposed into additive genetic, non-additive genetic, and environmental effects.

#### F2. Multivariate twin models: from bivariate to multivariate

Conor Dolan, June 2022 Virtual Workshop

In this lecture, the bivariate twin model is extended to the multivariate twin model. It is explained how the variance in four traits is decomposed into additive genetic, non-additive genetic, and environmental effects.

#### F3. Multivariate twin models: independent pathway models part 1

Dirk Pelt, June 2022 Virtual Workshop

In this lecture, the independent pathway model is introduced. A general specification of the common factor model is provided first, which is applied to genetic and environmental correlation matrices to arrive at the independent pathway model.

#### F4. Multivariate twin models: independent pathway models part 2

Dirk Pelt, June 2022 Virtual Workshop

In this lecture, several competing independent pathway models are tested. It explains how results from independent pathway models can be interpreted and presented.

#### F5. Multivariate twin models: the common pathway model

Conor Dolan, June 2022 Virtual Workshop

The common pathway model is discussed, also in relation to independent pathway models.

#### FP2. Practical - Independent & common pathway

Conor Dolan, June 2022 Virtual Workshop

In this practical we use the knowledge from the lectures in this section to estimate multivariate twin models, and independent and common pathway models in R using the OpenMx package. A skinfold dataset with 4 phenotypes is used, and a dataset with Neuroticism items.

#### G4. Multivariate Longitudinal Modeling with Genetically Informative Data

Michael C. Neale, June 2022 Virtual Workshop

Extending the Eaves et al Markov model for genetically informative data to the multivariate case is described. At the time of writing, this model has not yet been implemented in a software script, so some suggestions as to how to do so are included in the 30-minute talk.

#### G5. Markov Modeling with Genetically Informative Data

Michael C. Neale, June 2022 Virtual Workshop

A 30-minute talk about Markov modeling generally, with specific reference to the seminal 1986 contribution of Professor Eaves, which described Markov processes for genetic and environmental variance components. The consequences of including random intercepts for these processes are described.

#### G6. Biometrical Age-Based Latent Growth Curve Modeling

Michael C. Neale, June 2022 Virtual Workshop

This 26 minute talk describes genetically-informative latent growth curve modeling, including how age-at-participation, rather than wave-of-assessment is used to model development instead of the effects of being measured several times. It also shows how genetic variance in the cortical structure of the brain decreases with age, while its heritability increases.

## Statistical Genetics

### All

#### 1. Introduction to Workshop and Basics of Statistical Genetics

Benjamin Neale, June 2021 Virtual Workshop

## Genomics

### All

#### A1. Introduction to common variation - Part 1

Lucía Colodro-Conde, June 2021 Virtual Workshop

This video introduces some fundamental terms and concepts, including types of genetic variation and allele frequency.

#### A2. Introduction to common variation - Part 2

Lucía Colodro-Conde, June 2021 Virtual Workshop

This video introduces reference panels, linkage disequilibrium, differences across ancestry, the genome browser, and genomic assembly (or build).

#### A3. Measuring the Genome

Katrina Grasby, June 2022 Virtual Workshop

This video presents key concepts including what a genetic variant is, DNA strand, ambiguous and unambiguous alleles. Types of genetic variation including SNPs, insertions and deletions, biallelic and multiallelic sites are described. The basics of acquiring sequence data and genotype data are introduced.

#### A4. Quality Control

Katrina Grasby, June 2021 Virtual Workshop

A brief description of how genotyped data is generated from DNA and the steps to clean and check genotyped data prior to imputation and/or using it in analyses.

#### AP1. Introduction to PLINK

Lucía Colodro-Conde, June 2021 Virtual Workshop

An introduction to PLINK file formats, using the command line, and websites.

#### B1. Introduction to GWAS - Part 1

Katrina Grasby, June 2021 Virtual Workshop

A basic introduction to GWAS and the commonly used regression models.

#### B2. Introduction to GWAS - Part 2

Katrina Grasby, June 2021 Virtual Workshop

This introduces ideas of population stratification as a confounder, the large multiple testing burden in GWAS, the challenge of power, and the need for replication.

#### Biobank-based PheWAS and Saddlepoint approximation test

Rounak Dey, June 2021 Virtual Workshop

This is a lecture about using the Saddlepoint approximation in Biobank-based PheWAS to account for unbalanced case-control ratios for binary phenotypes, which has been implemented in the R library SPAtest.

#### BP1. Practical: GWAS

Lucía Colodro-Conde & Katrina Grasby, June 2021 Virtual Workshop

These are the files used in the practical held during the 2021 virtual workshop.

#### C1. Imputation

Sarah Medland, June 2021 Virtual Workshop

This video talks about the concepts involved in imputation of SNP data using public reference panels.

#### C2. Meta analysis

Sarah Medland, June 2021 Virtual Workshop

This video talks about the common methods used to conduct a GWAS meta-analysis and the software that is commonly used to do this.

#### DP2. Practical: Polygenic Risk Scores

Baptiste Couvy-Duchesne, June 2022 Virtual Workshop

This practical focuses on calculating a polygenic risk score on a toy dataset (using PRSice) and evaluating its prediction accuracy in a sample of related individuals. The practical involves using several software and packages (OpenMx, R, GCTA) and includes key data management steps (merge, wide-long data formatting).

#### E1. Introduction to Rare+SAIGE

Wei Zhou, June 2021 Virtual Workshop

This video gives a brief introduction about the outline of the lectures on GWASs in large biobanks and data sets for common and rare variants

#### E2. GWAS in large-scale biobanks and cohorts

Wei Zhou, June 2021 Virtual Workshop

This is a lecture to introduce the R package SAIGE, which was developed for conducting GWASs in large-scale biobanks, while accounting for unbalanced case-control ratios, handling sample relatedness, and being scalable for large data sets.

#### E3. Rare variant association tests

Zhangchen Zhao, June 2021 Virtual Workshop

This is a lecture about set-based association tests for testing the association between rare genetic variants and human diseases and traits. It introduces the function called SKATbinary implemented in the SKAT library that conducts set-based association tests while accounting for unbalanced case-control ratios for binary phenotypes.

#### E4. Rare variant association tests in large-scale biobanks and cohorts

Wei Zhou, June 2021 Virtual Workshop

This is a lecture about conducting set-based association tests in large-scale biobanks and cohorts. It introduces the R package SAIGE-GENE, which was developed for conducting exome-wide or genome-wide set-based tests for rare variants in large-scale biobanks, while accounting for unbalanced case-control ratios, handling sample relatedness, and being scalable for large data sets.

#### Introduction to Mendelian randomization - Part 2

David M. Evans, June 2021 Virtual Workshop

How does Mendelian randomization work?

#### Introduction to Mendelian randomization - Part 3

David M. Evans, June 2021 Virtual Workshop

Calculating causal effect estimates via Mendelian randomization

#### Introduction to Mendelian randomization - Part 4

David M. Evans, June 2021 Virtual Workshop

An example using Mendelian randomization

#### Introduction to Mendelian randomization - Part 5

David M. Evans, June 2021 Virtual Workshop

Limitations to Mendelian randomization

#### Introduction to Mendelian randomization - Part 6

David M. Evans, June 2021 Virtual Workshop

Introduction to the MR Base website

#### Sensitivity analyses in Mendelian randomization studies - Part 1

David M. Evans, June 2021 Virtual Workshop

Inverse variance weighted MR analysis. The importance of “strand” when conducting two sample MR studies across cohorts

#### Sensitivity analyses in Mendelian randomization studies - Part 2

David M. Evans, June 2021 Virtual Workshop

Horizontal pleiotropy in Mendelian randomization studies, heterogeneity testing and multivariable Mendelian randomization

#### Sensitivity analyses in Mendelian randomization studies - Part 3

David M. Evans, June 2021 Virtual Workshop

MR-Egger regression

#### Sensitivity analyses in Mendelian randomization studies - Part 4

David M. Evans, June 2021 Virtual Workshop

The MR median estimator

#### Sensitivity analyses in Mendelian randomization studies - Part 5

David M. Evans, June 2021 Virtual Workshop

The MR modal based estimator

#### Sensitivity analyses in Mendelian randomization studies - Part 6

David M. Evans, June 2021 Virtual Workshop

Reverse causal instruments and Steiger filtering

## Statistics

### All

## Special Topics

## Estimating and modeling SNP h^{2} and r_{g}

### All

#### A1. Introduction to genetic relatedness

Katrina Grasby, June 2022 Virtual Workshop

In this video various terms that describe genetic relatedness are introduced. The role of recombination in genetic variation and linkage disequilibrium is described. The concepts of identity-by-state and identity-by-descent are compared.

#### A2. Deriving the Phenotypic Covariance

David Evans, June 2022 Virtual Workshop

This video shows how the phenotypic covariance is parameterized under the M-GCTA model.

#### A3. Deriving the Phenotypic Variance

David Evans, June 2022 Virtual Workshop

This video shows how the phenotypic variance is parameterized under the M-GCTA model.

#### A4. Estimate Variance Components

David Evans, June 2022 Virtual Workshop

This video introduces the variance components model underlying the GCTA software package.

#### B1. From correlation coefficients to variance components

Baptiste Couvy-Duchesne, June 2022 Virtual Workshop

Models to estimate heritability (twin or SNP h2) are typically linear models that can be seen as extensions of the simple linear model between two variables, from which one estimates a correlation. Here, we start from the simplest model and progressively complexify it.

#### B2. From correlation coefficients to variance components: part 2 - these models look familiar

Baptiste Couvy-Duchesne, June 2022 Virtual Workshop

This section covers Twin and SNP heritability, h2 from whole genome sequencing, non-additive genetic effects and GWAS approaches. To better compare and understand the different approaches and models we position them in the statistical framework of linear models.

#### B3. From correlation coefficients to variance components: Part 3 - We can write a lot more models

Baptiste Couvy-Duchesne, June 2022 Virtual Workshop

We continue the exploration of the statistical landscape, including polygenic risk scores (calculation and evaluation), longitudinal models, to conclude on how Statistical Equation Modeling (SEM) can also be decomposed as a set of linear models.

#### C0. Heritability of individual level data - Welcome message, part 0

Loïc Yengo, June 2021 Virtual Workshop

This is a very short video giving an overview of the lecture about the estimation of additive genetic (co-)variance using individual-level genomic data.

#### C1. Heritability of individual level data - Introduction, part 1

Loïc Yengo, June 2022 Virtual Workshop

This video briefly introduces the concepts of heritability and genetic correlation, and illustrates what these concepts could be used for.

#### C2. Heritability of individual level data - Concepts and tools, part 2

Loïc Yengo, June 2021 Virtual Workshop

This video introduces how to measure genetic relatedness between individuals using genomic data and how to use these measures to estimate the heritability of a traits (or a disease).

#### C3. Heritability of individual level data - Methods, part 3

Loïc Yengo, June 2021 Virtual Workshop

This video introduces two estimators of the SNP-based heritability: (1) the Haseman-Elston regression, which is a method of moment, and (2) the Restricted Maximum Likelihood (REML) methods, which is a likelihood-based method.

#### C4. Heritability of individual level data - Interpretation, part 4

Loïc Yengo, June 2021 Virtual Workshop

This video discusses how to interpret estimates of SNP-based heritability, what can bias those estimates and what implication these estimates have for Genome-Wide Association Studies.

#### C5. Heritability of individual level data - Overview of research topics, part 5

Loïc Yengo, June 2021 Virtual Workshop

This video presents examples of active research related to the estimation of heritability from SNP data. It addresses issues related to computational efficiency, estimation of the contribution of non-additive genetic effects or how mate choice may impact the interpretation of these estimates.

#### CP1. Practical: GREML

Loïc Yengo, June 2021 Virtual Workshop

This practical is based on using the software package GCTA to estimate the heritability of a trait for which causal SNP effects depend on allele frequency and linkage disequilibrium patterns (Lecture Part 4).

#### CP2. Practical: LDSC

Loïc Yengo, June 2021 Virtual Workshop

This practical is on using the software package LDSC, which implements the Linkage Disequilibrium Regression method to estimate the heritability of a trait (or a disease) using summary statistics from a Genome-Wide Association Study.

#### D1. GCTA- Genetic Relationship Matrix

td {border: 1px solid #cccccc;}br {mso-data-placement:same-cell;}

David Evans, June 2022 Virtual Workshop

How to calculate a genetic relationship matrix (GRM)

#### D2. M-GCTA

David Evans, June 2022 Virtual Workshop

This video introduces the M-GCTA model for estimating maternal genetic variance components.

#### D3. Estimating the Importance of Maternal Genetic Effects on Offspring Phenotypes with "M-GCTA"

David Evans, June 2022 Virtual Workshop

Introduction to the G-REML method and GCTA software package.

#### E1. Estimating Parental Effects using Polygenic Scores Part I: Model Introduction

Jared V. Balbona, June 2022 Virtual Workshop

SEM-PGS is a model that uses genetic and phenotypic data to estimate parental effects (both genetic and environmental) as well as assortative mating. Here, we cover the underlying logic of SEM-PGS and explain how it is able to obtain these estimates.

#### E2. Estimating Parental Effects using Polygenic Scores Part II: Model Extensions

Jared V. Balbona, June 2022 Virtual Workshop

In this second video, we cover how SEM-PGS can be used to study assortative mating, and discuss several potential model extensions that can be used to address different types of questions.

#### F1. Genomic SEM Introduction

Andrew Grotzinger, June 2021 Virtual Workshop

This video provides a broad overview of the Genomic Structural Equation Modeling (Genomic SEM). The video is particularly focused on background information (e.g., motivations for developing the method) and results produced from empirical applications to GWAS summary data.

#### F2. Lavaan syntax and SEM introduction

Andrew Grotzinger, June 2021 Virtual Workshop

In this video the basic of structural equation modeling (SEM) are introduced. In addition, the video illustrates how SEMs are specified using Lavaan syntax, which is what Genomic SEM uses for model estimation.

#### G2. Explaining how S and V are estimated and what they are

Michel Nivard, June 2021 Virtual Workshop

#### GP1. Practical: Working through the examples on the wiki one by one: munge; ldsc; usermodel functions

Michel Nivard, June 2021 Virtual Workshop

#### GP2. Practical: Working through the examples on the wiki one by one: sumstats and GWAS functions

Andrew Grotzinger, June 2021 Virtual Workshop

This video introduces how to use the sumstats function and multivariate GWAS functions (userGWAS; commonfactorGWAS) in Genomic SEM. These functions are often used to estimate the effect of a SNP on a latent factor and to produce the QSNP heterogeneity metric. However, this suite of functions can more generally be used to examine the effect of individual SNPs within a multivariate system of genetically overlapping traits.

#### H1. The Augmented Classical Twin Design

David Evans, June 2022 Virtual Workshop

This video introduces a new structural equation model called the “Augmented Classical Twin Design” which relaxes the Equal Environments assumption in twin studies.

## Computing

These are topics related to installation and basic usage of software used in other topics.

### All

#### L1. Introduction to the Unix/Linux command line

Jeffrey Lessem, June 2021 Virtual Workshop

A brief introduction to using the Unix/Linux command line focusing on tasks that will be necessary for practicals at the Workshop. It covers basic concepts that people who have never used a command line should be familiar with.

#### R2. Downloading and Installing R and RStudio

Elizabeth Prom-Wormley, June 2021 Virtual Workshop

This video is well-suited for those who have had limited/no exposure to R. Students will receive a step-by-step approach to downloading R onto their computers, installing R, and installing RStudio as well as some basic background on how R works.

#### R3. Find, Open, and Review Files in R

Elizabeth Prom-Wormley, June 2021 Virtual Workshop

This video is well-suited for those who have had limited exposure to R. It will walk students through using files within R and conducting preliminary investigations of the variables in a dataset within base R.

#### R4. Data management in R

Elizabeth Prom-Wormley, June 2021 Virtual Workshop

This video is well-suited for those who have had limited exposure to R. It will guide students through basic functions for data management within base R.

#### R5. Graphics and Basic Statistics in R

Elizabeth Prom-Wormley, June 2021 Virtual Workshop

This video is well-suited for those who have had limited exposure to R. It will guide students through basic functions for graphics and data visualization within base R.

#### R6. Working with Twin Data in R

Elizabeth Prom-Wormley, June 2021 Virtual Workshop

This video will focus on working with twin data. In particular, this video will help students establish the basics of running basic analyses and visualizing twin data.

## Mendelian Randomization

## Guidelines for Responsible Conduct

## Workshop Contact Information

The Workshop Coordinator can be contacted with questions related to registration, scheduling, and travel at IBGworkshop@colorado.edu