Allison Atteberry
Associate Professor
Research & Evaluation Methodology

Institute of Behavioral Science, Room 333
University of Colorado Boulder
249 UCB
Boulder, CO 80309

Allison Atteberry is an associate professor in the Research and Evaluation Methodology (REM) program, within the CU Boulder School of Education. She received her PhD in 2011 from the Stanford School of Education in educational policy analysis, with a minor in statistics. Atteberry conducts research on teacher- and school-level interventions designed to improve the quality of instruction experienced by historically underserved students. As a field, we are increasingly aware of how difficult it is to determine whether policies, practices, and interventions have the intended impacts, and so Atteberry approaches her work with a strong interest in what constitutes compelling evidence of causal effects in quantitative research.

In terms of methods, Atteberry teaches and uses both econometric and statistical approaches to education policy analysis. She has a particular interest in the estimation of education production functions in the context of value-added modeling, as well as randomized control trials, instrumental variables, regression discontinuity, propensity score matching, fixed effects, and difference-in-differences causal models. Atteberry also enjoys using hierarchical linear models given their unique suitability for asking sociological questions in nested settings (e.g., repeated observations nested within students, nested within schools, etc.).

Atteberry’s academic interests center on policies and interventions that are intended to help provide effective teachers to the students who need them most. This has led her to focus on the identification, selection, development, and retention of teachers who have measurable impacts on student achievement. Specific topics include teacher preparation, high quality professional development, mentoring and peer collaboration, efforts to use measures of effectiveness formatively to improve practice, policies that target district responses to teachers and schools based on measures of effectiveness, and incentives for the strongest teachers to work in hard-to-serve schools.

Recent Media Mentions


  • PhD Policy Analysis, School of Education, Stanford University, 2011
  • Graduate Minor, Department of Statistics, Stanford University, 2010
  • BA Sociology; BA Political Science, University of Chicago, 2005

Recent Projects

  1. The Causal Effects of Full- vs. Half-Day Pre-K: A Randomized Control TrialAre children more successful in early schooling if they have access to full-day pre-k? Westminster Public School District is piloting a new full-day pre-k program to supplement its existing half-day offerings. Due to oversubscription to the pilot, we used a lottery to randomly assigned children to receive a full-day spot or an offer of half-day. As of 2019, all three cohorts have completed their pre-k year, and we have estimated the causal effects on student early literacy, special education referrals, and socio-emotional outcomes at the end of the year and fall of kindergarten (see EEPA publication here). We are also following study participants through third grade. We have a number of additional manuscripts under way, including studies of how time is used in these classrooms, how the provision of full-day pre-K affects family and home lives, and whether children who experience full day have different likelihoods of receiving special education designations. With colleagues: Daphna Bassok, Vivian Wong. 
  2. The Misattribution of Summers in Teacher Value-Added. In a 2020 ER article, I take up one methodological concern about teacher value-added model (VAM) based effect estimates: State tests are given only once annually in the spring. As a result, teachers’ estimated effects are based on their students’ test scores from the previous- to the current-spring, thus subsuming the summer before teachers even meet those students. Using a unique dataset with both fall and spring scores, I also estimated VAMs using current-fall to next-fall timing, which instead includes the summer after a given school year. Teachers’ VA-based scores from these two VAM timings—both of which purport to capture the same teacher’s effect in the same school year—turn out to be essentially unrelated to one another (ρ=0.13). This finding is concerning. There is no clear reason to prefer spring-to-spring over fall-to-fall time frames, and this choice would lead to an entirely different ranking of teachers.
  3. Trends in Student- and Teacher Outcomes during the Era of Denver ProComp. How has the distribution of student achievement and teachers changed since Denver began one of the first experiments with teacher merit pay in the U.S.? Denver Public Schools (DPS) started ProComp in 2006, and the system includes up to 10 different incentives for various forms of teaching effort and effectiveness. We explore whether highly effective teachers are more likely to remain in DPS since ProComp began. Have strong teachers been more likely to seek employment in DPS since the onset of the program? Has student achievement improved? We use interrupted time series methods to examine whether trends in outcomes are consistent with the roll out of this historic pay-for-performance system.  The manuscript is in press at Teachers College Record, and a link to the current draft is here.
  4. The Role of Summers in Achievement Disparities. What role does summer vacation play in the expansion of achievement gaps during K-12? We use a unique dataset with longitudinal records across K-12 for over half a million students to examine when during school-age years achievement gaps widen the most—during summers or between the first and last day of the school year. The article is in press at AERJ, and an Annenberg EdWorkingPaper version of the paper [No. 19-82] is available here
  5. A Research-Practice Partnership with Denver Public Schools on Supporting their Teacher Workforce. Professors Allison Atteberry and Mimi Engel from the CU Boulder School of Education are in the early stages of forming a long-term, research-practice partnership (RPP) with the Denver Public School District (DPS), called the Teacher Workforce Collaborative (TWC).  TWC connects CU professors with DPS’ Talent Management team. The focus of the RPP is closing Denver’s large and persistent achievement gaps. The mechanism for doing so—strengthening the District’s teacher workforce—is the focus of the Partnership. 
  6. Not Where You Start, But How Much You Grow: An Addendum to the Coleman Report . I remember learning, in my first year of graduate school, that the canonical 1966 Coleman Report established that only 10-20% of the variation in student achievement lies between schools. Schools, it seemed, were simply not a powerful lever to shape students’ outcomes—a takeaway that shook the field. Yet this “schools don’t matter” narrative is difficult to reconcile with the readily observable differences across schools, the formative nature of schooling, and a large body of subsequent evidence documenting aspects of schooling that can improve students’ trajectories. In an article in press at ER, I revisit the Coleman analysis, but instead of just decomposing the variance in students’ achievement levels within and between schools, I also decomposed the variance in students’ achievement growth rates. Coleman himself promoted this approach but did not have the necessary longitudinal data. Unlike achievement levels, most (over 70%) of the variation in student test score growth rates lies between schools. These findings are not definitive, as they touch on some key measurement issues. The article therefore calls for replication in other data settings. The results are nonetheless intriguing, challenging one of the dominant narratives about schools as weak influencers of student outcomes. 
  7. Opening the Gates: Detracking and the International Baccalaureate.  There is broad agreement about the benefits of taking AP and/or IB courses in high school. Nonetheless, student access to such courses remains uneven and inequitable, due largely to the practice of tracking students by perceived “ability.” These tracking practices are often defended based on the contention that detracking and mixed-ability classes are impractical or unworkable. In this 2019 TCR article, we study a reform that combines two basic elements: detracking in Grades 6 through 10 plus open IB enrollment in Grades 11 and 12. The results associated with this detracking reform challenged two widespread beliefs. First, the school’s highest achievers continued to succeed in the more heterogeneous IB classes. Second, the average IB scores for the school’s lower achievers were the same or higher after detracking began, even though many more such students enrolled in those courses. 

Teacher Workforce Collaborative, a Research-Practice Partnership

Overview: Professors Allison Atteberry and Mimi Engel from the CU Boulder School of Education are in the early stages of forming a long-term, research-practice partnership (RPP) with the Denver Public School District (DPS), called the Teacher Workforce Collaborative (TWC).  TWC connects CU professors with DPS’ Talent Management team. The focus of the RPP is closing Denver’s large and persistent achievement gaps. The mechanism for doing so—strengthening the District’s teacher workforce—is the focus of the Partnership. 

Motivation: Racial/ethnic and socioeconomic achievement gaps in DPS are among the largest in the United States. Hispanic-White and Black-White gaps are 2 to 3 times larger than average gaps nationally and are at the 97th percentile of all U.S. districts. The sheer magnitude of this problem necessitates that TWC focus on strategies that have the greatest potential for moving the needle. Research consistently shows that teachers are the number one in-school malleable factor for improving student outcomes. However, research also documents troubling inequalities in access to effective teaching among schools within districts. Non-White and low-income students also typically attend schools with less experienced teachers and higher teacher turnover rates. The connection between the teacher work force and youth inequality therefore arises due to reduced access to effective teachers in schools serving students from historically marginalized populations.

Activities: The TWC Core Team, comprised of researchers at both institutions, comes together on a bimonthly basis to craft and pursue a Joint Research Agenda related to understanding and strengthening DPS’ policies to recruit, place, develop, and retain the strongest teachers for the students who need them most. We produce internal policy documents for our DPS Partners, and we publish relevant findings in peer-reviewed journals. In addition, we develop junior researchers’ skills to conduct high-quality, place-based research. 

Courses Taught

    EDUC 8240: Quantitative Methods in Educational Research II

    Statistical analysis can be a powerful tool in understanding social, educational, psychological, and developmental processes. In cases where it is impossible or impractical to collect data on every individual, classroom, teacher, and school of interest, statistical analysis allows us to examine data on a sample of individuals (or classrooms, schools, etc.) in order to infer patterns in a larger population. For example, we might want to examine data on achievement test scores and per-pupil spending for a sample of schools to determine whether there is an association between spending and achievement patterns in the population. Or we might want to examine the association between race/ethnicity and achievement patterns. Moreover, if we find such an association, we might wish to ask additional questions, such as whether race/ethnic differences in achievement patterns can be accounted for by race/ethnic differences in family socioeconomic characteristics or in school quality.

    In this course we learn to answer such questions using regression analysis—a statistical tool that allows us 1) to describe average patterns of association among multiple variables observed in a sample and 2) to make inferences about the patterns of association among these variables in a population. Regression analysis is a powerful statistical method with many variations. Our goal in this course is to develop an understanding of the basic methods, including their limitations, and to develop skill in using regression analysis to answer educational research questions. Finally, because an important part of any analysis is communicating the results to an audience, we will also place considerable emphasis on learning to present (in writing, tables, and figures) the results of regression analyses. By the end of the semester, students in this course should be sufficiently skilled in regression analyses that they can critically examine published research using regression and can carefully perform their own analyses.


    EDUC 7456: Multilevel Models

    Why study multi-level models? It turns out that most human behavior takes place in nested settings. Here are just a few examples: Children nested in families, students nested in teachers (and in turn in schools), households nested in counties; school districts nested in states; repeated observations over time nested within individuals, etc. The fundamental phenomenon of interest in much behavioral research involves individuals being affected by the groups or organizations in which they live or to which they belong. This course is certainly methodological, but it’s also conceptual in the sense that we will develop a lens for understanding how social organizations shape people’s lives.

    Multilevel models (MLM), also known as hierarchical linear models (HLM), random-effects or random-coefficient models, variance component models, and (generalized) linear mixed models, are used when the units of observation (e.g., students) are grouped within clusters (e.g., schools). In such clustered data, observations for the same cluster cannot be assumed to be mutually independent for given covariate values as required for conventional regression models. Longitudinal or repeated measures data can also be thought of as clustered data with measurement occasions clustered within subjects; hierarchical models for longitudinal data are also known as growth curve models. This course will consider the statistical foundations of hierarchical linear models and focus on their application in behavioral and social research.


    EDUC 7326: Quasi-Experimental Design for Causal Inference in Social Sciences

    Assessing the causal effects of social and educational policies and practices is one important aim of educational and social science research. Educational researchers may want to know, for example, what effect a particular teaching practice has on student learning, what effects accountability policies have on teaching practices, or what effect early childhood education programs have on school readiness, and so on. Sociologists may want to know what the effect of certain neighborhood conditions are on child development, or what the effects of social networks are on individual behavior.

    Historically, however, much educational and social science research has not been designed in such a way as to allow researchers to make credible causal inferences about the effects of educational and social practices and policies. In part, this is because many quantitative studies in education and the social sciences are essentially correlational in nature—they may show that there are statistical associations among sets of policy and practice variables and outcomes, but they do not provide convincing evidence of the causal linkages among these variables.

    In recent decades, however, the so-called counterfactual or potential outcomes model (also called the “Rubin Causal Model”) and related developments have dramatically changed the way that social scientists have thought of causality. The new causal framework is not so much a set of technical models, but a precise logical framework for thinking about causality—and what constitutes evidence of causality—in the social sciences.

    This course introduces students to a toolkit of quantitative methods to enable them to make valid causal inferences, particularly in the absence of a true randomized experiment. The methods covered in the course include 1) randomized experiments, 2) instrumental variables; 3) the use of natural and quasi-experiments; 4) longitudinal methods, including comparative interrupted time-series methods and difference-in-differences methods; 5) regression discontinuity; 6) matching estimators, including propensity score matching; 7) fixed effects estimators, and 8) value-added models. These methods offer considerable power to researchers interested in generating convincing and credible evidence of casual effects.


    Miscellaneous Thoughts on Teaching

    The core of my passion for teaching is simple: I like to take materials or concepts that seem impenetrable and translate them something transparent and conquerable. This motivation comes from my own—often painful—experiences with math classes growing up, which led me to believe I was “not a math person.” The self-misperception stayed with me through college and didn’t get corrected until I was required to take quantitative methods courses in graduate school. Once there, I discovered a deep passion for statistics and econometrics that drives me to this day. I feel lucky that I had a few great professors who gave me the chance to discover a field I love and overcome a counter-productive self-narrative. In this way, great teachers changed the course of my life, which also inspires my interest in the power of teachers. Unfortunately, not all students get a second chance at confronting their educational fears. 

    This kind of experience has been shown to be particularly salient for women and minority students. Indeed, women are less likely to take advanced science courses in high school and choose science- or math-related careers (Charles 2005). Previous research has shown that women’s avoidance of math and science careers may stem from their performance in mathematics and science in high school and even at the onset of formal schooling (Entwisle & Alexander 1990; Hyde, Fennema, & Lamon 1990; Busch 1995; Downey & Vogt-Yuan 2005). However, research suggests that this phenomenon is less related to actual skill and instead accounted for by underlying psychological experiences in the classroom. Stereotype threat occurs when a person who belongs to a group that has a negative stereotype attached to it (say, women in math classes) subconsciously conforms to the negative stereotype by performing a task to a lesser degree than they would otherwise. This may play a crucial role in women’s’ seeming underperformance in and aversion to historically “male” subjects (see, e.g., Kiefer & Sekaquaptewa 2007, Nosek, Banaji,& Greenwald, 2002).

    It has become very common for schools of education to require all doctoral students to complete a year-long sequence in statistical training, so that they can become informed consumers of research regardless of methodological approach used.  The CU-Boulder School of Education is one such program that has made a commitment to strong methodological training for all students, regardless of background or degree program. However when students return to doctoral programs in education, they often have spent a fair bit of time away from academia and sometimes have gone years without participating in math-intensive coursework. In addition, graduate students choose to pursue a doctorate in education for a very wide variety of purposes, many of which have little to do with quantitative research. These factors can lead to particularly high levels of anxiety surrounding statistical coursework embedded within education graduate program. In turn, that anxiety may produce a barrier to deep engagement with the material.

    I adopt several strategies to try to provide many avenues of access for students of all levels in my introductory courses. One example strategy is the use of iClickers—remote-like devices—that students bring to class. The instructor can then design multiple choice questions and solicit anonymous responses to relevant to course content. The iClicker System allows for active participation by all students and provides immediate feedback to the instructor about any confusion or misunderstandings of the material being presented. The main potential benefits of iClickers are (a) to increase student engagement in class and (b) to provide an anonymous but direct line of communication between the instructor and students who feel hesitant or scared to reveal their struggles with the material.

    I bring iClickers into my classes primarily to address the fact that I’ve often found that students are hesitant to reveal when they’re struggling. Students sometimes “suffer silently” and assume they are the only one not “getting it.” I begin my courses by encouraging my students to assume any confusion that will arise stems from the instruction, not the learner. Instructors can make mistakes, explain things poorly, or assume prior knowledge they shouldn’t. I tell them that I want every student to feel comfortable speaking up if something is unclear. That said, I understand that advanced statistics courses can involve a certain level of vulnerability, and some students may prefer to use the clicker to interact anonymously. I have found that, in my course evaluations, there are always a few students who tell me the iClickers helped them interact during class even when they didn’t necessarily feel comfortable raising their hand. 

    Selected Publications

    1. Atteberry, A., Loeb, S. L., Wyckoff, J. (2016). “Teacher Churning Within Schools: Impacts on Student Achievement.” Educational Evaluation and Policy Analysis. (See article in Education Week)
    2. Atteberry, A., McEachin, A. (2016). “School’s Out: Summer Learning Loss Across Grade Levels and School Contexts in the Modern Era.” In Alexander, K. Pitcock, S., & Boulay, M (eds.) Summer Slide: What We Know and Can Do About Summer Learning LossNew York, NY: Teachers College Press.
    3. Atteberry, A., Loeb, S., Wyckoff, J. (2015). “Do First Impressions Matter? Improvement in Early Career Teacher Effectiveness.” AERA Open Access Journal: October. (See article in Education Week)
    4. Atteberry, A., and Bryk, A. S. (2010). Analyzing the Role of Social Networks in School-Based Professional Development Initiatives. In A. J. Daly (Ed.), Social Network Theory and Educational Change. Cambridge, MA: Harvard Press.



    • Dr. Atteberry's dissertation, entitled Validity of Value-Added Estimation: Investigations into Meaning and Measure, focused on estimation of causal effects of teachers and schools and implications for accountability systems (Dissertation Committee: Tony Bryk, Susanna Loeb, Sean Reardon).