Research & Evaluation Methodology
School of Education, Room 235
University of Colorado Boulder
Boulder, CO 80309-0249
Lorrie A. Shepard, PhD is University Distinguished Professor and Dean of the School of Education at the University of Colorado Boulder. Her research focuses on psychometrics and the use and misuse of tests in educational settings. Her technical work has contributed to validity theory, standard setting, and statistical models for detecting test bias. Her research studies on test use have addressed the identification of learning disabilities, readiness screening for kindergarten, grade retention, teacher testing, effects of high-stakes accountability testing, and most recently the use of classroom assessment to support teaching and learning.
Dr. Shepard is past president of the American Educational Research Association and past president of the National Council on Measurement in Education. She was elected to the National Academy of Education in 1992 and served as president of the NAEd from 2005-2009. Dr. Shepard has also served as editor of the Journal of Educational Measurement and of the American Educational Research Journal. She received the Distinguished Career Award from the National Council on Measurement in Education, the award for Distinguished Contributions to Research in Education from the American Educational Research Association, the 2005 Henry Chauncey Award for Distinguished Service to Assessment and Education Science from Educational Testing Service, and the 2006 David G. Imig Award for Distinguished Achievements in Teacher Education from the American Association of Colleges for Teacher Education.
PhD Research and Evaluation Methodology, University of Colorado Boulder, 1972
MA Counseling, University of Colorado Boulder, 1970
BA History, Pomona College, 1968
My early work focused on technical aspects of test development such as standard setting, item bias detection, and validity theory. A cardinal principle of validity research, however, is that test validity depends on how a test is used. As a result, I began to consider contexts of test use. For example, should a readiness test—made up of old IQ test items—be used to keep children out of kindergarten?
My most recent work focuses on classroom assessment. Drawing on cognitive research and sociocultural theory, I examine ways that assessment can be used as an integral part of instruction to help students learn.
My teaching interests focus on research methodology and on testing and assessment topics important to preservice teachers. I want to ensure that doctoral students in the School of Education have a solid grounding in both quantitative and qualitative research methods and that they know how to use tool skills associated with these methods in conducting systematic, disciplined studies on important topics. My goal for teacher candidates is that they be well prepared to use formative as well as summative assessments in their classrooms. Especially, they should be able to analyze assessment results to identify and respond to student needs and to revise instruction based on evidence of student learning.
Courses frequently taught
EDUC 5716: Basic Statistical Methods
Course Overview and Objectives
This course is designed to provide a broad overview of statistical concepts and procedures commonly used in the social sciences (e.g., psychology, sociology, education). The first part of the course will focus on descriptive statistics and the second part on inferential statistics. A goal of the course is to help students gain an understanding of statistics found in journal articles and in evaluation and policy reports published at the national, state, and district level. The idea is for you to become a thoughtful consumer of statistics, able to make sense of what the numbers mean--instead of only trusting the author to tell you what they mean--and to be aware of common fallacies. The goal of this first level course is not to make you a proficient data analyst. Students are asked to do simplified calculations by hand as a way of "seeing" how the numbers work. For example, how do changes in individual scores affect changes in a summary statistic? Students are also introduced to use of statistical software, Statistical Package for the Social Sciences (SPSS), as an easy way to obtain graphical displays and statistical summaries.
EDUC 7416: Seminar on Assessment of Student Learning
Course Overview and Objectives
This seminar is a special topics course focused on issues of assessment of learning and academic achievement in the current context of standards-based educational reform. Key ideas and understandings to be developed in the course include the following:
- The traditional measurement concepts of reliability, validity, and fairness. How are these terms understood by measurement specialists? What should their meaning be for curriculum specialists and teacher educators?
- The important differences between assessments conducted in classrooms as an on-going part of the teaching-learning process vs. externally mandated assessments used to monitor trends or to hold schools accountable.
- Behaviorist learning theories of the past have shaped both teaching and testing practices. What are the implications of cognitive, constructivist, or situated learning theories for changing assessment practices?
- The admirable intentions and controversial aspects of standards-based educational reform. What vision of assessment is put forward? Why does assessment have such a prominent place in the arguments for reform?
- What is meant by "authentic" and "direct" assessment? What does it look like in each of the subject areas for assessments to embody meaningful content and processes? What assessment strategies--e.g., observations, performances, portfolios, projects, essays, and presentations--are effective in addressing substantive goals? How can these assessment strategies contribute to the learning process and what kind information do they provide?
- What is the role of assessment in the learning process? How are expectations communicated to students and feedback used to guide improvement? How can assessments be used to get beyond "knows it" or "doesn't know it" to provide insights about the specifics of students' understandings and misconceptions that are held? How is a classroom culture created whereby teacher evaluations of performance are seen as fair and as valued coaching rather than as damaging to oneÌs relationship with students?
- How should each of the general principles regarding assessment and assessment practices be modified to take account of the specific characteristics of students--especially students' age, language background, and special learning needs?
Service & Outreach
Selected Professional Service
- National Academy of Sciences Board on Testing and Assessment, 1999-present.
- NAEP Validity Studies Panel, National Center for Education Statistics, 1995-present.
- Editor, Journal of Educational Measurement, Volumes 15-17, a publication of the National Council on Measurement in Education, 1977-80.
- Editor, with Mary Lee Smith & Gene Glass, American Educational Research Journal, Volumes 21-23, a publication of the American Educational Research Association, 1983-86.
- Interim Editor, with Hilda Borko, Educational Researcher, a publication of the American Educational Research Association, 2000.
- Co-chair, with Milbrey W. McLaughlin, of the National Academy of Education Panel on Standards-Based Education Reform, 1994-95.
- Co-chair with Sharon Lynn Kagan of the National Education Goals Panel Goal 1 Early Childhood Assessment Resource Group.
- Co-chair with Steve Gunderson, Panel on Improving of Education Research (PIER), provided recommendations to the Congress on reauthorization of the Office of Educational Research and Improvement, 1999-2000.
- President, American Educational Research Association, 1999-2000. Member of the AERA Council and Executive Board, 1998-2001.
- Vice President, National Academy of Education, 1993-97.
- President, National Council on Measurement in Education, 1982-83; Member of the NCME executive board as president-elect and immediate past-president 1981-84.
Selected University Service
- Vice-Chancellor's Advisory Committee on promotion and tenure, 1981-86
- Program Review Panel, Boulder Campus, 1988-91
- Salary Equity Methodology Committee, Boulder Campus, 1990-93
- Chair, Boulder Faculty Assembly Enrollment Task Force, 1994-95
- Member, Vice-Chancellor's Academic Planning Committee, 1995-96
- Boulder Faculty Assembly 1993-1996; Executive Committee, 1994-99.
- Member, Chancellor’s Committee on Federal Relations, Boulder Campus, 1998-present
School of Education
- Dean, 2001-2004, Interim Dean, 1996-98
- Director of Graduate Studies, School of Education, 1988-95
- Chair, Research and Evaluation Methodology, 1975-present
Research Articles: Accommodations
Shepard, L.A., Taylor, G., & Betebenner, D. (1998). Inclusion of Limited-English-Proficient Students in Rhode Island’s Grade 4 Mathematics Performance Assessment. Technical Report 486. Los Angeles, CA: Center for the Study of Evaluation. www.cse.ucla.edu/products/Reports/TECH486.pdf (PDF)
Other resources on testing accommodations for special education students and English-language learners:
Research Articles: Classroom Assessment
Shepard, L.A. (2003). Reconsidering large-scale assessment to heighten its relevance to learning. In J. M. Atkin & J. E. Coffey (Eds.), Everyday Assessment in the Science Classroom. Arlington, VA: National Science Teachers Association.
Shepard, L.A. (2000). The role of assessment in a learning culture. Educational Researcher, 29 (7), 4-14.
Shepard, L.A. (1997). Measuring achievement: What does it mean to test for robust understanding?. William H. Angoff Memorial Lecture Series, Third Annual. Educational Testing Service
Shepard, L.A. (1989). Why we need better assessments. Educational Leadership, 46, 4-9.
Shepard, L.A. (1991). Psychometricians' beliefs about learning. Educational Researcher, 20, 2-16.
Shepard, L.A. (1991). Will national tests improve student learning? Phi Delta Kappan, 72, 232-238. www.cse.ucla.edu/products/Reports/TECH342.pdf
Shepard, L.A. (1992). Commentary: What policy makers who mandate tests should know about the new psychology of intellectual ability and learning. In B.R. Gifford & M.C. O'Connor (Eds.), Changing assessments: Alternative views of aptitude, achievement, and instruction. Boston: Kluwer Academic Publishers.
Shepard, L.A. (1993). The place of testing reform in educational reform: A reply to Cizek. Educational Researcher, 22, 10-13.
Shepard, L.A. (1995). Using assessment to improve learning. Educational Leadership 52, 38-43.
Shepard, L.A. (1996). Measuring achievement. What does it mean to test for robust understandings?" William H. Angoff Memorial Lecture Series.Princeton, NJ: Educational Testing Services. Researcher, 29, 4-14. www.ets.org
Shepard, L.A. (2001). The role of classroom assessment in teaching and learning. In V. Richardson (Ed.), The Handbook of Research on Teaching, Fourth Edition. Washington, DC: American Educational Research Association. Technical Report 517. National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
Shepard, L.A., & Bliem, C.L. (1995). Parents' thinking about standardized tests and performance assessments. Educational Researcher, 24, 25-32.
Shepard, L.A., Flexer, R.J., Hiebert, E.H., Marion, S.F., Mayfield, V., & Weston, T.J. (1996). Effects of introducing classroom performance assessments on student learning. Educational Measurement: Issues and Practice, 15, 7-18.
Research Articles: Grade Retention
Heubert, J. P., & Hauser, R. M. Eds. (1999). High stakes: Testing for tracking, promotion, and graduation. Washington, DC: National Academy Press. [See Chapter 6 on Promotion and Retention.]
Articles by Shepard and Smith:
Shepard, L.A. (1994). Grade repeating. In T. Husen & T N Postlethwaite (Eds.), The International Encyclopedia of Education (2nd ed.). Oxford: Pergamon Press.
Shepard, L.A. & Smith, M.L. (1987). Effects of kindergarten retention at the end of first grade. Psychology in the Schools, 24, 346-357.
Shepard, L.A., & Smith, M.L. (Eds.). (1989). Flunking Grades: Research and Policies on Retention. London: The Falmer Press. (PDF)
Shepard, L.A. & Smith, M.L. (1990). Synthesis of research on grade retention. Educational Leadership, 47, 84-88.
Shepard, L.A., Smith, M.L., & Marion, S.F. (1996). Failed evidence on grade retention. Psychology in the Schools, 33, 251-261.
Shepard, L.A., Smith, M.L., & Marion, S.F. (1998). On the success of failure: A rejoinder to Alexander. Psychology in the Schools, 35, 404-406. [www3.interscience.wiley.com/cgi-bin/issuetoc?ID=10049909 ‡ journal]
Smith, M.L. & Shepard, L.A. (1987). What doesn't work: Explaining policies of retention in early grades. Phi Delta Kappan, 68, 129-134. (PDF)
Research Articles: High-Stakes Testing
Shepard, L. A. (2002). Standardized tests and high-stakes assessment. In Guthrie (Ed.), Encyclopedia of Education, Vol. 6, 2nd ed, pp. 2533–2537. New York: Macmillan Reference.
Koretz, D., Linn, R.L., Dunbar, S.B., & Shepard, L.A. (1991, April). The effects of high-stakes testing on achievement: Preliminary findings about generalization across tests. Paper presented at the annual meeting of the American Educational Research Association, Chicago.
Shepard, L. A. (1988, April). Should instruction be measurement driven?: A debate. Paper presented at the Anuual Meeting of the American Educational Research Association, New Orleans.
Shepard, L.A. (1990). Inflated test score gains: Is the problem old norms or teaching the test? Educational Measurement: Issues and Practice, 9, 15-22.
Shepard, L. A. (1991). Will national tests improve student learning? Phi Delta Kappan, 72, 232-238.
Shepard, L.A., & Cutts-Dougherty, K. (1991, April). Effects of high-stakes testing on instruction. Paper presented at the annual meeting of the American Educational Research Association, Chicago.
Taylor, G., Shepard, L., Kinner, F., & Rosenthal, J. (2000). A survey of teachers’ perspectives on high-stakes testing in Colorado: What gets taught, what gets lost? Technical Report 588. National Center for Research on Evaluation,
Standards, and Student Testing (CRESST).
Other important references:
Heubert, J. P., & Hauser, R. M. Eds. (1999). High stakes: Testing for tracking, promotion, and graduation. Washington, DC: National Academy Press.
U.S. Congress, Office of Technology Assessment, Testing in American Schools: Asking the Right Questions, OTA-SET-519 (Washington, DC: U.S. Government Printing Office, February 1992).
Research Articles: Learning Disabilities
Davis, W.A. & Shepard, L.A. (1983). Specialists' use of tests and clinical judgment in the diagnosis of learning disabilities. Learning Disability Quarterly, 6, 128-138.
Shepard, L.A. (1980). An evaluation of the regression discrepancy method for identifying children with learning disabilities. Journal of Special Education, 14, 79-91.
Shepard, L. (1983). The role of measurement in educational policy: Lessons from the identification of learning disabilities. Educational Measurement: Issues and Practice, 2, 4-8.
Shepard, L.A. (1989). Identification of mild handicaps. In R.L. Linn (Ed.), Educational Measurement. Third Edition. Washington, D.C.: The American Council on Education and MacMillan Publishing Company.
Shepard, L.A. & Smith, M.L. (1983). An evaluation of the identification of learning disabled students in Colorado. Learning Disability Quarterly, 6, 115-127. (PDF)
Shepard, L.A., Smith, M.L., & Vojir, C.P. (1983). Characteristics of pupils identified as learning disabled. American Educational Research Journal, 20, 309-331. (PDF)
Shepard, L.A. (1987). The new push for excellence: Widening the schism between regular and special education. Exceptional Children, 53, 327-329. (PDF)
Research Articles: School Readiness
Bredekamp, S. & Shepard, L. (1989). How best to protect children from inappropriate school expectations, practices, and policies. Young Children, 44, 14-24.
Bredekamp, S., & Shepard, L. (1998). Assessing young children’s learning and development. In R. Brandt (Ed.), Assessing Student Learning: New rules, New Realities. Arlington, VA: Educational Research Service.
Graue, M.E. & Shepard, L.A. (1992). Public school entrance age. In Williams, L.R., & Fromberg, D.P. (Eds.), Encyclopedia of Early Childhood Education. New York: Garland Publishing.
Graue, M.E. & Shepard, L.A. (1989). Predictive validity of the Gesell School Readiness Tests. Early Childhood Research Quarterly, 4, 303-315.
Shepard, L.A. (1991). The influence of standardized tests on early childhood curriculum, teachers, and children. In B. Spodek & O.N. Saracho (Eds.), Issues in Early Childhood Curriculum (Yearbook in Early Childhood Education, Vol. 2). New York: Teachers College Press.
Shepard, L.A. (1992). Psychometric properties of the Gesell Developmental Assessment: A critique. Early Childhood Research Quarterly, 7, 47-52.
Shepard, L.A. (1992). Retention and redshirting of kindergarten children. In Williams, L.R., & Fromberg, D.P. (Eds.), Encyclopedia of Early Childhood Education. New York: Garland Publishing.
Shepard, L.A. (1994). The challenges of assessing young children appropriately. Phi Delta Kappan, 76, 206-212. (PDF)
Shepard, L.A. (1997). Children not ready to learn? The invalidity of school readiness testing. Psychology in the Schools, 34, 85-97.
Shepard, L.A., & Graue, M.E. (1993). The morass of school readiness testing: Research on test use and test validity. In B. Spodek (Ed.), Handbook of Research on the Education of Young Children. New York: Teachers College Press.
Shepard, L., Kagan, S.L., & Wurtz, E. (Eds.) (1998). Principles and Recommendations for Early Childhood Assessments. Washington, D.C.: National Education Goals Panel.
Shepard, L.A. & Smith, M.L. (1986). Synthesis of research on school readiness and kindergarten retention. Educational Leadership, 44, 78-86.
Shepard, L.A. & Smith, M.L. (1988). Escalating academic demand in kindergarten: Counterproductive policies. The Elementary School Journal, 89, 135-145. (PDF)
Shepard, L.A. & Smith, M.L. (1988). Flunking kindergarten: Escalating curriculum leaves many behind. American Educator, 12, 34-38. (PDF)
Smith, M.L. & Shepard, L.A. (1988). Kindergarten readiness and retention: A qualitative study of teachers' beliefs and practices. American Educational Research Journal, 25, 307-333. (PDF)
Research Articles: Standards and Assessment Reform
Shepard, L.A. (2015). If we know so much from research on learning, why are educational reforms not successful? In M.J. Feuer, A.I. Berman, & R.C. Atkinson (Eds.), Past as prologue: The National Academy of Education at 50. Members reflect (pp. 41-51). Washington, DC: National Academy of Education.
Shepard, L.A. (2003). Reconsidering large-scale assessment to heighten its relevance to learning. In J. M. Atkin & J. E. Coffey (Eds.), Everday assessment in the science classroom, pp. 121-146. Arlington, VA: National Science Teachers Association Press.
Shepard, L. A., & Peressini, D. D. (2002, January). An analysis of the content and difficulty of the CSAP 10 th-grade Mathematics Test: A report to the Denver Area School Superintendents’ Council (DASSC). Boulder, CO: School of Education, University of Colorado Boulder.
McLaughlin, M., & Shepard, L.A. (1995). Improving Education through Standards-Based Reform: A Report by the National Academy of Education Panel on Standards-Based Education Reform. Stanford, CA: National Academy of Education. May be ordered from the National Academy of Education at [www.naeducation.org/NAEd_Publications.html]
Shepard, L.A. (1980). Standard setting issues and methods. Applied Psychological Measurement, 4, 477-467. (PDF)
Shepard, L.A. (1980). Technical issues in minimum competency testing. In D.C. Berliner (Ed.), Review of Research in Education, Volume VIII. Itasca, IL: F.E. Peacock Publishers. (PDF)
Shepard, L.A. (1983). Standards for placement and certification. In Anderson, S.B. & Helmick, J.S. (Eds.), On Educational Testing. San Francisco, CA: Jossey-Bass Publishers.
Shepard, L.A. (1984). Standard-setting methods. In R.A. Berk (Ed.), Criterion-referenced measurement: The state-of-the-art. Second Edition. Baltimore, MD: The Johns Hopkins University Press.
Shepard, L.A. (1995). Implications for standard setting of the National Academy of Education Evaluation of National Assessment of Educational Progress Achievement Levels. Proceedings from the Joint Conference on Standard Setting for Large-Scale Assessments. Washington, D.C.: National Assessment Governing Board and National Center for Education Statistics.
Shepard, L., Glaser, R., Linn, R., & Bohrnstedt, G. (1993). Setting performance standards for student achievement: A report of the National Academy of Education Panel on the Evaluation of the 1992 Achievement Levels. Stanford, CA: National Academy of Education. May be ordered from the National Academy of Education at: [www.naeducation.org/NAEd_Publications.html]
Research Articles: Teacher Testing
Shepard, L.A. & Kreitzer, A. (1987). The Texas teacher test. Educational Researcher, 16, 22-31. (PDF)
Research Articles: Test Bias
Camilli, G., & Shepard, L.A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: SAGE Publications.
Camilli, G. & Shepard, L.A. (1987). The inadequacy of ANOVA for detecting test bias. Journal of Educational Statistics, 12, 87-99. (PDF)
Shepard, L.A. (1987). The case for bias in tests of achievement and scholastic aptitude. In Modgil, S. & Modgil, C. (Eds.), Arthur Jensen: Consensus and Controversy. London: Falmer Press.
Shepard, L.A. (1981). Identifying bias in test items. In B.F. Green (Ed.), Issues in testing: Coaching, disclosure and ethnic bias. San Francisco, CA: Jossey Bass.
Shepard, L.A. (1982). Definitions of bias. In R.A. Berk (Ed.), Handbook of Methods for detecting test bias. Baltimore, MD: The Johns Hopkins University Press.
Shepard, L., Camilli, G., & Williams, D.M. (1984). Accounting for statistical artifacts in item bias research. Journal of Educational Statistics, 9, 93-128. (PDF)
Shepard, L.A., Camilli, G., & Williams, D. (1985). Validity of approximation techniques for detecting item bias. Journal of Educational Measurement, 22, 77-105.
Research Articles: Test Misuse
Shepard, L.A. (1991). Negative policies for dealing with diversity: When does assessment and diagnosis turn into sorting and segregation. In E. Hiebert (Ed.), Literacy for a Diverse Society: Perspectives, Practices, and Policies. New York: Teachers College Press.
Shepard, L.A. (1992). Uses and abuses of testing. In Marvin C. Alkin (Ed.), Encyclopedia of Educational Research, Sixth Edition, pp. 1477-1485. New York: MacMillan.
Research Articles: Test Validity
Shepard, L.A. (1993). Evaluating test validity. In L. Darling-Hammond (Ed.), Review of Research in Education, Vol. 19. Washington, DC: American Educational Research Association. (PDF)
Shepard, L.A. (1997). The centrality of test use and consequences for test validity. Educational Measurement: Issues and Practice, 16, 5-8.
- C&I: Humanities Education
- C&I: Literacy Studies
- C&I: Math & Science Education
- C&I: Research on Teaching & Teacher Education
- EECD: Educational Equity & Cultural Diversity
- EFPP: Educational Foundations, Policy & Practice
- LSHD: Learning Sciences & Human Development
- REM: Research & Evaluation Methodology
- MA + Licensure
- Partners in Education (PIE)