Teacher knows best?
In this dissertation the (dis)advantages of teacher judgments and test results of the end of primary school test are discussed, and how to optimally combine them.
Chapter 1, written in close collaboration with psychologists, provides an easy-to-read article on the different types of confidence intervals in common intelligence tests, together with an explanation of the often ignored or unknown assumptions that underlie these confidence intervals. Within the Classical Test Theory, tests often report the earlier
mentioned standard error of measurement (SEm; and/or the confidence intervals that are based on this SEm). One strict assumption underlying this SEm is that all pupils are measured with the same amount of measurement error . Since this assumption is hardly tenable, chapter 2, discusses two alternatives in which this assumption is loosened. The final chapter of part I, chapter 3, is written on the topic of Bayesian approximate measurement invariance .
Many daily decisions influence the learning opportunities of pupils, including ability grouping and instructional decisions, but also selection, grade retention and track allocation. Chapter 4, therefore explores
whether and how teacher judgments for almost all Dutch pupils in their transition from primary to secondary education were influenced by pupil- and school characteristics. In chapter 5, it is investigated how teacher judgments can be made explicit through a technique called ‘expert
In the context of the Dutch Eindtoets (EPST), the million dollar question is whether the EPST-result or the teacher judgment is more suitable in the track allocation process from primary to secondary education. In chapter 6, this question, again using a large dataset collected by CBS, is answered. Among other things, teacher judgment and EPST-result are compared with the pupil’s placement in secondary education three years
later, to investigate how predictive teacher judgment and EPST-result generally are. The final chapter, chapter 7, compares two different checks for prior-data conflict, specifically focusing on the question how robust these checks are when one of the ingredients of the checks (i.e., the choice of distance measure) is altered.