Chapter 1 Introduction 1.1 Rationales for studying rater variability 1.2 Status quo of studies on rater variability 1.3 An overview of this book 1.4 Definition of key termsChapter 2 Literature review: Studies on rater variability in language performance assessment 2.1 Rater variability in language performance assessment 2.2 Exploring rater variability using statistical analysis 2.2.1 Introduction 2.2.2 Rater reliability in Classical Test Theory 2.2.3 Rater facet as variance component in Generalizability Theory "" 2.2.4 Rater calibration in Many-Facet Rasch Model 2.2.5 Summary 2.3 Process-oriented approach to investigating rater variability 2.3.1 Raters'' decision-making: the "black box" behind the final ratings" 2.3.2 Indirect evidence 2.3.3 Direct investigation of rating process: insights from verbal protocols 2.4 Factors accounting for rater variability 2.4.1 External factors 2.4.2 Internal factors 2.4.3 Situational factors 2.5 A framework for comparison between rater groups 2.6 SummaryChapter 3 Study 1: Investigating the scoring reliability of CET-SET using Many-Facet Rasch Model 3.1 Issues in second language speaking assessment 3.2 Challenges in test validation 3.3 The context of the study 3.4 Objectives of the study 3.5 Methods 3.5.1 Data 3.5.2 Instrument MFRM 3.6 Data analyses and findings 3.6.1 Facet map 3.6.2 Candidates 3.6.3 Tasks 3.6.4 Items 3.6.5 Rating scales 3.6.6 Raters 3.6.7 Bias analysis 3.7 Conclusions 3.8 Implications 3.9 Further research efforts to be madeChapter 4 Study 2: Exploring how raters'' cognitive and meta-cognitive strategies influence rating accuracy in essay scoring 4.1 Subjective scoring: A matter of reliability or validity? 4.2 Exploring rating process: Looking into rater variability 4.3 Rater cognition studies in writing assessment 4.4 Methodology 4.4.1 The context of the study 4.4.2 Participants 4.4.3 Materials 4.4.4 Data collection 4.4.5 Data analysis 4.5 Results and discussion 4.5.1 General patterns of differences in broad categories 4.5.2 In-depth investigation of differences in the major sub-categories 4.6 Summary and further discussion 4.7 ConclusionChapter 5 Conclusions 5.1 Summary of findings 5.2 Comparison of the two studies 5.3 Limitations 5.4 Further research efforts to be madeAppendix I CET-SET rating scaleAppendix II CET4 rating rubrics for the writing taskAppendix III The writing task of the Dec. 2006 administration of CET4 and range findersAppendix IV Sample essaysAppendix V Instructions and training tasks for think-aloud sessionAppendix VI Sample transcripts of raters'' thinking aloudAppendix VII Coding protocols for think-aloud verbal reportsAppendix VIII The coding scheme for raters'' cognitive and meta-cognitive strategiesReferencesIndex