Some states that evaluate teachers based partly on student learning use the student growth percentile model, which computes a score that is assumed to reflect a teacher’s current and future effectiveness to help students achieve academically.

However, a recent REL West study found that half or more of the variance in teacher scores from this model is due to random or otherwise unstable sources rather than to reliable information that could predict future performance. Even when derived by averaging several years of teacher scores, effectiveness estimates are unlikely to provide sufficient reliability for high-stakes decisions, such as tenure of teachers or their dismissal.

This report shows how the methods and findings of the study can be used to judge the accuracy of different designs for teacher evaluation systems, concluding that states may want to be cautious about using scores from the student growth percentile model as a measure of teacher effectiveness for high-stakes decisions.