There’s a lot of confusion about the LSAT’s curve. The LSAT is not actually scored to a curve, but most test-takers think it is.
This series is my effort to explain LSAC’s process of test-equating, raw score conversions, percentiles, and why the test isn’t actually curved. Because I dislike statistics (and because most of you probably do also), this blog post involves very little math. However, it might involve some thinking.
You’ve been warned.
LSAC’s Associate Director of Psychometric Research, Lynda Reese, recently wrote the following to one test-taker who asked about the curve (I’ve added the links):
[T]he LSAT is not graded to a curve…Rather, for every form of the LSAT, a statistical process called test equating is carried out to adjust for minor differences in difficulty between different forms of the test. Specifically, the item response theory (IRT) true score equating method is applied to convert raw scores (the number correct) for each administration to a common 120 to 180 scale. A detailed description of this methodology can be found in…Applications of Item Response Theory to Practical Testing Problems…The equating process assures that a particular LSAT scaled score reflects the same level of ability regardless of the ability level of others who tested on the same day or any slight differences in difficulty between different forms of the test. That is, the equating process assures that LSAT scores are comparable, regardless of the administration at which they are earned.
I’m not a psychometrics expert, but I decided to go ahead and learn more about how LSAC constructs the exam and ensures different PrepTests are of relatively equal difficulty.
I looked up the book Ms. Reese referenced (and believe me, it wasn’t exactly a walk in the park).
The following is my understanding of how LSAC creates each LSAT and goes about the test-equating process. Feel free to leave questions and comments, especially if you have a decent understanding of statistics, psychometrics, etc. LSAC’s also welcome to leave comments. They haven’t commented on the blog yet, but the door’s always open.
If you’re new to the LSAT, see the LSAT FAQ for more on the basics before getting into all the details.
If you’re not new to the LSAT, read on, starting with these definitions of basic terms and concepts:
Conversion Chart: Chart at the end of each PrepTest that helps you translate a raw score into a score out of 180
Percentile: The percentage of test-takers whose scores fall below yours. If you score in the 50th percentile, you scored higher than half of all test-takers. If you score in the 97th percentile, you scored higher than 97% of all test-takers.
PrepTest: Previously administered and released LSAT exam
Psychometrics: The study of psychological measurements. As far as we’re concerned, it’s the “science” of standardized testing.
Raw Score: The number of questions you answer correctly on the LSAT
Test-equating/Pre-equating: “a statistical method used to adjust for minor fluctuations in the difficulty of different test forms so that a test taker is neither advantaged nor disadvantaged by the particular form that is given” – LSAC (PDF).
Test form: A particular LSAT exam
Scores have to be meaningful and consistent
The LSAT is a standardized exam. This means that a 160 on the Feb 2010 LSAT should be equivalent to a 160 on the June 2010 LSAT, which should be equivalent to a 160 on the October 2010 LSAT, etc. Law schools can’t be bothered to look at particular Logic Games, Logical Reasoning, and Reading Comprehension on various exams to see if students with identical scores actually performed at different levels. They can’t bother to look at test-takers’ raw scores. That’s why they have equated numerical scores out of 180, after all.
Administering the same questions over and over wouldn’t work
One theoretical (and stupid) way to ensure that all scores were equal would be to create only one LSAT PrepTest and administer it over and over. This would ensure that all test-takers were treated equally and that the “raw score conversions” were always fair. However, this ignores the fact that test-takers would share information with each other.
People who took the February 2010 LSAT would give/sell info about questions that appeared to test-takers who took it in June 2010, etc. Under such a system, the later one took the exam, the more inflated his/her score would be, on average. Thus, LSAC can’t just keep giving the exact same questions exam after exam.
For this reason, LSAC needs to create different exams for each released test administration and make them of relatively equal difficulty. A 160 on one LSAT (aka “test form”) needs to be equivalent to a 160 on any other LSAT.