Gpakit

A student finishes an AP practice exam in late April, converts their raw multiple-choice count and estimated free-response scores through an online predictor, and gets back "likely 4, borderline 5." The actual score in July comes back a 3. Nothing went wrong in the predictor. AP scoring involves a standard psychometric procedure called equating, and the cutoff scores between score bands quietly shift every year. This post walks through the scoring structure, why the cutoffs move, and what a specific score actually means for college credit.

The two-section structure

Most AP exams have two sections: a multiple-choice section (MC) and a free-response section (FRQ). A smaller number have variations — AP Studio Art is entirely portfolio-based, AP Music Theory includes a listening section, AP Computer Science Principles has a performance task — but the MC-plus-FRQ structure covers the majority of exams, including all the large enrollment subjects like U.S. History, Calculus, Biology, English Language, and Psychology.

Section weights differ by exam. AP Calculus AB and BC weight MC and FRQ 50/50. AP U.S. History weights them 55/45 in favour of the essays. AP Biology is 50/50. AP English Literature is 45/55 in favour of the FRQ essays. The exam-specific weights are published on the College Board's course description for each subject, and any AP score predictor worth using reads those weights directly rather than assuming a uniform 50/50 split.

The composite score

Raw MC points and raw FRQ points are combined into a composite score using the subject-specific weights. The composite is scaled to sit on roughly a 100-to-150-point range depending on the exam. For AP Calculus AB, for example, the MC section is 45 questions worth one point each, and the FRQ section is six questions scored out of 9 points each. The raw MC total (0 to 45) and raw FRQ total (0 to 54) are each multiplied by a scaling factor so that each section contributes exactly 50% of the composite, and the results are added.

The composite is what the College Board actually uses to assign your 1-through-5 score. There is no intermediate "percentage correct" step visible to the student. A student who gets 70% of the composite points might finish with a 4 on one year's Calculus AB exam and a 3 on another year's, depending on how the cutoffs came out that year. Our AP exam score predictor applies the most recent published cutoffs for each subject, with a flagged note that they can shift.

Equating: why the cutoffs move

Every year's AP exam is a different test. The questions are new, the difficulty of any individual question is unknown in advance, and the overall exam's difficulty varies from year to year. The College Board's job is to make sure that a "4 in AP Chemistry 2026" means the same level of demonstrated ability as "a 4 in AP Chemistry 2024," even if the exams themselves were harder or easier.

The procedure that accomplishes this is called equating. It is standard practice on any high-stakes standardised exam — the SAT, the LSAT, the MCAT, the GRE all do it. Equating works by including a small set of "anchor" items whose historical performance is known, observing how the current year's test-takers perform on those anchor items, and statistically adjusting the current year's score scale so that any given ability level maps to the same final score as it did in previous years.

The result is that the composite-score cutoff for, say, a 4 in AP Biology might be 67 one year and 71 another. The exam was harder the year the cutoff was 67; the equating process lowered the cutoff to compensate. From the student's perspective, the scale looks like it slid. From the test designer's perspective, the scale was kept fixed and the raw-composite-to-score mapping was adjusted to hold it in place.

Why predictors are always off by roughly ±1

Any AP score predictor is working from last year's cutoffs, or from the average cutoff over the last several years. Because equating adjusts the cutoff every year, there is always some residual uncertainty: last year's 67 cutoff might be this year's 64, or this year's 70. That range translates directly into ±1 of uncertainty on the final score for students close to a boundary.

For students well inside a band — raw composite of 85 out of 100 on an exam where every recent cutoff for a 5 has been between 66 and 72 — the prediction is solid. For students near a boundary (composite of 67 on the same exam), the prediction is fundamentally uncertain, and a good predictor will report both outcomes rather than pretend precision it does not have. The difference between a 4 and a 5 can come down to one or two MC questions.

The same equating logic applies to other standardised tests students compare against. Our SAT-to-ACT converter uses the College Board and ACT's official concordance tables, which are themselves built on equating; the converter is exact for the population average but has natural scatter for any individual.

What a specific score actually means

The College Board publishes a brief definition of each score:

  • 5: Extremely well qualified. Typically earns college credit at any institution that accepts AP credit for the subject.
  • 4: Well qualified. Earns college credit at most institutions, though some of the most selective schools require a 5 for certain subjects.
  • 3: Qualified. Earns college credit at many institutions, particularly large public universities, but is commonly rejected at selective private schools.
  • 2: Possibly qualified. Generally does not earn college credit.
  • 1: No recommendation. Does not earn college credit.

The credit policies are set institution by institution and subject by subject. A 3 in AP U.S. History at a flagship state university might satisfy a general-education history requirement worth three to six credit hours; the same 3 at an Ivy League school might satisfy nothing. Before assuming any particular credit outcome, check the specific school's published AP credit policy — most post them on the registrar's website, sorted by exam.

The "placement" use case

College credit is one use of an AP score; course placement is another, and the distinction matters. A student who earns a 4 on AP Calculus BC might receive credit for two semesters of calculus, or they might receive credit for one semester and "placement" into the second — meaning they skip the introductory course but do not bank the credit hours. Placement-only policies are common at engineering schools and in hard-sciences sequences where the department wants tight control over prerequisite knowledge.

A student using AP scores for planning should know which outcome applies at their target schools before deciding whether a marginal 3-or-4 is worth the gamble. If a school awards placement-only regardless of score, the exam may matter less than the student thinks. If a school awards full credit at 4 but nothing at 3, the boundary matters a great deal.

Course grades, curves, and the AP score

An AP course grade and the AP exam score are separate things. A student can earn an A in AP Chemistry and a 2 on the exam, or a B-minus and a 5. The course grade reflects the teacher's assessment of classroom performance — homework, labs, tests, essays. The exam score reflects one three-hour performance on a standardised instrument. Some schools grade AP courses on a curve; our curve grade calculator shows how common curving formulas translate raw test scores into letter grades.

For college admissions, both matter but they do different work. The course grade feeds the student's GPA and class rank, which are used in the admissions decision itself. The exam score, which usually arrives after admissions decisions are made, feeds the credit-and-placement decision after matriculation. Students sometimes conflate the two and over-invest in chasing a 5 on the exam when their GPA impact has already been determined months earlier.

What to do with a borderline predicted score

If a predictor returns "solid 4" or "solid 3," believe it. If it returns "borderline 3/4" or "borderline 4/5," treat the range as a genuine range rather than rounding up or down in your own planning. Preparation in the final weeks before the exam realistically moves a borderline score up by one band for most students, assuming the gap is made up on content rather than on timing or stamina. The math on the predictor is right; the residual uncertainty is a feature of the test design, not a flaw in the prediction.

Related calculators

Related articles