In 1885 a German psychologist named Hermann Ebbinghaus sat down with lists of nonsense syllables — "WID," "ZOF," "KAP" — and memorised them. He then tested himself at varying intervals: 20 minutes, an hour, 9 hours, a day, 2 days, 6 days, 31 days. He recorded how many he still remembered. The retention plot he produced is called the forgetting curve, and the numbers behind it still drive the interval formulas used in every major flashcard app today. Anki, SuperMemo, Quizlet Learn — they're all variations on the math Ebbinghaus started and Piotr Wozniak formalised a century later.
The shape of forgetting
Ebbinghaus found that memory of freshly learned material decays roughly exponentially. You lose the largest chunk in the first hour, another chunk in the first day, and progressively smaller fractions thereafter. Retention as a function of time looks approximately like R = e^(−t/S), where S is a per-item "stability" parameter that determines how fast the curve drops. Items with low stability (unfamiliar, unlinked to existing knowledge) decay fast. Items with high stability (well-understood, tied to prior knowledge) decay slowly.
The critical observation: each successful review increases stability. A card reviewed once and correctly recalled a day later has higher stability than the same card just after learning. Review it again successfully and stability increases further. This is why the intervals between reviews can grow: a well-established card that you haven't seen in 30 days still sits at high retention, so reviewing it earlier would be wasted effort.
Why reviewing just before you'd forget is optimal
Here's the efficiency question. If you review a card every day, you'll remember it, but most of those reviews are redundant — you'd still remember it at day 3 or day 5. Every redundant review is time spent not learning new material. On the other hand, if you review a card every 60 days, you'll have forgotten most of them and have to relearn from scratch, which is slower per minute than reviewing while the memory still exists.
The sweet spot is reviewing just as retention starts to drop toward the threshold where recall becomes effortful — roughly 85–90% retention probability. At that point the retrieval itself is effortful (which strengthens the memory most; this is the "desirable difficulty" effect from Bjork's research) but not so effortful that you fail. Every review at this point yields the largest stability increase per minute spent. Our spaced repetition interval calculator implements this logic.
The SM-2 algorithm
Piotr Wozniak, working on the SuperMemo project in the late 1980s, published SM-2 in 1987 as the first practical algorithm for choosing review intervals. The logic is straightforward enough to describe completely. Each card has an "ease factor" (EF) that starts at 2.5 and a review counter (n). When you review the card, you rate your recall quality on a 0–5 scale: 0 = blackout, 3 = correct with serious difficulty, 5 = perfect recall.
The next interval is chosen by rule. If n = 1, next interval = 1 day. If n = 2, next interval = 6 days. If n > 2, next interval = previous interval × ease factor. The ease factor itself updates after every review: EF' = EF + (0.1 − (5 − q) × (0.08 + (5 − q) × 0.02)), where q is the quality rating. Good recalls nudge EF up; bad recalls nudge it down. If q < 3, the card is reset to n = 1 and the intervals start over.
A worked example
Imagine you add a new card to your deck on day 0. EF starts at 2.5, n = 0.
- Day 0: initial learn. n becomes 1. Next interval = 1 day.
- Day 1: review, quality = 4 (correct with hesitation). EF updates to ~2.50 (roughly unchanged at q=4). n becomes 2. Next interval = 6 days.
- Day 7: review, quality = 5 (perfect). EF updates to ~2.60. n becomes 3. Next interval = 6 × 2.60 ≈ 16 days.
- Day 23: review, quality = 4. EF stays ~2.60. Next interval = 16 × 2.60 ≈ 42 days.
- Day 65: review, quality = 2 (incorrect). n resets to 1. Next interval = 1 day. EF drops to ~2.42.
The intervals grow geometrically as long as you keep recalling successfully, which is exactly what the forgetting-curve math predicts should be efficient. A card you've reviewed correctly 6 times might be scheduled 6 months out; a card you recently failed is back in your face tomorrow.
What modern apps changed
Anki uses a variant of SM-2 as its default scheduler (and as of recent versions offers FSRS, a newer probabilistic model). The main variations modern apps layer on top: a "learning" phase for new cards with steps like 10 minutes → 1 day before SM-2 takes over; a separate "lapsed" queue for failed cards; fuzz factors that randomise intervals slightly to prevent all cards from clumping on the same review day; and options for hard/good/easy buttons rather than the full 0–5 scale.
SuperMemo itself has continued past SM-2 through SM-17 and beyond. Modern variants (FSRS, SM-17) use larger neural or statistical models that estimate individual card stability directly from review history, rather than relying on a single ease factor. In practice the efficiency gains over SM-2 are modest for most learners, and SM-2 remains the dominant algorithm in the wild because it's simple, explainable, and works well enough. Our flashcard review scheduler sticks to SM-2 for exactly these reasons.
Why it works better than cramming
The efficiency advantage of spaced repetition over massed practice is well-documented. In a 2006 meta-analysis (Cepeda et al.), spaced review produced roughly twice the long-term retention of massed review for the same total study time. The effect scales with the retention interval — if you need to remember something for a month, spacing matters a little; if you need to remember it for a year, spacing matters a lot.
Mechanistically, the current best explanation is consolidation: each successful retrieval across a spaced interval engages mechanisms that move memories from hippocampus-dependent storage toward more durable neocortical representation. Massed review doesn't trigger the same consolidation cycles, because the repeated retrievals happen before any meaningful decay has occurred.
When spaced repetition doesn't fit
Spaced repetition is excellent for discrete items: vocabulary, formulas, historical dates, medical terminology, anatomy labels, code snippets. It's less useful for skills that require integrating many items in complex procedures — writing a literary essay, solving a novel proof, debugging an unfamiliar system. For those, the research points toward deliberate practice on progressively harder problems, not card-based review.
A reasonable rule of thumb: if you can state the item as a question-answer pair where the answer is short and definite, spaced repetition will probably help. If the task is "write a 5-page argument" or "solve this open-ended problem," you need a different method. Most students end up using spaced repetition for the declarative component of a subject (facts, definitions, core mechanisms) and problem-based practice for the procedural component. Our study hours planner can help you split time between both.
The core insight from Ebbinghaus is unchanged 140 years later: memories decay predictably, review near the threshold of forgetting is maximally efficient, and successful review increases stability non-linearly. The arithmetic in every flashcard app is just an implementation detail of that observation.