Interleaving: Why Mixing Your Practice Beats Mastering One Topic at a Time

Apr 20

Written By AC

Open any standard math textbook and you'll find a predictable structure. A lesson on slope is followed by a block of slope problems. A lesson on area is followed by a block of area problems. By the time a student picks up a pencil, the strategy is already known, not because the student identified it, but because the assignment announced it. An analysis of six seventh-grade mathematics textbooks found that, on average, about three-quarters of practice problems were grouped this way: a lesson, then a matching block of repetition (Dedrick et al., 2016).

This arrangement is called blocked practice, and its dominance makes intuitive sense. If you want to learn slope, practice slope. If you want to learn area, practice area. Finish one topic before moving on to the next. Work until it feels fluent, then move forward.

The trouble is that blocked practice trains you to do something slightly different from what exams and real-world applications actually demand. Working through a block of slope problems teaches you to execute the slope procedure. What it skips is the harder task that comes before execution: recognizing that a slope problem is a slope problem when it appears alongside ratio problems, graphing problems, and percentage problems that share similar surface features. On a cumulative exam, no one labels the problems for you. You have to figure out which approach fits before you can apply it.

Interleaving reverses the arrangement. Instead of practicing one type of problem until it feels solid, consecutive problems require different strategies. A slope problem follows a volume problem, which followed a probability problem. You can't carry the same procedure from one item to the next. Each problem demands a fresh decision: what kind of problem is this, and what should I do?

The Mechanism: Discrimination, Not Variety

The popular framing of interleaving treats it as a way to keep things fresh — avoid boredom, stay engaged, mix it up. That misses the point entirely.

The leading account of why interleaving works centers on discrimination. When different problem types appear in succession, the learner must attend to the features that distinguish one type from another, not just the features that define a single type. A meta-analysis spanning fifty-nine articles and over 8,400 participants tested this directly by examining whether the similarity between categories predicted the size of the interleaving benefit (Brunmair & Richter, 2019). It did. When the categories being learned were similar enough to be confusable, interleaving produced its largest advantages. When categories were already distinct and hard to mix up, the benefit shrank — and for some materials, blocking actually worked better.

Researchers have proposed several accounts of why this pattern holds. One hypothesis suggests that interleaving promotes comparisons between consecutive items, highlighting the features that distinguish one category from another. A refinement of this idea, called the attentional bias framework, specifies when each arrangement helps (Brunmair & Richter, 2019). Interleaved presentation highlights differences between categories; blocked presentation highlights similarities within them. Which matters more depends on the structure of what you're learning. When categories are easy to confuse (their members look alike across groups), interleaving forces the between-category comparisons that clarify the boundaries. When categories are internally diverse (their members don't obviously resemble each other), blocking provides the sustained within-category exposure needed to find the common thread. For most exam preparation, the challenge is the first kind: telling similar-looking problems apart. That's the niche where interleaving earns its advantage.

Neuroimaging evidence is consistent with this account. A narrative review reports that when items are repeated within a block, the brain processes each successive repetition less thoroughly, a phenomenon called neural repetition suppression (Van Hoof et al., 2022). Interleaving appears to attenuate that suppression, maintaining more extensive processing across encounters. The behavioral evidence is stronger than the neural evidence at this point, but they tell a consistent story: blocked practice produces a kind of cognitive habituation that interleaving disrupts.

One important complication: interleaving inherently produces spacing. When you alternate between topics, each encounter with a given topic is separated from the previous encounter by intervening material. That means any observed interleaving benefit is partially confounded with a spacing benefit — the very effect covered in the last post. Some studies have tried to separate the two by equating the temporal distribution of topics across conditions, and interleaving effects have sometimes persisted beyond what spacing alone would predict. But the question of how much interleaving adds beyond spacing remains open. If you choose to interleave your practice, you are also choosing to space it, and both contributions are probably helping.

The Classroom Evidence

One notable test of interleaving in a real educational setting involved 787 seventh-grade students across fifty-four classes and fifteen teachers (Rohrer et al., 2020). Over four months, students completed practice assignments that were either mostly interleaved or mostly blocked. Both groups then received an identical interleaved review assignment before the test, equalizing the last-practice-to-test interval. On an unannounced test one month later, students in the interleaved condition scored markedly higher than those in the blocked condition. Every one of the fifteen teachers showed a positive interleaving effect, and teachers surveyed anonymously before they knew the results expressed support for the approach.

That last detail matters. Teachers, who watched students struggle more visibly with interleaved assignments which often took longer as well, still endorsed the method. The interleaved assignments were harder. Students made more errors during practice. But the test results told a different story than the practice sessions did.

A parallel finding comes from undergraduate physics. Over eight weeks, students completed either interleaved or blocked homework assignments (Samani & Pan, 2021). On surprise tests given during the semester, the interleaved group showed median improvements of fifty percent on one test and more than double on another. The interleaved group recalled more relevant formulas and more frequently produced correct solutions.

Two patterns recur across these studies. The first is a metacognitive illusion: students in the physics study rated interleaved assignments as more difficult and reported less learning from them, yet scored substantially higher on delayed tests (Samani & Pan, 2021). The practice that felt least productive was the most effective. The second is structural. Interleaving doesn't just train discrimination; it also forces retrieval. In a blocked assignment, the formula needed for each problem is the same as the one just used, and students can solve the next problem by glancing at their previous work. In an interleaved assignment, consecutive problems require different approaches, so students must recall the relevant procedure from memory rather than copying it from the line above (Rohrer et al., 2020). This means interleaving quietly embeds two other evidence-based strategies — spacing and retrieval practice — into ordinary homework, without requiring any additional effort from the student or the teacher. In other words, a single change to how practice problems are arranged delivers three mechanisms of durable learning at once.

“a single change to how practice problems are arranged delivers three mechanisms of durable learning at once”

When Interleaving Doesn't Work

The meta-analysis found a clear pattern across material types: interleaving produced moderate-to-large benefits for learning to distinguish among visual categories (painting styles, bird species, naturalistic photographs), moderate benefits for mathematical problem types, ambiguous results for expository texts, and a negative effect for word learning — where blocking actually outperformed interleaving (Brunmair & Richter, 2019).

The word-learning finding is the boundary condition that matters most, because it reveals the logic behind the pattern. When the learning goal is to figure out what items within a category share in common (as when learning a set of vocabulary words), blocking provides the sustained within-category exposure that supports that goal. Interleaving disrupts exactly the kind of processing the task requires. But when the goal is to distinguish between similar categories, interleaving forces the comparisons that make those distinctions clear.

This means interleaving is not a universal improvement over blocking. It occupies a specific niche: tasks where the central challenge is discriminating among confusable alternatives or choosing the correct procedure from several that could plausibly apply. For mathematics, science, and diagnostic reasoning, fields where similar-looking problems require different solutions, that niche is large and practically important. For tasks that require building up a rich representation of what defines a single category, it isn't.

Two other boundaries are worth noting. First, the controlled evidence base is predominantly adolescent and undergraduate students. As of 2022, a review of the learning science literature found no published examples of interleaving being formally tested in continuing professional development (Van Hoof et al., 2022). That gap is beginning to close at the graduate medical education level: a national workgroup in family medicine has developed a consensus definition for Longitudinal Interleaved Residency Training, formalizing it as a curricular structure built on spaced repetition and interleaving principles (Zeller et al., 2023), and at least one emergency medicine program has implemented a fully interleaved three-year curriculum replacing the traditional block format (Clayton et al., 2024). These are implementation efforts, not controlled comparisons, so the question of whether interleaving produces the same measurable advantages in professional training that it does in classroom settings remains open. But the direction of adoption is clear. Second, while interleaving reliably improves discrimination and retention, most studies have measured these outcomes on tasks closely resembling the practice format. Whether interleaving produces far transfer — the ability to apply learned strategies to genuinely novel problems that require integrating separately learned topics — is less established. The existing evidence shows clear benefits for memory and near-transfer problem solving (Samani & Pan, 2021), but the kind of flexible, cross-domain application that most learners ultimately care about has not been as directly tested.

Stacking Strategies: Why More Difficulty Isn't Always Better

The previous two posts covered retrieval practice and spacing. Interleaving completes a trio of strategies that share a counterintuitive feature: they each make practice harder, and they each improve long-term learning relative to easier alternatives.

A natural instinct is to combine all three. Space your sessions, interleave your topics, and retrieve from memory rather than rereading. In some cases, combinations do produce additive benefits — the retrieval practice post noted that pairing spaced scheduling with retrieval produced the strongest retention in an anatomy learning study (Dobson et al., 2016).

But the picture is not that simple. In a study using vocabulary paired associates, interleaving the arrangement of retrieval-practice and restudy trials produced no additional benefit beyond retrieval practice alone (Abel & Roediger, 2007). The testing effect was equally strong whether practice was blocked or interleaved. Mixing the format of practice sessions did not automatically multiply the benefits.

A useful framework for understanding why comes from motor learning research. The Challenge Point Framework proposes that learning depends on the amount of interpretable information a practice task generates, and that this amount is governed by the interaction between task difficulty and the learner's skill level (Guadagnoli & Lee, 2004). When conditions are too easy, the task generates little information to learn from. When conditions exceed the learner's capacity to interpret what's happening, additional difficulty stops helping. The optimal challenge point — the level of difficulty that maximizes learning — shifts as expertise grows. Interleaving, spacing, and retrieval each independently increase the functional difficulty of practice. For a learner already managing one or two of these demands, adding a third may push difficulty past the point where additional challenge is productive. The practical lesson is that these strategies are not meant to be stacked indiscriminately. Each one works, but the value of adding another source of difficulty depends on what the learner is already managing and how far along they are in mastering the material.

What This Looked Like for Me

For most of my studying before medical school, I worked through one subject at a time. It felt efficient: build momentum, move through problems faster, finish with a sense of progress. Interleaving entered my routine not because I planned it, but because my study system demanded it.

Over the first two months of medical school, I built a workflow around third-party video resources and Anki. The rhythm was simple: watch a video or two on a topic, do the corresponding Anki cards, then move to the next topic. Within any given organ system, that meant bouncing between pharmacology, microbiology, physiology, and pathology in a single session. The interleaving was a byproduct of trying to cover ground, not a deliberate strategy. But I noticed something: when a video was too short or basic, I wouldn't jump straight to the cards. I'd watch another video or two first, so that the Anki review would pull from multiple topics at once. I was making the retrieval harder on purpose, even before I had a name for why that worked.

The payoff showed up most clearly during board prep. I was strict about running every question block in random mode, mixing all disciplines and organ systems rather than filtering by topic. I wanted practice that matched the test: no labels, no grouping, just the next question. That meant every problem required a discrimination step before I could even begin solving it. It was slower and less comfortable than topic-specific blocks, but it was the closest thing to the actual cognitive demands of the exam.

I still use blocking when it fits. When I encounter a topic for the first time, I need sustained exposure just to build a working understanding before interleaving can do anything useful. You can't discriminate between categories you haven't learned yet. But once I have a basic grasp of each topic, mixing them during practice produces results that focused, topic-by-topic drilling can't match.

“You can't discriminate between categories you haven't learned yet”

The Core Lesson

Interleaving works because it forces a cognitive step that blocked practice skips: figuring out which strategy to use before you use it. That step — discrimination — is exactly what exams and applied settings demand, and it is exactly what students get the least practice doing when their assignments group problems by topic.

The evidence is strong for mathematics and visual category learning, moderate for science, and absent or negative for tasks where the goal is to learn what items within a category share rather than what distinguishes categories from each other. Interleaving is not a universal replacement for blocking. It is a targeted strategy for situations where confusable alternatives need to be told apart.

If you've been following this series, a pattern should be emerging. Retrieval practice feels less productive than rereading but produces stronger retention. Spacing feels less efficient than massed practice but builds more durable memory. And now: interleaving feels harder and less satisfying than blocking but produces better discrimination and long-term performance. Three strategies, three versions of the same uncomfortable trade-off. The next post examines whether that pattern is coincidental or whether it reflects something systematic about how difficulty and learning relate.

AceMedEd

This post is part of a series on the science of learning. Each post covers one evidence-based principle and how to apply it to your own studying. Follow us on Instagram @acemeded to keep up with future blog posts and related content.

References

Abel, M., & Roediger, H. L. (2007). Comparing the testing effect under blocked and mixed practice. Memory, 26(7), 898–909.

Brunmair, M., & Richter, T. (2019). Similarity matters: A meta-analysis of interleaved learning and its moderators. Psychological Bulletin, 145(11), 1029–1052.

Clayton, L., Wells, M., Solano, J., Alter, S., Hughes, P., & Shih, R. (2024). Educational concepts: A longitudinal interleaved curriculum for emergency medicine residency training. JACEP Open, 5, e13223.

Dedrick, R. F., Rohrer, D., & Stershic, S. (2016). Content analysis of practice problems in 7th grade mathematics textbooks: Blocked vs. interleaved practice. Paper presented at the Annual Meeting of the American Educational Research Association, Washington, DC.

Dobson, J. L., Perez, J., & Linderholm, T. (2016). Distributed retrieval practice promotes superior recall of anatomy information. Anatomical Sciences Education, 10(4), 339–347.

Guadagnoli, M. A., & Lee, T. D. (2004). Challenge point: A framework for conceptualizing the effects of various practice conditions in motor learning. Journal of Motor Behavior, 36(2), 212–224.

Rohrer, D., Dedrick, R. F., Hartley, P., & Cheung, C.-N. (2020). A randomized controlled trial of interleaved mathematics practice. Journal of Educational Psychology, 112(1), 40–52.

Samani, J., & Pan, S. C. (2021). Interleaved practice enhances memory and problem-solving ability in undergraduate physics. npj Science of Learning, 6(1), 32.

Van Hoof, T. J., Sumeracki, M. A., & Madan, C. R. (2022). Science of learning strategy series: Article 3, interleaved practice. Journal of Continuing Education in the Health Professions, 42(4), 265–268.

Zeller, T. A., Beben, K., Kong, M., Martonffy, I., Patterson, S., Deas, W., Heo, M., & Keister, D. M. (2023). Longitudinal interleaved residency training: A consensus definition. Family Medicine, 55(5), 311–316.