Faster CPython — The Ambitious Plan to Double Python's Speed

Python 3.11 alpha 4 dropped this week, and the benchmarks are turning heads. The Faster CPython project, led by Guido van Rossum and Mark Shannon with Microsoft’s backing, is showing real results. Early benchmarks suggest CPython 3.11 will be roughly 10-60% faster than 3.10 across a range of workloads, with the project’s stated goal being a 2x speedup over several releases.

For a language that’s been famously slow for three decades, this is a big deal. And the approach they’re taking is technically fascinating.

The Specializing Adaptive Interpreter
#

The headline feature driving performance gains in 3.11 is PEP 659 — the specializing adaptive interpreter. Rather than trying to optimize Python code ahead of time (which is incredibly difficult given the language’s dynamic nature), the interpreter now watches how code actually executes and optimizes hot paths at runtime.

Here’s how it works in practice. When a bytecode instruction executes enough times (currently 8), the interpreter replaces it with a specialized version. For example, if a BINARY_ADD instruction consistently receives two integers, it gets replaced with BINARY_ADD_INT, which skips all the type-checking overhead and goes straight to integer addition.

If the specialization guess turns out to be wrong (say, someone passes a string where an integer was expected), the specialized instruction “deoptimizes” back to the generic version. This is similar in principle to what JIT compilers in V8 and HotSpot do, but CPython is doing it at the bytecode level without a JIT. That’s an important distinction — it means the performance gains come without the memory overhead and warm-up costs of a full JIT compiler.

The implementation adds new specialized opcodes for common patterns:

LOAD_ATTR_INSTANCE_VALUE for attribute access on regular objects
BINARY_OP_ADD_INT and BINARY_OP_ADD_FLOAT for arithmetic
CALL_FUNCTION_BUILTIN for calls to built-in functions
Several others for common dictionary and list operations

Lazy Python Frames
#

Another significant optimization in 3.11 is the lazy creation of Python frame objects. In previous versions, every function call created a full frame object on the Python heap. In 3.11, frames are created on the C stack by default and only materialized as Python objects when something actually needs them (like a debugger or sys._getframe()).

This sounds like a minor implementation detail, but function call overhead is one of Python’s most significant bottlenecks. Reducing the cost of every function call has a cascading effect across essentially all Python code. Mark Shannon’s PEP 659 implementation notes show that frame creation was one of the top consumers of CPU time in typical Python workloads.

The practical result: function-call-heavy code (which is most well-structured Python code) gets a meaningful speedup for free, without any code changes.

Why This Time Is Different
#

I’ll confess to some initial skepticism. We’ve heard “Python will get faster” before. Unladen Swallow (Google’s attempt to add LLVM-based JIT compilation to CPython) was abandoned in 2011. PyPy has been “the fast Python” for over a decade but never achieved mainstream adoption due to compatibility issues with C extensions.

Three things make the Faster CPython project different:

First, it’s happening inside CPython itself. No alternative runtime, no compatibility layer, no separate ecosystem. When 3.11 ships, everyone who upgrades gets the performance improvements. That’s a distribution advantage that no external project can match.

Second, Guido is leading it. Having the language’s creator driving performance work means the project has both deep language knowledge and political capital within the Python community. Technical decisions that might be contentious coming from an outsider are accepted more readily when Guido’s name is attached.

Third, Microsoft is funding it. Guido joined Microsoft’s Developer Division in 2020, and the company is paying for a small team to work on this full-time. That’s the kind of sustained investment that open source performance work needs — you can’t optimize a decades-old interpreter in weekends and spare time.

What This Means for Python Developers
#

If you’re writing Python today, here’s what’s actionable:

Don’t change your code for performance yet. The optimizations in 3.11 work best on idiomatic Python. Writing “clever” code to work around interpreter limitations often makes things worse when the optimizer improves. Write clean, straightforward Python and let the interpreter do its job.

Start testing against 3.11 alpha. If you maintain a library, set up CI against 3.11 now. The sooner compatibility issues surface, the easier they are to fix. Most well-maintained packages should work without changes, but C extensions occasionally need updates.

Watch the benchmark results. The pyperformance benchmark suite is the standard measure. Current results show impressive gains on compute-heavy benchmarks (spectral_norm is 50%+ faster) and modest gains on I/O-heavy code (where the interpreter speed matters less).

Keep realistic expectations. Even with a 25% average speedup, Python is still going to be slower than Go, Rust, or Java for CPU-bound work. The goal isn’t to make Python competitive in raw performance — it’s to make Python fast enough that performance isn’t the reason you choose a different language.

My Take
#

I’ve been writing Python since the 2.x days, and I’ve watched the “Python is too slow” debate play out for years. The pragmatic answer has always been that Python’s productivity advantages outweigh its performance costs for most workloads, and for the hot paths, you drop into C extensions, Cython, or NumPy.

What excites me about Faster CPython is that it’s a credible plan to shrink that performance gap without sacrificing what makes Python great. The specializing interpreter is technically elegant — it works with Python’s dynamic nature rather than against it.

The 3.11 release is scheduled for October 2022, and based on what I’m seeing in the alphas, it’s going to be one of the most significant CPython releases in years. Not for new syntax or features, but for making existing code run faster. Sometimes the most impactful improvements are the ones that require zero changes from the user.

This is part of my Developer Landscape series, tracking the trends and shifts that shape how we build software.

Python Evolution - This article is part of a series.

Part : Python 3.14 and the Free-Threading Revolution — Is the GIL Finally Behind Us?

Part : Python 3.13 in Production — Free-Threading and the GIL's Slow Goodbye

Part : Python 3.14 Lands — Free-Threading and the JIT Take Shape

Part : uv Adds Code Formatting — Python's Tooling Consolidation Continues

Part : Python 3.14 Beta and the Free-Threading Revolution

Part : Python 3.13 and the No-GIL Experiment — Threading's Biggest Shakeup in Decades

Part : Python 3.12 Is Here — Performance, Developer Experience, and What Matters

Part : Python 3.12 — A Performance-Focused Release Worth Getting Excited About

Part : Python 3.12 RC1 Drops — What Developers Should Know

Part : Python 3.11 Beta — The Fastest CPython Release Yet