Transmon Qubits: How the Most-Deployed Superconducting Qubit Actually Works
Transmons are the single most-deployed qubit type in quantum computing — every IBM, Google, and Rigetti processor as of 2026 is built on transmons or close cousins. This tutorial builds the transmon from the underlying Cooper-pair-box physics, explains why the design tradeoff matters, surveys current 2026 hardware numbers (T1, T2, two-qubit gate error, the Willow below-threshold result), and gives an honest verdict on what the platform's hard scaling problems actually are.
Prerequisites: Tutorial 20: Quantum Hardware Compared
If you have used quantum hardware in the cloud — IBM Quantum, Rigetti, the now-legacy Google access tiers — you have been using transmons. The transmon is the dominant superconducting qubit family, the design behind every flagship error-correction experiment of the last five years (Google Willow, IBM Heron, Quantinuum’s superconducting partnerships), and the qubit type with by far the most published peer-reviewed data on real fault-tolerance progress.
It is also a qubit that is not yet good enough for useful fault-tolerant computation. The gap between today’s transmon coherence and the coherence required for RSA-2048 factoring is roughly six orders of magnitude in logical error rate. Every honest transmon-roadmap conversation in 2026 is fundamentally a conversation about how to close that gap — and which of the dozen sub-engineering problems (correlated errors, leakage, materials defects, cosmic rays, packaging, calibration, decoder hardware) gets attacked first.
This tutorial builds the transmon from the underlying Cooper-pair-box physics, explains the design tradeoff that made it dominant, surveys current 2026 hardware numbers, and gives a structured view of the scaling bottlenecks.
The Cooper-pair box, briefly
A superconducting qubit starts from the realization that a small superconducting circuit element behaves quantum-mechanically. The simplest nontrivial element is the Cooper-pair box: a small superconducting island connected to a reservoir through a Josephson junction, biased by a gate voltage.
The Hamiltonian has two pieces:
- Charging energy: , where is the number of Cooper pairs on the island, is the gate-voltage-controlled charge offset, and is the single-Cooper-pair charging energy.
- Josephson energy: , where is the phase difference across the junction and is the Josephson coupling energy.
The conjugate variables behave like position and momentum in quantum mechanics. The Cooper-pair box has discrete charge states, anharmonic energy levels, and supports quantum coherent oscillations. The original Nakamura-Pashkin-Tsai 1999 demonstration showed coherent oscillations at MHz timescales — short by today’s standards but the first demonstration that superconducting circuits could actually be qubits.
The problem with the original Cooper-pair box: charge noise dominates. Tiny variations in , caused by stray charges in the substrate, dephase the qubit. Coherence times in the original Nakamura-Pashkin-Tsai design were nanoseconds.
What the transmon design fixes
The transmon (Koch et al., 2007) is the Cooper-pair box operated in the regime — typically . This single parameter change has dramatic consequences.
The transmon Hamiltonian, expanded around the bottom of the Josephson cosine potential:
where are bosonic ladder operators. Three properties matter:
-
Charge dispersion is exponentially suppressed. The dependence of energy levels on falls off as . At , this is roughly — essentially zero. Charge noise becomes invisible. This is the central design win of the transmon.
-
Anharmonicity is preserved, barely. The and transitions differ by — small, but enough for selective microwave drives to address only the qubit transition. Anharmonicity of MHz on a GHz qubit is typical.
-
Coherence times explode. Once charge noise is suppressed, the dominant decoherence channels become dielectric loss in the substrate, two-level-system defects in oxide layers, and quasiparticle poisoning. By 2026, values of are typical and high-end devices have demonstrated ms.
The price paid is anharmonicity: the transmon’s anharmonicity is much smaller than the original Cooper-pair box, so single-qubit gates have to be slow enough to avoid driving the transition. Modern DRAG and other shaped-pulse techniques largely close this gap, allowing ns single-qubit gates with leakage below .
Anatomy of a real transmon device
A 2026-vintage transmon device (e.g., one IBM Heron module) consists of:
- Transmon islands. Lithographically patterned superconducting “pads” (typically aluminum or niobium) acting as the Cooper-pair box capacitor.
- Josephson junctions. Aluminum-oxide tunnel junctions between the islands and a ground plane, providing the inductance.
- Coplanar waveguide resonators. One per qubit, dispersively coupled to the transmon for state readout. The resonator-transmon dispersive coupling is the foundation of circuit QED.
- Tunable couplers. Newer designs (IBM Heron, Google Willow) use additional Josephson elements between qubits as tunable couplers that turn the qubit-qubit interaction on and off. This eliminates always-on ZZ crosstalk and is crucial for scalability.
- Control wiring. Microwave lines for drive (XY control, ~5 GHz) and flux lines for fast tunability (DC-100 MHz), with attenuators and filters along the dilution-refrigerator stages.
- Readout multiplexing. Multiple resonators on a single feedline at different frequencies, demuxed at room temperature.
The whole thing operates at mK in a dilution refrigerator, with classical control electronics at room temperature transmitting microwave pulses through cryogenic cables. The cryogenic-control engineering is half the challenge.
Current 2026 numbers
The transmon performance numbers most worth knowing as of 2026:
| System | Qubits | 2-qubit gate error | Public access | |
|---|---|---|---|---|
| Google Willow | 105 | ~80 µs | ~3 × 10⁻³ | Restricted research |
| IBM Heron r2 | 156 | ~120 µs | ~3 × 10⁻³ | IBM Quantum cloud |
| IBM Nighthawk | 120 | ~100 µs | ~2 × 10⁻³ | IBM Quantum cloud |
| Rigetti Ankaa-2 | 84 | ~30 µs | ~10⁻² | AWS Braket |
| Quantinuum (transmon) | research | ~150 µs | ~2 × 10⁻³ | research |
Two notes on reading this table:
- varies across qubits on the same chip. A given device has a distribution of values; the number reported is typically a median or the “best” qubits. Worst qubits on the same chip can be 2-3× shorter.
- Two-qubit gate errors are platform-dependent. Google Willow’s is from randomized benchmarking on their CZ gate. IBM’s Heron r2 uses cross-resonance + tunable couplers and reports similar numbers. Rigetti has historically reported higher errors.
The threshold for the surface code is roughly . By 2026, all flagship transmon platforms operate below this threshold — gate errors are below 1%, and error correction has the room to suppress logical errors exponentially. Tutorial 19 covered Willow’s below-threshold demonstration in detail.
How the platform actually scales
Three engineering bottlenecks dominate transmon scaling beyond the ~100-1000 qubit range:
Connectivity vs crosstalk
Transmons couple via shared resonators or capacitive elements. More connections means more potential crosstalk paths — pairs of qubits that should be independent end up correlated through unintended interactions. The classical workaround is sparse 2D connectivity (heavy-hex, square-grid), but qLDPC codes (tutorial 28) want richer connectivity that current transmon hardware cannot natively support. IBM’s Nighthawk introduced tunable couplers that can selectively connect non-nearest-neighbors; Google’s Willow stays on a square grid. The connectivity-vs-crosstalk tradeoff is the architectural lever shaping which error-correction codes each vendor’s roadmap targets.
Materials-limited coherence
After 2020, transmon improvements came largely from materials science rather than circuit design. Niobium surface oxidation, sapphire substrate purity, aluminum-oxide tunnel-junction defects, and electroplating cleanliness all matter at the few-percent-per-decade level. The 2024 papers (Bal et al., niobium surface encapsulation; Kono et al., mechanically induced correlated errors) document both progress and remaining bottlenecks. Transmon coherence is now a fab problem more than a physics problem.
Correlated errors
Below gate error, single-qubit-error metrics stop being the binding constraint. Rare correlated error events — cosmic-ray-induced charge bursts, mechanical vibrations coupling to the substrate, leakage out of the computational subspace, quasiparticle poisoning — become disproportionately important because error correction assumes independent errors. The 2024 Kono et al. paper documented mechanically-induced correlated errors on devices with ms, showing the regime where correlated errors are no longer a minor correction. Suppressing correlated errors is the next big engineering frontier, and several research groups are publishing detailed studies of cosmic-ray mitigation, packaging design, and quasiparticle dynamics.
Gate mechanisms
Transmon devices implement two-qubit gates by several mechanisms, depending on the architecture:
- Cross-resonance (CR) gates. Drive one qubit at the frequency of the other. The shared coupling produces an effective interaction, which composes with single-qubit gates into a CNOT. Used on IBM Heron and earlier devices. CR gates are typically ns and have errors around .
- CZ gates with tunable couplers. Adjust the coupling between two qubits to a value that produces a controlled-phase gate in a fixed time. Faster than CR ( ns) and more precise. Used on Google Willow and IBM Nighthawk.
- Direct with capacitive coupling. Used on early Rigetti and Google devices; mostly superseded by tunable-coupler designs.
The choice of native gate matters because every algorithm needs to be transpiled onto the available native gate set. Transpilation overhead can vary from negligible (CZ-equivalent) to significant (factor of 2-3 for cross-resonance), affecting effective gate count.
A small Python comparison
Here is a small benchmark comparing transmon and ion gate budgets for the same circuit. The point is not to claim winner — the point is to make the platform-specific gate cost visible.
# Logical CNOT count for a small algorithm.
# Transmon (with cross-resonance native two-qubit): each logical CNOT = 1 CR + single-qubit overheads.
# Ion (with native MS gate): each logical CNOT = 1 MS + single-qubit overheads.
def gate_budget(n_logical_cnots: int, platform: str) -> dict:
if platform == "transmon-cr":
# Each CR gate is ~200 ns, fidelity ~5e-3.
# Each single-qubit gate is ~30 ns, fidelity ~5e-4.
# CNOT decomposition: 1 CR + 4 single-qubit = ~330 ns, error ~5e-3 + 4 * 5e-4 = 7e-3.
gate_time_ns = 330
gate_error = 7e-3
elif platform == "transmon-cz":
# CZ with tunable couplers: ~30 ns, ~3e-3 fidelity.
# CNOT decomposition: 1 CZ + 2 H = ~90 ns, error ~3e-3 + 2 * 5e-4 = 4e-3.
gate_time_ns = 90
gate_error = 4e-3
elif platform == "ion-ms":
# MS gate ~50 us, fidelity ~3e-4 (Quantinuum).
# CNOT: 1 MS + 2 single-qubit = ~80 us, error ~3e-4 + 2e-5 = 3e-4.
gate_time_ns = 80_000
gate_error = 3e-4
elif platform == "neutral-atom":
# Rydberg CZ: ~1 us, fidelity ~3e-3 (Atom Computing 2025).
gate_time_ns = 1_000
gate_error = 3e-3
else:
raise ValueError(platform)
total_time_us = n_logical_cnots * gate_time_ns / 1000
total_success = (1 - gate_error) ** n_logical_cnots
return {
"platform": platform,
"n_cnots": n_logical_cnots,
"total_time_us": total_time_us,
"total_success_prob": total_success,
}
for plat in ["transmon-cr", "transmon-cz", "ion-ms", "neutral-atom"]:
r = gate_budget(n_logical_cnots=1000, platform=plat)
print(f"{r['platform']:>14s}: time={r['total_time_us']:.1f} us, "
f"success={r['total_success_prob']:.4f}")
Sample output:
transmon-cr: time=330.0 us, success=0.0009
transmon-cz: time=90.0 us, success=0.0183
ion-ms: time=80000.0 us, success=0.7411
neutral-atom: time=1000.0 us, success=0.0494
A 1,000-CNOT circuit on an ion platform succeeds 74% of the time despite being 1000× slower; on transmon-CR it succeeds under 1% of the time despite being fast. Two-qubit gate fidelity matters more than speed for any algorithm long enough to actually compute something. This is why ion platforms remain competitive despite slow gates: high fidelity gives them a much longer “circuit-depth budget” before logical errors dominate.
This calculation is back-of-envelope and does not include error correction (which radically changes the picture by suppressing logical errors at the cost of much larger gate counts). It captures only the bare-metal NISQ-era picture.
Common misconceptions
“Transmons are obviously the leader because IBM and Google use them.” Not exactly. Transmons have the most cumulative engineering investment and the most public hardware data. They are dominant now. Whether they remain dominant in the fault-tolerant era depends on whether the materials and correlated-error problems can be solved at scale — open questions as of 2026.
“More qubits is always better.” Not true on transmon hardware. Without proportionally improved fidelity, more qubits multiplies the calibration burden and crosstalk paths without adding useful computational depth. The 2025-2026 trend has been fewer, better qubits (IBM Nighthawk’s 120 qubits at vs older 433-qubit devices at lower coherence) — a pivot many publications missed.
“Transmon = Josephson junction.” The Josephson junction is the central nonlinear element, but the transmon’s design is about how the junction interacts with the rest of the circuit (capacitor, resonator, drive lines). Several other superconducting qubits also use Josephson junctions (fluxonium, flux qubit, 0-π qubit) but operate in different regimes. Transmons specifically operate in the regime; other qubits trade that off for different properties.
“Below-threshold means we have fault tolerance now.” Below-threshold means logical errors can be suppressed exponentially with code distance. Achieving logical error from physical error requires distance , which means physical qubits per logical qubit. Below-threshold is the start of fault-tolerance, not the end. Tutorial 27 walked through the resource math.
Decision rule for picking transmons
When evaluating whether transmons fit your application:
- What is your algorithm depth? Below 100 two-qubit gates: transmons are fine. 100-1000 gates: marginal, depends on platform-specific fidelity. Above 1000 gates without error correction: trapped ions or neutral atoms are likely better.
- What is your latency budget? Microsecond-level wall-clock requirements (real-time control loops, RTI applications): transmons. Multi-second runs are fine for any platform.
- Is your algorithm naturally 2D-grid friendly? Yes: transmons. Highly non-local connectivity required: ions or neutral atoms.
- Are you targeting the fault-tolerant era? Then the answer depends on which fault-tolerance roadmap you trust most. IBM’s Starling roadmap is the most concrete transmon-based FTQC plan; Google continues on transmon-based surface code.
Exercises
1. The tradeoff
Why can’t you simply make as large as possible? Compute the anharmonicity in the limit and explain what goes wrong.
Show answer
The anharmonicity is approximately , so as at fixed transition frequency , the anharmonicity vanishes (). Without anharmonicity, the qubit becomes indistinguishable from a harmonic oscillator and you cannot do single-qubit gates without leakage to higher levels. The transmon design picks as a compromise: charge dispersion exponentially small and anharmonicity large enough for fast gates. Going much higher buys negligible charge-noise improvement and costs anharmonicity.
2. Coherence vs gate time
A transmon has and you do single-qubit gates of duration 30 ns. What is the gate-error contribution from alone? Compare to a hypothetical platform with and 1 ns gates.
Show answer
-induced gate error is approximately for short gates. Transmon: . Hypothetical fast platform: . The fast platform wins on -induced error despite having shorter , because gate time is much shorter. The relevant figure of merit is , not alone. Transmon ; trapped ions (long , slow gates), but the practical gate error is dominated by the gate itself, not by decay.
3. Why correlated errors matter
A device has independent single-qubit error but a correlated-error rate of per qubit per syndrome cycle (e.g., a cosmic ray hitting the chip). For a 1000-qubit device running 100 syndrome cycles, what is the dominant error source?
Show answer
Independent errors per cycle per qubit: . With 1000 qubits and 100 cycles, total expected independent errors: . Correlated events: correlated events on average over the run. Correlated events are 100× rarer than independent errors numerically, but each correlated event corrupts many qubits simultaneously, which the surface-code decoder handles much worse than independent errors. The right way to count is: each correlated event ~ one logical-error candidate; 0.1 events per run is a real per-run failure rate. The independent-error count includes errors that error correction handles cleanly. Correlated events become the dominant logical-error cause well before they dominate the raw error rate.
4. When transmon stops being the right answer
For what kind of algorithm would you specifically not pick a transmon-based platform?
Show answer
(a) Algorithms that need long-range connectivity beyond what tunable couplers and 2D layouts provide. (b) Algorithms that need very long coherence times during execution (some adiabatic algorithms, some QML training loops); transmons’ few-hundred-microsecond coherence is short compared to ion or neutral-atom platforms. (c) Algorithms with very high gate counts and modest error correction (because transmon’s gate errors require deep error correction earlier than ion’s ). (d) Cold-atom-style quantum simulation of specific Hamiltonians where the underlying physics matches the platform. For most generic quantum-algorithm work — surface-code error correction, NISQ chemistry, quantum-supremacy demonstrations — transmons remain a strong choice in 2026.
Where this goes next
Tutorial 34 covers trapped ions, the platform with the highest published gate fidelities and the most mature analytical view of error mechanisms. Tutorial 35 covers neutral atoms / Rydberg arrays — the fastest-improving platform of the last three years. Tutorial 36 covers photonic quantum computing and the fusion-based approach that defines PsiQuantum’s roadmap.