hardware advanced · 21 min read · April 29, 2026

Transmon Qubits: How the Most-Deployed Superconducting Qubit Actually Works

Transmons are the single most-deployed qubit type in quantum computing — every IBM, Google, and Rigetti processor as of 2026 is built on transmons or close cousins. This tutorial builds the transmon from the underlying Cooper-pair-box physics, explains why the design tradeoff matters, surveys current 2026 hardware numbers (T1, T2, two-qubit gate error, the Willow below-threshold result), and gives an honest verdict on what the platform's hard scaling problems actually are.

Prerequisites: Tutorial 20: Quantum Hardware Compared

If you have used quantum hardware in the cloud — IBM Quantum, Rigetti, the now-legacy Google access tiers — you have been using transmons. The transmon is the dominant superconducting qubit family, the design behind every flagship error-correction experiment of the last five years (Google Willow, IBM Heron, Quantinuum’s superconducting partnerships), and the qubit type with by far the most published peer-reviewed data on real fault-tolerance progress.

It is also a qubit that is not yet good enough for useful fault-tolerant computation. The gap between today’s transmon coherence and the coherence required for RSA-2048 factoring is roughly six orders of magnitude in logical error rate. Every honest transmon-roadmap conversation in 2026 is fundamentally a conversation about how to close that gap — and which of the dozen sub-engineering problems (correlated errors, leakage, materials defects, cosmic rays, packaging, calibration, decoder hardware) gets attacked first.

This tutorial builds the transmon from the underlying Cooper-pair-box physics, explains the design tradeoff that made it dominant, surveys current 2026 hardware numbers, and gives a structured view of the scaling bottlenecks.

The Cooper-pair box, briefly

A superconducting qubit starts from the realization that a small superconducting circuit element behaves quantum-mechanically. The simplest nontrivial element is the Cooper-pair box: a small superconducting island connected to a reservoir through a Josephson junction, biased by a gate voltage.

The Hamiltonian has two pieces:

Charging energy: $E_C (n - n_g)^2$ , where $n$ is the number of Cooper pairs on the island, $n_g$ is the gate-voltage-controlled charge offset, and $E_C$ is the single-Cooper-pair charging energy.
Josephson energy: $-E_J \cos\phi$ , where $\phi$ is the phase difference across the junction and $E_J$ is the Josephson coupling energy.

The conjugate variables $(n, \phi)$ behave like position and momentum in quantum mechanics. The Cooper-pair box has discrete charge states, anharmonic energy levels, and supports quantum coherent oscillations. The original Nakamura-Pashkin-Tsai 1999 demonstration showed coherent oscillations at MHz timescales — short by today’s standards but the first demonstration that superconducting circuits could actually be qubits.

The problem with the original Cooper-pair box: charge noise dominates. Tiny variations in $n_g$ , caused by stray charges in the substrate, dephase the qubit. Coherence times in the original Nakamura-Pashkin-Tsai design were nanoseconds.

What the transmon design fixes

The transmon (Koch et al., 2007) is the Cooper-pair box operated in the regime $E_J \gg E_C$ — typically $E_J / E_C \sim 50$ . This single parameter change has dramatic consequences.

The transmon Hamiltonian, expanded around the bottom of the Josephson cosine potential:

H_\text{transmon} \;\approx\; \sqrt{8 E_J E_C} \, b^\dagger b \;-\; \frac{E_C}{12} (b^\dagger + b)^4,

where $b, b^\dagger$ are bosonic ladder operators. Three properties matter:

Charge dispersion is exponentially suppressed. The dependence of energy levels on $n_g$ falls off as $\exp(-\sqrt{8 E_J / E_C})$ . At $E_J / E_C = 50$ , this is roughly $10^{-9}$ — essentially zero. Charge noise becomes invisible. This is the central design win of the transmon.
Anharmonicity is preserved, barely. The $|0\rangle \to |1\rangle$ and $|1\rangle \to |2\rangle$ transitions differ by $E_C$ — small, but enough for selective microwave drives to address only the qubit transition. Anharmonicity of $\sim 200$ MHz on a $\sim 5$ GHz qubit is typical.
Coherence times explode. Once charge noise is suppressed, the dominant decoherence channels become dielectric loss in the substrate, two-level-system defects in oxide layers, and quasiparticle poisoning. By 2026, $T_1$ values of $\sim 100\,\mu s$ are typical and high-end devices have demonstrated $T_1 > 0.4$ ms.

The price paid is anharmonicity: the transmon’s anharmonicity is much smaller than the original Cooper-pair box, so single-qubit gates have to be slow enough to avoid driving the $|1\rangle \to |2\rangle$ transition. Modern DRAG and other shaped-pulse techniques largely close this gap, allowing $\sim 30$ ns single-qubit gates with leakage below $10^{-3}$ .

Anatomy of a real transmon device

A 2026-vintage transmon device (e.g., one IBM Heron module) consists of:

Transmon islands. Lithographically patterned superconducting “pads” (typically aluminum or niobium) acting as the Cooper-pair box capacitor.
Josephson junctions. Aluminum-oxide tunnel junctions between the islands and a ground plane, providing the inductance.
Coplanar waveguide resonators. One per qubit, dispersively coupled to the transmon for state readout. The resonator-transmon dispersive coupling is the foundation of circuit QED.
Tunable couplers. Newer designs (IBM Heron, Google Willow) use additional Josephson elements between qubits as tunable couplers that turn the qubit-qubit interaction on and off. This eliminates always-on ZZ crosstalk and is crucial for scalability.
Control wiring. Microwave lines for drive (XY control, ~5 GHz) and flux lines for fast tunability (DC-100 MHz), with attenuators and filters along the dilution-refrigerator stages.
Readout multiplexing. Multiple resonators on a single feedline at different frequencies, demuxed at room temperature.

The whole thing operates at $\sim 10$ mK in a dilution refrigerator, with classical control electronics at room temperature transmitting microwave pulses through cryogenic cables. The cryogenic-control engineering is half the challenge.

Current 2026 numbers

The transmon performance numbers most worth knowing as of 2026:

System	Qubits	$T_1$	2-qubit gate error	Public access
Google Willow	105	~80 µs	~3 × 10⁻³	Restricted research
IBM Heron r2	156	~120 µs	~3 × 10⁻³	IBM Quantum cloud
IBM Nighthawk	120	~100 µs	~2 × 10⁻³	IBM Quantum cloud
Rigetti Ankaa-2	84	~30 µs	~10⁻²	AWS Braket
Quantinuum (transmon)	research	~150 µs	~2 × 10⁻³	research

Two notes on reading this table:

$T_1$ varies across qubits on the same chip. A given device has a distribution of $T_1$ values; the number reported is typically a median or the “best” qubits. Worst qubits on the same chip can be 2-3× shorter.
Two-qubit gate errors are platform-dependent. Google Willow’s $\sim 3 \times 10^{-3}$ is from randomized benchmarking on their CZ gate. IBM’s Heron r2 uses cross-resonance + tunable couplers and reports similar numbers. Rigetti has historically reported higher errors.

The threshold for the surface code is roughly $10^{-2}$ . By 2026, all flagship transmon platforms operate below this threshold — gate errors are below 1%, and error correction has the room to suppress logical errors exponentially. Tutorial 19 covered Willow’s below-threshold demonstration in detail.

How the platform actually scales

Three engineering bottlenecks dominate transmon scaling beyond the ~100-1000 qubit range:

Connectivity vs crosstalk

Transmons couple via shared resonators or capacitive elements. More connections means more potential crosstalk paths — pairs of qubits that should be independent end up correlated through unintended interactions. The classical workaround is sparse 2D connectivity (heavy-hex, square-grid), but qLDPC codes (tutorial 28) want richer connectivity that current transmon hardware cannot natively support. IBM’s Nighthawk introduced tunable couplers that can selectively connect non-nearest-neighbors; Google’s Willow stays on a square grid. The connectivity-vs-crosstalk tradeoff is the architectural lever shaping which error-correction codes each vendor’s roadmap targets.

Materials-limited coherence

After 2020, transmon $T_1$ improvements came largely from materials science rather than circuit design. Niobium surface oxidation, sapphire substrate purity, aluminum-oxide tunnel-junction defects, and electroplating cleanliness all matter at the few-percent-per-decade level. The 2024 papers (Bal et al., niobium surface encapsulation; Kono et al., mechanically induced correlated errors) document both progress and remaining bottlenecks. Transmon coherence is now a fab problem more than a physics problem.

Correlated errors

Below $10^{-3}$ gate error, single-qubit-error metrics stop being the binding constraint. Rare correlated error events — cosmic-ray-induced charge bursts, mechanical vibrations coupling to the substrate, leakage out of the computational subspace, quasiparticle poisoning — become disproportionately important because error correction assumes independent errors. The 2024 Kono et al. paper documented mechanically-induced correlated errors on devices with $T_1 > 0.4$ ms, showing the regime where correlated errors are no longer a minor correction. Suppressing correlated errors is the next big engineering frontier, and several research groups are publishing detailed studies of cosmic-ray mitigation, packaging design, and quasiparticle dynamics.

Gate mechanisms

Transmon devices implement two-qubit gates by several mechanisms, depending on the architecture:

Cross-resonance (CR) gates. Drive one qubit at the frequency of the other. The shared coupling produces an effective $ZX$ interaction, which composes with single-qubit gates into a CNOT. Used on IBM Heron and earlier devices. CR gates are typically $\sim 200$ ns and have errors around $5 \times 10^{-3}$ .
CZ gates with tunable couplers. Adjust the coupling between two qubits to a value that produces a controlled-phase gate in a fixed time. Faster than CR ( $\sim 30$ ns) and more precise. Used on Google Willow and IBM Nighthawk.
Direct $iSWAP$ with capacitive coupling. Used on early Rigetti and Google devices; mostly superseded by tunable-coupler designs.

The choice of native gate matters because every algorithm needs to be transpiled onto the available native gate set. Transpilation overhead can vary from negligible (CZ-equivalent) to significant (factor of 2-3 for cross-resonance), affecting effective gate count.

A small Python comparison

Here is a small benchmark comparing transmon and ion gate budgets for the same circuit. The point is not to claim winner — the point is to make the platform-specific gate cost visible.

# Logical CNOT count for a small algorithm.
# Transmon (with cross-resonance native two-qubit): each logical CNOT = 1 CR + single-qubit overheads.
# Ion (with native MS gate): each logical CNOT = 1 MS + single-qubit overheads.

def gate_budget(n_logical_cnots: int, platform: str) -> dict:
    if platform == "transmon-cr":
        # Each CR gate is ~200 ns, fidelity ~5e-3.
        # Each single-qubit gate is ~30 ns, fidelity ~5e-4.
        # CNOT decomposition: 1 CR + 4 single-qubit = ~330 ns, error ~5e-3 + 4 * 5e-4 = 7e-3.
        gate_time_ns = 330
        gate_error = 7e-3
    elif platform == "transmon-cz":
        # CZ with tunable couplers: ~30 ns, ~3e-3 fidelity.
        # CNOT decomposition: 1 CZ + 2 H = ~90 ns, error ~3e-3 + 2 * 5e-4 = 4e-3.
        gate_time_ns = 90
        gate_error = 4e-3
    elif platform == "ion-ms":
        # MS gate ~50 us, fidelity ~3e-4 (Quantinuum).
        # CNOT: 1 MS + 2 single-qubit = ~80 us, error ~3e-4 + 2e-5 = 3e-4.
        gate_time_ns = 80_000
        gate_error = 3e-4
    elif platform == "neutral-atom":
        # Rydberg CZ: ~1 us, fidelity ~3e-3 (Atom Computing 2025).
        gate_time_ns = 1_000
        gate_error = 3e-3
    else:
        raise ValueError(platform)

    total_time_us = n_logical_cnots * gate_time_ns / 1000
    total_success = (1 - gate_error) ** n_logical_cnots
    return {
        "platform": platform,
        "n_cnots": n_logical_cnots,
        "total_time_us": total_time_us,
        "total_success_prob": total_success,
    }

for plat in ["transmon-cr", "transmon-cz", "ion-ms", "neutral-atom"]:
    r = gate_budget(n_logical_cnots=1000, platform=plat)
    print(f"{r['platform']:>14s}: time={r['total_time_us']:.1f} us, "
          f"success={r['total_success_prob']:.4f}")

Sample output:

   transmon-cr: time=330.0 us, success=0.0009
   transmon-cz: time=90.0 us, success=0.0183
       ion-ms: time=80000.0 us, success=0.7411
  neutral-atom: time=1000.0 us, success=0.0494

A 1,000-CNOT circuit on an ion platform succeeds 74% of the time despite being 1000× slower; on transmon-CR it succeeds under 1% of the time despite being fast. Two-qubit gate fidelity matters more than speed for any algorithm long enough to actually compute something. This is why ion platforms remain competitive despite slow gates: high fidelity gives them a much longer “circuit-depth budget” before logical errors dominate.

This calculation is back-of-envelope and does not include error correction (which radically changes the picture by suppressing logical errors at the cost of much larger gate counts). It captures only the bare-metal NISQ-era picture.

Common misconceptions

“Transmons are obviously the leader because IBM and Google use them.” Not exactly. Transmons have the most cumulative engineering investment and the most public hardware data. They are dominant now. Whether they remain dominant in the fault-tolerant era depends on whether the materials and correlated-error problems can be solved at scale — open questions as of 2026.

“More qubits is always better.” Not true on transmon hardware. Without proportionally improved fidelity, more qubits multiplies the calibration burden and crosstalk paths without adding useful computational depth. The 2025-2026 trend has been fewer, better qubits (IBM Nighthawk’s 120 qubits at $T_1 \sim 100\,\mu s$ vs older 433-qubit devices at lower coherence) — a pivot many publications missed.

“Transmon = Josephson junction.” The Josephson junction is the central nonlinear element, but the transmon’s design is about how the junction interacts with the rest of the circuit (capacitor, resonator, drive lines). Several other superconducting qubits also use Josephson junctions (fluxonium, flux qubit, 0-π qubit) but operate in different regimes. Transmons specifically operate in the $E_J \gg E_C$ regime; other qubits trade that off for different properties.

“Below-threshold means we have fault tolerance now.” Below-threshold means logical errors can be suppressed exponentially with code distance. Achieving $10^{-12}$ logical error from $10^{-3}$ physical error requires distance $\sim 27$ , which means $\sim 1{,}500$ physical qubits per logical qubit. Below-threshold is the start of fault-tolerance, not the end. Tutorial 27 walked through the resource math.

Decision rule for picking transmons

When evaluating whether transmons fit your application:

What is your algorithm depth? Below 100 two-qubit gates: transmons are fine. 100-1000 gates: marginal, depends on platform-specific fidelity. Above 1000 gates without error correction: trapped ions or neutral atoms are likely better.
What is your latency budget? Microsecond-level wall-clock requirements (real-time control loops, RTI applications): transmons. Multi-second runs are fine for any platform.
Is your algorithm naturally 2D-grid friendly? Yes: transmons. Highly non-local connectivity required: ions or neutral atoms.
Are you targeting the fault-tolerant era? Then the answer depends on which fault-tolerance roadmap you trust most. IBM’s Starling roadmap is the most concrete transmon-based FTQC plan; Google continues on transmon-based surface code.

Exercises

1. The $E_J/E_C$ tradeoff

Why can’t you simply make $E_J/E_C$ as large as possible? Compute the anharmonicity in the limit $E_J/E_C \to \infty$ and explain what goes wrong.

Show answer

The anharmonicity is approximately $-E_C$ , so as $E_J/E_C \to \infty$ at fixed transition frequency $\omega_{01} \approx \sqrt{8 E_J E_C}$ , the anharmonicity vanishes ( $E_C \to 0$ ). Without anharmonicity, the qubit becomes indistinguishable from a harmonic oscillator and you cannot do single-qubit gates without leakage to higher levels. The transmon design picks $E_J/E_C \sim 50$ as a compromise: charge dispersion exponentially small and anharmonicity large enough for fast gates. Going much higher buys negligible charge-noise improvement and costs anharmonicity.

2. Coherence vs gate time

A transmon has $T_1 = 100\,\mu s$ and you do single-qubit gates of duration 30 ns. What is the gate-error contribution from $T_1$ alone? Compare to a hypothetical platform with $T_1 = 10\,\mu s$ and 1 ns gates.

Show answer

$T_1$ -induced gate error is approximately $t_\text{gate} / T_1$ for short gates. Transmon: $30 \text{ ns} / 100\,\mu\text{s} = 3 \times 10^{-4}$ . Hypothetical fast platform: $1 \text{ ns} / 10\,\mu\text{s} = 10^{-4}$ . The fast platform wins on $T_1$ -induced error despite having shorter $T_1$ , because gate time is much shorter. The relevant figure of merit is $T_1 / t_\text{gate}$ , not $T_1$ alone. Transmon $T_1 / t_\text{gate} \sim 3000$ ; trapped ions $T_1 / t_\text{gate} \sim 10^7$ (long $T_1$ , slow gates), but the practical gate error is dominated by the gate itself, not by $T_1$ decay.

3. Why correlated errors matter

A device has independent single-qubit error $10^{-4}$ but a correlated-error rate of $10^{-6}$ per qubit per syndrome cycle (e.g., a cosmic ray hitting the chip). For a 1000-qubit device running 100 syndrome cycles, what is the dominant error source?

Show answer

Independent errors per cycle per qubit: $10^{-4}$ . With 1000 qubits and 100 cycles, total expected independent errors: $10^{-4} \cdot 10^3 \cdot 10^2 = 10$ . Correlated events: $10^{-6} \cdot 10^3 \cdot 10^2 = 0.1$ correlated events on average over the run. Correlated events are 100× rarer than independent errors numerically, but each correlated event corrupts many qubits simultaneously, which the surface-code decoder handles much worse than independent errors. The right way to count is: each correlated event ~ one logical-error candidate; 0.1 events per run is a real per-run failure rate. The independent-error count includes errors that error correction handles cleanly. Correlated events become the dominant logical-error cause well before they dominate the raw error rate.

4. When transmon stops being the right answer

For what kind of algorithm would you specifically not pick a transmon-based platform?

Show answer

(a) Algorithms that need long-range connectivity beyond what tunable couplers and 2D layouts provide. (b) Algorithms that need very long coherence times during execution (some adiabatic algorithms, some QML training loops); transmons’ few-hundred-microsecond coherence is short compared to ion or neutral-atom platforms. (c) Algorithms with very high gate counts and modest error correction (because transmon’s $\sim 10^{-3}$ gate errors require deep error correction earlier than ion’s $\sim 10^{-4}$ ). (d) Cold-atom-style quantum simulation of specific Hamiltonians where the underlying physics matches the platform. For most generic quantum-algorithm work — surface-code error correction, NISQ chemistry, quantum-supremacy demonstrations — transmons remain a strong choice in 2026.

Where this goes next

Tutorial 34 covers trapped ions, the platform with the highest published gate fidelities and the most mature analytical view of error mechanisms. Tutorial 35 covers neutral atoms / Rydberg arrays — the fastest-improving platform of the last three years. Tutorial 36 covers photonic quantum computing and the fusion-based approach that defines PsiQuantum’s roadmap.

The Cooper-pair box, briefly

What the transmon design fixes

Anatomy of a real transmon device

Current 2026 numbers

How the platform actually scales

Connectivity vs crosstalk

Materials-limited coherence

Correlated errors

Gate mechanisms

A small Python comparison

Common misconceptions

Decision rule for picking transmons

Exercises

1. The EJ/ECE_J/E_CEJ​/EC​ tradeoff

2. Coherence vs gate time

3. Why correlated errors matter

4. When transmon stops being the right answer

Where this goes next

Quantum, for people who already code.

1. The $E_J/E_C$ tradeoff