Series
QML Reality Check
Quantum machine learning is two things at once: a real research area with open questions, and a marketing category routinely oversold against weak baselines. This series benchmarks named QML methods against seriously-tuned classical ones on real datasets, and publishes whatever the numbers say.
Pre-registration
For each entry below, we publish the dataset, the contenders, the evaluation metric, and the random seeds before running the benchmark — committed to git so you can verify we didn't choose the flattering configuration after the fact. The notebook produces the number; we publish whichever direction it points.
Published
- Published · Wisconsin Breast Cancer Diagnostic (569 samples, 30 features, binary)
Wisconsin Breast Cancer — XGBoost vs QML
Contenders: XGBoost · RBF SVM · Variational Quantum Classifier · Quantum Kernel SVM · MLP · Logistic Regression
XGBoost ~98%, QML methods ~93%. PCA bottleneck dominates the QML disadvantage.
- Published · MNIST 0/1/2 (3-class subset, 5,000 samples)
MNIST subset — quantum CNN vs LeNet
Contenders: LeNet-5 · Quantum CNN (Cong-Choi-Lukin) · Variational classifier · Tiny MLP · RBF SVM
LeNet-5 hits 99.67%. Best QCNN (n=8 + reuploading) hits 87.21%. Even a 2,500-param MLP (97.94%) beats every quantum variant. ~12 pp gap.
- Published · S&P 500 sector returns, n=20 assets, 10 random instances
Portfolio optimization — Markowitz vs QAOA vs simulated annealing
Contenders: Markowitz QP · Simulated annealing · Goemans-Williamson · QAOA-1, 3, 5, 7 (warm and cold)
Markowitz wins (Sharpe 1.412 in 40 ms). Best QAOA hits 88% of optimal Sharpe in 47s. Closed-form QP is the right tool here.
- Published · H₂, LiH, BeH₂, H₂O — STO-3G basis
Molecular simulation — VQE vs CCSD(T)
Contenders: VQE-UCCSD · VQE-HEA · classical CCSD(T) · DMRG (FCI as ground truth)
VQE-UCCSD MATCHES classical at chemical accuracy on every molecule. The first benchmark where quantum doesn't lose — but doesn't win on runtime either.
- Published · Random 3-regular graphs, n ∈ {8, 12, 16, 24}
MaxCut — Goemans-Williamson vs QAOA-p
Contenders: Goemans-Williamson SDP · QAOA-1 · QAOA-3 (warm-started) · QAOA-5 (warm-started)
GW wins at every size. QAOA-1 averages 0.70, GW averages 0.93. Warm-started QAOA-5 closes to 0.91 but still trails. No quantum advantage on this benchmark.
Pre-registered, coming soon
Why this exists
Open the most-cited QML papers from 2020–2024. Most benchmark on toy datasets (MNIST 0/1) where a 3-layer classical MLP gets 99%, against baselines like logistic regression. The "quantum advantage" claim survives only against the weak baseline.
Vendor blogs amplify this. They have to: their quantum-cloud business depends on the "QML works" narrative being directionally true. They are structurally incapable of publishing a series like this, because the first finding is "for tabular classification, XGBoost wins."
We can publish it. So we do. If the next finding is "but on this molecular simulation, VQE wins by 8 mHa" — we publish that too, with the same rigor. The point isn't to bash QML. It's to give working developers an honest answer to "should I reach for quantum here?"
For the methodology behind these benchmarks and our broader editorial stance, see Editorial Independence.