post quantum crypto advanced · 16 min read · By LIPAI WANG · April 29, 2026

Falcon (FN-DSA): The Compact Lattice Signature Standard

Falcon — standardized as FN-DSA in NIST FIPS 206 — is a post-quantum signature scheme built from NTRU lattices and floating-point Gaussian sampling. It produces signatures roughly 5x smaller than ML-DSA at comparable security, but at the cost of a much harder implementation (constant-time Gaussian sampling is notoriously subtle). This tutorial covers the math, the implementation pitfalls, and when Falcon is the right post-quantum signature choice.

Prerequisites: Tutorial 22: ML-KEM and ML-DSA in Practice

NIST’s post-quantum cryptography standardization picked three signature schemes from the final round: ML-DSA (FIPS 204, lattice-based, the default choice), SPHINCS+ (FIPS 205, hash-based, the conservative-stateless backup), and Falcon (FIPS 206, NTRU-lattice-based, the compact-signature option). Tutorial 22 covered ML-DSA in detail. This tutorial covers Falcon — what it does well, where it is hard, and when to pick it over ML-DSA.

The headline number: Falcon signatures are ~5× smaller than ML-DSA at comparable security level. A Falcon-512 signature is 666 bytes; an ML-DSA-44 (Dilithium2) signature is 2,420 bytes. For protocols where signatures are transmitted often or stored long-term, this size difference compounds.

The cost: Falcon’s signing operation requires constant-time Gaussian sampling over discrete lattices, which is one of the trickiest implementation problems in post-quantum cryptography. Naive implementations leak side-channel information that can break the scheme entirely. Several published “constant-time Falcon” implementations have had subtle bugs that leaked the secret key.

This tutorial covers Falcon’s math (NTRU lattices, the trapdoor sampling), the implementation pitfalls, the security argument, and a decision rule for picking Falcon vs ML-DSA in real protocols.

NTRU lattices, briefly

Falcon is built on the NTRU lattice problem. Roughly: given a polynomial ring $R = \mathbb{Z}[X]/(X^n + 1)$ with $n$ a power of 2, and a public polynomial $h \in R/q$ for some prime $q$ , the NTRU problem is to find polynomials $f, g \in R$ with small coefficients such that $h \equiv g/f \pmod q$ .

The “small coefficients” condition is what makes the problem hard. Computing $g/f$ for any $f, g$ is easy modular arithmetic; finding the small representative is a lattice-shortest-vector problem and is believed to be quantum-hard.

Falcon’s key generation:

Sample small polynomials $f, g \in R$ from an appropriate Gaussian distribution.
Compute $h = g \cdot f^{-1} \pmod{q}$ . This is the public key.
Compute additional polynomials $F, G$ such that $f G - g F = q$ (the NTRU equation). The pair $(f, g, F, G)$ is the secret key (a basis of a special lattice).

The public key is one polynomial; the secret key is a basis of an NTRU lattice. The mathematical structure of $(f, g, F, G)$ gives a trapdoor — a specific algorithm that uses these polynomials to find short lattice vectors near any target, without solving the general NTRU problem.

How Falcon signs

To sign a message $m$ :

Hash $m$ to a target point $c$ in the lattice $\mathbb{Z}^{2n}$ (specifically, the NTRU lattice’s dual).
Use the trapdoor (the secret key) to find a short vector $s$ such that $s \approx c$ in the lattice.
The signature is $s$ encoded compactly.

To verify:

Recompute the hash target $c$ from the message and public key.
Check that $s$ is a short vector consistent with $c$ in the public NTRU lattice.

The math is elegant. The implementation is hard.

The Gaussian sampling problem

The “find a short vector near $c$ ” step requires sampling from a discrete Gaussian distribution over the lattice. Each step involves:

Computing a target offset.
Sampling integer-valued Gaussian coefficients (specifically, samples from a discrete Gaussian on $\mathbb{Z}$ centered at a real number, with a specific variance).
Combining these into a lattice point near the target.

The challenge: the Gaussian sampling must be constant-time. Side-channel attacks can extract the secret key from timing variations in sampling. A constant-time discrete Gaussian sampler is non-trivial — naive rejection-sampling implementations leak through rejection counts.

Falcon’s reference implementation uses a “tree” of Gaussian samplers and floating-point arithmetic to achieve constant-time behavior. The implementation is around 4,000 lines of careful C, and several published bugs have shown how easy it is to get wrong.

Concrete sizes

Here are the parameter sizes for the standardized variants:

Scheme	Public key	Secret key	Signature	Security
Falcon-512	897 B	1,281 B	666 B	NIST level 1 (~AES-128 quantum)
Falcon-1024	1,793 B	2,305 B	1,280 B	NIST level 5 (~AES-256 quantum)
ML-DSA-44	1,312 B	2,560 B	2,420 B	NIST level 2
ML-DSA-65	1,952 B	4,032 B	3,309 B	NIST level 3
ML-DSA-87	2,592 B	4,896 B	4,627 B	NIST level 5
SPHINCS+-128s	32 B	64 B	7,856 B	NIST level 1

Two takeaways:

Falcon has the smallest signatures of any post-quantum scheme (at comparable security levels). This is its competitive advantage.
Public keys are similar across lattice schemes — small enough for most applications, large compared to elliptic-curve schemes (32 bytes for Ed25519).

For protocols where signatures are sent over the wire (TLS handshakes, blockchain transactions, code-signing), Falcon’s compactness is meaningful. For protocols where signatures are computed offline and stored, the size advantage is less critical.

Speed comparison

Performance varies substantially across implementations. Reference numbers (from the SUPERCOP benchmark suite, 2025-vintage):

Scheme	Sign (cycles)	Verify (cycles)	Signature size
Falcon-512	~28M	~76K	666 B
ML-DSA-44	~600K	~520K	2,420 B

ML-DSA signs ~50× faster than Falcon. Falcon’s verify is faster than ML-DSA’s verify. The asymmetry: Falcon’s complexity is concentrated in signing (Gaussian sampling), while ML-DSA spreads it more evenly.

For high-throughput signing scenarios (a server signing many messages per second), ML-DSA is often the right choice. For verification-heavy workloads with rare signing (TLS clients, embedded devices verifying firmware), Falcon’s compact signature can be worth the slow signing.

Implementation pitfalls

Falcon’s reference implementation has had multiple security disclosures, all related to the Gaussian sampling:

Floating-point determinism issues. The reference implementation uses double-precision floating-point for the sampling tree. Different platforms produce slightly different results, and in some cases the timing varies enough to leak information about secret-key components.
Side-channel leakage in modular reduction. Reducing intermediate polynomial coefficients modulo $q$ can leak through timing if not done in constant time.
Incorrect rejection bounds. The Gaussian sampler must reject some samples to maintain the correct distribution. Bugs in rejection logic have appeared in multiple implementations.

The general lesson: Falcon is much harder to implement correctly than ML-DSA. The cryptographic community now generally recommends Falcon only when the size advantage is critical, and only with carefully audited implementations.

The 2025 NIST guidance: use ML-DSA as the default, fall back to SPHINCS+ for long-term-stable hash-based security, and use Falcon when signature size is the binding constraint and a high-quality vetted implementation is available.

Decision rule

Use Falcon when:

Signature size is critical. Bandwidth-constrained protocols (resource-constrained IoT, blockchain transactions, satellite communications) where every byte matters.
Signing happens offline. If signing is rare and verification is frequent, Falcon’s slow signing is amortized.
You have access to a high-quality, audited Falcon implementation. Rolling your own is dangerous; there are now several vetted libraries (PQClean, liboqs).

Use ML-DSA when:

Signing throughput matters. Server signing many messages per second.
You need the simplest, safest implementation. ML-DSA is structurally easier to implement constant-time; the implementation maturity is now ~3 years and mature.
Default cryptographic policy is conservative. ML-DSA is the NIST default for a reason.

Use SPHINCS+ when:

You need conservative hash-based security. No lattice assumptions, only hash-function security.
Signature size doesn’t matter. SPHINCS+ has the largest signatures by far (~8 KB).
You expect long-term security across decades. Hash-function security is structurally more conservative than lattice problems for very long horizons.

The 2026 production picture: ~80% of new post-quantum deployments use ML-DSA; ~15% use Falcon for size-critical applications; ~5% use SPHINCS+ for stateless conservative use cases.

Common misconceptions

“Falcon is more secure than ML-DSA because it has smaller signatures.” No. Smaller signatures mean a more compact mathematical structure, not stronger security. Both schemes target the same NIST security level for their respective parameter sets. Choose based on the right tradeoff for your use case, not on signature size as a security proxy.

“Constant-time Gaussian sampling is solved.” It is well-studied but easy to get wrong in implementation. The 2025 PQShield reference implementations are well-audited; older reference code is not. Use a current, audited library.

“Falcon’s NTRU lattice is broken.” No fundamental break has been published. There are subexponential-time classical algorithms for some NTRU problems with structured parameters, but Falcon’s parameters were chosen to avoid these. As of 2026, Falcon’s mathematical security is intact. The vulnerabilities have been in implementation, not in the underlying problem.

“Falcon can replace ML-DSA in any protocol.” It can replace ML-DSA, but the slow signing and implementation difficulty matter. Most TLS deployments use ML-DSA; some specialized blockchain protocols use Falcon for the signature compactness.

Exercises

1. Why Falcon signatures are smaller

Compare the structure of a Falcon signature (a short lattice vector) and an ML-DSA signature (a vector of polynomial coefficients mod $q$ plus auxiliary data). Why is Falcon’s representation more compact?

Show answer

A Falcon signature is a short lattice vector that can be represented by its non-zero coefficients (which are typically small integers). The lattice has structure that allows efficient encoding. ML-DSA’s signature is a polynomial vector with fewer compactness optimizations and additional auxiliary data (a hash of the message reduces and a “challenge polynomial”). Falcon’s NTRU-lattice structure has more “compressible” output; ML-DSA’s structure trades compression for implementation simplicity. The size factor is roughly 5× — meaningful for bandwidth-constrained applications.

2. The constant-time challenge

Why is constant-time discrete Gaussian sampling especially difficult compared to constant-time uniform sampling?

Show answer

Uniform sampling over $\{0, 1, \ldots, M-1\}$ is straightforward: generate uniform integers, take modulo (or rejection-sample if bias matters). Constant-time uniform sampling is well-understood and library-supported. Discrete Gaussian sampling, by contrast, requires (a) computing the probability of each integer in the support according to the Gaussian PDF, (b) sampling proportional to those probabilities. Both steps involve floating-point arithmetic or table lookups that can leak through timing. Floating-point operations have data-dependent timing on most architectures, table lookups have data-dependent cache behavior, and any conditional branching to handle edge cases adds more leakage vectors. The accumulated complexity is what makes Falcon’s constant-time implementation an active research area rather than a solved problem.

3. When Falcon’s slow signing matters

A blockchain protocol expects nodes to sign ~100 transactions per second. Compute the per-server signing throughput required, and decide whether Falcon’s ~28M-cycle signing is feasible.

Show answer

100 signatures/sec × 28M cycles/sig = $2.8 \times 10^9$ cycles/sec required. A modern 3 GHz CPU has $3 \times 10^9$ cycles/sec available. One core dedicated to signing is at the edge of feasibility for Falcon at this throughput. Multi-core or batch signing helps but adds complexity. ML-DSA at ~600K cycles/sig requires only $6 \times 10^7$ cycles/sec — a tiny fraction of one core. For a 100-sig/sec blockchain node, ML-DSA is the clearly better choice from a server-throughput perspective; Falcon’s signature compactness is offset by the throughput cost. Falcon would be a better fit for a system where signature size is more constrained (small device transmitting over a narrow band).

4. Picking the right post-quantum signature for embedded firmware

A firmware signing system signs firmware images once per release (~weekly) and verifies them on millions of devices. Each device has 32 KB of RAM and limited CPU. Pick a signature scheme.

Show answer

Constraints: rare signing (weekly), frequent verification (each device on every boot), tight memory. Falcon-512 is the clear choice. The slow signing is amortized over a week; the compact signature (666 bytes) fits in tight memory; Falcon verification is fast enough for embedded CPUs. ML-DSA-44 would also work but uses ~3.5x more memory for the signature, which matters in 32 KB RAM. SPHINCS+ is too large (8 KB signature). Falcon is the right choice exactly when the bandwidth/memory constraint is the binding one and signing is rare. This is the canonical “Falcon use case” and explains why it is included in the NIST standards despite being harder to implement.

Where this goes next

Tutorial 50 covers SPHINCS+ — the hash-based signature alternative that doesn’t depend on lattice or NTRU assumptions, providing a structurally different conservative-security path.

NTRU lattices, briefly

How Falcon signs

The Gaussian sampling problem

Concrete sizes

Speed comparison

Implementation pitfalls

Decision rule

Common misconceptions

Exercises

1. Why Falcon signatures are smaller

2. The constant-time challenge

3. When Falcon’s slow signing matters

4. Picking the right post-quantum signature for embedded firmware

Where this goes next

Quantum, for people who already code.