Side-Channel Attacks on PQC: What Hardware Engineers Need to Know
There’s a dangerous assumption circulating in the security community: that post-quantum cryptographic algorithms are inherently safe because they’re mathematically hard to break. This conflates algorithmic security with implementation security — and in the real world, implementations are what get attacked.
The NIST PQC standards — ML-KEM for key encapsulation and ML-DSA for digital signatures — are built on the hardness of lattice problems. No known quantum algorithm efficiently solves these problems. But an attacker with physical access to your device doesn’t need a quantum computer. They need an oscilloscope, an EM probe, and patience.
The Gap Between Theory and Silicon
Every cryptographic operation consumes power. That power consumption varies depending on the data being processed. This is the fundamental physical reality that enables side-channel attacks, and no amount of mathematical elegance in the algorithm design changes it.
Lattice-based cryptography relies heavily on polynomial arithmetic, particularly the Number Theoretic Transform (NTT). The NTT is essentially a modular arithmetic version of the Fast Fourier Transform, and it’s the computational core of both ML-KEM and ML-DSA.
Here’s the problem: NTT butterfly operations involve multiplications and additions where the operands directly depend on secret key material. Each butterfly computes:
a' = a + w * b (mod q)
b' = a - w * b (mod q)
where w is a twiddle factor and a, b are coefficients that correlate with secret data. The multiplication w * b produces a data-dependent power signature that is measurable, repeatable, and exploitable.
Real Attack Vectors on ML-KEM and ML-DSA
Recent academic research has demonstrated practical attacks against PQC implementations:
Power Analysis on ML-KEM Decapsulation: The decapsulation process in ML-KEM involves a secret polynomial multiplication that directly leaks through power consumption. Correlation Power Analysis (CPA) targeting the NTT operations can recover the secret key with as few as 10,000 traces on unprotected software implementations running on ARM Cortex-M4.
Timing Attacks on ML-DSA Signing: The rejection sampling step in Dilithium’s signing procedure creates a timing side channel. The number of iterations before a valid signature is produced correlates with the secret key, enabling a remote timing attack that doesn’t even require physical access.
Electromagnetic Analysis: EM probes placed near the processor die can capture localized emanations from specific functional units. This is particularly devastating because it can target individual NTT butterfly units even in the presence of algorithmic noise from other concurrent operations.
Template Attacks on Key Generation: The most sophisticated variant — an attacker first profiles power consumption on an identical device with known keys, then uses this template to extract the key from the target device with minimal traces.
Why Software Countermeasures Fall Short
The standard software countermeasures against side-channel attacks include:
Random delays: Insert random wait cycles between operations. This increases the number of traces needed but doesn’t eliminate the leakage. With sufficient traces (typically 10-100x more), the attack still succeeds.
Operation shuffling: Randomize the order of independent NTT butterfly operations. This provides first-order protection but is defeated by higher-order statistical analysis.
Boolean masking in software: Split each secret value into random shares and operate on shares independently. This is theoretically sound but practically flawed in software — register reuse, memory bus contention, and compiler optimizations frequently create unintended leakage points that bypass the masking.
The fundamental issue is that software runs on hardware it doesn’t fully control. The CPU’s pipeline, cache hierarchy, branch predictor, and bus arbitration all create side channels that software cannot observe or mitigate.
Hardware Countermeasures: The Only Complete Solution
Hardware implementations of PQC can provide side-channel resistance that is architecturally guaranteed, not merely hoped for:
Gate-level Boolean Masking: In a hardware NTT implementation, each butterfly operation can be decomposed into masked gates where the shares never recombine on any physical wire. The masking is maintained through every logic gate, flip-flop, and interconnect — something impossible to guarantee in software.
Arithmetic Masking for NTT: The polynomial coefficients can be additively masked before entering the NTT datapath. Converting between Boolean and arithmetic masking is expensive in software (requiring the Goubin conversion algorithm) but can be implemented as a dedicated conversion unit in hardware with constant-time, constant-power behavior.
Hiding through Controlled Randomization: Hardware enables insertion of random dummy operations at the clock-cycle level, noise injection on the power supply, and randomized register allocation — all transparent to the functional operation but devastating to statistical side-channel analysis.
Dual-rail Logic: Advanced implementations use complementary logic styles (like WDDL or SABL) where every gate switches exactly once per clock cycle regardless of the data value, fundamentally eliminating data-dependent power variation at the source.
Evaluation and Certification Requirements
The security community is increasingly recognizing that PQC implementations must be evaluated for side-channel resistance:
Common Criteria AVA_VAN.5 (the highest vulnerability analysis level) now expects demonstration of resistance against advanced physical attacks including DPA, CPA, and template attacks.
NIST’s Implementation Security requirements for PQC explicitly acknowledge that side-channel resistance is essential for any deployment handling sensitive data.
KCMVP (Korean Cryptographic Module Validation Program) is updating its requirements to include side-channel evaluation for PQC modules.
Test Vector Leakage Assessment (TVLA) using Welch’s t-test has become the standard first-pass evaluation method. A secure implementation must pass TVLA at first order (and ideally higher orders) across all intermediate values.
Practical Guidance for Hardware Engineers
If you’re designing PQC hardware IP, here’s the minimum security baseline:
First-order masking on all operations involving secret key material. This includes NTT butterflies, polynomial sampling, and comparison operations in decapsulation/verification.
Constant-time execution guaranteed by hardware design — no data-dependent branching, no variable-latency multipliers, no early-termination optimizers in the cryptographic datapath.
Fresh randomness for every operation — integrate a TRNG (True Random Number Generator) that provides sufficient entropy for mask generation. For first-order masked NTT on ML-KEM-768, this requires approximately 3,072 random bits per operation.
Physical separation between the cryptographic core and the rest of the SoC. Dedicated power domains, separate clock trees, and shielding layers prevent cross-talk leakage.
Conclusion
The post-quantum cryptography standards are mathematically sound. But mathematics doesn’t execute — hardware does. And hardware leaks. The only way to build PQC implementations that are secure against real-world physical adversaries is to build the countermeasures into the silicon itself.
Side-channel attacks on PQC are not theoretical. They’re published, reproduced, and improving. The time to design resistant hardware is before deployment, not after the first breach.
