Notebook 02 — The LWE problem#
Goal: see Learning With Errors (LWE) operationally on tiny examples — build a sample by hand, then run Regev’s single-bit public-key encryption end-to-end.
LWE in one paragraph#
Learning With Errors (LWE): given many samples
where \(s \in \mathbb{Z}_q^n\) is a fixed secret, each \(a_i \in \mathbb{Z}_q^n\) is uniformly random, and each \(e_i\) is small Gaussian noise, recover \(s\).
The decision version: distinguish real LWE samples from uniformly random \((a, b)\). Both search-LWE and decision-LWE are conjectured hard; they provably reduce to worst-case lattice problems.
import numpy as np
rng = np.random.default_rng(1)
n, q, sigma = 4, 17, 1.0
s = rng.integers(0, q, n)
a = rng.integers(0, q, n)
e = int(np.round(rng.normal(0, sigma)))
b = (int(a @ s) + e) % q
print('s =', s)
print('a =', a)
print('<a,s> =', int(a @ s), 'mod', q, '=', int(a @ s) % q)
print('e =', e)
print('b =', b)
s = [ 8 8 12 16]
a = [ 0 2 13 16]
<a,s> = 428 mod 17 = 3
e = 1
b = 4
Regev’s single-bit public-key encryption#
Key generation. Sample a secret \(s \in \mathbb{Z}_q^n\) and \(m\) LWE samples; the public key is the list of pairs \((a_i, b_i)\).
Encrypt a bit \(\mu \in \{0, 1\}\): pick a random subset \(S\) of samples and output
Decrypt with \(s\): compute \(v = b^* - \langle a^*, s \rangle \bmod q\). If \(v\) is close to \(0\), output \(0\); if close to \(q/2\), output \(1\).
Correctness: \(v = \sum_{i \in S} e_i + \mu \lfloor q/2 \rfloor\), which is a small noise term plus either \(0\) or \(q/2\). If the accumulated noise stays below \(q/4\) in magnitude, decryption is unambiguous.
from pqc_edu.lwe import toy_keygen, toy_encrypt, toy_decrypt
rng = np.random.default_rng(0)
pk, sk = toy_keygen(n=16, q=257, sigma=1.0, rng=rng)
print('secret key s:', sk.s)
print('public key has', pk.samples.shape[0], 'samples, each',
pk.samples.shape[1] - 1, 'coefficients + b')
secret key s: [218 163 131 69 79 10 19 4 45 209 166 234 129 155 249 187]
public key has 42 samples, each 16 coefficients + b
results = []
for _ in range(20):
bit = int(rng.integers(0, 2))
ct = toy_encrypt(pk, bit, rng)
recovered = toy_decrypt(sk, ct)
results.append((bit, recovered, bit == recovered))
print(f"{'bit':>5} {'decrypted':>10} {'ok':>5}")
for b, r, ok in results:
print(f"{b:>5} {r:>10} {str(ok):>5}")
print('all correct?', all(ok for _, _, ok in results))
bit decrypted ok
0 0 True
0 0 True
0 0 True
1 1 True
0 0 True
1 1 True
1 1 True
1 1 True
1 1 True
1 1 True
1 1 True
1 1 True
1 1 True
0 0 True
1 1 True
1 1 True
0 0 True
1 1 True
0 0 True
0 0 True
all correct? True
Noise budget demo#
If \(\sigma\) is too large, decryption can fail. The accumulated error from summing a random subset of samples can push \(b^*\) out of the unambiguous region around \(0\) or \(\lfloor q/2 \rfloor\). Real schemes carefully pick parameters so the failure probability is negligible (well below \(2^{-100}\)).
rng = np.random.default_rng(0)
pk, sk = toy_keygen(n=16, q=257, sigma=8.0, rng=rng)
fails = 0
for _ in range(200):
bit = int(rng.integers(0, 2))
ct = toy_encrypt(pk, bit, rng)
if toy_decrypt(sk, ct) != bit:
fails += 1
print(f'failures: {fails}/200 with sigma=8.0')
failures: 3/200 with sigma=8.0
Issues with this toy scheme#
This construction is correct but impractical:
One bit per ciphertext is wasteful (ciphertext is \(\sim n \log q\) bits for 1 bit of plaintext).
The public key matrix is huge — \(m \times n\) entries in \(\mathbb{Z}_q\).
No active-attacker protection — a malicious ciphertext can reveal \(s\).
ML-KEM (notebook 06) fixes all three: module lattices for compact keys, the NTT for fast polynomial multiplication, and the Fujisaki-Okamoto transform for CCA security.
But first: can we actually break toy LWE? → 03_attacking_toy_lwe.ipynb.