Building an automated missile interception algorithm (ongoing)

This post is intentionally dense: lists, blockquotes, math, tables, images with captions, and footnotes that link out (and even sneak in a line of code). If the layout survives this, it survives anything.¹

Overview

The project is a 3D-ish intercept simulation: incoming threats move along ballistic-ish arcs; a defender learns when to launch and how to steer mid-course corrections so the interceptor meets the threat. Think less “production missile defense” and more “gym environment where I can iterate on reward shaping without filing export paperwork.”²

Unordered laundry list of what actually exists in the repo today:

Simulation core: discrete time steps, configurable gravity, drag hacks³, noisy sensors.
Policy: started with a hand-tuned PD-ish baseline, moving toward PPO/SAC-style updates (see log).
Visualization: WebGL preview + offline GIF exports for debugging angles and bone rig stupidity.

What “done” means (for now)

Reproducible training runs with a fixed seed.
Intercept rate above random on a held-out threat distribution.
A single command to record a rollout GIF for the README.
Stop lying to myself in the README about how finished any of this is.

Note: The hardest part hasn’t been neural networks. It’s time. When your simulation step isn’t aligned with how you log events, you get “ghost” intercepts that never happened — only your instrumentation thinks they did.

Nested blockquote, because why not:

Early prototype mantra:
Make it run. Make it right. Make it fast.
(I’m still somewhere between one and two.)

Log and Notes

A longer, messier stream of thoughts + experiments lives in the dedicated notes page: missile interception — log + notes. Below is the “executive summary” version with figures.

Still from the intercept visualization — threats, defender, and debug vectors Prototype view: intercept geometry + debug vectors (colors are meaningless; joy is real).

RL training GIF — policy fiddling with bone / angle targets Episode clip: angle / “bone” targets moving as the policy figures out it should not yeet the interceptor into the floor.

Earlier episode capture Older rollout export — kept around as a reminder that “bad but moving” beats “perfect but never run.”

Vocabulary (definition list)

Kramdown-style definitions — handy for glossaries:

RL loop: Agent observes state $s_t$, emits action $a_t$, environment returns $s_{t+1}$ and reward $r_t$. Rinse⁴.
Interceptor: The controllable object trying to meet the threat; not assumed to have unlimited lateral acceleration.
Threat: Anything you want to not reach the protected volume — modeled as a point mass first, fancier later.

Simulation sketch (code)

Python — environment step (toy)

@dataclass
class Vec3:
    x: float
    y: float
    z: float

    def __add__(self, o: "Vec3") -> "Vec3":
        return Vec3(self.x + o.x, self.y + o.y, self.z + o.z)

    def scale(self, s: float) -> "Vec3":
        return Vec3(self.x * s, self.y * s, self.z * s)

def step(pos: Vec3, vel: Vec3, acc: Vec3, dt: float) -> tuple[Vec3, Vec3]:
    """Semi-implicit Euler — good enough until energy blows up."""
    vel_next = vel + acc.scale(dt)
    pos_next = pos + vel_next.scale(dt)
    return pos_next, vel_next

Bash — one-liner to grep my own chaos

rg -n "TODO|FIXME|WTF" src/ notes/ --glob '!_site/**'

JavaScript — time slider mental model (what broke my brain)

// Discrete ticks; "continuous" feel is just small dt + good interpolation.
function clamp(x, lo, hi) {
  return Math.min(hi, Math.max(lo, x));
}

export function advanceTime(t, dt, tMax) {
  return clamp(t + dt, 0, tMax);
}

GLSL — fake “heat” in the debug view (fragment idea)

precision mediump float;
varying vec2 vUv;
uniform float uTime;
uniform vec2 uThreatPos;
uniform vec2 uInterceptPos;

void main() {
  float dThreat = distance(vUv, uThreatPos);
  float dIntercept = distance(vUv, uInterceptPos);
  float heat = exp(-12.0 * dThreat) + 0.6 * exp(-10.0 * dIntercept);
  vec3 col = mix(vec3(0.05, 0.07, 0.12), vec3(1.0, 0.35, 0.1), heat);
  gl_FragColor = vec4(col, 1.0);
}

JSON — config fragment

{
  "simulation": {
    "dt": 0.02,
    "max_episode_seconds": 45,
    "integrator": "semi_implicit_euler"
  },
  "rewards": {
    "intercept_bonus": 100.0,
    "distance_shaping": true
  }
}

Reward hacking (ordered list of failure modes)

Agent learns to stall just inside the success radius without intercepting — looks good in logs, is cowardice.
Huge terminal reward causes value explosions; advantage estimates go brrr.
Shaped distance rewards fight terminal sparse rewards unless you schedule curriculum.
You “fix” (3) by adding more terms until the reward is a Christmas tree and nobody knows what’s being optimized.

Table: what I thought vs what the metrics said

Phase	What I believed	What eval showed
Week 1	“Distance shaping will help exploration”	Learned to orbit
Week 2	“Bigger intercept bonus fixes orbit”	Learned to slam into terrain
Week 3	“Penalize crash”	Learned to do nothing
Week 4	“Tune penalties”	Finally something like intercept

Links, footnotes, and cross-refs

Internal: writing index · this site’s notes
External resources I keep reopening: Spinning Up in Deep RL · Stable-Baselines3 docs

Reward shaping sketch (don’t copy-paste blindly):⁵

Table stress tests (long cells, code, TeX, dollars)

Symptom	Likely cause	Quick check
Loss goes `nan` after ~2k steps	Bad learning rate or `log(0)` in policy	`torch.isfinite(loss).all()`; clamp `logits`
Interceptor orbits forever	Distance-only shaping with no terminal intercept term	Inspect $\mathbb{E}[r_T]$ vs shaped $\sum \gamma^k r_k$
Policy ignores threat	Observation normalization off or wrong frame	Print `obs.min()`, `obs.max()` each eval

Footer: three-column layout with inline code and inline math $\mathbb{E}[\cdot]$.

Budget line item	Amount	Notes
GPU hours	$12/hr spot × 40h	Escaped dollars for spreadsheet brain
Coffee	$4.50 × 2/day × 30	Same — `\$` not math
Emotional damage	Priceless	Long text cell: this row exists to see whether a joking label plus a medium-length explanation still wraps cleanly when the table is full-width and the type is Crimson Pro at ~1.1rem. If anything clips or the baseline looks wrong, that’s a signal to tweak `td` padding or `vertical-align`.

Footer: currency + a deliberately verbose third column.

Integrator	Update	Stability
Euler	`v += adt; x += vdt`	Meh
Semi-implicit	`v += adt; x += vdt` (order swap)	Better energy
RK4	four `k` stages	Overkill for this prototype

Counted block: first table should get “Table 1 —” if your CSS counter is active.

Constraint	Expression	Code
Speed cap	$\\|v\\| \leq v_{\max}$	`v = v * min(1.0, v_max / (torch.norm(v)+1e-8))`
No-fly	altitude $h \geq h_{\min}$	`penalty = torch.relu(h_min - h) ** 2`

Second table in same block — should read “Table 2 —”.

Extreme table torture

Pure layout stress: horizontal scroll, vertical scroll, dense grids, mixed modalities. If anything looks off, tweak .page-content table in fugg2.css — not your prose.

Absurdly wide grid (12 columns × 17 data rows)

C0	C1	C2	C3	C4	C5	C6	C7	C8	C9	C10	C11
`row_00`	(x_{0,1})	`np.float32`	v0_3	v0_4	`pow(2,0)`	v0_6	v0_7	v0_8	v0_9	v0_10	v0_11
`row_01`	(x_{1,1})	`torch.Tensor`	v1_3	v1_4	v1_5	v1_6	v1_7	v1_8	v1_9	v1_10	v1_11
`row_02`	(x_{2,1})	`torch.Tensor`	v2_3	v2_4	v2_5	v2_6	v2_7	v2_8	v2_9	v2_10	v2_11
`row_03`	(x_{3,1})	`np.float32`	v3_3	v3_4	v3_5	v3_6	v3_7	v3_8	v3_9	v3_10	v3_11
`row_04`	(x_{4,1})	`torch.Tensor`	v4_3	v4_4	`pow(2,4)`	v4_6	v4_7	v4_8	v4_9	v4_10	v4_11
`row_05`	(x_{5,1})	`torch.Tensor`	v5_3	v5_4	v5_5	v5_6	v5_7	v5_8	v5_9	v5_10	v5_11
`row_06`	(x_{6,1})	`np.float32`	v6_3	v6_4	v6_5	v6_6	v6_7	v6_8	v6_9	v6_10	v6_11
`row_07`	(x_{7,1})	`torch.Tensor`	v7_3	v7_4	v7_5	v7_6	v7_7	v7_8	v7_9	v7_10	Mega-cell: word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word …
`row_08`	(x_{8,1})	`torch.Tensor`	v8_3	v8_4	`pow(2,0)`	v8_6	v8_7	v8_8	v8_9	v8_10	v8_11
`row_09`	(x_{9,1})	`np.float32`	v9_3	v9_4	v9_5	v9_6	v9_7	v9_8	v9_9	v9_10	v9_11
`row_10`	(x_{10,1})	`torch.Tensor`	v10_3	v10_4	v10_5	v10_6	v10_7	v10_8	v10_9	v10_10	v10_11
`row_11`	(x_{11,1})	`torch.Tensor`	v11_3	v11_4	v11_5	v11_6	v11_7	v11_8	v11_9	v11_10	v11_11
`row_12`	(x_{12,1})	`np.float32`	v12_3	v12_4	`pow(2,4)`	v12_6	v12_7	v12_8	v12_9	v12_10	v12_11
`row_13`	(x_{13,1})	`torch.Tensor`	v13_3	v13_4	v13_5	v13_6	v13_7	v13_8	v13_9	v13_10	v13_11
`row_14`	(x_{14,1})	`torch.Tensor`	v14_3	v14_4	v14_5	v14_6	v14_7	v14_8	v14_9	v14_10	v14_11
`row_15`	(x_{15,1})	`np.float32`	v15_3	v15_4	v15_5	v15_6	v15_7	v15_8	v15_9	v15_10	v15_11
`row_16`	(x_{16,1})	`torch.Tensor`	v16_3	v16_4	`pow(2,0)`	v16_6	v16_7	v16_8	v16_9	v16_10	v16_11

Wide grid footer — check overflow-x, cell padding, and last-column long text.

Absurdly tall runbook (4 columns × 54 data rows)

#	Status	Command / token	Note
0	`OK`	`torchrun --nproc_per_node=2 train.py --seed 0`	( \sum_{k=1}^{n} k = n(n+1)/2 ) for sanity check 0
1	`WARN`	`pytest -q tests/test_001.py`	tick 1
2	`FAIL`	`pytest -q tests/test_002.py`	tick 2
3	`SKIP`	`pytest -q tests/test_003.py`	tick 3
4	`OK`	`pytest -q tests/test_004.py`	tick 4
5	`WARN`	`torchrun --nproc_per_node=2 train.py --seed 5`	tick 5
6	`FAIL`	`pytest -q tests/test_006.py`	tick 6
7	`SKIP`	`pytest -q tests/test_007.py`	tick 7
8	`OK`	`pytest -q tests/test_008.py`	tick 8
9	`WARN`	`pytest -q tests/test_009.py`	tick 9
10	`FAIL`	`torchrun --nproc_per_node=2 train.py --seed 10`	tick 10
11	`SKIP`	`pytest -q tests/test_011.py`	( \sum_{k=1}^{n} k = n(n+1)/2 ) for sanity check 11
12	`OK`	`pytest -q tests/test_012.py`	tick 12
13	`WARN`	`pytest -q tests/test_013.py`	tick 13
14	`FAIL`	`pytest -q tests/test_014.py`	tick 14
15	`SKIP`	`torchrun --nproc_per_node=2 train.py --seed 15`	tick 15
16	`OK`	`pytest -q tests/test_016.py`	tick 16
17	`WARN`	`pytest -q tests/test_017.py`	tick 17
18	`FAIL`	`pytest -q tests/test_018.py`	tick 18
19	`SKIP`	`pytest -q tests/test_019.py`	tick 19
20	`OK`	`torchrun --nproc_per_node=2 train.py --seed 20`	tick 20
21	`WARN`	`pytest -q tests/test_021.py`	tick 21
22	`FAIL`	`pytest -q tests/test_022.py`	( \sum_{k=1}^{n} k = n(n+1)/2 ) for sanity check 22
23	`SKIP`	`pytest -q tests/test_023.py`	Long note: lorem ipsum dolor sit amet, lorem ipsum dolor sit amet, lorem ipsum dolor sit amet, lorem ipsum dolor sit amet, lorem ipsum dolor sit amet, lorem ipsum dolor sit amet, lorem ipsum dolor sit amet, lorem ipsum dolor sit amet, lorem ipsum dolor sit amet, lorem ipsum dolor sit amet, lorem ipsum dolor sit amet, lorem ipsum dolor sit amet, end.
24	`OK`	`pytest -q tests/test_024.py`	tick 24
25	`WARN`	`torchrun --nproc_per_node=2 train.py --seed 25`	tick 25
26	`FAIL`	`pytest -q tests/test_026.py`	tick 26
27	`SKIP`	`pytest -q tests/test_027.py`	tick 27
28	`OK`	`pytest -q tests/test_028.py`	tick 28
29	`WARN`	`pytest -q tests/test_029.py`	tick 29
30	`FAIL`	`torchrun --nproc_per_node=2 train.py --seed 30`	tick 30
31	`SKIP`	`pytest -q tests/test_031.py`	tick 31
32	`OK`	`pytest -q tests/test_032.py`	tick 32
33	`WARN`	`pytest -q tests/test_033.py`	( \sum_{k=1}^{n} k = n(n+1)/2 ) for sanity check 33
34	`FAIL`	`pytest -q tests/test_034.py`	tick 34
35	`SKIP`	`torchrun --nproc_per_node=2 train.py --seed 35`	tick 35
36	`OK`	`pytest -q tests/test_036.py`	tick 36
37	`WARN`	`pytest -q tests/test_037.py`	tick 37
38	`FAIL`	`pytest -q tests/test_038.py`	tick 38
39	`SKIP`	`pytest -q tests/test_039.py`	tick 39
40	`OK`	`torchrun --nproc_per_node=2 train.py --seed 40`	tick 40
41	`WARN`	`pytest -q tests/test_041.py`	tick 41
42	`FAIL`	`pytest -q tests/test_042.py`	tick 42
43	`SKIP`	`pytest -q tests/test_043.py`	tick 43
44	`OK`	`pytest -q tests/test_044.py`	( \sum_{k=1}^{n} k = n(n+1)/2 ) for sanity check 44
45	`WARN`	`torchrun --nproc_per_node=2 train.py --seed 45`	tick 45
46	`FAIL`	`pytest -q tests/test_046.py`	tick 46
47	`SKIP`	`pytest -q tests/test_047.py`	tick 47
48	`OK`	`pytest -q tests/test_048.py`	tick 48
49	`WARN`	`pytest -q tests/test_049.py`	tick 49
50	`FAIL`	`torchrun --nproc_per_node=2 train.py --seed 50`	tick 50
51	`SKIP`	`pytest -q tests/test_051.py`	tick 51
52	`OK`	`pytest -q tests/test_052.py`	tick 52
53	`WARN`	`pytest -q tests/test_053.py`	tick 53

Tall runbook footer — row hover, alternating statuses, one obese note row.

Dense numeric matrix (16 × 16, every cell filled)

	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
0	0	17	34	51	68	85	102	119	136	153	170	187	204	221	238	255
1	31	48	65	82	99	116	133	150	167	184	201	218	235	252	269	286
2	62	79	96	113	130	147	164	181	198	215	232	249	266	283	300	317
3	93	110	127	144	161	178	195	212	229	246	263	280	297	314	331	348
4	124	141	158	175	192	209	226	243	260	277	294	311	328	345	362	379
5	155	172	189	206	223	240	257	274	291	308	325	342	359	376	393	410
6	186	203	220	237	254	271	288	305	322	339	356	373	390	407	424	441
7	217	234	251	268	285	302	319	336	353	370	387	404	421	438	455	472
8	248	265	282	299	316	333	350	367	384	401	418	435	452	469	486	503
9	279	296	313	330	347	364	381	398	415	432	449	466	483	500	517	534
10	310	327	344	361	378	395	412	429	446	463	480	497	514	531	548	565
11	341	358	375	392	409	426	443	460	477	494	511	528	545	562	579	596
12	372	389	406	423	440	457	474	491	508	525	542	559	576	593	610	627
13	403	420	437	454	471	488	505	522	539	556	573	590	607	624	641	658
14	434	451	468	485	502	519	536	553	570	587	604	621	638	655	672	689
15	465	482	499	516	533	550	567	584	601	618	635	652	669	686	703	720

Dense matrix footer — typography at small sizes, column alignment, thead vs tbody.

Mixed chaos (empty cells, long strings — pipes only as delimiters)

A	B	C
	`single-space`	(empty first column)
`pipe-in-code-ok`	`grep a\\\|b file`	or-op in regex
U+1F680	Not an emoji font test	just characters
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx	short	`mid`
short	yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy	`tail`

Chaos footer — empty cells, code with pipe characters inside backticks, ultra-long strings.

Screenshot archaeology

Sometimes the most “real” artifact is a random screengrab from a late night:

Debug screenshot — UI spaghetti era Exported screenshot from an older debug UI — buttons everywhere, soul nowhere.

Status: ongoing. If you’re reading this in the future and the project is still “ongoing,” then: same.

Meta-footnote: this paragraph exists to stress-test footnotes-to-gutter.js, sidenotes, and long footnote bodies. Inline check: python -m pip install -e . ↩
Not a weapons system. Toy research code. If you need real intercept math, start with proportional navigation and a proper aerospace textbook — e.g. Zarchan, Tactical and Strategic Missile Guidance. ↩
Drag is not “real” yet — it is a scalar fudge you tune until trajectories look plausible: drag_coeff * velocity ** 2 with clamps. ↩
If you want the formalism: MDP. Implementation detail: my state is not Markov yet — there’s hidden history in the integrator unless I augment the observation. ↩
$r_t = -\alpha \|p_{\text{int}} - p_{\text{threat}}\|2 + \beta \mathbb{1}{\text{hit}}$. In code: reward = -alpha * dist + beta * float(hit). Compare with Hugging Face RL notes for the bigger picture. ↩

Constraint	Expression	Code
Speed cap	\(\\|v\\| \leq v_{\max}\)	`v = v * min(1.0, v_max / (torch.norm(v)+1e-8))`
No-fly	altitude \(h \geq h_{\min}\)	`penalty = torch.relu(h_min - h) ** 2`

	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
0	0	17	34	51	68	85	102	119	136	153	170	187	204	221	238	255
1	31	48	65	82	99	116	133	150	167	184	201	218	235	252	269	286
2	62	79	96	113	130	147	164	181	198	215	232	249	266	283	300	317
3	93	110	127	144	161	178	195	212	229	246	263	280	297	314	331	348
4	124	141	158	175	192	209	226	243	260	277	294	311	328	345	362	379
5	155	172	189	206	223	240	257	274	291	308	325	342	359	376	393	410
6	186	203	220	237	254	271	288	305	322	339	356	373	390	407	424	441
7	217	234	251	268	285	302	319	336	353	370	387	404	421	438	455	472
8	248	265	282	299	316	333	350	367	384	401	418	435	452	469	486	503
9	279	296	313	330	347	364	381	398	415	432	449	466	483	500	517	534
10	310	327	344	361	378	395	412	429	446	463	480	497	514	531	548	565
11	341	358	375	392	409	426	443	460	477	494	511	528	545	562	579	596
12	372	389	406	423	440	457	474	491	508	525	542	559	576	593	610	627
13	403	420	437	454	471	488	505	522	539	556	573	590	607	624	641	658
14	434	451	468	485	502	519	536	553	570	587	604	621	638	655	672	689
15	465	482	499	516	533	550	567	584	601	618	635	652	669	686	703	720

	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
0	0	17	34	51	68	85	102	119	136	153	170	187	204	221	238	255
1	31	48	65	82	99	116	133	150	167	184	201	218	235	252	269	286
2	62	79	96	113	130	147	164	181	198	215	232	249	266	283	300	317
3	93	110	127	144	161	178	195	212	229	246	263	280	297	314	331	348
4	124	141	158	175	192	209	226	243	260	277	294	311	328	345	362	379
5	155	172	189	206	223	240	257	274	291	308	325	342	359	376	393	410
6	186	203	220	237	254	271	288	305	322	339	356	373	390	407	424	441
7	217	234	251	268	285	302	319	336	353	370	387	404	421	438	455	472
8	248	265	282	299	316	333	350	367	384	401	418	435	452	469	486	503
9	279	296	313	330	347	364	381	398	415	432	449	466	483	500	517	534
10	310	327	344	361	378	395	412	429	446	463	480	497	514	531	548	565
11	341	358	375	392	409	426	443	460	477	494	511	528	545	562	579	596
12	372	389	406	423	440	457	474	491	508	525	542	559	576	593	610	627
13	403	420	437	454	471	488	505	522	539	556	573	590	607	624	641	658
14	434	451	468	485	502	519	536	553	570	587	604	621	638	655	672	689
15	465	482	499	516	533	550	567	584	601	618	635	652	669	686	703	720

	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
0	0	17	34	51	68	85	102	119	136	153	170	187	204	221	238	255
1	31	48	65	82	99	116	133	150	167	184	201	218	235	252	269	286
2	62	79	96	113	130	147	164	181	198	215	232	249	266	283	300	317
3	93	110	127	144	161	178	195	212	229	246	263	280	297	314	331	348
4	124	141	158	175	192	209	226	243	260	277	294	311	328	345	362	379
5	155	172	189	206	223	240	257	274	291	308	325	342	359	376	393	410
6	186	203	220	237	254	271	288	305	322	339	356	373	390	407	424	441
7	217	234	251	268	285	302	319	336	353	370	387	404	421	438	455	472
8	248	265	282	299	316	333	350	367	384	401	418	435	452	469	486	503
9	279	296	313	330	347	364	381	398	415	432	449	466	483	500	517	534
10	310	327	344	361	378	395	412	429	446	463	480	497	514	531	548	565
11	341	358	375	392	409	426	443	460	477	494	511	528	545	562	579	596
12	372	389	406	423	440	457	474	491	508	525	542	559	576	593	610	627
13	403	420	437	454	471	488	505	522	539	556	573	590	607	624	641	658
14	434	451	468	485	502	519	536	553	570	587	604	621	638	655	672	689
15	465	482	499	516	533	550	567	584	601	618	635	652	669	686	703	720