/* ============================================================
   PART IV — LSTM
   ============================================================ */

(function() {
  const ANCHOR = document.getElementById('anchor-lstm');

  const html = `
<!-- ========== 22 The conveyor-belt idea ========== -->
<article id="lstm-idea" class="screen" data-screen-label="22 The conveyor-belt idea">
  <div class="section-head">
    <div class="section-eyebrow">Part IV · Section 22 · 5 min</div>
    <h2>LSTM: a conveyor belt of memory, with valves.</h2>
    <p class="section-lede">An LSTM keeps an extra piece of state called the <em>cell</em>, that flows along untouched unless the cell explicitly chooses to write to it or read from it.</p>
  </div>
  <div class="prose">
    <p>The vanilla RNN had one piece of state, <em>h</em>, that got rewritten at every step. That rewrite is exactly where information from earlier steps gets lost. The LSTM's central idea is to <b>separate the highway from the workshop</b>:</p>

    <ul>
      <li>The <b>cell state <em>C</em></b> is a long-running memory that flows from step to step <em>almost untouched</em>. Picture a conveyor belt.</li>
      <li>The <b>hidden state <em>h</em></b> is the working output, the same role <em>h</em> played in the vanilla RNN.</li>
      <li>Three small <b>gates</b> decide what to forget, what to write, and what to read.</li>
    </ul>

    <div class="fig">
      <div class="fig-title"><strong>The conveyor belt</strong><span>cell state C flows mostly straight through</span></div>
      <div id="fig-belt"></div>
    </div>
    <div class="caption">The straight horizontal arrow is the cell state. Most of the time it's mostly unchanged; the gates only intervene where they need to.</div>

    <p>The genius is in that "almost." If the gates choose, the cell state can carry information across hundreds of steps with no decay. The gradient flowing backward along the belt doesn't get multiplied by anything — so it doesn't vanish.</p>
  </div>
</article>

<!-- ========== 23 Three gates ========== -->
<article id="lstm-gates" class="screen" data-screen-label="23 Three gates">
  <div class="section-head">
    <div class="section-eyebrow">Part IV · Section 23 · 7 min</div>
    <h2>Three gates: forget, input, output.</h2>
    <p class="section-lede">Each gate is a sigmoid layer producing a vector of values between 0 and 1 — multiplied element-wise into the cell state. 0 = closed, 1 = open.</p>
  </div>
  <div class="prose">
    <div class="fig fig-wide">
      <div class="fig-title"><strong>The three gates</strong><span>sigmoid → mask → modify cell</span></div>
      <div id="fig-three-gates"></div>
    </div>
    <div class="caption">Each gate looks at the current input and previous hidden state, and decides — independently and per-dimension — how much information to let through.</div>

    <h3>Forget gate · what to drop</h3>
    <div class="eq" style="font-size:18px">f<span class="sub">t</span> = σ(W<span class="sub">f</span>·[h<span class="sub">t−1</span>, x<span class="sub">t</span>] + b<span class="sub">f</span>)<span class="lbl">multiplies into C — values close to 0 erase memory</span></div>
    <p>"Should I remember the subject of the sentence after seeing a period? Probably not — close the gate."</p>

    <h3>Input gate · what to write</h3>
    <div class="eq" style="font-size:18px">i<span class="sub">t</span> = σ(W<span class="sub">i</span>·[h<span class="sub">t−1</span>, x<span class="sub">t</span>] + b<span class="sub">i</span>) &nbsp;&nbsp; C̃<span class="sub">t</span> = tanh(W<span class="sub">C</span>·[h<span class="sub">t−1</span>, x<span class="sub">t</span>] + b<span class="sub">C</span>)<span class="lbl">i decides how much; C̃ is the candidate to write</span></div>
    <p>"Just saw a new subject. Open the input gate, write it to memory."</p>

    <h3>Output gate · what to read</h3>
    <div class="eq" style="font-size:18px">o<span class="sub">t</span> = σ(W<span class="sub">o</span>·[h<span class="sub">t−1</span>, x<span class="sub">t</span>] + b<span class="sub">o</span>) &nbsp;&nbsp; h<span class="sub">t</span> = o<span class="sub">t</span> ⊙ tanh(C<span class="sub">t</span>)<span class="lbl">controls what part of memory becomes the hidden output</span></div>
    <p>"Predicting a verb? I need the subject's number — open the output gate on those dims."</p>

    <p>All four equations together describe the entire LSTM step. They look intimidating; they're really just four small dense layers, three with sigmoid, one with tanh, glued together with element-wise multiplications.</p>
  </div>
</article>

<!-- ========== 24 One step walked through ========== -->
<article id="lstm-walk" class="screen" data-screen-label="24 One step walked through">
  <div class="section-head">
    <div class="section-eyebrow">Part IV · Section 24 · 6 min · animated</div>
    <h2>A single LSTM step, animated.</h2>
    <p class="section-lede">Watch a token enter the cell. Watch the gates compute. Watch the cell state update. Watch the new hidden state emerge.</p>
  </div>
  <div class="prose">
    <div class="fig fig-wide" id="fig-lstm-walk-mount"></div>
    <div class="caption">Phase 1: forget. Phase 2: input + candidate. Phase 3: cell update. Phase 4: output. The cell state belt is yellow; the hidden state line is amber; gates pulse on as they activate.</div>

    <h3>Why this fixes the vanishing gradient</h3>
    <p>Trace the gradient backward along the cell state. At each step it's multiplied by the <em>forget gate's</em> activation — which the network can <em>learn</em> to keep close to 1 when long-range memory matters. Compare this to the vanilla RNN, where the gradient is always multiplied by <em>W</em><sub>hh</sub>, a fixed matrix.</p>
    <p>Result: LSTMs can learn dependencies across hundreds, sometimes thousands of time steps. RNNs typically max out around 10–20.</p>
  </div>
</article>

<!-- ========== 25 Gate playground ========== -->
<article id="lstm-play" class="screen" data-screen-label="25 Gate playground">
  <div class="section-head">
    <div class="section-eyebrow">Part IV · Section 25 · 6 min · interactive</div>
    <h2>Gate playground: set the gates, watch memory survive (or die).</h2>
    <p class="section-lede">Drag the three gate sliders. The chart shows what happens to a piece of information that enters memory at step 0, over 30 time steps.</p>
  </div>
  <div class="prose">
    <div class="fig fig-wide" id="fig-lstm-play-mount"></div>
    <div class="caption">Try this: set <em>forget</em> = 1, <em>input</em> = 1, <em>output</em> = 1 at step 0, then forget = 1 thereafter. The signal is preserved indefinitely. Now drop forget to 0.6 — watch decay return.</div>

    <h3>Three preset scenarios to try</h3>
    <ul>
      <li><b>Hodling memory.</b> forget = 1.0 throughout, input pulse at step 0, output = 1.0 at step 29. The information arrives untouched 30 steps later.</li>
      <li><b>Slow decay (vanilla RNN equivalent).</b> forget = 0.7. The signal drops to ~0.04 by step 30 — already gone.</li>
      <li><b>Selective overwrite.</b> forget = 0.3 at step 15, input = 1.0 at step 15. The original memory is wiped and replaced with the new input.</li>
    </ul>
  </div>
</article>

<!-- ========== 26 LSTM in Keras ========== -->
<article id="lstm-code" class="screen" data-screen-label="26 LSTM in Keras">
  <div class="section-head">
    <div class="section-eyebrow">Part IV · Section 26 · 4 min</div>
    <h2>An LSTM in Keras.</h2>
    <p class="section-lede">Same Keras one-liner discipline. Same training loop. The architecture takes care of itself.</p>
  </div>
  <div class="prose">
    <pre class="code"><span class="code-tag">tensorflow / keras</span><span class="kw">from</span> tensorflow.keras <span class="kw">import</span> layers, models

<span class="com"># Stock-price next-day prediction · time series</span>
WINDOW, FEATURES = <span class="num">60</span>, <span class="num">5</span>          <span class="com"># 60 days × (open, high, low, close, vol)</span>

model = models.<span class="fn">Sequential</span>([
    layers.<span class="fn">Input</span>(shape=(WINDOW, FEATURES)),
    layers.<span class="fn">LSTM</span>(<span class="num">64</span>, return_sequences=<span class="kw">True</span>),    <span class="com"># stack a second LSTM</span>
    layers.<span class="fn">Dropout</span>(<span class="num">0.2</span>),
    layers.<span class="fn">LSTM</span>(<span class="num">32</span>),                          <span class="com"># last-step output</span>
    layers.<span class="fn">Dense</span>(<span class="num">1</span>)                              <span class="com"># next-day close</span>
])
model.<span class="fn">compile</span>(optimizer=<span class="str">'adam'</span>, loss=<span class="str">'mse'</span>)
model.<span class="fn">fit</span>(X_train, y_train, epochs=<span class="num">20</span>, batch_size=<span class="num">32</span>,
          validation_split=<span class="num">0.1</span>)</pre>

    <h3>Things to notice</h3>
    <ul>
      <li><b>Stacked LSTMs.</b> First layer with <code>return_sequences=True</code> emits a state per step; that becomes the input to the second LSTM. Second one keeps only the final state for the dense head.</li>
      <li><b>Same compile/fit pattern.</b> The framework handles BPTT and gradient clipping defaults.</li>
      <li><b>Input shape <code>(window, features)</code>.</b> Time first, features second.</li>
    </ul>

    <div class="callout">
      <div class="callout-title">When to use LSTM vs Transformer in 2025</div>
      <p>Transformers won natural language. For most text tasks today — translation, summarization, sentiment — start with a pretrained transformer (BERT, T5, or an LLM API). Reach for LSTMs when: (1) sequences are very long but mostly local in dependency, (2) you have small data and limited compute, or (3) the input is genuinely streaming (audio, IoT, financial ticks) and you want a stateful model. They're far from obsolete; just no longer the default for text.</p>
    </div>
  </div>
</article>

<!-- ========== 27 Closing ========== -->
<article id="closing" class="screen" data-screen-label="27 What to study next">
  <div class="section-head">
    <div class="section-eyebrow">Wrap · Section 27 · 4 min</div>
    <h2>What you know now, and where to go.</h2>
    <p class="section-lede">Two hours ago we started with a single neuron. You now understand the four architectures that powered most of deep learning's first decade — and the conceptual seeds for what came after.</p>
  </div>
  <div class="prose">
    <div class="fig fig-soft">
      <div class="fig-title"><strong>The arc you walked</strong><span>each rung adds an inductive bias</span></div>
      <div id="fig-arc"></div>
    </div>

    <h3>Where to go next</h3>
    <div class="card-row">
      <div class="card"><div class="card-tag">depth</div><div class="card-h">Train the four models</div><div class="card-b">Build the four Keras snippets in a Colab. There's a chasm between reading and running. Cross it this week.</div></div>
      <div class="card"><div class="card-tag">breadth</div><div class="card-h">Transformers</div><div class="card-b">Self-attention generalizes the recurrence idea: every step looks at every other step. The architecture behind GPT and friends. Read "The Illustrated Transformer."</div></div>
      <div class="card"><div class="card-tag">vision</div><div class="card-h">ResNet & Vision Transformers</div><div class="card-b">Residual connections solved CNN depth. ViTs apply attention to image patches. The frontier of computer vision is here.</div></div>
      <div class="card"><div class="card-tag">tooling</div><div class="card-h">PyTorch + JAX</div><div class="card-b">Once Keras feels fluent, go one level lower. PyTorch for research; JAX for performance.</div></div>
    </div>

    <h3>The single most important habit</h3>
    <p class="aside">When you read a new architecture paper, ask one question: <em>what inductive bias is being built in?</em> CNNs assume locality. RNNs assume temporal order. LSTMs assume long-range memory matters. Transformers assume any pair of positions might matter. Once you can name the bias, the model stops being a black box and becomes an opinion about your data.</p>

    <p>Two hours well spent. Now go train something.</p>

    <div style="margin-top:48px; padding-top:24px; border-top:1px solid var(--line); display:flex; justify-content:space-between; align-items:baseline; font-family:var(--mono); font-size:11px; color:var(--muted); letter-spacing:0.06em; text-transform:uppercase;">
      <span>END · 27 of 27 sections</span>
      <span>Deep Learning · 2-hour session</span>
    </div>
  </div>
</article>
  `;
  if (ANCHOR) ANCHOR.insertAdjacentHTML('afterend', html);

  // ============================================================
  // Static figures
  // ============================================================
  const belt = document.getElementById('fig-belt');
  if (belt) {
    belt.innerHTML = `
      <svg viewBox="0 0 760 200" width="100%">
        <!-- conveyor belt: cell state -->
        <line x1="40" y1="60" x2="720" y2="60" stroke="#b8860b" stroke-width="6" stroke-linecap="round"/>
        <text x="40" y="40" font-family="JetBrains Mono" font-size="10" fill="#b8860b" letter-spacing="1">CELL STATE C — long-running memory</text>

        <!-- ticks along belt with valve symbols -->
        ${[180, 360, 540].map((x, i) => `
          <circle cx="${x}" cy="60" r="14" fill="#fbfaf6" stroke="#b8860b" stroke-width="1.6"/>
          <line x1="${x-7}" y1="60" x2="${x+7}" y2="60" stroke="#b8860b" stroke-width="2"/>
          <line x1="${x}" y1="53" x2="${x}" y2="67" stroke="#b8860b" stroke-width="2"/>
          <text x="${x}" y="92" text-anchor="middle" font-family="JetBrains Mono" font-size="10" fill="#b8860b">gate</text>
        `).join('')}

        <!-- hidden state line below -->
        <line x1="40" y1="140" x2="720" y2="140" stroke="#1a1a1a" stroke-width="1" stroke-dasharray="3 3"/>
        <text x="40" y="160" font-family="JetBrains Mono" font-size="10" fill="#8a877f" letter-spacing="1">HIDDEN STATE h — working output, like vanilla RNN</text>

        <!-- input arrows -->
        ${[180, 360, 540].map(x => `
          <line x1="${x}" y1="180" x2="${x}" y2="148" stroke="#1f6feb" stroke-width="1.2" marker-end="url(#beltArr)"/>
          <text x="${x+12}" y="188" font-family="JetBrains Mono" font-size="10" fill="#1f6feb">x</text>
          <line x1="${x}" y1="74" x2="${x}" y2="132" stroke="#b8860b" stroke-width="1" stroke-dasharray="2 2"/>
        `).join('')}

        <defs><marker id="beltArr" markerWidth="8" markerHeight="8" refX="6" refY="4" orient="auto"><polygon points="0 0, 8 4, 0 8" fill="#1f6feb"/></marker></defs>

        <!-- moving particle along belt -->
        <circle r="4" fill="#b8860b">
          <animate attributeName="cx" values="40;720" dur="6s" repeatCount="indefinite"/>
          <animate attributeName="cy" values="60;60" dur="6s" repeatCount="indefinite"/>
        </circle>
      </svg>
    `;
  }

  const tg = document.getElementById('fig-three-gates');
  if (tg) {
    tg.innerHTML = `
      <svg viewBox="0 0 880 280" width="100%">
        <!-- cell box -->
        <rect x="60" y="40" width="760" height="200" fill="#fbfaf6" stroke="#1a1a1a" stroke-width="1.4" rx="8"/>
        <text x="440" y="30" text-anchor="middle" font-family="JetBrains Mono" font-size="10" fill="#8a877f" letter-spacing="1">LSTM CELL · ONE STEP</text>

        <!-- top conveyor: cell state -->
        <line x1="80" y1="80" x2="800" y2="80" stroke="#b8860b" stroke-width="4"/>
        <text x="86" y="74" font-family="Fraunces" font-size="13" fill="#b8860b" font-weight="600">C<tspan font-size="9">t-1</tspan></text>
        <text x="780" y="74" font-family="Fraunces" font-size="13" fill="#b8860b" font-weight="600">C<tspan font-size="9">t</tspan></text>

        <!-- bottom hidden -->
        <line x1="80" y1="220" x2="800" y2="220" stroke="#1a1a1a" stroke-width="1" stroke-dasharray="3 3"/>
        <text x="86" y="238" font-family="Fraunces" font-size="13" font-weight="600">h<tspan font-size="9">t-1</tspan></text>
        <text x="780" y="238" font-family="Fraunces" font-size="13" font-weight="600">h<tspan font-size="9">t</tspan></text>

        <!-- forget gate -->
        <g>
          <circle cx="200" cy="80" r="18" fill="#f6e4d8" stroke="#c84e1d" stroke-width="1.6"/>
          <text x="200" y="85" text-anchor="middle" font-family="Fraunces" font-size="14" font-weight="600" fill="#c84e1d">×</text>
          <line x1="200" y1="120" x2="200" y2="98" stroke="#c84e1d" stroke-width="1.4" marker-end="url(#gArr)"/>
          <rect x="170" y="124" width="60" height="32" fill="#fbfaf6" stroke="#c84e1d" stroke-width="1.4" rx="3"/>
          <text x="200" y="144" text-anchor="middle" font-family="Fraunces" font-size="13" font-weight="600" fill="#c84e1d">σ</text>
          <text x="200" y="172" text-anchor="middle" font-family="JetBrains Mono" font-size="10" fill="#c84e1d">FORGET f</text>
        </g>

        <!-- input gate + candidate -->
        <g>
          <circle cx="380" cy="80" r="18" fill="#dde7f7" stroke="#1f6feb" stroke-width="1.6"/>
          <text x="380" y="85" text-anchor="middle" font-family="Fraunces" font-size="14" font-weight="600" fill="#1f6feb">+</text>
          <rect x="350" y="124" width="60" height="32" fill="#fbfaf6" stroke="#1f6feb" stroke-width="1.4" rx="3"/>
          <text x="365" y="144" text-anchor="middle" font-family="Fraunces" font-size="13" font-weight="600" fill="#1f6feb">σ</text>
          <text x="395" y="144" text-anchor="middle" font-family="Fraunces" font-size="13" font-weight="600" fill="#1f6feb">tanh</text>
          <line x1="380" y1="120" x2="380" y2="98" stroke="#1f6feb" stroke-width="1.4" marker-end="url(#gArr2)"/>
          <text x="380" y="172" text-anchor="middle" font-family="JetBrains Mono" font-size="10" fill="#1f6feb">INPUT i · candidate C̃</text>
        </g>

        <!-- output gate -->
        <g>
          <circle cx="640" cy="170" r="18" fill="#d6e8de" stroke="#1a7a4c" stroke-width="1.6"/>
          <text x="640" y="175" text-anchor="middle" font-family="Fraunces" font-size="14" font-weight="600" fill="#1a7a4c">×</text>
          <line x1="640" y1="80" x2="640" y2="152" stroke="#1a7a4c" stroke-width="1.2" stroke-dasharray="2 2"/>
          <text x="668" y="120" font-family="Fraunces" font-size="11" fill="#1a7a4c">tanh(C)</text>
          <line x1="640" y1="188" x2="640" y2="220" stroke="#1a7a4c" stroke-width="1.4" marker-end="url(#gArr3)"/>
          <rect x="610" y="120" width="60" height="32" fill="#fbfaf6" stroke="#1a7a4c" stroke-width="1.4" rx="3" opacity="0"/>
          <text x="640" y="252" text-anchor="middle" font-family="JetBrains Mono" font-size="10" fill="#1a7a4c">OUTPUT o</text>
        </g>

        <defs>
          <marker id="gArr" markerWidth="6" markerHeight="6" refX="5" refY="3" orient="auto"><polygon points="0 0, 6 3, 0 6" fill="#c84e1d"/></marker>
          <marker id="gArr2" markerWidth="6" markerHeight="6" refX="5" refY="3" orient="auto"><polygon points="0 0, 6 3, 0 6" fill="#1f6feb"/></marker>
          <marker id="gArr3" markerWidth="6" markerHeight="6" refX="5" refY="3" orient="auto"><polygon points="0 0, 6 3, 0 6" fill="#1a7a4c"/></marker>
        </defs>
      </svg>
    `;
  }

  const arc = document.getElementById('fig-arc');
  if (arc) {
    arc.innerHTML = `
      <svg viewBox="0 0 760 220" width="100%">
        ${[
          {x:60,  c:'#1f6feb', bg:'#dde7f7', n:'ANN',  s:'+ nonlinearity'},
          {x:240, c:'#c84e1d', bg:'#f6e4d8', n:'CNN',  s:'+ spatial weight sharing'},
          {x:420, c:'#1a7a4c', bg:'#d6e8de', n:'RNN',  s:'+ temporal weight sharing'},
          {x:600, c:'#b8860b', bg:'#f1e4c2', n:'LSTM', s:'+ gated memory'},
        ].map((s, i, arr) => `
          <circle cx="${s.x+60}" cy="100" r="44" fill="${s.bg}" stroke="${s.c}" stroke-width="1.6"/>
          <text x="${s.x+60}" y="96" text-anchor="middle" font-family="Fraunces" font-size="20" font-weight="600" fill="${s.c}">${s.n}</text>
          <text x="${s.x+60}" y="170" text-anchor="middle" font-family="JetBrains Mono" font-size="10" fill="#8a877f" letter-spacing="1">${s.s}</text>
          ${i < arr.length-1 ? `<path d="M ${s.x+108} 100 L ${arr[i+1].x+12} 100" stroke="#8a877f" stroke-width="1.2" stroke-dasharray="3 3" marker-end="url(#arcArr)"/>` : ''}
        `).join('')}
        <defs><marker id="arcArr" markerWidth="8" markerHeight="8" refX="6" refY="4" orient="auto"><polygon points="0 0, 8 4, 0 8" fill="#8a877f"/></marker></defs>
      </svg>
    `;
  }

  // ============================================================
  // React figures
  // ============================================================
  window.__mountLstmFigures = function() {
    if (window.__lstmMounted) return;
    if (!window.React || !window.ReactDOM) return;
    window.__lstmMounted = true;
    const { useState, useEffect, useMemo } = React;

    // ---------- LSTM walk-through ----------
    function LstmWalkFig() {
      const [phase, setPhase] = useState(0); // 0..3
      const [playing, setPlaying] = useState(true);
      useEffect(() => {
        if (!playing) return;
        const id = setInterval(() => setPhase(p => (p+1) % 4), 1500);
        return () => clearInterval(id);
      }, [playing]);

      const W = 920, H = 320;

      // colors based on phase
      const fActive = phase >= 0;
      const iActive = phase >= 1;
      const cellUpd = phase >= 2;
      const oActive = phase >= 3;

      const phaseLabels = [
        '1. forget — what to drop from cell state',
        '2. input — what new info to write',
        '3. update — combine forget + input into new cell state',
        '4. output — read out the new hidden state',
      ];

      return (
        <div>
          <div className="fig-title"><strong>LSTM step, phase by phase</strong>
            <span style={{color:'#c84e1d', fontWeight:600}}>{phaseLabels[phase]}</span>
          </div>
          <svg viewBox={`0 0 ${W} ${H}`} width="100%">
            {/* cell box */}
            <rect x="60" y="40" width={W-120} height="220" fill="#fbfaf6" stroke="#1a1a1a" strokeWidth="1.2" rx="8"/>
            {/* top belt */}
            <line x1="80" y1="100" x2={W-80} y2="100" stroke="#b8860b" strokeWidth={cellUpd ? 6 : 4} opacity={cellUpd ? 1 : 0.6}/>
            <text x="92" y="92" fontFamily="Fraunces" fontSize="14" fontWeight="600" fill="#b8860b">C<tspan fontSize="10">t-1</tspan></text>
            <text x={W-110} y="92" fontFamily="Fraunces" fontSize="14" fontWeight="600" fill="#b8860b">C<tspan fontSize="10">t</tspan></text>

            {/* bottom hidden */}
            <line x1="80" y1="240" x2={W-80} y2="240" stroke="#1a1a1a" strokeWidth="1" strokeDasharray="3 3"/>
            <text x="92" y="258" fontFamily="Fraunces" fontSize="14" fontWeight="600">h<tspan fontSize="10">t-1</tspan></text>
            <text x={W-110} y="258" fontFamily="Fraunces" fontSize="14" fontWeight="600" fill={oActive ? '#1a7a4c' : '#8a877f'}>h<tspan fontSize="10">t</tspan></text>

            {/* forget gate */}
            <g opacity={fActive ? 1 : 0.25}>
              <circle cx="220" cy="100" r="22" fill="#f6e4d8" stroke="#c84e1d" strokeWidth={fActive ? 2.2 : 1.4}/>
              <text x="220" y="106" textAnchor="middle" fontFamily="Fraunces" fontSize="16" fontWeight="600" fill="#c84e1d">×</text>
              <text x="220" y="148" textAnchor="middle" fontFamily="JetBrains Mono" fontSize="10" fill="#c84e1d">forget · f</text>
              {phase === 0 && <circle cx="220" cy="100" r="22" fill="none" stroke="#c84e1d" strokeWidth="2"><animate attributeName="r" values="22;30;22" dur="1.5s" repeatCount="indefinite"/><animate attributeName="opacity" values="1;0;1" dur="1.5s" repeatCount="indefinite"/></circle>}
            </g>

            {/* input gate + candidate */}
            <g opacity={iActive ? 1 : 0.25}>
              <circle cx="420" cy="100" r="22" fill="#dde7f7" stroke="#1f6feb" strokeWidth={iActive ? 2.2 : 1.4}/>
              <text x="420" y="106" textAnchor="middle" fontFamily="Fraunces" fontSize="16" fontWeight="600" fill="#1f6feb">+</text>
              <text x="420" y="148" textAnchor="middle" fontFamily="JetBrains Mono" fontSize="10" fill="#1f6feb">input · i ⊙ C̃</text>
              {phase === 1 && <circle cx="420" cy="100" r="22" fill="none" stroke="#1f6feb" strokeWidth="2"><animate attributeName="r" values="22;30;22" dur="1.5s" repeatCount="indefinite"/><animate attributeName="opacity" values="1;0;1" dur="1.5s" repeatCount="indefinite"/></circle>}
            </g>

            {/* output gate */}
            <g opacity={oActive ? 1 : 0.25}>
              <circle cx="700" cy="180" r="22" fill="#d6e8de" stroke="#1a7a4c" strokeWidth={oActive ? 2.2 : 1.4}/>
              <text x="700" y="186" textAnchor="middle" fontFamily="Fraunces" fontSize="16" fontWeight="600" fill="#1a7a4c">×</text>
              <line x1="700" y1="106" x2="700" y2="158" stroke="#1a7a4c" strokeWidth="1.2" strokeDasharray="2 2"/>
              <text x="730" y="140" fontFamily="Fraunces" fontSize="11" fill="#1a7a4c">tanh(C)</text>
              <line x1="700" y1="202" x2="700" y2="240" stroke="#1a7a4c" strokeWidth="1.4"/>
              <text x="700" y="278" textAnchor="middle" fontFamily="JetBrains Mono" fontSize="10" fill="#1a7a4c">output · o</text>
              {phase === 3 && <circle cx="700" cy="180" r="22" fill="none" stroke="#1a7a4c" strokeWidth="2"><animate attributeName="r" values="22;30;22" dur="1.5s" repeatCount="indefinite"/><animate attributeName="opacity" values="1;0;1" dur="1.5s" repeatCount="indefinite"/></circle>}
            </g>

            {/* current input x_t arrow up from below */}
            <line x1="540" y1="290" x2="540" y2="240" stroke="#1f6feb" strokeWidth="1.4" markerEnd="url(#walkArr)"/>
            <text x="540" y="305" textAnchor="middle" fontFamily="Fraunces" fontSize="14" fontWeight="600" fill="#1f6feb">x<tspan fontSize="10">t</tspan></text>

            <defs><marker id="walkArr" markerWidth="6" markerHeight="6" refX="5" refY="3" orient="auto"><polygon points="0 0, 6 3, 0 6" fill="#1f6feb"/></marker></defs>
          </svg>
          <div className="fig-controls">
            <button className="btn-ghost btn-sm" onClick={() => setPlaying(p => !p)}>{playing ? '⏸ Pause' : '▶ Play'}</button>
            <span className="ctrl-label" style={{marginLeft:12}}>phase</span>
            <input type="range" min="0" max="3" value={phase} onChange={e => { setPlaying(false); setPhase(+e.target.value); }} />
            <span className="ctrl-value">{phase+1}/4</span>
          </div>
        </div>
      );
    }
    const lwm = document.getElementById('fig-lstm-walk-mount');
    if (lwm) ReactDOM.createRoot(lwm).render(<LstmWalkFig/>);

    // ---------- Gate playground ----------
    function LstmPlayFig() {
      const [forget, setForget] = useState(1.0);
      const [input, setInput] = useState(0.0);
      const [output, setOutput] = useState(1.0);
      const T = 30;
      // simulate: c_0 = 1.0 (a memory we want to keep). Each step c = forget*c + input*candidate(=1)
      // h = output * tanh(c).
      const data = useMemo(() => {
        const arr = [];
        let c = 0;
        for (let t = 0; t < T; t++) {
          // initial pulse: at step 0, write a 1.0 candidate via input gate
          const cand = (t === 0) ? 1.0 : 0.0;
          c = forget * c + input * cand;
          // override: at step 0, force c = 1 if user has input gate near 0 — this lets us study just forget.
          if (t === 0 && input < 0.1) c = 1.0;
          const h = output * Math.tanh(c);
          arr.push({ c, h });
        }
        return arr;
      }, [forget, input, output]);

      const W = 880, H = 280;
      const px0 = 60, py0 = 30, pw = W - 120, ph = H - 80;
      const X = (i) => px0 + (i / (T-1)) * pw;
      const Yc = (v) => py0 + (1 - Math.max(-0.1, Math.min(1.1, v))) * ph * 0.5;
      const Yh = (v) => py0 + ph*0.5 + (1 - Math.max(-0.1, Math.min(1.1, v))) * ph * 0.5;

      const cPath = data.map((d,i)=>(i?'L':'M')+X(i).toFixed(1)+' '+Yc(d.c).toFixed(1)).join(' ');
      const hPath = data.map((d,i)=>(i?'L':'M')+X(i).toFixed(1)+' '+Yh(d.h).toFixed(1)).join(' ');

      const cDots = data.map((d,i) => <circle key={'cd'+i} cx={X(i)} cy={Yc(d.c)} r={2.5} fill="#b8860b"/>);
      const hDots = data.map((d,i) => <circle key={'hd'+i} cx={X(i)} cy={Yh(d.h)} r={2.5} fill="#1a7a4c"/>);

      const finalC = data[T-1].c;
      const survival = finalC > 0.5 ? 'memory survives' : finalC > 0.1 ? 'memory fading' : 'memory lost';
      const survivalColor = finalC > 0.5 ? '#1a7a4c' : finalC > 0.1 ? '#b8860b' : '#c84e1d';

      return (
        <div>
          <div className="fig-title"><strong>Gate playground</strong><span>same memory pulse, different gates</span></div>
          <svg viewBox={`0 0 ${W} ${H}`} width="100%">
            <rect x={px0} y={py0} width={pw} height={ph} fill="none" stroke="#c9c2ad" strokeWidth="0.6"/>
            {/* zero lines */}
            <line x1={px0} y1={Yc(0)} x2={px0+pw} y2={Yc(0)} stroke="#c9c2ad" strokeWidth="0.4" strokeDasharray="2 2"/>
            <line x1={px0} y1={Yh(0)} x2={px0+pw} y2={Yh(0)} stroke="#c9c2ad" strokeWidth="0.4" strokeDasharray="2 2"/>
            {/* divider */}
            <line x1={px0} y1={py0+ph*0.5} x2={px0+pw} y2={py0+ph*0.5} stroke="#c9c2ad" strokeWidth="0.6"/>

            <text x={px0} y={py0-8} fontFamily="JetBrains Mono" fontSize="10" fill="#b8860b" letterSpacing="1">CELL STATE C</text>
            <text x={px0} y={py0+ph*0.5-6} fontFamily="JetBrains Mono" fontSize="10" fill="#1a7a4c" letterSpacing="1">HIDDEN STATE h = o · tanh(C)</text>

            <path d={cPath} stroke="#b8860b" strokeWidth="2" fill="none"/>
            <path d={hPath} stroke="#1a7a4c" strokeWidth="2" fill="none"/>
            {cDots}{hDots}

            <text x={px0} y={py0+ph+18} fontFamily="JetBrains Mono" fontSize="10" fill="#8a877f">step 0</text>
            <text x={px0+pw} y={py0+ph+18} fontFamily="JetBrains Mono" fontSize="10" fill="#8a877f" textAnchor="end">step {T-1}</text>
          </svg>

          <div className="fig-controls" style={{marginTop:12}}>
            <span className="ctrl-label" style={{minWidth:80, color:'#c84e1d'}}>forget gate</span>
            <input type="range" min="0" max="1" step="0.01" value={forget} onChange={e => setForget(+e.target.value)}/>
            <span className="ctrl-value" style={{color:'#c84e1d'}}>{forget.toFixed(2)}</span>
          </div>
          <div className="fig-controls">
            <span className="ctrl-label" style={{minWidth:80, color:'#1f6feb'}}>input gate</span>
            <input type="range" min="0" max="1" step="0.01" value={input} onChange={e => setInput(+e.target.value)}/>
            <span className="ctrl-value" style={{color:'#1f6feb'}}>{input.toFixed(2)}</span>
          </div>
          <div className="fig-controls">
            <span className="ctrl-label" style={{minWidth:80, color:'#1a7a4c'}}>output gate</span>
            <input type="range" min="0" max="1" step="0.01" value={output} onChange={e => setOutput(+e.target.value)}/>
            <span className="ctrl-value" style={{color:'#1a7a4c'}}>{output.toFixed(2)}</span>
          </div>
          <div style={{marginTop:12, display:'flex', justifyContent:'space-between', alignItems:'center'}}>
            <div style={{display:'flex', gap:8}}>
              <button className="btn-ghost btn-sm" onClick={() => { setForget(1.0); setInput(1.0); setOutput(1.0); }}>preset · hold</button>
              <button className="btn-ghost btn-sm" onClick={() => { setForget(0.7); setInput(0.0); setOutput(1.0); }}>preset · vanilla RNN decay</button>
              <button className="btn-ghost btn-sm" onClick={() => { setForget(0.0); setInput(1.0); setOutput(1.0); }}>preset · overwrite</button>
            </div>
            <div style={{fontFamily:'JetBrains Mono', fontSize:11, color:survivalColor, fontWeight:600}}>
              after 30 steps · C = {finalC.toFixed(3)} · {survival}
            </div>
          </div>
        </div>
      );
    }
    const lpm = document.getElementById('fig-lstm-play-mount');
    if (lpm) ReactDOM.createRoot(lpm).render(<LstmPlayFig/>);
  };
})();
