/* global React */

function Chapter03() {
  return (
    <section className="chapter" id="ch-03" data-screen-label="03 LLM primitives">
      <div className="chapter-header">
        <div className="eyebrow">Chapter 03 · Primitives</div>
        <h1 className="chapter-title">Messages, prompts, and chat models — the bottom of the stack.</h1>
        <p className="chapter-lede">
          Before any agent, before any chain, there are messages. If you understand exactly what gets sent to the API on
          each call and what comes back, you understand 80% of what LangChain does.
        </p>
      </div>

      <SectionTitle num="3.1">Everything is a message list</SectionTitle>
      <p>
        Modern chat APIs (OpenAI, Anthropic, Gemini, etc.) all share the same wire format: you send a list of messages,
        you get a message back. LangChain wraps that with a small class hierarchy that gives you typing and helpers:
      </p>
      <CodeBlock file="messages.py">{`from langchain_core.messages import (
    SystemMessage,    # Instructions to the model. Usually first.
    HumanMessage,     # User input.
    AIMessage,        # Model output. May contain tool_calls.
    ToolMessage,      # Result of a tool execution.
)

messages = [
    SystemMessage(content="You are a helpful weather assistant."),
    HumanMessage(content="What's the weather in Tokyo?"),
]`}</CodeBlock>

      <p>
        Every chat model in LangChain — <code>ChatOpenAI</code>, <code>ChatAnthropic</code>, <code>ChatGoogleGenerativeAI</code>,
        <code>ChatOllama</code> — has the same interface: <code>.invoke(messages) → AIMessage</code>.
      </p>

      <CodeBlock file="basic_call.py">{`from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
response = model.invoke(messages)

print(type(response))     # <class 'AIMessage'>
print(response.content)   # "I don't have real-time data, but..."`}</CodeBlock>

      <Callout kind="intuition" title="AIMessage is the universal output type">
        Whether the model returned plain text, a tool call, or both, you always get back an <code>AIMessage</code>.
        The presence of <code>response.tool_calls</code> is what tells the agent loop "the model wants to act."
      </Callout>

      <SectionTitle num="3.2">Prompt templates — parameterized messages</SectionTitle>
      <p>
        Hardcoding strings into <code>HumanMessage(content=...)</code> doesn't scale. <code>ChatPromptTemplate</code> lets
        you define message slots with placeholders, then fill them at invoke time:
      </p>
      <CodeBlock file="prompt_template.py">{`from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a {style} assistant. Respond in {language}."),
    ("human", "{question}"),
])

messages = prompt.format_messages(
    style="terse",
    language="English",
    question="Why is the sky blue?",
)
# Returns a real list[BaseMessage] you can pass to .invoke()`}</CodeBlock>

      <p>
        For agents specifically, you'll see <code>MessagesPlaceholder</code> a lot. It reserves a slot for the <em>growing</em>
        message list (the scratchpad):
      </p>
      <CodeBlock file="agent_prompt.py">{`from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a research assistant. Use tools when needed."),
    MessagesPlaceholder("chat_history"),     # past turns
    ("human", "{input}"),
    MessagesPlaceholder("agent_scratchpad"), # tool calls + tool results
])`}</CodeBlock>

      <SectionTitle num="3.3">Anatomy of a model call</SectionTitle>
      <p>Three model methods you'll use constantly:</p>
      <ul>
        <li><code>model.invoke(messages)</code> — synchronous, returns one <code>AIMessage</code>.</li>
        <li><code>model.stream(messages)</code> — yields <code>AIMessageChunk</code>s as tokens arrive.</li>
        <li><code>model.batch([msgs1, msgs2, ...])</code> — runs many in parallel, returns a list of AIMessages.</li>
      </ul>
      <p>And async variants: <code>ainvoke</code>, <code>astream</code>, <code>abatch</code>. Memorize this pattern — every Runnable supports it.</p>

      <SectionTitle num="3.4">Configuring the model</SectionTitle>
      <CodeBlock file="model_config.py">{`model = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0,            # 0 = deterministic; for agents, almost always 0
    max_tokens=1024,
    timeout=30,
    max_retries=2,
    model_kwargs={
        "response_format": {"type": "json_object"},  # provider-specific
    },
)`}</CodeBlock>

      <Callout kind="tip" title="For agents: temperature=0">
        Tool-calling agents need predictable structured output. A creative model that hallucinates a tool name half the
        time will burn your budget. Set <code>temperature=0</code> for the orchestrating LLM. Crank it up for sub-tasks
        like "write the email body."
      </Callout>

      <Callout kind="gotcha" title="Token limits silently truncate">
        If your message history exceeds the model's context window, you get a hard API error or — worse — silent
        truncation depending on the provider. Always know your context budget. We discuss trimming in Chapter 10.
      </Callout>
    </section>
  );
}

window.Chapter03 = Chapter03;
