Chapter 03 · Primitives

Messages, prompts, and chat models — the bottom of the stack.

Before any agent, before any chain, there are messages. If you understand exactly what gets sent to the API on each call and what comes back, you understand 80% of what LangChain does.

Everything is a message list

Modern chat APIs (OpenAI, Anthropic, Gemini, etc.) all share the same wire format: you send a list of messages, you get a message back. LangChain wraps that with a small class hierarchy that gives you typing and helpers:

{`from langchain_core.messages import ( SystemMessage, # Instructions to the model. Usually first. HumanMessage, # User input. AIMessage, # Model output. May contain tool_calls. ToolMessage, # Result of a tool execution. ) messages = [ SystemMessage(content="You are a helpful weather assistant."), HumanMessage(content="What's the weather in Tokyo?"), ]`}

Every chat model in LangChain — ChatOpenAI, ChatAnthropic, ChatGoogleGenerativeAI, ChatOllama — has the same interface: .invoke(messages) → AIMessage.

{`from langchain_openai import ChatOpenAI model = ChatOpenAI(model="gpt-4o-mini", temperature=0) response = model.invoke(messages) print(type(response)) # print(response.content) # "I don't have real-time data, but..."`} Whether the model returned plain text, a tool call, or both, you always get back an AIMessage. The presence of response.tool_calls is what tells the agent loop "the model wants to act." Prompt templates — parameterized messages

Hardcoding strings into HumanMessage(content=...) doesn't scale. ChatPromptTemplate lets you define message slots with placeholders, then fill them at invoke time:

{`from langchain_core.prompts import ChatPromptTemplate prompt = ChatPromptTemplate.from_messages([ ("system", "You are a {style} assistant. Respond in {language}."), ("human", "{question}"), ]) messages = prompt.format_messages( style="terse", language="English", question="Why is the sky blue?", ) # Returns a real list[BaseMessage] you can pass to .invoke()`}

For agents specifically, you'll see MessagesPlaceholder a lot. It reserves a slot for the growing message list (the scratchpad):

{`from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder prompt = ChatPromptTemplate.from_messages([ ("system", "You are a research assistant. Use tools when needed."), MessagesPlaceholder("chat_history"), # past turns ("human", "{input}"), MessagesPlaceholder("agent_scratchpad"), # tool calls + tool results ])`} Anatomy of a model call

Three model methods you'll use constantly:

model.invoke(messages) — synchronous, returns one AIMessage.
model.stream(messages) — yields AIMessageChunks as tokens arrive.
model.batch([msgs1, msgs2, ...]) — runs many in parallel, returns a list of AIMessages.

And async variants: ainvoke, astream, abatch. Memorize this pattern — every Runnable supports it.

Configuring the model {`model = ChatOpenAI( model="gpt-4o-mini", temperature=0, # 0 = deterministic; for agents, almost always 0 max_tokens=1024, timeout=30, max_retries=2, model_kwargs={ "response_format": {"type": "json_object"}, # provider-specific }, )`} Tool-calling agents need predictable structured output. A creative model that hallucinates a tool name half the time will burn your budget. Set temperature=0 for the orchestrating LLM. Crank it up for sub-tasks like "write the email body." If your message history exceeds the model's context window, you get a hard API error or — worse — silent truncation depending on the provider. Always know your context budget. We discuss trimming in Chapter 10.