Advanced State & Persistence
Master LangGraph Persistence — checkpointers, threads, crash recovery, SQLite savers, time travel replay, and Streamlit session integration.
Persistence is the foundational layer that powers short-term conversation memory, fault tolerance, Human-in-the-Loop workflows, and debugging replay features. This module details the mechanics of saving, modifying, and restoring graph execution checkpoints.
Persistence & Checkpointers
Why this matters
Checkpointers persist graph state across turns — required for production chat and human-in-the-loop.
Checkpointers save graph state after each super-step so conversations resume across sessions.
from langgraph.checkpoint.memory import MemorySaver
memory = MemorySaver()
graph = builder.compile(checkpointer=memory)
config = {"configurable": {"thread_id": "user-123"}}
graph.invoke({"messages": ["hi"]}, config)MemorySaver— dev/in-memory;SqliteSaver— persistent local DB.thread_idin config scopes checkpoint history per user/session.
Common mistakes
- Forgetting to compile the graph with a checkpointer when persistence is required.
- List state fields without reducers — updates overwrite instead of append.
- Infinite loops in cyclic graphs with no max iteration or termination edge.
Interview checkpoints
- Q: Explain checkpointers in LangGraph. A: One-sentence definition + one API.
- Q: Common bug? A: State, checkpointer, or routing loop issue.
Practice
- Basic: Sketch a minimal checkpointers example.
- Intermediate: Run a notebook cell demonstrating Checkpointers.
- Advanced: Break Checkpointers intentionally and read the LangSmith trace.
Recap
- You can explain checkpointers clearly.
- You know one mistake to avoid.
- You see how this connects to the next lesson.
Next: Fault Tolerance
Fault Tolerance & Time Travel
Why this matters
Fault tolerance and time travel let you resume threads and inspect past checkpoints after failures.
Fault tolerance replays from last good checkpoint. Time travel inspects or forks from prior states.
- Human-in-the-loop: interrupt before sensitive nodes, resume after approval.
- Multi-thread: one graph, many
thread_idvalues for concurrent users.
Common mistakes
- Forgetting to compile the graph with a checkpointer when persistence is required.
- List state fields without reducers — updates overwrite instead of append.
- Infinite loops in cyclic graphs with no max iteration or termination edge.
Interview checkpoints
- Q: Explain fault tolerance in LangGraph. A: One-sentence definition + one API.
- Q: Common bug? A: State, checkpointer, or routing loop issue.
Practice
- Basic: Sketch a minimal fault tolerance example.
- Intermediate: Run a notebook cell demonstrating Fault Tolerance.
- Advanced: Break Fault Tolerance intentionally and read the LangSmith trace.
Recap
- You can explain fault tolerance clearly.
- You know one mistake to avoid.
- You see how this connects to the next lesson.
Next: LangSmith
