Search topics…
Tutorials
Explore
June 6 Offline Event →
LangGraph · Module 3

Advanced State & Persistence

Master LangGraph Persistence — checkpointers, threads, crash recovery, SQLite savers, time travel replay, and Streamlit session integration.

⏱ 30 Min Read Module 3 of 6 Updated: May 2026

Persistence is the foundational layer that powers short-term conversation memory, fault tolerance, Human-in-the-Loop workflows, and debugging replay features. This module details the mechanics of saving, modifying, and restoring graph execution checkpoints.

Day 5

Persistence & Checkpointers

Why this matters

Checkpointers persist graph state across turns — required for production chat and human-in-the-loop.

Checkpointers save graph state after each super-step so conversations resume across sessions.

from langgraph.checkpoint.memory import MemorySaver

memory = MemorySaver()
graph = builder.compile(checkpointer=memory)

config = {"configurable": {"thread_id": "user-123"}}
graph.invoke({"messages": ["hi"]}, config)
  • MemorySaver — dev/in-memory; SqliteSaver — persistent local DB.
  • thread_id in config scopes checkpoint history per user/session.

Common mistakes

  • Forgetting to compile the graph with a checkpointer when persistence is required.
  • List state fields without reducers — updates overwrite instead of append.
  • Infinite loops in cyclic graphs with no max iteration or termination edge.

Interview checkpoints

  • Q: Explain checkpointers in LangGraph. A: One-sentence definition + one API.
  • Q: Common bug? A: State, checkpointer, or routing loop issue.

Practice

  1. Basic: Sketch a minimal checkpointers example.
  2. Intermediate: Run a notebook cell demonstrating Checkpointers.
  3. Advanced: Break Checkpointers intentionally and read the LangSmith trace.

Recap

  • You can explain checkpointers clearly.
  • You know one mistake to avoid.
  • You see how this connects to the next lesson.

Next: Fault Tolerance

Day 6

Fault Tolerance & Time Travel

Why this matters

Fault tolerance and time travel let you resume threads and inspect past checkpoints after failures.

Fault tolerance replays from last good checkpoint. Time travel inspects or forks from prior states.

  • Human-in-the-loop: interrupt before sensitive nodes, resume after approval.
  • Multi-thread: one graph, many thread_id values for concurrent users.

Common mistakes

  • Forgetting to compile the graph with a checkpointer when persistence is required.
  • List state fields without reducers — updates overwrite instead of append.
  • Infinite loops in cyclic graphs with no max iteration or termination edge.

Interview checkpoints

  • Q: Explain fault tolerance in LangGraph. A: One-sentence definition + one API.
  • Q: Common bug? A: State, checkpointer, or routing loop issue.

Practice

  1. Basic: Sketch a minimal fault tolerance example.
  2. Intermediate: Run a notebook cell demonstrating Fault Tolerance.
  3. Advanced: Break Fault Tolerance intentionally and read the LangSmith trace.

Recap

  • You can explain fault tolerance clearly.
  • You know one mistake to avoid.
  • You see how this connects to the next lesson.

Next: LangSmith

← LangGraph Fundamentals Observability →