Case Study / LangGraph system

RabbitHole

Messy public debates and complex legal scenarios resist simple, consensus-driven answers, and LLMs under prompt constraints frequently suffer from token overruns and perspective compliance issues.

Approach

Built a stateful multi-agent courtroom architecture with a moderator node scheduling parallel advocate/critic perspectives. Implemented a self-correcting RAG sub-graph combining Pinecone/BM25 search, Jina Reranking, and Self-RAG hallucination auditing.

Model / System

Hierarchical LangGraph orchestration featuring State-Based Schema Constraints to restrict token usage, a Corrective RAG (CRAG) fallback using Jina Web Search, and dual-tier model routing (Llama-3.3-70B and smaller models).

Result

Optimized execution latency (MTTV) by 51% (from 19.8s to 9.8s) while completely neutralizing token overruns. Delivered a modular, inspectable Python engine alongside a Next.js/React debug interface.

Technical highlights

What to inspect.

Strict State-Based Schema Constraints resolving prompt-based perspective compliance issues.

Self-RAG Hallucination Auditor loop utilizing Jina Rerank and Corrective web search fallback.

Dynamic model routing and moderator partitioning reducing daily token consumption under Groq rate limits.

Repository