The ErrorCapture Pattern

When you're building with LLMs, errors don't look like traditional stack traces. They're fuzzy, context-dependent, and often happen in the space between your code and the model's interpretation of your prompt.

After months of debugging LLM applications in production, we developed what we call the ErrorCapture pattern—a structured approach to making AI-related failures debuggable.

The Problem

Traditional error handling assumes you know what can go wrong:

try {
  const result = await processData(input);
  return result;
} catch (error) {
  if (error instanceof ValidationError) {
    // handle validation
  } else if (error instanceof NetworkError) {
    // handle network
  }
}

But LLM errors are different. The model might:

  • Return a valid response that's semantically wrong
  • Hallucinate a function that doesn't exist
  • Format output incorrectly in subtle ways
  • Time out unpredictably under load

None of these fit cleanly into traditional error categories.

The ErrorCapture Pattern

Instead of trying to predict every failure mode, we capture the full context around every LLM interaction:

interface LLMCapture {
  // Input context
  prompt: string;
  systemPrompt: string;
  inputVariables: Record<string, unknown>;
  
  // Output context
  rawResponse: string;
  parsedResponse: unknown;
  
  // Execution context
  model: string;
  temperature: number;
  latencyMs: number;
  tokenUsage: { prompt: number; completion: number };
  
  // Validation context
  validationErrors: string[];
  retryCount: number;
}

Every LLM call gets wrapped in a capture function that records all of this automatically.

Why This Works

When something goes wrong, you have everything you need to debug:

  1. Reproducibility: You can replay the exact prompt with the exact parameters
  2. Pattern recognition: You can query captures to find similar failures
  3. LLM-assisted debugging: You can feed the capture to another LLM to help diagnose the issue

The key insight: make the LLM's context visible at every step, not just when errors occur.

Implementation Tips

Start with logging everything, then tune down once you understand your failure modes:

async function capturedLLMCall<T>(
  config: LLMConfig,
  parser: (raw: string) => T
): Promise<{ result: T; capture: LLMCapture }> {
  const startTime = Date.now();
  const capture: Partial<LLMCapture> = {
    prompt: config.prompt,
    systemPrompt: config.systemPrompt,
    model: config.model,
    temperature: config.temperature,
  };
  
  try {
    const response = await callLLM(config);
    capture.rawResponse = response.text;
    capture.tokenUsage = response.usage;
    capture.latencyMs = Date.now() - startTime;
    
    const parsed = parser(response.text);
    capture.parsedResponse = parsed;
    
    return { result: parsed, capture: capture as LLMCapture };
  } catch (error) {
    capture.validationErrors = [String(error)];
    throw new CapturedError(error, capture as LLMCapture);
  }
}

What's Next

In future posts, I'll cover:

  • How we built a query interface for captures
  • Using captures to generate test cases automatically
  • The infrastructure cost of logging everything (and how to optimize it)

The ErrorCapture pattern has fundamentally changed how we debug LLM applications. It's not just about catching errors—it's about making the invisible visible.


Have you tried similar patterns? I'd love to hear what's worked for you. Find me on Twitter.