Building Model-Agnostic AI Agents

Core takeaway: The adapter pattern is cheap insurance against vendor lock-in. Define a unified chat() + normalize_tools() + normalize_messages() interface, standardize internally on OpenAI tool format — everything else is implementation detail. The cost difference between models can exceed 20x.

Most AI Agent frameworks are born tied to a specific model vendor. LangChain was originally built around OpenAI. Claude Code is naturally Anthropic-exclusive. But in practice, you often need to switch between models — for cost, latency, capability matching, or simply to avoid lock-in.

This article shows you how to build an Agent that works with any model — Claude, GPT, DeepSeek, Llama, or locally deployed open models — by swapping a single line of configuration.

Why Model-Agnostic Matters

What is Model-Agnostic Architecture

Scenario	What You Need	Lock-In Problem
Production deployment	GPT-4o for complex tasks, Claude for long-form writing	Your code has OpenAI SDK hardcoded
Cost optimization	DeepSeek for simple queries (10x cheaper), GPT for hard ones	Tool definitions only work with one format
Privacy-sensitive data	Local Llama 3 for internal docs, cloud API for public tasks	Different message formats break your pipeline
Model evaluation	A/B test 3 models on the same Agent task	Can't swap models without code changes

Model-agnostic means your Agent's core logic doesn't depend on any specific model's API format. The Agent loop — observe → think → act → observe — stays identical regardless of which model powers the "think" step.

The Adapter Interface

Implementing Real Adapters

Here are concrete implementations for the three most common model families. Notice how each handles tool calling differently.

OpenAI Adapter (GPT-4o, GPT-4, GPT-3.5)

Anthropic Adapter (Claude Sonnet, Claude Opus)

OpenAI-Compatible Adapter (DeepSeek, vLLM, Ollama, local models)

Many models (DeepSeek, Llama via vLLM/Ollama, Groq) use the OpenAI-compatible API. One adapter covers them all — just change the base_url:

Tool Format Normalization

Different providers have slightly different tool schemas. The key insight: standardize on OpenAI's function-calling format as the internal representation, and let each adapter convert to its native format.

Model Selection Strategy

With a model-agnostic architecture, you can route tasks to the optimal model based on characteristics:

The Complete Agent

Here's the full model-agnostic Agent. The core loop never changes — only the adapter does:

Testing Your Model-Agnostic Layer

How do you know the adapter is working correctly? Test with a simple tool-calling task across all models:

Existing Frameworks (and When to Roll Your Own)

Key Takeaways

Feature	OpenAI	Anthropic	Google Gemini
Tool wrapper	`{"type": "function", "function": {...}}`	Bare object, no wrapper	`{"functionDeclarations": [...]}`
Schema field	`parameters` (JSON Schema)	`input_schema` (JSON Schema)	`parameters` (OpenAPI-like)
Tool result role	`role: "tool"`	`tool_result` content block	`role: "tool"`
Parallel calls	Supported natively	Supported natively	Not supported

Task Type	Recommended Model	Reason
Complex reasoning, math, code	Claude Opus / GPT-4o	Highest reasoning accuracy
Simple Q&A, summarization	DeepSeek / Llama 3 70B	5-10x cheaper, good enough
Long-form writing	Claude Sonnet	Excellent prose quality
Chinese content	DeepSeek / Qwen	Native Chinese performance
Sensitive internal data	Local Llama / Qwen	Data never leaves your infra
Real-time (< 500ms)	Groq / GPT-4o-mini	Ultra-low latency

Framework	Model Support	Best For	When to Skip
smolagents (HuggingFace)	Any HF model + external APIs	Quick prototyping, HF ecosystem users	Need fine control over tool loop
DSPy	10+ providers via adapters	Prompt optimization, A/B testing models	Simple tool-calling agents (overkill)
LangChain	Wide but historically OpenAI-first	Complex RAG pipelines, many integrations	Simplicity; LangChain adds abstraction overhead
Custom adapter (this article)	Any model, full control	Production systems, specific requirements	You only use one model

Frequently Asked Questions

Q: Do I really need a model-agnostic architecture?: A: If you genuinely use one model with no plans to switch — no. But if you're building a product, platform, or anything that might outlive your current model choice — the adapter pattern pays for itself the first time you need to switch. It's cheap insurance against vendor lock-in.
Q: What's different about tool calling across models?: A: OpenAI wraps tools as {"type":"function","function":{...}}, Anthropic uses bare objects with an input_schema field, Gemini uses functionDeclarations arrays. This article standardizes on OpenAI's format internally, with each adapter handling its own conversion.
Q: Why standardize on OpenAI format internally?: A: OpenAI's function-calling format has become the de facto standard. DeepSeek, Groq, vLLM, Ollama, and many others all use OpenAI-compatible APIs. Using this format as your internal representation minimizes conversion overhead.
Q: How much can smart routing actually save?: A: Simple queries (<100 chars) on DeepSeek at ~$0.14/M tokens vs. complex reasoning on Claude at ~$3/M tokens — a 20x cost difference. The article provides a reference smart_route() implementation you can adapt.
Q: What are the adapter pattern's gotchas?: A: Tool format translation bugs are silent — hard to catch in production. Test every adapter with the same simple tool-calling scenario (e.g., "get weather"). Also, Anthropic requires extracting the system prompt as a separate parameter — an easy detail to miss.

Citable Definition

Model-Agnostic Agent: An AI Agent architecture pattern where the core decision loop (observe → think → act → observe) is decoupled from any specific LLM provider's API format. Through the Adapter Pattern, a unified interface contract is defined — typically comprising chat() (send messages and receive responses), normalize_tools() (tool definition format normalization), and normalize_messages() (message format normalization) — with each model provider (OpenAI, Anthropic, DeepSeek, local Llama, etc.) implementing that interface. The Agent core logic always operates on the unified format; switching models requires only replacing the adapter instance with zero business code changes. This is a low-cost insurance policy against vendor lock-in.

Next Steps

📖 Basics: Your First AI Agent — Hands-On Code — if you haven't built an Agent yet, start here and get your first one running
📖 Advanced: Building an Agent Framework from Scratch — integrate the model-agnostic adapter pattern into a complete production-grade framework
📖 Related: Agent Error Recovery — model switching is itself a powerful recovery strategy: auto-fallback to a backup model when the primary fails