LLM API Format Comparison
LLM API Format Comparison: Understanding Different API Standards
One of the most frustrating aspects of working with multiple AI providers is that each uses a different API format. Anthropic's Messages API, OpenAI's Chat Completions API, and Google's Gemini API all accomplish similar tasks but with different request structures, response formats, and parameter names. This guide compares the major LLM API formats, shows how to work with each, and explains how compatibility layers eliminate the complexity.
The Three Major API Formats
OpenAI Chat Completions Format
The Chat Completions format has become the industry standard. Most third-party tools, libraries, and relay services support it, making it the safest choice for portability.
// OpenAI Chat Completions — Request
POST /v1/chat/completions
{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is an API?"}
],
"temperature": 0.7,
"max_tokens": 1024,
"stream": false
}
// OpenAI Chat Completions — Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "An API is..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 150,
"total_tokens": 175
}
}
Anthropic Messages Format
Anthropic's Messages API has several design differences from OpenAI's format. The system prompt is a separate top-level field, and the response structure differs significantly.
// Anthropic Messages — Request
POST /v1/messages
{
"model": "claude-sonnet-4-20250514",
"system": "You are a helpful assistant.",
"messages": [
{"role": "user", "content": "What is an API?"}
],
"max_tokens": 1024,
"temperature": 0.7
}
// Anthropic Messages — Response
{
"id": "msg_abc123",
"type": "message",
"role": "assistant",
"model": "claude-sonnet-4-20250514",
"content": [
{
"type": "text",
"text": "An API is..."
}
],
"stop_reason": "end_turn",
"usage": {
"input_tokens": 25,
"output_tokens": 150
}
}
Google Gemini Format
Google's Gemini API uses yet another structure, with "parts" instead of simple content strings and "candidates" instead of "choices".
// Gemini — Request
POST /v1/models/gemini-2.5-pro:generateContent
{
"system_instruction": {
"parts": [{"text": "You are a helpful assistant."}]
},
"contents": [
{
"role": "user",
"parts": [{"text": "What is an API?"}]
}
],
"generationConfig": {
"temperature": 0.7,
"maxOutputTokens": 1024
}
}
// Gemini — Response
{
"candidates": [
{
"content": {
"parts": [{"text": "An API is..."}],
"role": "model"
},
"finishReason": "STOP"
}
],
"usageMetadata": {
"promptTokenCount": 25,
"candidatesTokenCount": 150,
"totalTokenCount": 175
}
}
Key Differences at a Glance
- System prompt: OpenAI uses a message with role "system". Anthropic has a top-level "system" field. Gemini uses "system_instruction".
- Content structure: OpenAI uses a simple string. Anthropic uses an array of content blocks. Gemini uses "parts".
- Response wrapper: OpenAI returns "choices". Anthropic returns "content". Gemini returns "candidates".
- Token counting: All three report usage but with different field names (prompt_tokens vs. input_tokens vs. promptTokenCount).
- Streaming format: All use SSE but with different event structures and delta formats.
- Max tokens parameter: "max_tokens" (OpenAI/Anthropic) vs. "maxOutputTokens" (Gemini).
- Stop reason: "stop"/"length" (OpenAI) vs. "end_turn"/"max_tokens" (Anthropic) vs. "STOP"/"MAX_TOKENS" (Gemini).
Streaming Format Differences
Streaming differences are particularly tricky to handle because each provider uses different SSE event structures:
// OpenAI streaming chunk
data: {"choices":[{"delta":{"content":"Hello"},"index":0}]}
// Anthropic streaming events
event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"Hello"}}
// Gemini streaming chunk
data: {"candidates":[{"content":{"parts":[{"text":"Hello"}]}}]}
Tool/Function Calling Differences
Function calling (tool use) varies significantly between providers:
- OpenAI: Uses "tools" array with "function" type and JSON Schema parameters
- Anthropic: Uses "tools" array with "input_schema" (JSON Schema) and returns tool_use content blocks
- Gemini: Uses "functionDeclarations" with a different schema format
Building a Format Translation Layer
If you need to support multiple providers directly, here is the pattern for a minimal translation layer:
function toOpenAIFormat(provider, request) {
if (provider === 'anthropic') {
const systemMsg = request.messages.find((m) => m.role === 'system');
return {
model: request.model,
system: systemMsg?.content,
messages: request.messages.filter((m) => m.role !== 'system'),
max_tokens: request.max_tokens
};
}
if (provider === 'gemini') {
const systemMsg = request.messages.find((m) => m.role === 'system');
return {
system_instruction: systemMsg
? { parts: [{ text: systemMsg.content }] }
: undefined,
contents: request.messages
.filter((m) => m.role !== 'system')
.map((m) => ({
role: m.role === 'assistant' ? 'model' : m.role,
parts: [{ text: m.content }]
})),
generationConfig: {
maxOutputTokens: request.max_tokens,
temperature: request.temperature
}
};
}
return request; // Already OpenAI format
}
The Practical Solution
For most developers, building and maintaining a format translation layer is not worth the engineering effort. Use the OpenAI format as your standard and access all providers through an OpenAI-compatible gateway like claude4u.com. This gives you provider portability without the complexity of managing multiple API formats in your codebase. Write once, deploy everywhere — that is the promise of a unified API layer.
Get Started with 轻舟 AI
Stable, fast AI API relay — supports Claude, OpenAI, Gemini and more
Sign Up Free
轻舟 AI