LLM API Format Comparison

LLM API Format Comparison: Understanding Different API Standards

One of the most frustrating aspects of working with multiple AI providers is that each uses a different API format. Anthropic's Messages API, OpenAI's Chat Completions API, and Google's Gemini API all accomplish similar tasks but with different request structures, response formats, and parameter names. This guide compares the major LLM API formats, shows how to work with each, and explains how compatibility layers eliminate the complexity.

The Three Major API Formats

OpenAI Chat Completions Format

The Chat Completions format has become the industry standard. Most third-party tools, libraries, and relay services support it, making it the safest choice for portability.

// OpenAI Chat Completions — Request
POST /v1/chat/completions
{
  "model": "gpt-4o",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is an API?"}
  ],
  "temperature": 0.7,
  "max_tokens": 1024,
  "stream": false
}

// OpenAI Chat Completions — Response
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "An API is..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

Anthropic Messages Format

Anthropic's Messages API has several design differences from OpenAI's format. The system prompt is a separate top-level field, and the response structure differs significantly.

// Anthropic Messages — Request
POST /v1/messages
{
  "model": "claude-sonnet-4-20250514",
  "system": "You are a helpful assistant.",
  "messages": [
    {"role": "user", "content": "What is an API?"}
  ],
  "max_tokens": 1024,
  "temperature": 0.7
}

// Anthropic Messages — Response
{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "model": "claude-sonnet-4-20250514",
  "content": [
    {
      "type": "text",
      "text": "An API is..."
    }
  ],
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 25,
    "output_tokens": 150
  }
}

Google Gemini Format

Google's Gemini API uses yet another structure, with "parts" instead of simple content strings and "candidates" instead of "choices".

// Gemini — Request
POST /v1/models/gemini-2.5-pro:generateContent
{
  "system_instruction": {
    "parts": [{"text": "You are a helpful assistant."}]
  },
  "contents": [
    {
      "role": "user",
      "parts": [{"text": "What is an API?"}]
    }
  ],
  "generationConfig": {
    "temperature": 0.7,
    "maxOutputTokens": 1024
  }
}

// Gemini — Response
{
  "candidates": [
    {
      "content": {
        "parts": [{"text": "An API is..."}],
        "role": "model"
      },
      "finishReason": "STOP"
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 25,
    "candidatesTokenCount": 150,
    "totalTokenCount": 175
  }
}

Key Differences at a Glance

Streaming Format Differences

Streaming differences are particularly tricky to handle because each provider uses different SSE event structures:

// OpenAI streaming chunk
data: {"choices":[{"delta":{"content":"Hello"},"index":0}]}

// Anthropic streaming events
event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"Hello"}}

// Gemini streaming chunk
data: {"candidates":[{"content":{"parts":[{"text":"Hello"}]}}]}

Tool/Function Calling Differences

Function calling (tool use) varies significantly between providers:

The format differences are the primary reason developers use API relay services. A service like claude4u.com accepts requests in the standard OpenAI Chat Completions format and translates them to whatever format the target provider requires. This means you write your code once and switch between Claude, GPT, and Gemini by simply changing the model parameter.

Building a Format Translation Layer

If you need to support multiple providers directly, here is the pattern for a minimal translation layer:

function toOpenAIFormat(provider, request) {
  if (provider === 'anthropic') {
    const systemMsg = request.messages.find((m) => m.role === 'system');
    return {
      model: request.model,
      system: systemMsg?.content,
      messages: request.messages.filter((m) => m.role !== 'system'),
      max_tokens: request.max_tokens
    };
  }
  if (provider === 'gemini') {
    const systemMsg = request.messages.find((m) => m.role === 'system');
    return {
      system_instruction: systemMsg
        ? { parts: [{ text: systemMsg.content }] }
        : undefined,
      contents: request.messages
        .filter((m) => m.role !== 'system')
        .map((m) => ({
          role: m.role === 'assistant' ? 'model' : m.role,
          parts: [{ text: m.content }]
        })),
      generationConfig: {
        maxOutputTokens: request.max_tokens,
        temperature: request.temperature
      }
    };
  }
  return request; // Already OpenAI format
}
Format translation is more complex than it appears. Edge cases around multi-modal content (images, files), tool calling responses, and streaming chunk parsing can introduce subtle bugs. Test thoroughly with each provider, especially for streaming and tool use scenarios.

The Practical Solution

For most developers, building and maintaining a format translation layer is not worth the engineering effort. Use the OpenAI format as your standard and access all providers through an OpenAI-compatible gateway like claude4u.com. This gives you provider portability without the complexity of managing multiple API formats in your codebase. Write once, deploy everywhere — that is the promise of a unified API layer.

Get Started with 轻舟 AI

Stable, fast AI API relay — supports Claude, OpenAI, Gemini and more

Sign Up Free