Gemini API Error Troubleshooting

Gemini API Error Codes Guide: Troubleshooting 400, 403, 429, 500, and 503 Errors

When working with the Gemini API, encountering errors is inevitable. Understanding what each error code means and how to resolve it can save you hours of debugging. This comprehensive guide covers every common Gemini API error with explanations, causes, and solutions.

400 Bad Request — INVALID_ARGUMENT

A 400 error means your request is malformed or contains invalid parameters. Common causes include:

# Common fix: validate your request before sending
from google import genai

client = genai.Client(api_key="YOUR_API_KEY")

# Count tokens before sending to avoid context length errors
token_count = client.models.count_tokens(
    model="gemini-2.5-flash",
    contents="Your prompt here"
)
print(f"Token count: {token_count.total_tokens}")
Always validate your input token count before sending large requests. The count_tokens endpoint is free and helps prevent unnecessary 400 errors from exceeding the context window.

403 Forbidden — PERMISSION_DENIED

A 403 error means your API key does not have permission to access the requested resource. Troubleshoot by checking:

# Test your API key
curl -s "https://generativelanguage.googleapis.com/v1beta/models?key=YOUR_API_KEY" | head -20

429 Too Many Requests — RESOURCE_EXHAUSTED

A 429 error means you have exceeded your rate limit or quota. The Gemini API enforces limits at multiple levels:

Solutions for 429 errors:

  1. Implement rate limiting in your client code to stay within quotas.
  2. Use exponential backoff when retrying after a 429.
  3. Enable billing to move from free tier limits to higher pay-as-you-go quotas.
  4. Request a quota increase through the Google Cloud Console if your pay-as-you-go limits are insufficient.
  5. Use a relay service that distributes requests across multiple API keys.
import time

def call_with_rate_limit(client, prompt, rpm_limit=25):
    """Simple rate limiter for Gemini API calls."""
    min_interval = 60.0 / rpm_limit

    try:
        response = client.models.generate_content(
            model="gemini-2.5-flash",
            contents=prompt
        )
        time.sleep(min_interval)  # Enforce minimum interval
        return response
    except Exception as e:
        if "429" in str(e):
            retry_after = 60  # Default wait time
            print(f"Rate limited. Waiting {retry_after}s...")
            time.sleep(retry_after)
            return call_with_rate_limit(client, prompt, rpm_limit)
        raise

500 Internal Server Error — INTERNAL

A 500 error indicates an unexpected problem on Google's servers. These are typically transient and resolve on their own. If you encounter frequent 500 errors:

If you receive consistent 500 errors for the same request, the issue may be with your specific input content rather than a server problem. Try modifying your prompt or input data to narrow down the cause.

503 Service Unavailable — MODEL_CAPACITY_EXHAUSTED

A 503 error means the model does not have enough server capacity to process your request. This is different from 429 (your personal rate limit) — 503 affects all users. Common during peak demand periods, especially for Gemini 2.5 Pro.

Best Practices for Error Handling

Build resilient applications by following these patterns:

  1. Always implement retry logic with exponential backoff for 429, 500, and 503 errors.
  2. Log error details including the error code, message, and request metadata for debugging.
  3. Set reasonable timeouts to prevent hanging requests from consuming resources.
  4. Use circuit breakers to stop sending requests when error rates are high.
  5. Monitor error rates and set up alerts for unusual spikes.

Simplify Error Handling with a Relay Service

A relay service like claude4u.com handles many of these errors transparently. It implements automatic retry logic, rate limit management across multiple API keys, and model failover — so your application receives fewer errors and you write less error handling code. The relay service also provides detailed error logging and analytics to help you identify patterns and optimize your usage.

Get Started with 轻舟 AI

Stable, fast AI API relay — supports Claude, OpenAI, Gemini and more

Sign Up Free