Skip to main content

v1.77.5-stable - MCP OAuth 2.0 Support

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM
Alexsander Hamir
Backend Performance Engineer

Deploy this versionโ€‹

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.77.5-stable

Key Highlightsโ€‹

  • MCP OAuth 2.0 Support - Enhanced authentication for Model Context Protocol integrations
  • Scheduled Key Rotations - Automated key rotation capabilities for enhanced security
  • New Gemini 2.5 Flash & Flash-lite Models - Latest September 2025 preview models with improved pricing and features
  • Performance Improvements - 54% RPS improvement

Scheduled Key Rotationsโ€‹


This release brings support for scheduling virtual key rotations on LiteLLM AI Gateway.

This is great for Proxy Admins looking to enforce Enterprise Grade security for use cases going through LiteLLM AI Gateway.

From this release you can enforce Virtual Keys to rotate on a schedule of your choice e.g every 15 days/30 days/60 days etc.


Performance Improvements - 54% RPS Improvementโ€‹


This release brings a 54% RPS improvement (1,040 โ†’ 1,602 RPS, aggregated) per instance.

The improvement comes from fixing O(nยฒ) inefficiencies in the LiteLLM Router, primarily caused by repeated use of in statements inside loops over large arrays.

Tests were run with a database-only setup (no cache hits).

Test Setupโ€‹

All benchmarks were executed using Locust with 1,000 concurrent users and a ramp-up of 500. The environment was configured to stress the routing layer and eliminate caching as a variable.

System Specs

  • CPU: 8 vCPUs
  • Memory: 32 GB RAM

Configuration (config.yaml)

View the complete configuration: gist.github.com/AlexsanderHamir/config.yaml

Load Script (no_cache_hits.py)

View the complete load testing script: gist.github.com/AlexsanderHamir/no_cache_hits.py


New Models / Updated Modelsโ€‹

New Model Supportโ€‹

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)Features
Geminigemini-2.5-flash-preview-09-20251M$0.30$2.50Chat, reasoning, vision, audio
Geminigemini-2.5-flash-lite-preview-09-20251M$0.10$0.40Chat, reasoning, vision, audio
Geminigemini-flash-latest1M$0.30$2.50Chat, reasoning, vision, audio
Geminigemini-flash-lite-latest1M$0.10$0.40Chat, reasoning, vision, audio
DeepSeekdeepseek-chat131K$0.60$1.70Chat, function calling, caching
DeepSeekdeepseek-reasoner131K$0.60$1.70Chat, reasoning
Bedrockdeepseek.v3-v1:0164K$0.58$1.68Chat, reasoning, function calling
Azureazure/gpt-5-codex272K$1.25$10.00Responses API, reasoning, vision
OpenAIgpt-5-codex272K$1.25$10.00Responses API, reasoning, vision
SambaNovasambanova/DeepSeek-V3.133K$3.00$4.50Chat, reasoning, function calling
SambaNovasambanova/gpt-oss-120b131K$3.00$4.50Chat, reasoning, function calling
Bedrockqwen.qwen3-coder-480b-a35b-v1:0262K$0.22$1.80Chat, reasoning, function calling
Bedrockqwen.qwen3-235b-a22b-2507-v1:0262K$0.22$0.88Chat, reasoning, function calling
Bedrockqwen.qwen3-coder-30b-a3b-v1:0262K$0.15$0.60Chat, reasoning, function calling
Bedrockqwen.qwen3-32b-v1:0131K$0.15$0.60Chat, reasoning, function calling
Vertex AIvertex_ai/qwen/qwen3-next-80b-a3b-instruct-maas262K$0.15$1.20Chat, function calling
Vertex AIvertex_ai/qwen/qwen3-next-80b-a3b-thinking-maas262K$0.15$1.20Chat, function calling
Vertex AIvertex_ai/deepseek-ai/deepseek-v3.1-maas164K$1.35$5.40Chat, reasoning, function calling
OpenRouteropenrouter/x-ai/grok-4-fast:free2M$0.00$0.00Chat, reasoning, function calling
XAIxai/grok-4-fast-reasoning2M$0.20$0.50Chat, reasoning, function calling
XAIxai/grok-4-fast-non-reasoning2M$0.20$0.50Chat, function calling

Featuresโ€‹

  • Gemini
    • Added Gemini 2.5 Flash and Flash-lite preview models (September 2025 release) with improved pricing - PR #14948
    • Added new Anthropic web fetch tool support - PR #14951
  • XAI
  • Anthropic
    • Updated Claude Sonnet 4 configs to reflect million-token context window pricing - PR #14639
    • Added supported text field to anthropic citation response - PR #14164
  • Bedrock
    • Added support for Qwen models family & Deepseek 3.1 to Amazon Bedrock - PR #14845
    • Support requestMetadata in Bedrock Converse API - PR #14570
  • Vertex AI
    • Added vertex_ai/qwen models and azure/gpt-5-codex - PR #14844
    • Update vertex ai qwen model pricing - PR #14828
    • Vertex AI Context Caching: use Vertex ai API v1 instead of v1beta1 and accept 'cachedContent' param - PR #14831
  • SambaNova
    • Add sambanova deepseek v3.1 and gpt-oss-120b - PR #14866
  • OpenAI
    • Fix inconsistent token configs for gpt-5 models - PR #14942
    • GPT-3.5-Turbo price updated - PR #14858
  • OpenRouter
    • Add gpt-5 and gpt-5-codex to OpenRouter cost map - PR #14879
  • VLLM
  • Flux

Bug Fixesโ€‹

  • Anthropic
    • Fix: Support claude code auth via subscription (anthropic) - PR #14821
    • Fix Anthropic streaming IDs - PR #14965
    • Revert incorrect changes to sonnet-4 max output tokens - PR #14933
  • OpenAI
    • Fix a bug where openai image edit silently ignores multiple images - PR #14893
  • VLLM
    • Fix: vLLM provider's rerank endpoint from /v1/rerank to /rerank - PR #14938

New Provider Supportโ€‹


LLM API Endpointsโ€‹

Featuresโ€‹

  • General
    • Add SDK support for additional headers - PR #14761
    • Add shared_session parameter for aiohttp ClientSession reuse - PR #14721

Bugsโ€‹

  • General
    • Fix: Streaming tool call index assignment for multiple tool calls - PR #14587
    • Fix load credentials in token counter proxy - PR #14808

Management Endpoints / UIโ€‹

Featuresโ€‹

  • Proxy CLI Auth
    • Allow re-using cli auth token - PR #14780
    • Create a python method to login using litellm proxy - PR #14782
    • Fixes for LiteLLM Proxy CLI to Auth to Gateway - PR #14836

Virtual Keys

  • Initial support for scheduled key rotations - PR #14877
  • Allow scheduling key rotations when creating virtual keys - PR #14960

Models + Endpoints

  • Fix: added Oracle to provider's list - PR #14835

Bugsโ€‹

  • SSO - Fix: SSO "Clear" button writes empty values instead of removing SSO config - PR #14826
  • Admin Settings - Remove useful links from admin settings - PR #14918
  • Management Routes - Add /user/list to management routes - PR #14868

Logging / Guardrail / Prompt Management Integrationsโ€‹

Featuresโ€‹

Guardrailsโ€‹

  • LakeraAI v2 Guardrail - Ensure exception is raised correctly - PR #14867
  • Presidio Guardrail - Support custom entity types in Presidio guardrail with Union[PiiEntityType, str] - PR #14899
  • Noma Guardrail - Add noma guardrail provider to ui - PR #14415

Prompt Managementโ€‹

  • BitBucket Integration - Add BitBucket Integration for Prompt Management - PR #14882

Spend Tracking, Budgets and Rate Limitingโ€‹

  • Service Tier Pricing - Add service_tier based pricing support for openai (BOTH Service & Priority Support) - PR #14796
  • Cost Tracking - Show input, output, tool call cost breakdown in StandardLoggingPayload - PR #14921
  • Parallel Request Limiter v3
    • Ensure Lua scripts can execute on redis cluster - PR #14968
    • Fix: get metadata info from both metadata and litellm_metadata fields - PR #14783
  • Priority Reservation - Fix: Priority Reservation: keys without priority metadata receive higher priority than keys with explicit priority configurations - PR #14832

MCP Gatewayโ€‹

  • MCP Configuration - Enable custom fields in mcp_info configuration - PR #14794
  • MCP Tools - Remove server_name prefix from list_tools - PR #14720
  • OAuth Flow - Initial commit for v2 oauth flow - PR #14964

Performance / Loadbalancing / Reliability improvementsโ€‹

  • Memory Leak Fix - Fix InMemoryCache unbounded growth when TTLs are set - PR #14869
  • Cache Performance - Fix: cache root cause - PR #14827
  • Concurrency Fix - Fix concurrency/scaling when many Python threads do streaming using sync completions - PR #14816
  • Performance Optimization - Fix: reduce get_deployment cost to O(1) - PR #14967
  • Performance Optimization - Fix: remove slow string operation - PR #14955
  • DB Connection Management - Fix: DB connection state retries - PR #14925

Documentation Updatesโ€‹

  • Provider Documentation - Fix docs for provider_specific_params.md - PR #14787
  • Model References - Update model references from gemini-pro to gemini-2.5-pro - PR #14775
  • Letta Guide - Add Letta Guide documentation - PR #14798
  • README - Make the README document clearer - PR #14860
  • Session Management - Update docs for session management availability - PR #14914
  • Cost Documentation - Add documentation for additional cost-related keys in custom pricing - PR #14949
  • Azure Passthrough - Add azure passthrough documentation - PR #14958
  • General Documentation - Doc updates sept 2025 - PR #14769
    • Clarified bridging between endpoints and mode in docs.
    • Added Vertex AI Gemini API configuration as an alternative in relevant guides. Linked AWS authentication info in the Bedrock guardrails documentation.
    • Added Cancel Response API usage with code snippets
    • Clarified that SSO (Single Sign-On) is free for up to 5 users:
    • Alphabetized sidebar, leaving quick start / intros at top of categories
    • Documented max_connections under cache_params.
    • Clarified IAM AssumeRole Policy requirements.
    • Added transform utilities example to Getting Started (showing request transformation).
    • Added references to models.litellm.ai as the full models list in various docs.
    • Added a code snippet for async_post_call_success_hook.
    • Removed broken links to callbacks management guide. - Reformatted and linked cookbooks + other relevant docs
  • Documentation Corrections - Corrected docs updates sept 2025 - PR #14916

New Contributorsโ€‹

  • @uzaxirr made their first contribution in PR #14761
  • @xprilion made their first contribution in PR #14416
  • @CH-GAGANRAJ made their first contribution in PR #14779
  • @otaviofbrito made their first contribution in PR #14778
  • @danielmklein made their first contribution in PR #14639
  • @Jetemple made their first contribution in PR #14826
  • @akshoop made their first contribution in PR #14818
  • @hazyone made their first contribution in PR #14821
  • @leventov made their first contribution in PR #14816
  • @fabriciojoc made their first contribution in PR #10955
  • @onlylonly made their first contribution in PR #14845
  • @Copilot made their first contribution in PR #14869
  • @arsh72 made their first contribution in PR #14899
  • @berri-teddy made their first contribution in PR #14914
  • @vpbill made their first contribution in PR #14415
  • @kgritesh made their first contribution in PR #14893
  • @oytunkutrup1 made their first contribution in PR #14858
  • @nherment made their first contribution in PR #14933
  • @deepanshululla made their first contribution in PR #14974
  • @TeddyAmkie made their first contribution in PR #14758
  • @SmartManoj made their first contribution in PR #14775
  • @uc4w6c made their first contribution in PR #14720
  • @luizrennocosta made their first contribution in PR #14783
  • @AlexsanderHamir made their first contribution in PR #14827
  • @dharamendrak made their first contribution in PR #14721
  • @TomeHirata made their first contribution in PR #14164
  • @mrFranklin made their first contribution in PR #14860
  • @luisfucros made their first contribution in PR #14866
  • @huangyafei made their first contribution in PR #14879
  • @thiswillbeyourgithub made their first contribution in PR #14949
  • @Maximgitman made their first contribution in PR #14965
  • @subnet-dev made their first contribution in PR #14938
  • @22mSqRi made their first contribution in PR #14972

Full Changelogโ€‹