GitHub Copilot auto model selection UI in VS Code showing model routing options GitHub
by VibecodedThis

GitHub Copilot Made Four Model Changes in Three Days

Between May 18-20, GitHub added cheap cloud agent models, brought Gemini 3.5 Flash to IDEs, launched auto model selection in VS Code with a 10% discount, and then stripped all Gemini models from web chat.

Share

Between May 18 and May 20, GitHub pushed four model-related changes to Copilot. They pull in almost opposite directions, which makes the week worth unpacking.

May 18: Cheaper models for cloud agent tasks

GitHub added two 0.33x cost-multiplier models to the Copilot cloud agent: Claude Haiku 4.5 and GPT-5.4-mini. These are for straightforward delegated tasks — routine changes that don’t need a full-capability model.

The math is simple: a 0.33x multiplier means three cloud agent tasks for the same premium request cost as one. If you’re using the cloud agent to handle simple follow-ups, auto-fixes, and boilerplate changes, picking a smaller model preserves your quota for work that actually needs it.

May 19: Gemini 3.5 Flash added for IDE and agent use

One day later, Gemini 3.5 Flash became generally available across Copilot Pro, Pro+, Business, and Enterprise plans. GitHub describes it as offering near-Pro coding quality at Flash-tier speed, with strong tool use and high cache efficiency suited to fast, iterative agentic workflows.

The catch: it launches at a 14x premium request multiplier. GitHub notes the pricing is tentative and may change, but at launch it is one of the more expensive options in the roster.

Business and Enterprise administrators need to explicitly enable the Gemini 3.5 Flash policy in Copilot settings before users can access it.

May 20: Auto model selection in VS Code, with a 10% discount

Auto model selection arrived in VS Code the following day. When you pick “Auto” in the model picker, Copilot evaluates the task across “reasoning, code generation complexity, bug diagnosis difficulty, and tool orchestration needs” and routes to a model in real time.

Paid subscribers using Auto get a 10% discount on the model multiplier. A model that normally costs 1x premium request costs 0.9 when selected through Auto. GitHub says the system also optimizes cache usage, reducing token spend on repeated context.

Transparency is built in: hover over a response to see which model was selected. Switching to a specific model manually still works at any point. No configuration is required to start using it.

May 20: All Gemini models removed from web chat

Also on May 20, the same day auto model selection launched, GitHub removed all Gemini models from Copilot Chat on github.com. Two OpenAI models went with them: GPT-5.2 Codex and GPT-5.4 nano.

The official explanation: “While model choice is valuable, we are limiting the list of available models on github.com so that we can consistently ensure reliable responses.” OpenAI and Claude models remain available across all plans on the web.

The contrast with the day before is obvious. GitHub added Gemini 3.5 Flash to IDEs on May 19, then removed all Gemini from the web surface on May 20. GitHub’s explanation makes more sense when you treat these as different products: the IDE is a developer tool with wide model flexibility, while the web chat is a consumer-facing surface GitHub wants to keep consistent.

Going forward, GitHub says web chat will support “a more limited set of new model rollouts” as they work to ensure quality.

The net result for developers: if you need Gemini in Copilot, use the IDE or the API. The web chat is now OpenAI and Anthropic only.

Share