Every major AI model update, in one place. Updated through Feb 2026.
Anthropic releases Claude 3.7 Sonnet, featuring extended thinking mode that can reason for up to 128K tokens before responding. Sets new SOTA on coding and math benchmarks. Available on API immediately.
Read announcement โxAI announces Grok 3, trained on a 100K H100 cluster (10x Grok 2). Claims state-of-the-art on AIME and GPQA benchmarks. Available to Premium+ subscribers first.
Read announcement โGoogle releases Gemini 2.0 Flash as GA, featuring native tool use (search, code execution), multimodal live API, and 1M context window. Twice the speed of 1.5 Flash at same price.
Read announcement โOpenAI releases o3-mini, a compact reasoning model that competes with larger models on STEM tasks. Three effort levels: low, medium, high. Dramatically cheaper than o3 full.
Read announcement โxAI launches DeepSearch, a feature allowing Grok to iteratively search the web and reason across multiple sources before answering. Similar to Perplexity Deep Research.
Read announcement โClaude 3.5 Haiku is now GA after limited beta. Outperforms the previous Claude 3 Opus on most benchmarks at a fraction of the cost. Priced at $0.80/M input and $4/M output tokens.
Read announcement โGoogle prices Gemini 2.0 Flash at $0.075/M input and $0.30/M output โ 10x cheaper than GPT-4o. Free tier: 15 RPM, 1M TPM with no cost. Aggressive push for developer adoption.
Read announcement โMistral releases Small 3 (24B), trained with a novel distillation technique. Outperforms Llama 3.1 70B on most benchmarks at 3x fewer parameters. Apache 2.0 licensed.
Read announcement โOpenAI announces o3, achieving 87.5% on ARC-AGI (a test of general reasoning). Not yet released publicly, but marks a significant milestone. Safety evaluations ongoing.
Read announcement โGoogle announces Veo 2, a video generation model that produces 4K video with unprecedented quality and camera control. Available through VideoFX and coming to Gemini.
Read announcement โGoogle announces the Gemini 2.0 family, designed for an "agentic era" with native tool calling, streaming, and a new Live API for real-time multimodal interactions.
Read announcement โMeta releases Llama 3.3 70B, a fine-tuned instruction model that matches Llama 3.1 405B performance on most benchmarks at 6x less compute. Best open-source model per parameter.
Read announcement โCanvas is a new ChatGPT interface for working on long documents and code side-by-side with Claude. Supports inline edits, comments, and version history.
Read announcement โDeep Research uses Gemini 1.5 Pro to autonomously search the web, synthesize findings, and produce detailed research reports. Available to Gemini Advanced subscribers.
Read announcement โAnthropic releases Computer Use, allowing Claude to view screenshots and control mouse/keyboard on real computers. First LLM with native computer control. Available via API.
Read announcement โxAI launches its public API with OpenAI-compatible endpoints. Drop-in replacement for OpenAI in many apps. Priced at $2/M input, $10/M output for grok-2-1212.
Read announcement โSignificant upgrade to Claude 3.5 Sonnet with major improvements to coding, especially tool use, multi-step tasks, and code editing. Computer Use public beta launched alongside.
Read announcement โMistral launches Le Chat Pro, a premium subscription to their AI assistant with web search, image generation, and document analysis. Positioned as a privacy-first alternative for European users.
Read announcement โThe Realtime API enables sub-300ms voice conversations with GPT-4o, with function calling and VAD (voice activity detection) built in. Built for always-on voice agents.
Read announcement โMeta releases Llama 3.2 with vision capabilities (11B and 90B) and lightweight models (1B and 3B) optimized for mobile and edge deployment. First multimodal open-source Llama.
Read announcement โMistral releases Pixtral 12B, a multimodal model combining a 12B language model with a 400M vision encoder. Open-source, supports variable image resolutions.
Read announcement โPrompt caching allows reusing prompt prefixes across API calls. Up to 90% cost reduction and 85% latency improvement for repeated prompts. Available for Claude 3 family.
Read announcement โo1 is OpenAI's first model trained with reinforcement learning to reason before responding. Excels at complex math, science, and coding. Available to ChatGPT Plus and API users.
Read announcement โxAI releases Grok 2 and Grok 2 mini with significant improvements over Grok 1.5. Real-time X.com data access and new image generation via Aurora model. API available.
Read announcement โOpenAI launches Structured Outputs, guaranteeing model outputs match any JSON schema exactly. Eliminates parsing errors in production. Supported across GPT-4o and newer.
Read announcement โMistral releases Large 2 with 123B parameters, achieving GPT-4 level performance on coding and math. Supports 128K context, function calling, and 80+ languages.
Read announcement โMeta releases Llama 3.1, including a 405B parameter model that competes with GPT-4o and Claude 3.5 Sonnet. Fully open weights. Supports 128K context, 8 languages.
Read announcement โGPT-4o mini replaces GPT-3.5 Turbo as the go-to cheap model. Scores 82% on MMLU (higher than GPT-4 at launch). Priced at $0.15/M input, $0.60/M output.
Read announcement โAnthropic open-sources the Model Context Protocol, a universal standard for connecting LLMs to external tools, data, and services. Already has 40+ official server implementations.
Read announcement โFirst model in the Claude 3.5 family. Sets new benchmark records on coding (HumanEval), reasoning (GPQA), and vision tasks. 2x faster than Claude 3 Opus at lower cost.
Read announcement โGemini 1.5 Pro launches with a 1M token context window (2M in private preview). Processes entire codebases, hour-long videos, or thousands of documents in one call.
Read announcement โGPT-4o ("omni") natively processes text, audio, and images in a single model. Real-time voice conversations with emotion detection. 2x faster than GPT-4 Turbo, 50% cheaper.
Read announcement โMeta releases Llama 3 with 8B and 70B models, significantly outperforming Llama 2. Trained on 15T tokens, showing major improvements in reasoning, code, and instruction following.
Read announcement โOpenAI deprecates the original GPT-4 Turbo (gpt-4-turbo-preview and gpt-4-0125-preview). Customers redirected to GPT-4o, which outperforms at lower cost. Shutdown date: June 2025.
Read announcement โNew Files API allows storing documents server-side for repeated use across conversations. Citations feature returns precise document references with exact quotes for RAG workflows.
Read announcement โ