article

AI Dev Essentials #33: Gemini 3 Pro, Antigravity IDE, & GPT-5.1 Codex Max

Gemini 3 Pro & Antigravity IDE launch, OpenAI ships GPT-5.1-Codex-Max, and Cloudflare buys Replicate. Get the full breakdown in AI Dev Essentials #33.

Hey Everyone 👋,

John Lindquist here with the 33rd issue of AI Dev Essentials!

Google launched Gemini 3 Pro alongside their new Antigravity IDE, both representing significant advances in multimodal understanding and agentic development. OpenAI released GPT-5.1-Codex-Max with 24-hour autonomous task capabilities and ChatGPT for Teachers, while making group chats available in pilot regions. Claude models became available in Microsoft Foundry with structured outputs now in public beta. Cloudflare acquired Replicate to strengthen their AI infrastructure.

I've been spending the week diving deep into Anthropic's engineering guide on Code Execution with MCP and exploring their frontend design plugin to prepare for the upcoming workshop. I've also been putting Gemini 3 through its paces to determine if it earns a spot in my daily workflow.

🚨 Happening TOMORROW! 🚨 Last chance to sign up for the Claude Code Power User Workshop - European Timezone Editon 🌍: https://egghead.io/workshop/claude-code

🚀 Major Announcements

Google Launches Gemini 3 Pro with Record Benchmark Scores [Breaking]

Google DeepMind announced Gemini 3 Pro on November 18, 2025, achieving the highest benchmark scores for a publicly available model with 1501 on LMArena and introducing new capabilities in multimodal understanding and generative UI.

Model capabilities:

Record Performance: First model to exceed 1500 on LMArena benchmark
PhD-Level Reasoning: 37.5% on Humanity's Last Exam (without tools), 91.9% on GPQA Diamond
Best Multimodal Understanding: Leading performance across text, images, audio, and video
Generative UI: LLMs can generate complete user experiences including web pages, games, tools, and applications
Availability: Gemini app on desktop, mobile app, and mobile web with "Thinking" mode selection
Developer Access: Available in AI Studio and via Gemini API

Gemini 3 Deep Think variant:

Enhanced Reasoning: Outperforms Gemini 3 Pro on major benchmarks
Humanity's Last Exam: 41.0% (without tools)
GPQA Diamond: 93.8%
ARC-AGI: 45.1% with code execution
Availability: Coming to Google AI Ultra subscribers in the coming weeks

(Google Official Blog, Google Developers Blog, TechCrunch, CNBC)

This is really big. Gemini 3 has been absolutely crushing it designs and one-shotting incredible demos. I've had it build a 3d playable piano, recreate Wordle and other games, and create almost perfect designs of Slack and other UI-heavy apps. I've had much worse luck in longer programming tasks and I've heard evidence that it requires a lot of "hand holding" and specific instuctions, but maybe that's because it's still considered a preview? Either way, it's definitely a tool I will be using a lot for prototyping designs. Google is quickly integrating Gemini 3 into every product, even the main Google "AI Mode" which I highly recommend checking out. Things are really heating up.

Google Unveils Antigravity: Agent-First IDE Powered by Gemini 3 [New IDE]

Google launched Antigravity on November 18, 2025, a new agentic development platform featuring agent-first architecture with direct access to editor, terminal, and browser for autonomous execution.

Platform features:

Multi-Model Support: Gemini 3, Anthropic Claude Sonnet 4.5, and OpenAI GPT OSS models
Two Main Modes: Editor view for hands-on coding with agent sidebar; Manager view for mission control and async orchestration
Direct Tool Access: Agents can autonomously use editor, terminal, and browser without approval friction
Free Public Preview: Available for Mac, Windows, and Linux with generous Gemini 3 Pro rate limits
Download: antigravity.google

Strategic positioning:

Competes directly with Cursor, Windsurf, Claude Code, and other agentic IDEs
Built on VS Code fork similar to competitors but with Google's AI infrastructure
Focus on async agent orchestration for background task management

(TechCrunch, VentureBeat, The New Stack)

Apparently they forked Windsurf (they forgot to remove some references to "Cascade") and are attempting a Cursor clone of sorts. I've tried it a handful of times, but keep running into model limits and errors, so I don't really have a read on it. I've had much better luck using Gemini 3 in the Gemini CLI and Cursor. I'll check back in on AntiGravity once they sort out the launch hiccups.

OpenAI Releases GPT-5.1-Codex-Max with 24-Hour Task Capabilities [Model Update]

OpenAI announced GPT-5.1-Codex-Max on November 19, 2025, introducing the first model natively trained to operate across multiple context windows using compaction technology for autonomous work spanning 24+ hours.

Technical capabilities:

Compaction Technology: Natively trained to operate coherently over millions of tokens in a single task
Multi-Day Tasks: Internally tested completing tasks lasting 24+ hours including multi-step refactors, test-driven iteration, and autonomous debugging
Performance: 77.9% on SWE-Bench Verified at extra-high reasoning effort
Token Efficiency: Uses 30% fewer thinking tokens than GPT-5.1-Codex at medium reasoning effort
Windows Support: First OpenAI model trained to operate in Windows environments
Availability: Available in Codex (CLI, IDE extension, cloud, code review); API access coming soon

(OpenAI Blog, VentureBeat, The Decoder)

This has been out less than 24 hours, but it's apparently the best coding model available. I haven't had enough time to properly evaluate it, but I've heard nothing but good things.

GPT-5.1 Pro Spotted in the Wild [Model Update]

Reports are flooding in that OpenAI has begun rolling out GPT-5.1 Pro to Pro and Team users, with early access users calling it "easily the most capable and impressive model" to date.

Rollout details:

Availability: Spotted appearing for Pro and Team users (Nov 19)
Performance: Early reviews describe it as a "monster" with significant capability jumps
Capabilities: Matt Shumer's early review highlights it as the "most capable" model he's used
Status: Silent rollout, appearing in model selectors without a dedicated blog post yet

(TestingCatalog News, Matt Shumer Review)

GPT-5.1 Pro showed up in my account a few minutes ago. I packed up a bunch of files from my Script Kit project and dumped them into the model and came away pretty impressed. I imagine I'll be using this a lot for planning. I'm curious if/when it will hit APIs so we could script our own workflows.

Claude Models Available in Microsoft Foundry with $30B Azure Commitment [Model Update]

Anthropic announced on November 18, 2025, that Claude Sonnet 4.5, Haiku 4.5, and Opus 4.1 are now available in public preview in Microsoft Foundry, alongside a $30 billion Azure cloud capacity commitment and $15 billion combined investment from NVIDIA and Microsoft.

Partnership details:

Models Available: Claude Sonnet 4.5, Haiku 4.5, and Opus 4.1 in public preview
Azure Commitment: Anthropic agreed to spend $30 billion on Microsoft Azure cloud capacity
Strategic Investment: NVIDIA ($10 billion) and Microsoft ($5 billion) investing $15 billion combined in Anthropic
Full Capabilities: All models support code execution, web search/fetch, citations, vision, tool use, and prompt caching
MACC Eligible: Models eligible for Microsoft Azure Consumption Commitment agreements
Integration: Claude models can be used with Claude Code in Foundry environment

(Anthropic Newsroom, Microsoft Azure Blog, NVIDIA Blog)

It's good to see Anthropic making moves. Claude Code and Sonnet 4.5 are still my go-to models for writing code and just kicking off random tasks without having to give it a ton of direction. I'm really hoping to see an Opus 4.5 release soon with Google and OpenAI making big moves.

OpenAI Launches ChatGPT for Teachers Free Through June 2027 [Education]

OpenAI announced ChatGPT for Teachers on November 19, 2025, providing free access to GPT-5.1 Auto with unlimited messages for verified U.S. K-12 educators through June 2027.

Program details:

Free Access: Unlimited messages with GPT-5.1 Auto, search, file uploads, connectors, and image generation
Education Compliance: FERPA-compliant with education-grade privacy and security
Student Data Protection: Student data will not be used to train models
Immediate Availability: Available to all educators immediately upon verification
District Partnerships: Active with 150,000+ teachers and staff across Fairfax County Public Schools, Houston ISD, Fulton County Schools, Prince William County Public Schools
Purpose: Give teachers hands-on experience with AI tools to establish best practices

(OpenAI Blog, CNBC, Axios, Engadget)

Smart play by OpenAI. Education is such an excellent use case for AI, especially when it comes to customizing lessons for specific classrooms and students. I just hope the teachers realize they still need to review the output for accuracy (especially anything Math related). I'm curious if they'll have some sort of teacher onboarding experience to help them get started.

🛠️ Developer Tooling Updates

Cloudflare Acquires Replicate to Strengthen AI Infrastructure [Acquisition]

Cloudflare announced on November 17, 2025, an agreement to acquire Replicate, an AI model deployment platform hosting 50,000+ production-ready models, with completion expected in two months.

Acquisition details:

Model Catalog: Replicate hosts 50,000+ production-ready AI models
Timeline: Acquisition expected to complete in two months
Integration: Replicate's models will integrate with Cloudflare Workers AI
Brand: Replicate will continue as distinct brand with better speed and resources
Developer Impact: Promises more seamless AI cloud experience for developers

(Cloudflare Press Release, Business Wire, Replicate Blog)

Seems like a nice pairing. Although I think Cloudflare lost a lot of the good will from this acquisition thanks to the major outage this week...

Gemini CLI 0.16.0 Ships Major UI Overhaul [CLI]

Google released Gemini CLI v0.16.0 on November 15-18, 2025, featuring a complete UI rendering overhaul eliminating visual noise and adding mouse support.

UI improvements:

Eliminated Flickering: Visual noise and bouncing screens completely removed
Mouse Support: Click and navigate within input prompt
Fixed Cursor Position: No longer lost in long output streams
Scrollbar Improvements: Click-and-drag functionality added
Gemini 3 Pro Access: Support for paid Gemini API keys enabling Gemini 3 Pro in terminal
Continued Updates: v0.17.0-preview.0 released November 18 with additional polish

(GitHub Releases, Google Developers Blog, Google Developers Blog: Gemini 3 in CLI)

I've heard rumors that the Gemini CLI is making big moves. I'm really excited to see what they're up to. I just feel like all the other CLI tools are so far behind Claude Code when it comes to customization and scriptability.

Claude Structured Outputs Launch in Public Beta [Beta]

Anthropic released Structured Outputs in public beta on November 14, 2025, for Claude Sonnet 4.5 and Opus 4.1, guaranteeing JSON schema compliance and eliminating parsing errors.

Technical features:

Schema Compliance: Guarantees JSON schema compliance in API responses
Zero Parsing Errors: Eliminates parsing errors requiring retry logic and validation workarounds
Python & TypeScript: Supports both Pydantic and Zod schema definitions
Beta Header: Requires structured-outputs-2025-11-13 header
Model Support: Sonnet 4.5 and Opus 4.1 (Haiku 4.5 support coming soon)
Production Focus: Addresses production LLM reliability issues

(Claude Developer Platform Blog, Anthropic Release Notes)

We've all had a lot of hacks and workarounds in place do to the lack of structured outputs that we're finally going to be able to retire. Woo!

NotebookLM Adds Image Support for Research Sources

Google announced on November 13, 2025, that NotebookLM now supports images as source material, including photos, screenshots, infographics, handwritten notes, and whiteboards.

New capabilities:

Image Sources: Photos, screenshots, infographics, handwritten notes, whiteboards
Rollout: Available over subsequent weeks after announcement
iOS Support: v1.17.0 (released November 18) added image file support
Additional Updates: Part of larger expansion including Deep Research feature and Google Sheets/Word doc support

(Google Blog, TechCrunch, Digital Trends)

My biggest use case is screenshots from my phone. Such an awesome experinece to be able to capture anything and drop it into a long term memory bank.

🤖 AI Ecosystem Updates

OpenAI Pilots Group Chats in ChatGPT Across Four Regions

OpenAI launched a pilot for group chats in ChatGPT on November 13, 2025, enabling up to 20 participants to collaborate with ChatGPT in shared conversations.

Pilot details:

Pilot Regions: Japan, New Zealand, South Korea, and Taiwan
Participant Limit: Up to 20 participants per chat
Platform: Mobile and web
Availability: Free, Plus, Go, and Pro users
Model: Powered by GPT-5.1 Auto (routing system selecting best model automatically)
Features: Search, image generation, file uploads, dictation, emoji reactions, personalized images
Privacy: Group chats separate from private conversations; personal ChatGPT memory never shared

(OpenAI Blog, TechCrunch, Engadget, Axios)

Anthropic Partners with Rwanda and ALX for Chidi AI Learning Companion

Anthropic announced on November 18, 2025, a partnership with the Government of Rwanda and ALX Africa to deploy Chidi, a Socratic learning companion built on Claude, reaching 200,000+ students across Africa.

Partnership details:

Launch Date: November 4, 2025 (rollout start)
Reach: 200,000+ students through ALX (African tech training provider)
Government Partners: Rwanda's ICT & Innovation and Education ministries
Early Results: 1,100+ conversations, nearly 4,000 learning sessions; 9 out of 10 users reporting positive experiences
Phase 1: Teacher training for 2,000 teachers and civil servants in Rwanda
Cost Structure: Anthropic covers LLM/API costs; ALX provides training/implementation; Rwanda provides policy support and school access

(Anthropic Newsroom, APO Group, EdTech Innovation Hub)

Google Releases WeatherNext 2 with 8x Faster Forecasting

Google DeepMind introduced WeatherNext 2 on November 17, 2025, as the most advanced and efficient weather forecasting model using a new Functional Generative Network architecture.

Model capabilities:

8x Faster: 1-hour resolution forecasting
New Architecture: Functional Generative Network (FGN) approach
Superior Performance: Surpasses previous model on 99.9% of variables and lead times (0-15 days)
Scenario Generation: Hundreds of forecast scenarios in under 1 minute using single TPU
Integration: Now in Google Search, Gemini, Pixel Weather, Google Maps Platform

(Google Blog, 9to5Google, Android Authority, TechRadar)

⚡ Quick Updates

GPT-5.1 Models Available in Cursor

Three Models: GPT-5.1 for everyday tasks, GPT-5.1 Codex for ambitious coding, GPT-5.1 Codex Mini for cost-efficient changes
Availability: All models accessible in Cursor as of November 13
Integration: Full Cursor feature support including Composer and Agent

(Cursor)

Claude Code Frontend Design Plugin Released

Release Date: November 12, 2025
Purpose: Addresses "distributional convergence" (LLM tendency toward generic designs)
Four-Dimension Framework: Typography, Color & Theme, Motion, Backgrounds
Availability: Official Claude Code GitHub repository

(Claude Blog, GitHub)

Google Releases 54-Page AI Agent Production Guide

Release: November 2025
Authors: Alan Blount, Antonio Gulli, Shubham Saboo, Michael Zimmermann, Vladimir Vuskovic
Content: Five-level agent taxonomy, operational cycle framework, production standards
Access: Free download from Google Cloud

(Google Cloud)

✨ Workshop Spotlight

Claude Code Power User Workshop

European Timezone Editon 🌍 (🚨 TOMORROW! 🚨)

Date: Tomorrow, Friday November 21, 2025 Time: 2:00 PM - 7:00 PM (CET) Platform: Zoom

Pricing:

Regular Price: $375
egghead.io Pro Yearly Member: $275 - Become a Pro member →

What You'll Learn:

Learn the essential skills to ship reliable AI-generated code with confidence. This hands-on workshop covers everything from foundational prompting to advanced automation using the Claude Code SDK and custom integrations.

Core Skills:

Context Engineering: Control what context Claude sees for reliable, consistent results
TypeScript SDK: Script Claude programmatically to build custom workflows
Custom Hooks: Automate repetitive tasks with Claude Code hooks
Model Context Protocol: Integrate APIs securely to extend Claude's capabilities
Claude Code Skills: Build custom skills for Claude Code
Live Q&A: Get your specific questions answered by John Lindquist

👉 Register: https://egghead.io/workshop/claude-code

Read this far? Share "AI Dev Essentials" with a friend! - https://egghead.io/newsletters/ai-dev-essentials

John Lindquist

https://egghead.io