article

AI Dev Essentials #19: GPT-5's Mixed Debut, GitHub CEO Exits, & Claude's 1M Token Window

AI Dev Essentials #19 dives into GPT-5’s rocky launch, Claude’s 1M token context, new AI tooling updates, and tips for boosting CLI workflows with AI.

AI Dev Essentials - Issue #19

Hey Everyone 👋,

John Lindquist here with the 19th issue of AI Dev Essentials!

I spent the past week building out an app to streamline "video prompting". I'll put out a call for beta testers next week 😇

Throughout the process, I've had really good experiences asking Claude Code to drive CLIs (gcloud, vercel, and GitHub) to accomplish tasks like setting up storage buckets, managing permissions, and auto-fixing failed GitHub actions. These tasks used to be extremely painful days wasted diving into complicated settings pages. Thanks to AI's understanding of CLIs, their flags, and their options, it can run the right commands in one shot, verify they're working, and then send you the URLs to confirm. Honsestly, using CLIs with natural langague has taken off so much weight and boredom of chore work and allowed me to focus on the fun stuff. Anyone else felt the same?

I'll be recording some videos based on this week's experiences tomorrow, so I should be posting some interesting videos to egghead soon. They just didn't quite make the cutoff for today's newsletter.

The rest of the time I spent polishing up materials for my Claude Code workshop next Friday. I'm really enjoying teaching all of these powerhouse workflows and excited to share the countless hours of work I've put into them with you.

These Claude Code workshops keep selling out, so grab your ticket now: https://egghead.io/workshop/claude-code

On to the news!

🚀 Major Announcements

GPT-5's Rocky Launch Leaves Developers Divided

OpenAI released GPT-5 on August 7, 2025, but the highly anticipated launch has been met with confusion and frustration as the model fails to live up to pre-release hype.

The reality check:

Preview vs. Production: Developers who tested preview models report significant degradation in production
Performance issues: Widespread complaints about speed, especially in IDEs like Cursor where it's "absolutely painful to use"
Routing confusion: Unclear how the model decides between instant responses and deeper reasoning
Mixed reception: Community feedback ranges from "underwhelming" to questioning if they're using it wrong
Integration problems: Slowness and inconsistent behavior across different platforms

What was promised vs. delivered:

Promise: Breakthrough reasoning with intelligent routing
Reality: Inconsistent performance that has developers reverting to Claude or GPT-4
Promise: Enhanced agentic capabilities for development
Reality: Speed issues make it impractical for real-time coding assistance

(OpenAI Official Blog, Cursor Community Forum)

My personal experience with GPT-5 started strong. Right at launch things felt like they were working really well, but then performance degraded over the week. I tried it both inside Cursor and via the Cursor CLI and had a very mixed experience. Sometimes it was great; other times it failed in ways that made it hard to build trust.

That said, with GPT-5 Pro through ChatGPT on the $200 per month plan, I have had incredible experiences solving difficult problems in video recording workflows, refactoring asynchronous code, and other complex tasks that require deep reasoning and a strong understanding of the code. I have seen it optimize code, add edge case handling, cover a wide range of tasks, and get it right the first time. It also creates clear plans that I can hand to other tools. I am using GPT-5 Pro constantly, and right now it is my go to for planning.

Claude Code is still king for me when it comes to agent work, mostly because it's been the most reliable and because of its pacing and feedback. A lot of my GPT-5 sessions show twenty or more tool calls with no explanation, which doesn't provide much insight into what's happening. I hope they take notes from Claude Code on surfacing incremental updates and briefly explaining what's going on.

I get worried when GPT-5 seems like it's doing a lot of work without checking in. Until the visibility and reliability improve, I'll keep leaning on Claude Code for serious agent tasks. We'll see how both the Cursor CLI and Codex improve over time. It would take a massive turnaround for them to truly compete with the experience of Claude Code, especially when it comes to building true AI pipelines.

Claude Expands Context Window to 1 Million Tokens

Anthropic announced on August 12, 2025, that Claude Sonnet 4 now supports 1 million tokens of context on the Anthropic API, representing a 5x increase from the previous 200,000 token limit.

Technical details:

Context window: 1 million tokens (approximately 750,000 words)
Availability: API access for Sonnet 4
Use cases: Large codebases, extensive documentation, multi-file analysis
Performance: Maintained speed despite increased context

(Anthropic Official Announcement, TechCrunch Coverage)

It is exciting to see more providers catch up with Gemini and offer a one million token context window. Early research suggests Anthropic is doing very well at understanding the full window and keeping responses coherent as conversations get longer.

This just shipped and it's not available in Claude Code yet (even if you try to hack it in, lol).

I am curious whether Gemini will counter with a ten million token context window. There are rumors they have it working internally, but it's not cost efficient to release yet. Given their recent moves, it feels like Gemini is sitting on something significant. I hope they release it to the public soon.

GitHub CEO Thomas Dohmke Departs to Found New Startup

After nearly four years as CEO, Thomas Dohmke announced on August 11, 2025, his departure from GitHub to become a startup founder again.

Impact and transition:

Legacy: Led GitHub through significant growth and AI integration era
GitHub Copilot: Oversaw development and expansion of AI coding assistant
New venture: Details of startup not yet announced
Transition: Will remain through end of 2025

(GitHub Official Blog, Tech Startups Coverage)

It is curious to see the GitHub CEO leave at this specific moment when many AI enabled tooling companies are "accelerating" (sorry for the buzzword). It makes it smell like GitHub missed the AI boat and are trying to course correct, but I'm making massive assumptions here. I'm just surprised and curious if there's more to the story.

🛠️ Developer Tooling Updates

Gemini CLI Updates and VS Code Integration

Google released significant updates to Gemini CLI during the week of August 4, 2025, including deep VS Code integration and performance enhancements.

Updates include:

VS Code integration: Native integration with context-aware workflows (v0.1.20)
Performance: Faster response times and improved reliability
Guided Learning: New AI tutor feature using LearnLM and Gemini 2.5 Pro
Fine-tuning: Supervised fine-tuning available for Gemini 2.5 Flash

(Google Developers Blog, Google Guided Learning)

I love seeing the updates to the Gemini CLI. They might be moving a little slower than some others, but they are doing all the right things.

The biggest unlock everyone is waiting for is Gemini becoming a stronger agentic model that can keep up with the models from Anthropic. If Gemini releases a 2.6 or a 3.0 that lets Pro create plans and lets Flash execute those plans, and bundles that with a one million plus context window, we will have something special on our hands.

Chrome's Built-in AI: Prompt API Now Available

Google's Prompt API for Chrome enables free, local AI capabilities directly in the browser powered by Gemini Nano, available from Chrome 138 stable.

Technical specifications:

Local execution: All processing done locally with no data sent to external servers
Privacy-first: Supports offline usage with complete data privacy
Chrome Extensions: Available for extension developers from Chrome 138
Model: Powered by Gemini Nano
Use cases: Text generation, summarization, translation

(Chrome for Developers - Prompt API, Chrome for Developers - Built-in AI)

Many people would be surprised by how many AI tools are now built into Chrome. Both the developer tooling and the browser internals are moving toward running models directly in the browser. It feels like Chrome is doing important work that will shape what sites can do, where a site does not need to rely on external AI services for basic features.

On my Script Kit site I have a scenario where a user selection should present autocomplete like options. That is one of the first things I am going to try with the Prompt API. Let someone click a button and get a few suggested options. It does not need much intelligence. It just needs something small and fast. If I can offload that work to the browser that is a huge win for owning the entire workflow rather than relying on providers to be up.

API providers do go down, have issues, and sometimes create poor user experiences. The more control we have over the AI when building our own tools, the better.

I also need to make lessons and tutorials on the AI developer tools inside Chrome DevTools. There is a lot to cover and they are doing incredible work.

💼 AI Ecosystem & Business Updates

MCP-UI Protocol Gains Momentum

The Model Context Protocol UI extension is gaining traction as an open RFC being considered for adoption into the MCP specification.

Development updates:

RFC stage: Open RFC under consideration, may change as proposal advances
Component layer: Enables AI agents to return fully interactive UI components
Community adoption: Growing ecosystem of MCP-UI compatible tools
Shopify development: Shopify actively developing and promoting the protocol

(Shopify Engineering, Block/Goose Blog)

This specification feels important for the future of how we interact with AI tools and AI conversations. The idea of just in time user interfaces that are rendered by MCPs could change everything. At the high end it might replace websites. A more moderate view is that it will give you rich user interfaces on mobile when you are chatting with GPT and Claude and the tools you use every day. Instead of only text you get buttons and controls.

One of my favorite things to do with Claude on mobile, which can render a canvas, is to ask it to create interactive charts and interactive timelines. Instead of a static diagram you can get an animated diagram that you can drag and change. That experience feels like the future.

Bringing those kinds of experiences into the general flow of any tool I use feels like an ideal direction for where MCPs could go. I am cheering this project on.

Jules Agent Enhances Code Review Capabilities

Google introduced the Jules Critic feature on August 8, 2025, acting as an internal peer reviewer integrated directly into the code generation process.

Key features:

Automated review: Flags potential bugs and inefficient code patterns
Adversarial review: Challenges proposed changes before developers see output
Integration: Works seamlessly with existing Jules workflow
Reliability: Significantly improves code robustness

(Google Developers Blog, Jules Changelog)

I have seen so many code review tools at this point that I am not surprised when another one appears. I built my own code review tools last year, so I know this is something I could build myself.

No matter how talented I may or may not be, seeing teams at companies like Google build their own code review tools is not an impressive achievement. It feels like a must have feature now. It also feels odd that GitHub has code review, and then Cursor, CodeRabbit, and others are adding their own code review layers on top.

I am curious if anyone has stacked every code review tool on a single pull request. I would love to see a battle royale of code review tools reviewing nonsense code just for fun.

💻 Cursor Corner

GPT-5 Integration Tips and Best Practices

The Cursor community has been sharing optimal configurations for GPT-5 integration following its release.

Configuration recommendations:

Model selection: Use gpt-5 (full reasoning model) with Agent mode and MAX mode enabled
Performance notes: Users continue reporting significant slowness issues with GPT-5 in Cursor
Prompting differences: GPT-5 requires different strategies, focusing on clarity and autonomous behavior rather than explicit instruction
Cost considerations: Higher reasoning modes significantly increase costs through additional reasoning tokens

(OpenAI GPT-5 Documentation, GPT-5 Prompting Guide, Cursor GPT-5 Blog)

Community feedback:

Mixed initial reactions with improvement after configuration adjustments
Speed improvements noted after backend optimizations
Best results with high reasoning effort for complex tasks

As with any new model there are growing pains in the integration between the model and the IDEs. I do not usually expect this many growing pains on the model side, but this round feels different.

It will take some time before both the model and the IDE settle into an optimized workflow that most people are happy with. That likely means we need some patience for the tools to shine and for the model to fully represent what it is capable of.

If you are fumbling with Cursor right now trying to get GPT-5 to meet your expectations, you are not alone.

⚡ Quick Updates

Mistral Small 3.1 Release

New small-tier model with improved reasoning and code generation
Enhanced performance while maintaining efficiency
Available through major API providers

(Mistral AI Models)

Google AI Studio Redesign

New landing page and improved user interface
Better model selection and comparison tools
Enhanced documentation and examples

Fine-tuning OpenAI's GPT-OSS Locally

GPT-OSS models (120B and 20B) available under Apache 2.0 license
Community tutorials through Unsloth and Hugging Face
100% local execution maintaining privacy

(OpenAI GPT-OSS Announcement, Unsloth Documentation)

Higgsfield Platform Updates

Enhanced video generation with Seedance Pro unlimited access
Supports MiniMax Hailuo-02, Veo 3, and Seedance Pro models
Image-to-video conversion for creative workflows

(Higgsfield Official Site, Seedance Pro Launch)

Claude Chat History References

New on-demand recall of past conversations for Max, Team, and Enterprise subscribers
Privacy-focused approach requiring explicit user prompts
Improved context preservation across sessions

(Anthropic Announcement, Business Today)

✨ Workshop Spotlight

Claude Code Power User Workshop - Next Friday, August 22nd

Following our two sold-out workshops, the next Claude Code Power User workshop is scheduled for this Friday, August 22nd at 9am PDT

What we'll cover:

Advanced subagent orchestration patterns
Building AI pipelines with multiple Claude instances
Context engineering for complex projects
Real-world automation workflows
Live Q&A and troubleshooting

Register here: https://egghead.io/workshop/claude-code

Read this far? Share "AI Dev Essentials" with a friend! - https://egghead.io/newsletters/ai-dev-essentials

John Lindquist

https://egghead.io