article

AI Dev Essentials #30: Cursor 2.0, GitHub Agent HQ, and Cognition SWE-1.5

AI dev news: GitHub Agent HQ, Cursor 2.0, Cognition SWE-1.5, and OpenAI's $1.4T plan. Plus, get a new 6-lesson course on building Claude Code skills.

Hey Everyone 👋,

John Lindquist here with the 30th issue of AI Dev Essentials!

I was invited to attend GitHub Universe this week. They covered my conference and travel costs (thanks Ashley and Christina!). It was packed with AI/Agent announcements. The Agent HQ announcement positions GitHub as the central hub for any AI agent (Claude, Codex, Jules, Grok, etc) and they're all available through single subscription with Copilot Pro+. Cursor launched version 2.0 with their first proprietary Composer model and parallel execution. Cognition shipped SWE-1.5 at 950 tokens per second thanks to their partnership with Cerebras.

I've been focusing on building out Claude Code skills and plugins. In fact, I published a plugin for the programmable badge they handed out at GitHub Universe (https://github.com/johnlindquist/badger-2350-plugin). It was a good exercise in preparing a plugin w/ skills for working in a completely new environment to enable anyone to build apps without prior experience.

I'll be covering more about skills and plugins in my upcoming workshop. Sign up here: https://egghead.io/workshop/claude-code

🚨 Early Bird Pricing for the Workshop Ends Tomorrow! 🚨

🎓 New egghead.io Lessons This Week

I've been releasing a steady stream of new lessons covering all the aspects of Claude Code skills:

The Essential Guide to Claude Code Skills - 6 Lessons Published

I've been focused on launching a comprehensive new course teaching developers how to build custom skills for Claude Code. All six lessons are now live and available free as a community resource. Here's what we cover:

Lesson 1: Create Your First Claude Code Skill Learn the core concepts of skill development by building a timestamp skill from scratch. Understand the required file structure, YAML frontmatter, and how to safely grant Claude access to specific shell commands.

Lesson 2: Control Claude Skills Output with References and Examples Master precision in AI agent behavior by using detailed example files. Learn how to enforce specific file naming conventions and folder structures while keeping context usage efficient through lazy-loaded examples.

Lesson 3: Stacking Claude Skills to Create Complex Workflows Discover how to build modular, composable skills that work together. Learn to define skill dependencies so Claude can autonomously orchestrate multi-step workflows by chaining multiple skills.

Lesson 4: Build Better Tools in Claude Skills with Scripts Escape skill limitations by abstracting complex logic into TypeScript/Bun scripts. Learn how to bypass parser issues with shell redirects and achieve cross-platform compatibility through programmatic control.

Lesson 5: Secure Your Claude Skills with Custom PreToolUse Hooks Implement fine-grained security controls using Claude Code hooks. Learn to programmatically validate commands and enforce strict security policies before tool execution.

Lesson 6: Claude Skills Compared to Slash Commands Understand the key differences between Slash Commands (user-invoked) and Skills (agent-driven). Learn when to use each pattern to create more efficient and powerful AI-assisted workflows.

Start the course →

🚀 Major Announcements

GitHub Universe 2025: Agent HQ Unifies AI Coding Agents

GitHub announced Agent HQ on October 28, 2025, at GitHub Universe 2025, creating a unified platform to orchestrate any AI agent from any provider through GitHub Copilot subscriptions.

Platform capabilities:

Multi-Provider Support: Integrate agents from Anthropic (Claude), OpenAI (Codex), Google, Cognition, and xAI through single Copilot Pro+ subscription ($39/month)
Mission Control Interface: Centralized dashboard to assign, steer, and track multiple agent tasks simultaneously
Custom Agents: File-based configuration system (.github/agents/) for creating specialized agents with custom prompts, tools, and MCP servers
Immediate Availability: OpenAI Codex accessible in VS Code Insiders; other partner agents rolling out over coming months
Agent Handoffs: Copilot automatically routes tasks to appropriate custom agents based on context

This represents GitHub's strategic move to become the central hub for AI coding, regardless of which underlying agent provider developers prefer.

(GitHub Blog: Agent HQ, GitHub Changelog: Custom Agents, VentureBeat, CNBC)

I've had a lot of conversations around background agents vs. focused work recently. Everyone is trying to find that line of "when can trust an Agent to complete a task without help?". The answer is different for every task, every model, and every user. So it's extremely difficult to generalize the marketing the messaging for tools like Agent HQ. Often you spend so much time configuring the Agents that you could have just written the code yourself. Other times you're configuring an Agent that you'll never use again. It's becoming a delicate balancing act. There's also a strong argument that you need to set up your Agents now so that they're ready once the models improve enough to trust them. Regardless, these are important skills to have in your toolbelt and I strongly recommend putting in the work to get comfortable with them.

Cursor 2.0 Launches with Composer Model and Multi-Agent Orchestration

Cursor released version 2.0 on October 29, 2025, introducing Composer, their first proprietary coding model, along with infrastructure for running up to 8 agents in parallel on a single task.

Technical breakthroughs:

Composer Model: Cursor's first in-house coding model, achieving 4x speed improvement over similarly intelligent models
Sub-30 Second Turns: Most agent interactions complete in under 30 seconds
Parallel Agent Execution: Run up to 8 agents simultaneously using git worktrees to prevent file conflicts
Anyrun Orchestrator: Rust-based orchestration service managing cloud execution via AWS EC2/Firecracker
Browser for Agent: Generally available with DOM selection tools for visual debugging
Sandboxed Terminals: macOS sandboxing now generally available for safer code execution

(Cursor Blog: 2.0 Release, InfoWorld, VentureBeat, Bloomberg)

If you'd been using the "Cheetah" model the past month, that was an early preview of Cursor's latest Composer model. It's wonderfully fast and smart enough for most grunt work. I don't quite trust it nearly as much as I trust Sonnet 4.5 to fully work through and complete a task, but the speed offers up an tight feedback loop that helps you stay focused on a single task. I'll need to spend way more time with it to get a better feel for when I want to use it. I definitely trust it for simple chores and it's amazing watching it knock out a task in seconds.

Cognition Releases SWE-1.5: Near-SOTA Performance at 950 Tokens/Second

Cognition announced SWE-1.5 on October 29, 2025, a frontier-size agent model achieving near state-of-the-art coding performance while delivering unprecedented speed through partnership with Cerebras.

Performance highlights:

Frontier-Scale Model: Hundreds of billions of parameters trained on state-of-the-art GB200 NVL72 chip cluster
950 Tokens/Second: Partnered with Cerebras for blazing-fast inference (6x faster than Haiku 4.5, 13x faster than Sonnet 4.5)
Near-SOTA Accuracy: Achieves performance close to best coding models while maintaining speed advantage
First GB200 Production Model: May be first publicly available model trained on NVIDIA's latest-generation hardware
Integrated Stack: Reimagines entire stack as unified system (model + inference + agent harness)
Available in Windsurf: Now deployed in Cognition's Windsurf IDE

The release emphasizes Cognition's thesis that the future requires vertical integration across the entire stack, not just model improvements.

(Cognition Blog: SWE-1.5, Testing Catalog)

Not to be outdone, Windsurf released their own model later in the same day. The speed comes from a partnership with Cerebras and their custom chips. I'd be curious how fast the model would be on the same hardware as Cursor's, but at the end of the day it doesn't really matter to us end-users. As long as they can keep the prices down, I'm all for competition. Honestly, I need to re-install Windsurf and try it out again (even though I'm extremely satisfied with Cursor)

Sam Altman Outlines OpenAI's Trillion-Dollar Vision for Automated AI Researchers

OpenAI CEO Sam Altman announced on October 28, 2025, ambitious timelines for automated AI research capabilities, alongside massive infrastructure commitments totaling $1.4 trillion in financial obligations.

Strategic roadmap:

September 2026: Launch "Automated AI Research Intern" to meaningfully accelerate research on hundreds of thousands of GPUs
March 2028: Deploy full "Automated AI Research" system capable of autonomous research reports and scientific discoveries
30 Gigawatts Committed: $1.4 trillion total cost of ownership for compute infrastructure over coming years
1 GW/Week Goal: Vision to build "AI factory" adding 1 gigawatt capacity weekly at ~$20 billion per gigawatt
Corporate Restructuring: Completed transition from nonprofit to public benefit corporation structure
$25 Billion Nonprofit Commitment: OpenAI Foundation dedicating $25 billion to health/disease research and AI resilience initiatives

Altman emphasized this represents a calculated bet on future model capabilities and revenue growth justifying the unprecedented infrastructure investment.

(TechCrunch, TechRadar, Technology.org, Business Standard)

$1.4 trillion is an absurd amount of money to bet on a specific vision of how AI scales. If I was to spend that kind of money on something, I'd certainly hope it was curing cancer and solving climate change. The Automated AI Research Interns sounds like an awesome project and I'm really cheering for them to succeed. Not because I particularly care about OpenAI, I just want the world to be a better place.

OpenAI Releases gpt-oss-safeguard for Customizable AI Safety

OpenAI launched gpt-oss-safeguard on October 29, 2025, an open-weight safety toolkit enabling developers to build customizable content moderation systems using reasoning-based policy interpretation.

Safety innovation:

Open-Weight Models: Apache 2.0 licensed models (120B and 20B parameter versions) freely usable and modifiable
Reasoning-Based Approach: Interprets developer-provided policies at inference time rather than requiring retraining
Dynamic Policy Updates: Change safety policies on-the-fly without model retraining
Foundation: Based on OpenAI's internal "Safety Reasoner" protecting GPT-5 and Sora 2
Strong Performance: 120B version achieves 46.3% accuracy vs GPT-4o's 43.2% on multi-policy benchmarks
Launch Partners: Discord, SafetyKit, and ROOST (Robust Open Online Safety Tools)

The release represents OpenAI's strategy to open-source safety infrastructure while keeping frontier models proprietary.

(OpenAI Blog: gpt-oss-safeguard, OpenAI Technical Report, WinBuzzer, CNBC)

🛠️ Developer Tooling Updates

VS Code Integrates OpenAI Codex Through Agent Sessions

Visual Studio Code announced OpenAI Codex integration on October 28, 2025, bringing cloud-based AI coding agents directly into the editor through the new Agent Sessions view.

Integration features:

Agent Sessions View: Dedicated interface for managing local and cloud agent tasks within VS Code
Copilot Pro+ Powered: Requires GitHub Copilot Pro+ subscription ($39/month) for Codex access
VS Code Insiders: Currently available in preview build with general availability coming soon
Mission Control: Centralized dashboard for assigning and tracking multiple agent sessions
Seamless Handoff: Move between local editor work and cloud agent execution without context loss

The integration represents Microsoft's strategy to bring multi-agent orchestration directly into developers' primary workflow tools.

(GitHub Changelog: VS Code Upgrade, Visual Studio Code Release Notes, Visual Studio Magazine)

I honestly think this is Microsoft's game to lose. They have GitHub and VS Code. They have all the users and all the infrastructure. So it comes down to the UX and their partnerships with the model providers. I'm honestly a little surprised they're not announcing models of their own yet. From all of the various Microsoft + OpenAI partnerships, I suspect we'll see tighter integrations between their products moving forward, especially because OpenAI doesn't seem interested in forking VS Code like everyone else.

Stitch and Jules Integration Streamlines Design-to-Code Workflow

Google's Stitch team announced Jules integration in late October 2025, enabling seamless transitions from design mockups to working code in multiple frameworks.

Workflow capabilities:

Design Selection: Select screens in Stitch and click "Jules" button to initiate code generation
Repository Connection: Jules connects directly to GitHub repos for framework-specific code generation
Multi-Framework Support: Generate React, Swift, and other framework-specific implementations
Gemini 2.5 Pro Upgrade: Enhanced UI generation capabilities through latest Gemini model
Preview Access: Currently in preview with full tutorials coming soon

The integration represents Google's strategy to connect design tools with agentic code generation workflows.

(Google Developers Blog: Stitch Launch, Medium: Stitch and Gemini Integration)

Gemini is such an awesome model for design. And while I'd still consider these project "Labs", they certainly have my attention and I can't wait to see what I can build with it.

Vercel Partners with Z.ai for Lowest-Cost GLM 4.6 Access

Vercel announced partnership with Z.ai in late October 2025, offering GLM 4.6 through AI Gateway at highly competitive pricing.

Partnership details:

Aggressive Pricing: $0.45 per million input tokens, $1.80 per million output tokens
AI Gateway Integration: Available through Vercel's AI Gateway with model identifier zai/glm-4.6
Cache Support: $0.11 per million tokens for cache reads
Drop-In Replacement: Works as alternative to other coding models in existing Vercel applications

The partnership represents Vercel's strategy to offer diverse model options through their infrastructure layer.

(Vercel AI Gateway: GLM 4.6, Z.ai Blog: GLM-4.6)

Google Launches Pomelli Marketing Agent on Google Labs

Google released Pomelli on October 28, 2025, a specialized AI agent for marketing campaign generation, available through Google Labs in select regions.

Agent capabilities:

Business DNA Analysis: Analyzes brand identity, voice, and positioning from website and provided materials
Campaign Ideation: Generates marketing campaign concepts aligned with brand strategy
Asset Generation: Creates on-brand marketing materials including copy, visuals, and social content
Multi-Format Support: Produces assets for various channels (social media, email, ads, landing pages)
Regional Availability: US, Canada, Australia, New Zealand (English only initially)

The launch represents Google's strategy to create specialized agents for specific business functions beyond general-purpose AI assistants.

(Google Blog: Pomelli, Search Engine Journal)

This is a neat proof of concept. My first impression we're "fine", it's not something I would use today, but it has a ton of potential for the future.

🤖 AI Ecosystem Updates

Gemini CLI Adds Interactive Shell and Tool Calling

Google released Gemini CLI versions 0.9.0 and 0.10.0 in early October 2025, introducing interactive shell capabilities and intelligent tool usage without explicit commands.

CLI enhancements:

Interactive Shell (v0.9.0, Oct 6): Run vim, git rebase -i, and other interactive commands directly within CLI context
Intelligent Tool Calling (v0.10.0, Oct 13): Automatic shell tool usage without requiring "!" prefix
Alt Key Support: Keyboard shortcuts for improved terminal navigation
Telemetry Diff Stats: Track line changes and modifications across sessions
Pre-release Extensions: Install and test extension functionality before official releases

The updates represent Google's commitment to making Gemini CLI a full-featured development environment rather than just a chat interface.

(GitHub: Gemini CLI Releases, Gemini API Docs: Release Notes)

Addy Osmani Publishes Comprehensive Gemini CLI Tips Guide

Chrome engineering manager Addy Osmani published an extensive Gemini CLI tips and tricks guide on October 21, 2025, covering 30+ pro-level techniques for maximizing CLI productivity.

Guide coverage:

Checkpointing: Save and restore CLI session state
GEMINI.md Configuration: Project-specific agent instructions and conventions
MCP Server Integration: Connect Model Context Protocol servers for extended functionality
Extensions System: Install and manage Gemini CLI extensions
Advanced Workflows: Pattern libraries for common development tasks

The guide represents community-driven documentation helping developers level up their CLI usage.

(Addy Osmani Substack: Gemini CLI Tips, GitHub: Gemini CLI Tips Repository)

⚡ Quick Updates

Cursor 2.0 Ships Browser for Agent

Generally Available: DOM selection tools for visual debugging now in production
Visual Targeting: Click any UI element to have agent modify specific components
Screenshot Context: Agent can see and reference visual state of applications
Accessibility Integration: Leverages accessibility tree for element identification

(Cursor 2.0 Changelog)

OpenAI Sora Character Cameos Expand to Pets and Objects

October Updates: Expanded cameo system beyond human subjects
Pet Cameos: Drop pets into AI-generated scenes after one-time video/audio recording
Object Support: Include stuffed toys, personal items, and other non-human subjects
Trending UI: Real-time trending cameos across Sora community

(OpenAI: Sora 2, TechCrunch: Sora Update)

GitHub Universe 2025 Featured Developer Community

90s Toy Hacking: Keynote featured hacking beloved 90s toys with modern AI
Open Source Focus: Highlighted contributions from open source community to enterprise
Student Partnerships: Expanded partnerships with Hack Club and Major League Hacking
Developer-First: Emphasis on "the spark behind the magic - you, the developers"

(GitHub Universe Keynote Recap)

Gemini CLI Hit 1 Million+ Developers

Adoption Milestone: Over 1 million developers using Gemini CLI as of October 2025
Extensions Ecosystem: 22+ launch partner extensions including Atlassian, GitLab, Stripe
Open Standard: No Google approval required for publishing extensions
Public Repositories: Extensions hosted in public GitHub repositories

(Google Blog: Gemini CLI Extensions)

✨ Workshop Spotlight (🚨 Early Bird Pricing Ends Tomorrow! 🚨)

Claude Code Power User Workshop - November 7th

Date: November 7, 2025 Time: 9:00 AM - 2:00 PM (PDT) Platform: Zoom

Pricing:

🔥 Early Bird: $300 (jumps to $375 tomorrow!)
egghead.io Pro Yearly Member: $225 - Become a Pro member →

What You'll Learn:

Master the essential skills to ship reliable AI-generated code with confidence. This hands-on workshop covers everything from foundational prompting to advanced automation using the Claude Code SDK and custom integrations.

Core Skills:

Context Engineering: Control what context Claude sees for reliable, consistent results
TypeScript SDK: Script Claude programmatically to build custom workflows
Custom Hooks: Automate repetitive tasks with Claude Code hooks
Model Context Protocol: Integrate APIs securely to extend Claude's capabilities
Claude Code Skills: Build custom skills for Claude Code
Live Q&A: Get your specific questions answered by John Lindquist

Register: https://egghead.io/workshop/claude-code

Read this far? Share "AI Dev Essentials" with a friend! - https://egghead.io/newsletters/ai-dev-essentials

John Lindquist

https://egghead.io