• Cahn's Newsletter
  • Posts
  • Gemini 3 Lands & The Model Wars Heat Up Cahn's AI Canvas #Edition 30 (Nov 14 - Nov 21)

Gemini 3 Lands & The Model Wars Heat Up Cahn's AI Canvas #Edition 30 (Nov 14 - Nov 21)

This week, Google dropped Gemini 3, OpenAI shipped GPT-5.1, Grok evolved, and the model wars shifted from hype to live benchmarks.

 The Big Conversation:
Gemini 3 Arrives—And It's Already Beating ChatGPT on Live Benchmarks (Nov 18)

Google announced Gemini 3 on November 18, shipping it immediately across Search, Gemini app, and developer tools—marking one of Google's most aggressive rollouts yet. Early benchmarks show Gemini 3 crushing GPT-4o on reasoning, multimodal understanding, and agentic/coding tasks. The catch: "Thinking" mode (advanced reasoning) is locked to Google AI Pro and Ultra subscribers in the U.S., live in Search and available via API.

Meanwhile: OpenAI countered with GPT-5.1 (Instant & Thinking) and announced an AI-powered jobs platform launching mid-2026. Grok 4 (from xAI) adds advanced reasoning, coding, and real-time web search via DeepSearch—all competing for enterprise and consumer mindshare.

Why it matters:
The model landscape shifted from "which is better?" to "which do you need for what task?" Enterprises now have Gemini 3, GPT-5.1, and Grok 4 as credible alternatives, each with different speed/reasoning tradeoffs.

The tension:
Speed vs. depth: Instant models (Gemini 3 Pro, GPT-5.1 Instant) are faster but less accurate than Thinking variants. Which wins depends on use case—not on hype.

Read Here

What’s New this week

  1. OpenAI ships GPT-5.1 (Instant & Thinking).Faster inference, stronger reasoning; bundled with new AI jobs platform.Read More

  2. Grok 4 adds advanced reasoning & DeepSearch.Real-time web access, PhD-level reasoning, coding; trained on Colossus supercomputer. Read Here

  3. Jeff Bezos launches Project Prometheus with $6.2B.Physical AI and robotics focus; signals next frontier beyond software-only models. Read More

  4. Physical Intelligence raises $600M at $5.6B valuation.Co-founded by Sergey Levine; embodied AI momentum. Read Here

  5. Luma AI raises $900M led by Saudi firm Humain. Building a supercluster in Saudi Arabia; video AI goes global. Read Here

  6. Baidu ERNIE multimodal model outperforms GPT & Gemini. Three-billion-parameter inference, optimized for schematics/dashboards/video. Read Here

  7. Microsoft launches MAI-Voice-1 and MAI-1-preview. In-house speech model and foundation model on 15,000 H100 GPUs. Read Here

  8. Metropolis raises $500M for edge AI. Leading November's $3.5B+ startup funding wave. Read here

  9. Anthropic and Microsoft pour billions into AI infrastructure. Data centers in Texas, New York, and Fairwater; continent-scale AI engines. Read Here

The Money Reality

  • November 2025 AI funding: $3.5B+ in first two weeks — On pace to exceed Q3's record

  • AI now 52.5% of global VC spend YTD — $192.7B invested across 2025 (Bloomberg)

    Funding surge by category:

Total Raised

Key Trends

AI Infrastructure

$1.035B

Hardware, security, edge computing

Healthcare & Biotech AI

$585M

Drug discovery, patient care automation

Enterprise AI Agents

$322M

Customer service, sales automation

Vertical SaaS & Workflow AI

$477M

Industry-specific applications

Generative AI Platforms

$251.7M

Content creation, code generation

Startup Radar — who to watch (and how to help)

Startup

One-liner (what it builds)

Stage

Why promising

What they need

Physical Intelligence

Embodied AI for robotics

Series D

Clear founder pedigree, scaling fast

Enterprise robotics pilots

Luma AI

Video generation + compute infrastructure

Growth

Saudi backing = geopolitical play

Creator adoption, API integrations

Metropolis

Edge AI for physical spaces

Series B

Security + efficiency combo

Retail/logistics pilots

Baidu ERNIE

Multimodal models for schematic-heavy industries

Released

Benchmark beats, open-licensed

GPU access, enterprise partnerships

The model wars are now real—Gemini 3, GPT-5.1, and Grok 4 are all credible. Pick based on latency, reasoning depth, and real-world performance—not marketing.

Cahn’s POV

Dev Corner : What to Build Next

  • Gemini 3 "Thinking" mode pilots: Test advanced reasoning on your complex workflows; compare latency vs. accuracy vs. GPT-5.1 Thinking

  • Multi-model routing: Build a router that selects Gemini 3 Instant for fast tasks, GPT-5.1 Thinking for deep reasoning, Grok 4 for real-time web context

  • Jobs platform experiments: OpenAI's jobs platform (mid-2026) signals enterprise demand; prep your agent workflows for HR/hiring automation

  • Edge + voice workflows: Combine MAI-Voice-1 with Metropolis edge AI for real-time ops dashboards in retail/logistics

The Agentic Shift

Reasoning: Gemini 3 and GPT-5.1's "Thinking" modes are the new frontier—slow but accurate, perfect for code review, complex analysis, scientific work.
Real-time: Grok 4's DeepSearch integration and Google's live Search integration mean agents now have current web context built-in.
Robotics: Embodied AI is the new narrative—Physical Intelligence, Prometheus, and Boston Dynamics partnerships heating up.
Infrastructure: Anthropic and Microsoft's data center blitz signals a shift from "AI for apps" to "AI as infrastructure."

Voice as the New Interface
Microsoft's MAI-Voice-1 is low-latency and expressive—the new bar for conversational AI in phones, cars, and wearables. Grok 4's natural voice adds another vector.

Tools to Try

Tool

What it does

Best For

Soul Gen

Generating stylized characters, mascots, avatars, and fantasy visuals

Creativity-focused tool

Gensmo AI

For concept visuals, character design, thumbnails, and creative mockups.

AI Stylist and Virtual TryOn

BeMyEyes AI

AI powered accessiblility to people who are blind or have low vision.

AI powered virtual assistance

Luma Dream Machine

Video generation + infrastructure partnership

Creative production at scale

Metropolis

Edge AI for retail/warehouse/logistics ops

Real-time ops, physical spaces

Want the full breakdown on model benchmarks, latency testing, and routing strategies? Email [email protected] for model comparison deep dives.

Breakthrough Tools


1. Google Gemini 3 Pro: State-of-the-art multimodal reasoning; "Thinking" mode unlocks PhD-level problem solving. Available live in Search, Gemini app, and Vertex AI. The most aggressive Google launch in the AI arms race.

2. OpenAI GPT-5.1 (Instant & Thinking): Dual variants let you trade speed for reasoning depth. Jobs platform (mid-2026) signals enterprise-first roadmap.

3. Grok 4: DeepSearch integration gives real-time web context; advanced reasoning trained on Colossus. The "current events" model in a market of "static knowledge" models.

4. Microsoft's In-House Stack (MAI-Voice-1 + MAI-1-preview): A bold "no dependency" move; 15,000 H100 GPUs training a bespoke foundation model signals Microsoft's long-term autonomy.

AI x Creativity Exclusive

  • Marble (Nov 12): Fei-Fei Li's bet that spatial intelligence is the next AI frontier. Unlike text-to-image or text-to-video, Marble generates entire navigable 3D worlds—editable, exportable, and ready for downstream use (VR, web, gaming engines). The hybrid editor lets you block out spaces like a designer while AI fills in details. This is the infrastructure layer for immersive AI.

  • Scribe v2 Realtime(Nov 12): ElevenLabs' answer to the speech-to-text race. Sub-150ms latency means real-time agents that feel human. Integrated directly into their Agents platform, so you can now wire voice → transcription → reasoning → synthesis in a single workflow.

  • Multi-Model Generative Fill (Adobe) (Nov 15): Adobe just unbundled creative tools. Instead of "Use Firefly," creators now route to Gemini for photorealism, FLUX.1 for art styles, or Firefly for branding consistency. This is the beginning of the end for single-model lock-in.

Two quick plays

Creators: Export Marble 3D worlds as 10-15s Reels with ElevenLabs voice narration—position as "AI co-creation" for 3x engagement on wellness/meditation content.

Founders: Build a vertical SaaS: voice → Marble 3D → Generative Fill → ElevenLabs tour = design concierge at scale (perfect for spa/wellness booking pilots).

Cahn’s AI Canvas

Not to Miss Events

  • The AI Summit/ New York (Dec 2025) — Explore the power of AI as it reshapes industries, redefines possibilities, and propels you to the forefront of progress.

  • Microsoft Ignite  Deep dive on MAI stack and H100 utilization

  • Google Gemini Developer Summit — Best practices for Gemini 3 and Thinking mode integration

  • AWS re:Invent 2025.(Dec 1) BCG is a strategic alliance partner of AWS, and our delegation will collaborate and share our latest thinking on the future of cloud-based tech and how to scale AI innovation.

  • Must-read: Nathan Lambert's essay on open models outpacing closed labs; Andrej Karpathy on AI and verifiable work

Fireside Chat

As creative work becomes faster and more automated with AI, what is the most irreplaceable element of human creativity that technology can never replicate—and how do we ensure we don’t lose it in the race for efficiency?”

AI PUN

Toss between Apple and Google

Cahn’s POV

This week marked a fundamental shift: the model wars are no longer theoretical. Gemini 3, GPT-5.1, and Grok 4 are all live, all credible, and all competing on measurable benchmarks. Google's aggressive rollout to Search signals that they're fighting for relevance against ChatGPT's network effects. OpenAI's dual-variant strategy (Instant + Thinking) and jobs platform hint at long-term enterprise positioning. Elon's Grok 4 (with real-time web context) is the dark horse—it could disrupt search and news if adoption accelerates.

The real winner? Builders. You now have three distinct models to choose from, each with different strengths. The companies that wire multi-model routing into their workflows—fast for routine tasks, deep for complex reasoning, real-time for trending topics—will own the next cycle.

That's it for today!

Before you go we’d love to know what you thought of today's newsletter to help us improve Cahn’s AI Canvas experience for you.

⭐️⭐️⭐️⭐️⭐️ Nailed it

⭐️⭐️⭐️ Average

⭐️ Fail

See you Next Week,

Aditi and Swati — The humans behind Cahn’s AI Canvas.

📩This week felt like AI stopped asking for permission and started filing expense reports. Agents got gutsier, platforms picked sides, and chat stopped being “just chat.”

If you're a creator, dev, or artist trying to figure out where you fit—you're not alone. This isn't doom and gloom. It's about knowing what's real, what's hype, and where the actual opportunities are.

Want our setup checklist for local AI agents + our best prompt pack? Reply "Loop."

Stay Creative. Stay Updated. Get in Touch : [email protected]

Edition #30 covered Nov 15-Nov 21, 2025. All news verified from mainstream sources with direct article links provided.

Disclaimer: The information presented in this newsletter is curated from public sources on the internet. All content is for informational purposes only.