- Cahn's Newsletter
- Posts
- Gemini 3 Lands & The Model Wars Heat Up Cahn's AI Canvas #Edition 30 (Nov 14 - Nov 21)
Gemini 3 Lands & The Model Wars Heat Up Cahn's AI Canvas #Edition 30 (Nov 14 - Nov 21)
This week, Google dropped Gemini 3, OpenAI shipped GPT-5.1, Grok evolved, and the model wars shifted from hype to live benchmarks.

The Big Conversation:
Gemini 3 Arrives—And It's Already Beating ChatGPT on Live Benchmarks (Nov 18)
Google announced Gemini 3 on November 18, shipping it immediately across Search, Gemini app, and developer tools—marking one of Google's most aggressive rollouts yet. Early benchmarks show Gemini 3 crushing GPT-4o on reasoning, multimodal understanding, and agentic/coding tasks. The catch: "Thinking" mode (advanced reasoning) is locked to Google AI Pro and Ultra subscribers in the U.S., live in Search and available via API.
Meanwhile: OpenAI countered with GPT-5.1 (Instant & Thinking) and announced an AI-powered jobs platform launching mid-2026. Grok 4 (from xAI) adds advanced reasoning, coding, and real-time web search via DeepSearch—all competing for enterprise and consumer mindshare.
Why it matters:
The model landscape shifted from "which is better?" to "which do you need for what task?" Enterprises now have Gemini 3, GPT-5.1, and Grok 4 as credible alternatives, each with different speed/reasoning tradeoffs.
The tension:
Speed vs. depth: Instant models (Gemini 3 Pro, GPT-5.1 Instant) are faster but less accurate than Thinking variants. Which wins depends on use case—not on hype.
What’s New this week
OpenAI ships GPT-5.1 (Instant & Thinking).Faster inference, stronger reasoning; bundled with new AI jobs platform.Read More
Grok 4 adds advanced reasoning & DeepSearch.Real-time web access, PhD-level reasoning, coding; trained on Colossus supercomputer. Read Here
Jeff Bezos launches Project Prometheus with $6.2B.Physical AI and robotics focus; signals next frontier beyond software-only models. Read More
Physical Intelligence raises $600M at $5.6B valuation.Co-founded by Sergey Levine; embodied AI momentum. Read Here
Luma AI raises $900M led by Saudi firm Humain. Building a supercluster in Saudi Arabia; video AI goes global. Read Here
Baidu ERNIE multimodal model outperforms GPT & Gemini. Three-billion-parameter inference, optimized for schematics/dashboards/video. Read Here
Microsoft launches MAI-Voice-1 and MAI-1-preview. In-house speech model and foundation model on 15,000 H100 GPUs. Read Here
Metropolis raises $500M for edge AI. Leading November's $3.5B+ startup funding wave. Read here
Anthropic and Microsoft pour billions into AI infrastructure. Data centers in Texas, New York, and Fairwater; continent-scale AI engines. Read Here
The Money Reality
November 2025 AI funding: $3.5B+ in first two weeks — On pace to exceed Q3's record
AI now 52.5% of global VC spend YTD — $192.7B invested across 2025 (Bloomberg)
Funding surge by category:
Total Raised | Key Trends | |
|---|---|---|
AI Infrastructure | $1.035B | Hardware, security, edge computing |
Healthcare & Biotech AI | $585M | Drug discovery, patient care automation |
Enterprise AI Agents | $322M | Customer service, sales automation |
Vertical SaaS & Workflow AI | $477M | Industry-specific applications |
Generative AI Platforms | $251.7M | Content creation, code generation |
Startup Radar — who to watch (and how to help)
Startup | One-liner (what it builds) | Stage | Why promising | What they need |
|---|---|---|---|---|
Embodied AI for robotics | Series D | Clear founder pedigree, scaling fast | Enterprise robotics pilots | |
Video generation + compute infrastructure | Growth | Saudi backing = geopolitical play | Creator adoption, API integrations | |
Edge AI for physical spaces | Series B | Security + efficiency combo | Retail/logistics pilots | |
Multimodal models for schematic-heavy industries | Released | Benchmark beats, open-licensed | GPU access, enterprise partnerships |
The model wars are now real—Gemini 3, GPT-5.1, and Grok 4 are all credible. Pick based on latency, reasoning depth, and real-world performance—not marketing.
Dev Corner : What to Build Next
Gemini 3 "Thinking" mode pilots: Test advanced reasoning on your complex workflows; compare latency vs. accuracy vs. GPT-5.1 Thinking
Multi-model routing: Build a router that selects Gemini 3 Instant for fast tasks, GPT-5.1 Thinking for deep reasoning, Grok 4 for real-time web context
Jobs platform experiments: OpenAI's jobs platform (mid-2026) signals enterprise demand; prep your agent workflows for HR/hiring automation
Edge + voice workflows: Combine MAI-Voice-1 with Metropolis edge AI for real-time ops dashboards in retail/logistics
The Agentic Shift
Reasoning: Gemini 3 and GPT-5.1's "Thinking" modes are the new frontier—slow but accurate, perfect for code review, complex analysis, scientific work.
Real-time: Grok 4's DeepSearch integration and Google's live Search integration mean agents now have current web context built-in.
Robotics: Embodied AI is the new narrative—Physical Intelligence, Prometheus, and Boston Dynamics partnerships heating up.
Infrastructure: Anthropic and Microsoft's data center blitz signals a shift from "AI for apps" to "AI as infrastructure."
Voice as the New Interface
Microsoft's MAI-Voice-1 is low-latency and expressive—the new bar for conversational AI in phones, cars, and wearables. Grok 4's natural voice adds another vector.
Tools to Try
Tool | What it does | Best For |
|---|---|---|
Generating stylized characters, mascots, avatars, and fantasy visuals | Creativity-focused tool | |
For concept visuals, character design, thumbnails, and creative mockups. | AI Stylist and Virtual TryOn | |
AI powered accessiblility to people who are blind or have low vision. | AI powered virtual assistance | |
Video generation + infrastructure partnership | Creative production at scale | |
Edge AI for retail/warehouse/logistics ops | Real-time ops, physical spaces |
Want the full breakdown on model benchmarks, latency testing, and routing strategies? Email [email protected] for model comparison deep dives.
Breakthrough Tools
1. Google Gemini 3 Pro: State-of-the-art multimodal reasoning; "Thinking" mode unlocks PhD-level problem solving. Available live in Search, Gemini app, and Vertex AI. The most aggressive Google launch in the AI arms race.
2. OpenAI GPT-5.1 (Instant & Thinking): Dual variants let you trade speed for reasoning depth. Jobs platform (mid-2026) signals enterprise-first roadmap.
3. Grok 4: DeepSearch integration gives real-time web context; advanced reasoning trained on Colossus. The "current events" model in a market of "static knowledge" models.
4. Microsoft's In-House Stack (MAI-Voice-1 + MAI-1-preview): A bold "no dependency" move; 15,000 H100 GPUs training a bespoke foundation model signals Microsoft's long-term autonomy.
AI x Creativity Exclusive
Marble (Nov 12): Fei-Fei Li's bet that spatial intelligence is the next AI frontier. Unlike text-to-image or text-to-video, Marble generates entire navigable 3D worlds—editable, exportable, and ready for downstream use (VR, web, gaming engines). The hybrid editor lets you block out spaces like a designer while AI fills in details. This is the infrastructure layer for immersive AI.
Scribe v2 Realtime(Nov 12): ElevenLabs' answer to the speech-to-text race. Sub-150ms latency means real-time agents that feel human. Integrated directly into their Agents platform, so you can now wire voice → transcription → reasoning → synthesis in a single workflow.
Multi-Model Generative Fill (Adobe) (Nov 15): Adobe just unbundled creative tools. Instead of "Use Firefly," creators now route to Gemini for photorealism, FLUX.1 for art styles, or Firefly for branding consistency. This is the beginning of the end for single-model lock-in.
Two quick plays
Creators: Export Marble 3D worlds as 10-15s Reels with ElevenLabs voice narration—position as "AI co-creation" for 3x engagement on wellness/meditation content.
Founders: Build a vertical SaaS: voice → Marble 3D → Generative Fill → ElevenLabs tour = design concierge at scale (perfect for spa/wellness booking pilots).
Cahn’s AI Canvas
Not to Miss Events
The AI Summit/ New York (Dec 2025) — Explore the power of AI as it reshapes industries, redefines possibilities, and propels you to the forefront of progress.
Microsoft Ignite Deep dive on MAI stack and H100 utilization
Google Gemini Developer Summit — Best practices for Gemini 3 and Thinking mode integration
AWS re:Invent 2025.(Dec 1) BCG is a strategic alliance partner of AWS, and our delegation will collaborate and share our latest thinking on the future of cloud-based tech and how to scale AI innovation.
Must-read: Nathan Lambert's essay on open models outpacing closed labs; Andrej Karpathy on AI and verifiable work
Fireside Chat
As creative work becomes faster and more automated with AI, what is the most irreplaceable element of human creativity that technology can never replicate—and how do we ensure we don’t lose it in the race for efficiency?”
AI PUN

Toss between Apple and Google
Cahn’s POV
This week marked a fundamental shift: the model wars are no longer theoretical. Gemini 3, GPT-5.1, and Grok 4 are all live, all credible, and all competing on measurable benchmarks. Google's aggressive rollout to Search signals that they're fighting for relevance against ChatGPT's network effects. OpenAI's dual-variant strategy (Instant + Thinking) and jobs platform hint at long-term enterprise positioning. Elon's Grok 4 (with real-time web context) is the dark horse—it could disrupt search and news if adoption accelerates.
The real winner? Builders. You now have three distinct models to choose from, each with different strengths. The companies that wire multi-model routing into their workflows—fast for routine tasks, deep for complex reasoning, real-time for trending topics—will own the next cycle.
That's it for today!
Before you go we’d love to know what you thought of today's newsletter to help us improve Cahn’s AI Canvas experience for you.
⭐️⭐️⭐️⭐️⭐️ Nailed it
⭐️⭐️⭐️ Average
⭐️ Fail
See you Next Week,
Aditi and Swati — The humans behind Cahn’s AI Canvas.
📩This week felt like AI stopped asking for permission and started filing expense reports. Agents got gutsier, platforms picked sides, and chat stopped being “just chat.”
If you're a creator, dev, or artist trying to figure out where you fit—you're not alone. This isn't doom and gloom. It's about knowing what's real, what's hype, and where the actual opportunities are.
Want our setup checklist for local AI agents + our best prompt pack? Reply "Loop."
Stay Creative. Stay Updated. Get in Touch : [email protected]
Edition #30 covered Nov 15-Nov 21, 2025. All news verified from mainstream sources with direct article links provided.
Disclaimer: The information presented in this newsletter is curated from public sources on the internet. All content is for informational purposes only.