The Big Conversation:
Gemini 3 Arrives—And It's Already Beating ChatGPT on Live Benchmarks (Nov 18)

Google announced Gemini 3 on November 18, shipping it immediately across Search, Gemini app, and developer tools—marking one of Google's most aggressive rollouts yet. Early benchmarks show Gemini 3 crushing GPT-4o on reasoning, multimodal understanding, and agentic/coding tasks. The catch: "Thinking" mode (advanced reasoning) is locked to Google AI Pro and Ultra subscribers in the U.S., live in Search and available via API.

Meanwhile: OpenAI countered with GPT-5.1 (Instant & Thinking) and announced an AI-powered jobs platform launching mid-2026. Grok 4 (from xAI) adds advanced reasoning, coding, and real-time web search via DeepSearch—all competing for enterprise and consumer mindshare.

Why it matters:
The model landscape shifted from "which is better?" to "which do you need for what task?" Enterprises now have Gemini 3, GPT-5.1, and Grok 4 as credible alternatives, each with different speed/reasoning tradeoffs.

The tension:
Speed vs. depth: Instant models (Gemini 3 Pro, GPT-5.1 Instant) are faster but less accurate than Thinking variants. Which wins depends on use case—not on hype.

Read Here

What’s New this week

OpenAI ships GPT-5.1 (Instant & Thinking).Faster inference, stronger reasoning; bundled with new AI jobs platform.Read More
Grok 4 adds advanced reasoning & DeepSearch.Real-time web access, PhD-level reasoning, coding; trained on Colossus supercomputer. Read Here
Jeff Bezos launches Project Prometheus with $6.2B.Physical AI and robotics focus; signals next frontier beyond software-only models. Read More
Physical Intelligence raises $600M at $5.6B valuation.Co-founded by Sergey Levine; embodied AI momentum. Read Here
Luma AI raises $900M led by Saudi firm Humain. Building a supercluster in Saudi Arabia; video AI goes global. Read Here
Baidu ERNIE multimodal model outperforms GPT & Gemini. Three-billion-parameter inference, optimized for schematics/dashboards/video. Read Here
Microsoft launches MAI-Voice-1 and MAI-1-preview. In-house speech model and foundation model on 15,000 H100 GPUs. Read Here
Metropolis raises $500M for edge AI. Leading November's $3.5B+ startup funding wave. Read here
Anthropic and Microsoft pour billions into AI infrastructure. Data centers in Texas, New York, and Fairwater; continent-scale AI engines. Read Here

The Money Reality

November 2025 AI funding: $3.5B+ in first two weeks — On pace to exceed Q3's record
AI now 52.5% of global VC spend YTD — $192.7B invested across 2025 (Bloomberg)
Funding surge by category:

	Total Raised	Key Trends
AI Infrastructure	$1.035B	Hardware, security, edge computing
Healthcare & Biotech AI	$585M	Drug discovery, patient care automation
Enterprise AI Agents	$322M	Customer service, sales automation
Vertical SaaS & Workflow AI	$477M	Industry-specific applications
Generative AI Platforms	$251.7M	Content creation, code generation

Startup Radar — who to watch (and how to help)

Startup	One-liner (what it builds)	Stage	Why promising	What they need
Physical Intelligence	Embodied AI for robotics	Series D	Clear founder pedigree, scaling fast	Enterprise robotics pilots
Luma AI	Video generation + compute infrastructure	Growth	Saudi backing = geopolitical play	Creator adoption, API integrations
Metropolis	Edge AI for physical spaces	Series B	Security + efficiency combo	Retail/logistics pilots
Baidu ERNIE	Multimodal models for schematic-heavy industries	Released	Benchmark beats, open-licensed	GPU access, enterprise partnerships

❝

The model wars are now real—Gemini 3, GPT-5.1, and Grok 4 are all credible. Pick based on latency, reasoning depth, and real-world performance—not marketing.

Cahn’s POV

Dev Corner : What to Build Next

Gemini 3 "Thinking" mode pilots: Test advanced reasoning on your complex workflows; compare latency vs. accuracy vs. GPT-5.1 Thinking
Multi-model routing: Build a router that selects Gemini 3 Instant for fast tasks, GPT-5.1 Thinking for deep reasoning, Grok 4 for real-time web context
Jobs platform experiments: OpenAI's jobs platform (mid-2026) signals enterprise demand; prep your agent workflows for HR/hiring automation
Edge + voice workflows: Combine MAI-Voice-1 with Metropolis edge AI for real-time ops dashboards in retail/logistics

The Agentic Shift

Reasoning: Gemini 3 and GPT-5.1's "Thinking" modes are the new frontier—slow but accurate, perfect for code review, complex analysis, scientific work.
Real-time: Grok 4's DeepSearch integration and Google's live Search integration mean agents now have current web context built-in.
Robotics: Embodied AI is the new narrative—Physical Intelligence, Prometheus, and Boston Dynamics partnerships heating up.
Infrastructure: Anthropic and Microsoft's data center blitz signals a shift from "AI for apps" to "AI as infrastructure."

Voice as the New Interface
Microsoft's MAI-Voice-1 is low-latency and expressive—the new bar for conversational AI in phones, cars, and wearables. Grok 4's natural voice adds another vector.

Find Out More on Dev Here

Tools to Try

Tool	What it does	Best For
Soul Gen	Generating stylized characters, mascots, avatars, and fantasy visuals	Creativity-focused tool
Gensmo AI	For concept visuals, character design, thumbnails, and creative mockups.	AI Stylist and Virtual TryOn
BeMyEyes AI	AI powered accessiblility to people who are blind or have low vision.	AI powered virtual assistance
Luma Dream Machine	Video generation + infrastructure partnership	Creative production at scale
Metropolis	Edge AI for retail/warehouse/logistics ops	Real-time ops, physical spaces

Want the full breakdown on model benchmarks, latency testing, and routing strategies? Email [email protected] for model comparison deep dives.

Breakthrough Tools


1. Google Gemini 3 Pro: State-of-the-art multimodal reasoning; "Thinking" mode unlocks PhD-level problem solving. Available live in Search, Gemini app, and Vertex AI. The most aggressive Google launch in the AI arms race.

2. OpenAI GPT-5.1 (Instant & Thinking): Dual variants let you trade speed for reasoning depth. Jobs platform (mid-2026) signals enterprise-first roadmap.

3. Grok 4: DeepSearch integration gives real-time web context; advanced reasoning trained on Colossus. The "current events" model in a market of "static knowledge" models.

4. Microsoft's In-House Stack (MAI-Voice-1 + MAI-1-preview): A bold "no dependency" move; 15,000 H100 GPUs training a bespoke foundation model signals Microsoft's long-term autonomy.

AI x Creativity Exclusive

Marble (Nov 12): Fei-Fei Li's bet that spatial intelligence is the next AI frontier. Unlike text-to-image or text-to-video, Marble generates entire navigable 3D worlds—editable, exportable, and ready for downstream use (VR, web, gaming engines). The hybrid editor lets you block out spaces like a designer while AI fills in details. This is the infrastructure layer for immersive AI.
Scribe v2 Realtime(Nov 12): ElevenLabs' answer to the speech-to-text race. Sub-150ms latency means real-time agents that feel human. Integrated directly into their Agents platform, so you can now wire voice → transcription → reasoning → synthesis in a single workflow.
Multi-Model Generative Fill (Adobe) (Nov 15): Adobe just unbundled creative tools. Instead of "Use Firefly," creators now route to Gemini for photorealism, FLUX.1 for art styles, or Firefly for branding consistency. This is the beginning of the end for single-model lock-in.

❝

Two quick plays

Creators: Export Marble 3D worlds as 10-15s Reels with ElevenLabs voice narration—position as "AI co-creation" for 3x engagement on wellness/meditation content.

Founders: Build a vertical SaaS: voice → Marble 3D → Generative Fill → ElevenLabs tour = design concierge at scale (perfect for spa/wellness booking pilots).

Cahn’s AI Canvas

Not to Miss Events

The AI Summit/ New York (Dec 2025) — Explore the power of AI as it reshapes industries, redefines possibilities, and propels you to the forefront of progress.
Microsoft Ignite Deep dive on MAI stack and H100 utilization
Google Gemini Developer Summit — Best practices for Gemini 3 and Thinking mode integration
AWS re:Invent 2025.(Dec 1) BCG is a strategic alliance partner of AWS, and our delegation will collaborate and share our latest thinking on the future of cloud-based tech and how to scale AI innovation.
Must-read: Nathan Lambert's essay on open models outpacing closed labs; Andrej Karpathy on AI and verifiable work

Fireside Chat

As creative work becomes faster and more automated with AI, what is the most irreplaceable element of human creativity that technology can never replicate—and how do we ensure we don’t lose it in the race for efficiency?”

AI PUN

Toss between Apple and Google

Cahn’s POV

This week marked a fundamental shift: the model wars are no longer theoretical. Gemini 3, GPT-5.1, and Grok 4 are all live, all credible, and all competing on measurable benchmarks. Google's aggressive rollout to Search signals that they're fighting for relevance against ChatGPT's network effects. OpenAI's dual-variant strategy (Instant + Thinking) and jobs platform hint at long-term enterprise positioning. Elon's Grok 4 (with real-time web context) is the dark horse—it could disrupt search and news if adoption accelerates.

The real winner? Builders. You now have three distinct models to choose from, each with different strengths. The companies that wire multi-model routing into their workflows—fast for routine tasks, deep for complex reasoning, real-time for trending topics—will own the next cycle.

That's it for today!

Before you go we’d love to know what you thought of today's newsletter to help us improve Cahn’s AI Canvas experience for you.

⭐️⭐️⭐️⭐️⭐️ Nailed it

⭐️⭐️⭐️ Average

⭐️ Fail

See you Next Week,

Aditi and Swati — The humans behind Cahn’s AI Canvas.

📩This week felt like AI stopped asking for permission and started filing expense reports. Agents got gutsier, platforms picked sides, and chat stopped being “just chat.”

If you're a creator, dev, or artist trying to figure out where you fit—you're not alone. This isn't doom and gloom. It's about knowing what's real, what's hype, and where the actual opportunities are.

Want our setup checklist for local AI agents + our best prompt pack? Reply "Loop."

Stay Creative. Stay Updated. Get in Touch : [email protected]

Edition #30 covered Nov 15-Nov 21, 2025. All news verified from mainstream sources with direct article links provided.

Disclaimer: The information presented in this newsletter is curated from public sources on the internet. All content is for informational purposes only.

Gemini 3 Lands & The Model Wars Heat Up Cahn's AI Canvas #Edition 30 (Nov 14 - Nov 21)

Dev Corner : What to Build Next

Tools to Try

Breakthrough Tools

AI x Creativity Exclusive

Cahn’s POV

Keep Reading

Cahn's Newsletter

Home