AI 2026-03-16

AI Just Split Into Three Incompatible Futures

In the span of four days in mid-March 2026, the AI ecosystem shipped three fundamentally incompatible visions of how the next generation of intelligent systems should work. Nobody's talking about the contradiction. They should be.

On March 11, Baidu launched DuClaw, a zero-deployment service for OpenClaw agents — agents that operate entirely through a web interface. Three days later, Garry Tan released gstack, an open-source system for deploying Claude Code agents directly into your code repositories, reducing latency from 3-5 seconds to 100-200 milliseconds by running a persistent headless Chromium daemon. On the same day, Stanford researchers released OpenJarvis, a local-first framework that moves computation entirely onto your device, keeping data private and eliminating cloud dependencies altogether.

And then there's Google's Groundsource project, which took 5 million news articles spanning two decades, fed them to Gemini to extract 2.6 million distinct flood events, and created a geo-tagged time series database that now powers flood predictions across 150 countries. That's not an agent framework. That's something else entirely — LLMs as a data extraction layer for the pre-AI era.

The industry is treating these as separate announcements. They're not. They're three competing bets about the fundamental architecture of AI systems, and they cannot all be right.

The Cloud-Native Vision: Agents as Integrations

The pitch is frictionless. You want an AI agent in your codebase? Install gstack. You want one managing your business processes? Use DuClaw. No infrastructure to set up. No model to download. No API keys to rotate. The agent plugs into your existing workflows — your GitHub repos, your browser sessions, your email, your CRM — and operates *through* them as if it were you, just faster and tireless.

gstack embodies this. It's 8 core commands: `/plan-ceo-review`, `/plan-eng-review`, `/review`, `/ship`, `/browse`, `/qa`, `/setup-browser-cookies`, `/retro`. The persistent Chromium daemon is the key insight — instead of spinning up a new browser for each request (3-5 second cold start), you keep one running and reuse it (100-200ms warm). That's not just faster. That's the difference between "useful for occasional tasks" and "useful for continuous work."

Baidu's DuClaw is even simpler. RMB 17.8/month (roughly $2.50 USD). Web interface. No setup. The roadmap mentions integrations with WeCom, DingTalk, and Feishu — the Chinese equivalents of Slack, Teams, and whatever your enterprise collaboration tool is. The message is clear: we'll handle the infrastructure. You just describe what you want done.

This architecture assumes something crucial: you trust the cloud provider with your data. Your code. Your browser sessions. Your email. Your business logic. The agent needs access to all of it to be useful, and that access flows through servers you don't control.

For many teams, that's fine. The convenience is worth it. But it's a choice, not a default.

The Local-First Vision: Agents as Devices

Stanford's OpenJarvis starts from a different assumption: you don't want to trust anyone with your data, and you shouldn't have to.

The framework is built on five primitives: Intelligence (the model), Engine (the reasoning loop), Agents (the task executors), Tools & Memory (what the agent can do and remember), and Learning (how it improves over time). Everything runs on your device. The model is local. The data never leaves your machine. There's no API call, no cloud dependency, no recurring cost.

The efficiency gains are substantial. Stanford's research shows that local models can handle 88.7% of single-turn queries at interactive latencies. From 2023 to 2025, they achieved a 5.3x efficiency improvement — what required a data center three years ago now runs on your laptop. The framework is designed to be hardware-aware, so it scales from your phone to a workstation without rewriting.

The pitch here is different: "AI that never leaves your machine. Privacy by default. No vendor lock-in. No monthly fees."

It's also slower for complex tasks. If you need the reasoning power of a frontier model, you're waiting. And if you want the agent to integrate with cloud services — your email, your SaaS tools, your team's shared documents — you're managing that integration yourself. Local-first is powerful, but it's not frictionless.

The Data Pipeline Vision: LLMs as Extraction

Google's Groundsource is doing something neither cloud agents nor local-first agents do. It's not managing workflows or operating autonomously. It's mining unstructured human knowledge and converting it into structured scientific data.

The numbers are worth sitting with. 5 million news articles. 20+ years of archives. 2.6 million distinct flood events extracted and geo-tagged. A time series database that now powers flood predictions with 20-square-kilometer resolution across 150 countries.

This is the first time Google deployed LLMs for scientific data extraction at scale. Not for automation. Not for integration. For knowledge extraction. As Juliet Rothenberg from Google's Resilience team explained: "Because we're aggregating millions of reports, the Groundsource dataset actually helps rebalance the map. It enables us to extrapolate to other regions where there isn't as much information."

In other words: LLMs are good at reading human-written text and understanding what it means. We have centuries of human knowledge locked in unstructured text — news archives, historical documents, research papers, field reports. Before LLMs, that knowledge was trapped. Now it's extractable. That's not a workflow automation tool. That's a new layer of infrastructure for turning the pre-AI era's knowledge into datasets the AI era can use.

The Contradiction

Here's the problem: these three visions are incompatible.

If you're betting on cloud-native agents, you're betting that convenience and integration matter more than privacy and vendor independence. You're building for speed and connectivity. You want the agent to reach everything.

If you're betting on local-first agents, you're betting the opposite. Privacy and independence matter more than integration. You're building for autonomy and offline-first operation. You want the agent to reach nothing outside your machine.

If you're betting on LLM-as-data-pipeline, you're not building agents at all. You're building the infrastructure layer that lets other systems *use* LLMs as a component. You're assuming LLMs will be embedded into existing workflows, not replacing them.

The industry is moving on all three fronts simultaneously. Anthropic is pushing cloud-native (Claude Code, browser use). Stanford and the open-source community are pushing local-first (OpenJarvis, Ollama, Llama.cpp). Google and other incumbents are pushing data extraction (Groundsource, but also every enterprise search tool and knowledge base system now incorporating LLM indexing).

Nobody's explicitly saying these are competing bets. They're treating them as parallel innovation tracks. But they're not. At scale, you have to choose.

The Safety Wildcard

There's a fourth tension lurking underneath. On Hacker News, a developer named zippolyon posted about gstack: "GStack is a brilliant setup for maximizing Claude Code's velocity. But if you are letting an agent run autonomously across your repos, velocity without constraints is terrifying. We recently had Case #001: a Claude Code agent got stuck in a 70-minute loop."

A 70-minute loop. Autonomous. Across repositories. That's the cloud-native vision playing out in real time — the agent has access and autonomy, which is exactly what makes it powerful and exactly what makes it dangerous.

Local-first solves this by limiting scope. Your agent can't get stuck in a loop across your entire company's infrastructure because it doesn't have access to it. But that's a feature only if you accept the privacy tradeoff.

Cloud-native solves this with better guardrails and monitoring. But that requires trusting the cloud provider's guardrails, and trusting that they're actually monitoring. As we covered in our analysis of why non-technical teams are now building core systems, the speed advantage of AI can outpace the safety infrastructure around it.

What This Means

The AI ecosystem is not converging on a single architecture. It's diverging into three. And the builders are moving so fast they haven't noticed they're building incompatible futures.

For small teams, this is actually an opportunity. You get to choose. Do you want maximum integration and speed (cloud-native)? Maximum privacy and independence (local-first)? Or do you want to use LLMs as a data layer for something completely different (data pipeline)?

For larger organizations, this is a problem. You'll probably end up using all three simultaneously — cloud agents for customer-facing automation, local agents for sensitive internal work, and LLM extraction for knowledge management. That's three different operational models, three different security models, three different vendor relationships.

The next 18 months will tell you which bet won. But the fact that all three are shipping right now, with serious backing, suggests the answer might be: all of them. The AI infrastructure layer isn't consolidating. It's fragmenting into specialized tools for specialized jobs.

The builders know this. They're just not saying it out loud yet.