AGI is an Engineering Problem

We’ve reached an inflection point in AI development. The scaling laws that once promised ever-more-capable models are showing diminishing returns. GPT-5, Claude, and Gemini represent remarkable achievements, but they’re hitting asymptotes that brute-force scaling can’t solve. The path to artificial general intelligence isn’t through training ever-larger language models—it’s through building engineered systems that combine models, memory, context, and deterministic workflows into something greater than their parts.

Let me be blunt: AGI is an engineering problem, not a model training problem.

The Plateauing Reality

The current generation of large language models has hit a wall that’s become increasingly obvious to anyone working with them daily. They’re impressive pattern matchers and text generators, but they remain fundamentally limited by their inability to maintain coherent context across sessions, their lack of persistent memory, and their stochastic nature that makes them unreliable for complex multi-step reasoning.

We’ve seen this movie before. Every technology wave follows the same trajectory: initial breakthrough, rapid scaling, then increasing marginal costs for decreasing marginal gains. The semiconductor industry hit this wall in the early 2000s when clock speed scaling became impossible. The solution then wasn’t to brute-force faster processors—it was to fundamentally rethink the architecture with multi-core designs.

AI is at the same inflection point. We need to stop asking “how do we make the model bigger?” and start asking “how do we make the system smarter?”

The Systems Approach to AGI

The human brain isn’t a single neural net—it’s a collection of specialized systems working in concert: memory formation, context management, logical reasoning, spatial navigation, language processing. Each system has evolved specific purposes, and they operate asynchronously with complex feedback loops between them.

True AGI requires us to engineer similar systems. Here’s what we actually need to build:

1. Context Management as Infrastructure

Current models have attention spans measured in thousands of tokens. Human context span extends across years of lived experience. The gap isn’t just quantitative—it’s qualitative. We need context management systems that can:

Retrieve and filter relevant information on-demand using sophisticated retrieval systems
Maintain coherent world models that persist across sessions and evolve with new information
Bridge context gaps between different specialized knowledge domains
Handle conflicting information with probabilistic weighting and uncertainty quantification

This requires moving beyond simple vector similarity searches to building operational knowledge graphs that can be updated, queried, and reasoned about in real-time. Our work on Context Engineering provides a foundation for these systems.

2. Memory as a Service

LLMs don’t have memory—they engage in elaborate methods to fake it through prompt engineering and context stuffing. Real AGI needs memory systems that:

Update beliefs when contradicted by new evidence
Consolidate information across multiple experiences into general principles
Forget irrelevant details without catastrophic forgetting
Generate meta-knowledge about the reliability and source of stored information

This isn’t just database persistence—it’s building memory systems that evolve the way human memory does: strengthening with use, decaying with disuse, and reorganizing based on new understanding. The architectural patterns from software systems show us how to design such evolving structures.

3. Deterministic Workflows with Probabilistic Components

The real breakthrough in AGI will come from building deterministic frameworks that can incorporate probabilistic components when appropriate. Think of it like building a compiler: the overall flow is rigid and predictable, but individual steps can use heuristics and probabilistic optimization.

We need systems that can:

Route problems to appropriate specialized solvers based on problem characteristics
Execute multi-step workflows with rollback and recovery capabilities
Validate outputs through deterministic checks before accepting probabilistic results
Compose capabilities in predictable ways while maintaining the benefits of stochastic generation

Our research on deterministic vs. probabilistic systems demonstrates how we can build these hybrid architectures effectively. The key insight is that uncertainty should be a first-class concept in system design, not something we try to eliminate.

4. Specialized Models as Modular Components

The future isn’t one model to rule them all—it’s hundreds or thousands of specialized models working together in orchestrated workflows. Language models remain excellent at linguistic tasks, but they’re terrible at:

Symbolic manipulation and exact calculation
Visual-spatial reasoning beyond basic pattern matching
Temporal reasoning and planning complex sequences
Intentional agent behavior with persistent goals

Instead of waiting for a breakthrough that makes language models good at everything, we should be building systems that:

Route problems to models optimized for specific domains (thinking in agents demonstrates this approach)
Combine outputs from different model types into coherent solutions
Maintain compatibility while allowing individual components to evolve independently
Handle failure gracefully when individual models underperform

The Engineering Challenge

This brings us to the core insight: building AGI is a distributed systems problem, not a machine learning problem. We’ve been fooled into thinking that because data center-scale training clusters are distributed systems, we’re already doing systems engineering. Nothing could be further from the truth.

The real engineering challenge is building:

Fault-tolerant pipelines where component failures don’t cascade into system failures
Monitoring and observability systems that can detect when model outputs are drifting or becoming unreliable
Deployment systems that allow for rolling updates without breaking existing integrations
Testing frameworks that can validate system behavior across thousands of model and parameter combinations

This is the kind of engineering challenge that requires decades of distributed systems experience, not just machine learning expertise. The solutions will come from infrastructure engineers who understand how to build reliable, scalable systems at the intersection of hardware, software, and AI models.

What We Should Be Building Instead

While everyone else is focused on scaling the next model, we should be building the infrastructure that makes general intelligence possible. Here’s my roadmap:

Phase 1: Foundation Layer

Context Management Service: Persistent, queryable, versioned knowledge graphs with real-time updates
Memory Service: Episodic and semantic memory systems with learned consolidation patterns
Workflow Engine: Deterministic orchestration of probabilistic components with rollback capabilities
Agent Coordination Layer: Multi-agent systems with negotiated consensus and conflict resolution

Phase 2: Capability Layer

Specialized Model Controls: Fine-tuned models for specific reasoning domains with standardized interfaces
Symbolic Reasoning Engine: Exact calculation and symbolic manipulation capabilities that work with probabilistic components
Planning and Goal Management: Systems that can break complex objectives into executable sub-plans
Cross-modal Integration: Systems that combine sensory inputs (text, vision, audio) into unified representations

Phase 3: Emergence Layer

This is where real AGI emerges—from the interaction of all these components working together, not from any single breakthrough model. The system’s capabilities will exceed those of its individual parts through emergent properties that arise from careful architectural design.

The Path Forward

The path to AGI isn’t through training a bigger transformer—it’s through building distributed systems that can orchestrate hundreds of specialized models, maintain coherent context across sessions, execute deterministic workflows around probabilistic components, and provide fault-tolerant operation at production scale.

This is fundamentally engineering work, requiring decades of experience building reliable distributed systems. The breakthroughs will come from infrastructure engineers who understand how to build context paths, memory systems, workflow orchestration, and model coordination at scale.

The race to AGI isn’t being won by the team with the biggest GPU cluster—it’s being won by the team that understands how to build reliable, engineered AI systems that can actually reason across domains while maintaining consistent behavior.

The models we have now are sufficient. The missing piece is the systems engineering that turns them into general intelligence.

We’ve been asking the wrong question. It’s not “how do we get to the next model breakthrough?” It’s “how do we build the system architecture that makes general intelligence inevitable with the models we already have?”

The answer is systems engineering. The future of AGI is architectural, not algorithmic.