We’ve reached an inflection point in AI development. The scaling laws that once promised ever-more-capable models are showing diminishing returns. GPT-5, Claude, and Gemini represent remarkable achievements, but they’re hitting asymptotes that brute-force scaling can’t solve. The path to artificial general intelligence isn’t through training ever-larger language models—it’s through building engineered systems that combine models, memory, context, and deterministic workflows into something greater than their parts.
Let me be blunt: AGI is an engineering problem, not a model training problem.
The Plateauing Reality
The current generation of large language models has hit a wall that’s become increasingly obvious to anyone working with them daily. They’re impressive pattern matchers and text generators, but they remain fundamentally limited by their inability to maintain coherent context across sessions, their lack of persistent memory, and their stochastic nature that makes them unreliable for complex multi-step reasoning.
We’ve seen this movie before. Every technology wave follows the same trajectory: initial breakthrough, rapid scaling, then increasing marginal costs for decreasing marginal gains. The semiconductor industry hit this wall in the early 2000s when clock speed scaling became impossible. The solution then wasn’t to brute-force faster processors—it was to fundamentally rethink the architecture with multi-core designs.
AI is at the same inflection point. We need to stop asking “how do we make the model bigger?” and start asking “how do we make the system smarter?”
The Systems Approach to AGI
The human brain isn’t a single neural net—it’s a collection of specialized systems working in concert: memory formation, context management, logical reasoning, spatial navigation, language processing. Each system has evolved specific purposes, and they operate asynchronously with complex feedback loops between them.
True AGI requires us to engineer similar systems. Here’s what we actually need to build:
1. Context Management as Infrastructure
Current models have attention spans measured in thousands of tokens. Human context span extends across years of lived experience. The gap isn’t just quantitative—it’s qualitative. We need context management systems that can:
- Retrieve and filter relevant information on-demand using sophisticated retrieval systems
- Maintain coherent world models that persist across sessions and evolve with new information
- Bridge context gaps between different specialized knowledge domains
- Handle conflicting information with probabilistic weighting and uncertainty quantification
This requires moving beyond simple vector similarity searches to building operational knowledge graphs that can be updated, queried, and reasoned about in real-time. Our work on Context Engineering provides a foundation for these systems.
2. Memory as a Service
LLMs don’t have memory—they engage in elaborate methods to fake it through prompt engineering and context stuffing. Real AGI needs memory systems that:
- Update beliefs when contradicted by new evidence
- Consolidate information across multiple experiences into general principles
- Forget irrelevant details without catastrophic forgetting
- Generate meta-knowledge about the reliability and source of stored information
This isn’t just database persistence—it’s building memory systems that evolve the way human memory does: strengthening with use, decaying with disuse, and reorganizing based on new understanding. The architectural patterns from software systems show us how to design such evolving structures.
3. Deterministic Workflows with Probabilistic Components
The real breakthrough in AGI will come from building deterministic frameworks that can incorporate probabilistic components when appropriate. Think of it like building a compiler: the overall flow is rigid and predictable, but individual steps can use heuristics and probabilistic optimization.
We need systems that can:
- Route problems to appropriate specialized solvers based on problem characteristics
- Execute multi-step workflows with rollback and recovery capabilities
- Validate outputs through deterministic checks before accepting probabilistic results
- Compose capabilities in predictable ways while maintaining the benefits of stochastic generation
Our research on deterministic vs. probabilistic systems demonstrates how we can build these hybrid architectures effectively. The key insight is that uncertainty should be a first-class concept in system design, not something we try to eliminate.
4. Specialized Models as Modular Components
The future isn’t one model to rule them all—it’s hundreds or thousands of specialized models working together in orchestrated workflows. Language models remain excellent at linguistic tasks, but they’re terrible at:
- Symbolic manipulation and exact calculation
- Visual-spatial reasoning beyond basic pattern matching
- Temporal reasoning and planning complex sequences
- Intentional agent behavior with persistent goals
Instead of waiting for a breakthrough that makes language models good at everything, we should be building systems that:
- Route problems to models optimized for specific domains (thinking in agents demonstrates this approach)
- Combine outputs from different model types into coherent solutions
- Maintain compatibility while allowing individual components to evolve independently
- Handle failure gracefully when individual models underperform
The Engineering Challenge
This brings us to the core insight: building AGI is a distributed systems problem, not a machine learning problem. We’ve been fooled into thinking that because data center-scale training clusters are distributed systems, we’re already doing systems engineering. Nothing could be further from the truth.
The real engineering challenge is building:
- Fault-tolerant pipelines where component failures don’t cascade into system failures
- Monitoring and observability systems that can detect when model outputs are drifting or becoming unreliable
- Deployment systems that allow for rolling updates without breaking existing integrations
- Testing frameworks that can validate system behavior across thousands of model and parameter combinations
This is the kind of engineering challenge that requires decades of distributed systems experience, not just machine learning expertise. The solutions will come from infrastructure engineers who understand how to build reliable, scalable systems at the intersection of hardware, software, and AI models.
What We Should Be Building Instead
While everyone else is focused on scaling the next model, we should be building the infrastructure that makes general intelligence possible. Here’s my roadmap:
Phase 1: Foundation Layer
- Context Management Service: Persistent, queryable, versioned knowledge graphs with real-time updates
- Memory Service: Episodic and semantic memory systems with learned consolidation patterns
- Workflow Engine: Deterministic orchestration of probabilistic components with rollback capabilities
- Agent Coordination Layer: Multi-agent systems with negotiated consensus and conflict resolution
Phase 2: Capability Layer
- Specialized Model Controls: Fine-tuned models for specific reasoning domains with standardized interfaces
- Symbolic Reasoning Engine: Exact calculation and symbolic manipulation capabilities that work with probabilistic components
- Planning and Goal Management: Systems that can break complex objectives into executable sub-plans
- Cross-modal Integration: Systems that combine sensory inputs (text, vision, audio) into unified representations
Phase 3: Emergence Layer
This is where real AGI emerges—from the interaction of all these components working together, not from any single breakthrough model. The system’s capabilities will exceed those of its individual parts through emergent properties that arise from careful architectural design.
The Path Forward
The path to AGI isn’t through training a bigger transformer—it’s through building distributed systems that can orchestrate hundreds of specialized models, maintain coherent context across sessions, execute deterministic workflows around probabilistic components, and provide fault-tolerant operation at production scale.
This is fundamentally engineering work, requiring decades of experience building reliable distributed systems. The breakthroughs will come from infrastructure engineers who understand how to build context paths, memory systems, workflow orchestration, and model coordination at scale.
The race to AGI isn’t being won by the team with the biggest GPU cluster—it’s being won by the team that understands how to build reliable, engineered AI systems that can actually reason across domains while maintaining consistent behavior.
The models we have now are sufficient. The missing piece is the systems engineering that turns them into general intelligence.
We’ve been asking the wrong question. It’s not “how do we get to the next model breakthrough?” It’s “how do we build the system architecture that makes general intelligence inevitable with the models we already have?”
The answer is systems engineering. The future of AGI is architectural, not algorithmic.