Most agents today are just loops with dreams.
They wrap an LLM call in a while-loop, pass some loosely defined tools, slap together a prompt, and hope it doesn’t fall apart. There’s no structured state, no typed interface, no control flow. It works until it doesn’t. And it never scales.
That’s fine for demos. But production systems don’t run on magic. They run on architecture.
So in this post, we’ll show you what real agentic architecture looks like. Not a wrapper. Not a framework. A blueprint. Built from experience in high-stakes automation, refined through our work on Arti, and grounded in what the best minds in the space are starting to articulate.
Why Architecture, Not Abstraction, Is the Missing Layer
If you want to build agentic software that actually runs in production — in a regulated business, a complex process, or a high-value user environment — you need more than a clever prompt and a handful of tools.
You need:
- Typed, composable context
- Structured execution flows
- Runtime state management
- Built-in support for human-AI collaboration
- Semantic memory
- Observability, evaluation, and replayability
You don’t get that from abstractions. You get that from systems.
The Core Components of Cognitive Agentic Architecture
Here’s what a real agentic system needs under the hood:
1. Context Layer
The system needs a structured, semantic, versioned representation of context. Not string concatenation. Not magical memory.
- Typed input schemas
- Scoped, composable prompt modules
- Ontology-driven interpretation (roles, intents, domain semantics)
- Short-term memory, long-term memory, and context overlays
2. Execution Layer
This is where decisions get made and behaviors run.
- Typed tools and interfaces
- Behavior routing via Mixture-of-Experts patterns
- Prompt overloading and dispatching
- Retry logic, interrupt/resume hooks
- Typed side effects and tool responses
3. State Layer
Agents need internal state. World state. Temporal state. And it all has to be tracked, logged, and observable.
- Persistent, structured agent state
- Checkpoints and rewind capability
- Differentiation between model state and user/environment state
- Testable transitions and state contracts
4. Collaboration Layer
Agents don’t run alone. Real systems have users, supervisors, editors, and fallbacks.
- Human-in-the-loop: interrupt, review, approve
- Human-on-the-loop: audit, modify, resume
- Role-specific UX hooks and callbacks
- Multi-party awareness in execution design
5. Observability & Evaluation Layer
You can’t fix what you can’t see. And you can’t trust what you can’t measure.
- Traces of every execution step
- Input/output capture and replay
- Evaluation hooks and test harnesses
- Semantic metrics: relevance, hallucination, efficiency, recovery
Where Most Frameworks (and Manifestos) Fall Short
We’ve tested them all. Here’s the pattern:
- LangGraph: brittle state, rigid graphs, hard to change dynamically
- Agents SDK: agent class soup, no control over context, no semantics
- AutoGen, CrewAI: abstraction over abstraction, no observability
And then there’s 12 Factor Agents.
Let’s talk about that.
12 Factor Agents: The Right Questions, Still Waiting for Execution
Dexter Horthy’s “12 Factor Agents” is one of the most thoughtful articulation of what matters in agentic design:
- Own your prompts
- Own your context window
- Make agents composable, restartable, inspectable
- Treat tools as structured interfaces
- Treat context and state as first-class citizens
We agree with almost everything.
But the same way the original 12 Factor App manifesto didn’t give you microservices, CI/CD, or Kubernetes, 12FA gives you principles, not architecture.
It doesn’t explain:
- How to implement semantic memory
- How to route behavior across modular agents
- How to run tool calls with type safety
- How to evaluate and debug behavior at runtime
It’s the right floor. But it needs walls, a roof, and power.
How Arti Turns Principles Into Systems
Arti is not a framework. It’s a runtime agentic architecture that implements everything above:
- Typed, modular context objects
- Semantic memory with short-term and long-term separation
- Prompt modules that can be overloaded, versioned, and tested
- Tool contracts defined as code, not config
- Built-in support for human approval, interruption, feedback
- Full observability: traces, evaluations, introspection, rollback
We didn’t build this because it sounded cool. We built it because nothing else worked. And because our background is in automation, where systems have to run under pressure.
Agentic software will be no different.
Conclusion: Philosophy Needs Infrastructure
It’s encouraging to see groups like Anthropic, PydanticAI, and the 12 Factor Agents initiative moving in the same direction — away from hype and toward thoughtful software design.
Their work validates many of the principles we’ve built into Arti from day one: modularity, structure, semantic control, human collaboration, and rigorous system design.
But at the end of the day, principles need execution.
If you’re building an agent that needs to survive contact with reality, don’t start with a framework. Start with the blueprint.
Next up: A detailed look into how Arti implements each layer of the system — starting with the context and memory model.