We started building Arti like many teams do — with the idea that we needed an orchestration layer for intelligent software.

But our background isn’t in AI demos or LLM wrappers. It’s in industrial software automation — manufacturing, automotive, and systems where software controls real-world processes, robots, and operations. In those environments, failure is not a UX issue. It’s a breakdown with real consequences. And in that world, orchestrating behavior isn’t enough. You need to manage state. Surface decisions. Control complexity. You need production-grade agent systems.

So we approached the agentic problem from first principles. At first, orchestration seemed like the answer. But it quickly became clear:

Orchestration alone doesn’t scale. It creates a bottleneck at the control layer.

What we needed wasn’t another framework. We needed a system architecture — one that could support a Mixture of Experts approach — not in the model sense, but in the system architecture:

Modular agents with clear responsibilities
Context-aware routing
Typed, testable state
Prompt behaviors that could be versioned, composed, and evolved
Collaboration between humans and AI, not top-down command loops

We tested the available frameworks:

LangGraph: declarative graphs with brittle state passing and painful rigidity.
AutoGen, CrewAI, Agents SDK: abstractions over abstractions. Easy to start, impossible to trust at scale.

These weren’t frameworks. They were demo wrappers. None of them survived contact with real-world complexity.

So we began defining a different approach — what we now call Cognitive Agentic Architecture: a modular, typed, observable, human-compatible architecture for intelligent systems.

And then we found two teams thinking like we were:

Anthropic, emphasizing pattern clarity and composable simplicity
PydanticAI, advocating minimal, type-safe, system-aware agent design

Neither of them builds “frameworks” either.
They’re building philosophies that map to real-world complexity without abstraction debt.

In this post, we’ll walk through what we respect in both — and how Arti expands those principles into something deployable, introspectable, and durable.

Because agentic systems aren’t just workflows.
They’re software.
And we build software like it matters.

This post is about what Anthropic and PydanticAI get right— and what’s needed to go further.

Anthropic: Patterns Over Frameworks

Anthropic’s guide to building effective agents is one of the most practical and insightful resources in the agent framework space.

Their core advice?
Start simple.
Don’t build agents unless you need them.
Compose small, testable workflows using prompt chaining, routing, and tool calls.
Use the LLM as a decision engine, not a black box.

And they’re right. For many use cases, that’s enough.

But here’s where things break down:

⚠️ The Limits of Pattern-Only Thinking:

There’s no structured approach to long-term state.
Tool calls are prompt-engineered, not typed or contracted.
There’s no model for semantic memory or reusable prompt modules.
The execution environment lacks runtime introspection, rollback, or monitoring.
Their approach assumes the LLM is the agent — when in reality, it should be just one part of a modular system.

In short: great software starts with patterns, but it scales with architecture.

PydanticAI: Typed, Explicit, Composable Agents

PydanticAI takes a unique approach that we deeply respect.

Where Anthropic is pattern-first, Pydantic is software-first:

Typed tools and agent inputs
Explicit delegation and agent handoffs
Avoidance of DAG fetishism
Code that’s readable, testable, inspectable

Their best metaphor?

“Don’t use a nail gun unless you need one.”

That could be Arti’s motto.

But there are gaps here too:

❗ The Missing Layers:

There’s no concept of semantic context or ontologies — state is structured, but not meaningfully enriched.
There’s no built-in notion of collaborative interaction — everything is still “agent does X.”
There’s minimal treatment of execution observability or runtime introspection.
Prompt handling is typed, but not versioned, overloaded, or runtime-dispatched.

PydanticAI is the best foundation we’ve seen for agent logic as code — but not yet for agentic systems at scale.

Where We Come From — and Why This Matters

We didn’t start in chatbots or demos. We built automation software for manufacturing, automotive, and industrial systems.

In our world:

State isn’t ephemeral — it drives machines.
Errors aren’t recoverable with retries — they cost real money.
You don’t ship a product that sometimes works.

So when we look at agents, we don’t see magic. We see a control system that needs:

Versioned, modular behavior
Semantic context and state inspection
Real-time observability and rollback
Collaborative control models between human and AI

We respect Anthropic’s clarity. We align with PydanticAI’s posture. But we build like systems engineers, not AI whisperers.

What Arti Adds — Without Breaking the Philosophy

Arti builds on the strengths of Anthropic and PydanticAI, extending their principles into a production-grade solution that meets the demands of real-world applications.

Principle	Anthropic	PydanticAI	Arti (CAA-based)
Simplicity first	✅	✅	✅ but modularized
Typed tools	❌	✅	✅ (enforced + testable)
Prompt modularity	⚠️ (manual)	✅	✅ (versioned + runtime dispatched)
Execution state	⚠️ (LLM memory)	✅	✅ (typed + semantic)
Human-AI loops	⚠️	❌	✅ (interrupts, approvals, collaboration)
Observability	⚠️	⚠️	✅ (first-class, not afterthought)
Rollback / Eval	❌	❌	✅ (built-in evaluation + time travel)

We’re not building a framework. We’re building an execution layer for intelligent, stateful, observable, modular software — where LLMs are a component, not a controller.

Closing: Patterns Are a Start. Systems Must Follow.

We agree with Anthropic:

“Add complexity only when it demonstrably improves outcomes.”

And we agree with PydanticAI:

“Don’t use more power than you can control.”

But eventually, even the cleanest pattern or best-typed agent will hit a wall — because agents aren’t demos anymore. They’re infrastructure.

Arti is what happens when you treat agents like real systems. Join us as we redefine the future of agentic architecture and build the foundation for the next generation of intelligent systems.

Next up:

Netflix built a foundation model for recommendation systems. We’re building one for cognition. What they’ve done shows us where agentic architecture is headed — and why DAGs, SDKs, and chat loops won’t get us there. In the next post, we will examine how Netflix’s approach aligns with ours and what it teaches us about agentic infrastructure.

Published On: May 8th, 2025 / Categories: AI Implementation and Best Practices / Tags: agent system design, agentic systems, ai agent frameworks, ai agents frameworks, AI in Industry /

In This Article

Get real-world insights on AI, workforce tech, and knowledge execution — straight to your inbox.

You agree by subscribing to our Privacy Policy.

Beyond Frameworks: Building Production-Grade Agent Systems

Anthropic: Patterns Over Frameworks

⚠️ The Limits of Pattern-Only Thinking:

PydanticAI: Typed, Explicit, Composable Agents

❗ The Missing Layers:

Where We Come From — and Why This Matters

What Arti Adds — Without Breaking the Philosophy

Closing: Patterns Are a Start. Systems Must Follow.

Next up:

Want insights like this in your inbox?

Get real-world insights on AI, workforce tech, and knowledge execution — straight to your inbox.

AI That Captures & Executes Knowledge – Not Just Stores It

Use Cases

AI Solutions

Company

Beyond Frameworks: Building Production-Grade Agent Systems

Anthropic: Patterns Over Frameworks

⚠️ The Limits of Pattern-Only Thinking:

PydanticAI: Typed, Explicit, Composable Agents

❗ The Missing Layers:

Where We Come From — and Why This Matters

What Arti Adds — Without Breaking the Philosophy

Closing: Patterns Are a Start. Systems Must Follow.

Next up:

Want insights like this in your inbox?

Related Posts

Trade Republic LLMOps Confirms the 10 Principles of CAA

10 engineering rules for production-grade agentic automation

The Blueprint for Production-Grade Agentic Architecture

What Netflix’s Foundation Model Teaches Us About Agentic Infrastructure

Get real-world insights on AI, workforce tech, and knowledge execution — straight to your inbox.

AI That Captures & Executes Knowledge – Not Just Stores It

Use Cases

AI Solutions

Company