A Note on This Post: What follows is a vision—not a fully fleshed-out solution. Many pieces remain unsolved: authentication, logging, debugging, determinism, and more. I’m sharing this as a direction I believe we’re heading, not a blueprint ready for production. Think of it as an invitation to explore and refine these ideas together.
How Applications Work Today
If you’ve ever wondered what happens behind the scenes when you use a web application, here’s the typical architecture:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Browser │ ───▶ │ Backend │ ───▶ │ Database │
│ (Frontend) │ ◀─── │ Server │ ◀─── │ │
└─────────────┘ └─────────────┘ └─────────────┘
HTML Node.js PostgreSQL
CSS Python MySQL
JavaScript Java MongoDB
React/Vue/Angular Express/Django/Rails
The frontend is what you see in your browser. It’s built with HTML, CSS, and typically a JavaScript framework like React, Vue, or Angular. These frameworks help developers manage complex user interfaces, handle state (like whether you’re logged in), and make API calls to fetch data.
The backend is where the business logic lives. It processes requests, talks to databases, handles authentication, and returns data. Developers write this in languages like Python, JavaScript (Node.js), or Java, using frameworks like Express, Django, or Rails.
The database stores your data persistently.
This architecture has served us well for decades. But it comes with costs.
The Hidden Cost of Complexity
Every layer in this stack introduces:
- Code to write and maintain: API routes, database queries, authentication logic, error handling
- Security vulnerabilities: Each endpoint is a potential attack vector
- Integration burden: Every external service needs SDK code, API handling, retry logic
- Framework overhead: React alone adds ~40KB to your bundle before you write a single line
And here’s the thing about frameworks: they exist to help humans manage complexity. React’s virtual DOM, state management libraries like Redux, ORMs like Prisma—these are all abstractions that make it easier for developers to reason about complex systems.
But what if the machine handling the complexity doesn’t need those abstractions?
The Security Advantage of Static Sites
Consider a purely static website—just HTML and CSS files. No JavaScript framework, no backend server, no database connection.
The attack surface shrinks dramatically:
| Traditional App | Static Site |
|---|---|
| SQL injection | Not possible (no database) |
| Authentication bypass | Not possible (no auth) |
| API exploitation | Not possible (no APIs) |
| Server misconfiguration | Not possible (no server) |
| Dependency vulnerabilities | Minimal (no runtime deps) |
Static sites can still face issues like DNS hijacking or CDN compromise, but compared to a full-stack application, there’s remarkably little to attack.
I use this approach for my own portfolio tracker on this site. Instead of building a backend that queries stock prices and exposes API endpoints, I have a GitHub Action that runs on a schedule:
GitHub Action (runs every few hours)
↓
Python script fetches stock prices
↓
Updates JSON file with current data
↓
Redeploys static site to GitHub Pages
↓
HTML reads from JSON, displays prices
No backend is exposed. The Python script runs in an isolated GitHub Action, does its job, and disappears. The user only ever sees static HTML and CSS.
For my use case—position trading over months, not day trading—I don’t need real-time prices. Updates every few hours are perfect.
But this got me thinking: what if we could extend this pattern to more dynamic applications?
The Vision: LLM as Your Backend
Today, we use LLMs to write code:
Developer: "Write me code to connect to Postgres and fetch product specs"
↓
LLM: "Here's 50 lines of Python..."
↓
Developer: copies → pastes → tests → debugs → deploys
↓
Code exists in your codebase forever
↓
You maintain it forever
Every API change, security patch, or new requirement means updating that code.
Here’s a different approach:
User clicks "Get product specs"
↓
Request goes to LLM (the backend)
↓
LLM connects to database directly
↓
LLM returns data
↓
Static site displays result
No intermediate code to maintain. The LLM doesn’t write code for you to deploy—it is the backend that executes the logic.
Why write code at all if the LLM can execute the intent directly?
What This Vision Is NOT For
Before diving deeper, let me be clear about what this approach cannot handle today—and may never be the right fit for:
Authentication-Heavy Applications
If your app requires user authentication, you face a fundamental problem: how do you securely pass user identity to the LLM backend?
Sending credentials or session tokens via prompt payloads is problematic:
- Interception risk: Prompts could be logged, cached, or exposed
- No standard auth flow: OAuth, JWT, session cookies—none of these map cleanly to prompt-based communication
- Identity verification: How does the LLM verify the user is who they claim to be?
This remains an unsolved problem in the agentic backend model. Traditional backends handle auth through well-established, battle-tested patterns. We don’t have equivalents yet.
Financial Transactions and High-Stakes Operations
For applications like:
- Payment processing (Stripe, PayPal integrations)
- E-commerce platforms (eBay-like marketplaces)
- Banking and trading systems
- Healthcare record management
…you need absolute determinism and auditability. A payment endpoint must charge exactly $100.00 every time, not “usually $100.00.” More on this in the determinism section below.
Real-Time and High-Throughput Systems
Current LLM inference latency (500ms-2000ms+) makes this unsuitable for:
- Real-time collaboration (Google Docs-style)
- Gaming backends
- Live trading platforms
- Chat applications requiring instant responses
Large-Scale Platforms
AWS, Shopify, Netflix—these require:
- Millions of requests per second
- Sub-10ms latency
- Complex distributed systems
- Cost efficiency at massive scale
The economics don’t work. At $0.003 per 1K tokens, 10 million daily requests would cost thousands of dollars—versus a few hundred on traditional infrastructure.
When This DOES Make Sense
- Content sites with occasional dynamic features
- Internal tools and dashboards
- Prototypes and MVPs
- Personal projects (like my portfolio tracker)
- Applications where 500ms+ latency is acceptable
- Low-to-medium traffic scenarios
The Determinism Problem
Here’s the elephant in the room: LLMs are non-deterministic by design.
The same prompt can return different results. For business logic, this is potentially catastrophic:
Prompt: "Calculate total for cart: 3 items at $33.33 each"
Response 1: "$99.99"
Response 2: "$100.00"
Response 3: "$99.99"
Traditional code returns the same answer every time. LLMs might not.
It’s Not As Bad As It Sounds
Here’s the key insight: the LLM doesn’t have to do everything.
In a well-designed agentic backend, the LLM handles orchestration and intent parsing—deciding what to do. The actual work is delegated to deterministic components:
Skills pointing to scripts are deterministic:
skills:
calculate_cart_total:
type: script
command: python scripts/cart_calculator.py
The Python script produces the same output every time. The LLM just decides to call it.
MCP queries are deterministic: When the LLM uses MCP to query a database, the database returns the same data for the same query. The data retrieval is deterministic—only the LLM’s decision to query is variable.
The architecture pattern:
User Intent → LLM (parses intent, chooses action)
↓
Deterministic Execution
• Script via Skill
• Database via MCP
• API via MCP server
↓
Deterministic Output
The non-determinism is confined to the routing layer, not the execution layer.
Additional Mitigations
For the parts where LLM variability matters, these guardrails help:
1. Temperature = 0 Setting temperature to zero reduces (but doesn’t eliminate) variability. Most providers support this.
2. Structured Outputs Force the LLM to return data in strict schemas (JSON with defined fields). This constrains the output space significantly.
3. Validation Layers Every LLM response passes through validation before being used:
LLM Response → Schema Validation → Business Rule Check → Use
4. Idempotency Keys For operations that must not repeat (payments, record creation), traditional idempotency patterns still apply—the LLM doesn’t change this requirement.
5. Deterministic Fallbacks For critical calculations, don’t use the LLM at all. Use it for orchestration and intent parsing, but delegate math to deterministic code.
Remaining Gaps
Even with these mitigations, challenges remain:
- What if the LLM misparses intent and calls the wrong skill?
- How do we ensure consistent routing for edge cases?
- What’s the testing strategy for probabilistic routing?
These are solvable problems, but they require new patterns we’re still developing.
Declarative Agentic Backend
This is the concept I’m calling a Declarative Agentic Backend. Instead of writing imperative code, you declare what you need, and an agentic framework handles execution.
What is MCP?
Before explaining the architecture, a quick note on MCP (Model Context Protocol). Launched by Anthropic in late 2024, MCP is essentially a standard protocol for connecting AI systems to external tools—databases, APIs, file systems, and more.
Think of it as USB for LLMs: a universal way for AI to plug into services without custom integration code for each one.
As of 2025-2026, MCP has gained adoption from OpenAI, Google DeepMind, and others, making it an emerging industry standard rather than a single company’s initiative.
The Three Pillars
1. Prompt-First Communication
All interaction happens through structured prompts. No REST conventions, no SDK calls—prompts are the interface.
## Request
Intent: get_product_specs
Product ID: 12345
## User Context
Session started: 2024-01-15 10:30 UTC
Journey: home → electronics → filters(brand=Sony) → product_detail
Previous queries: [price_comparison, reviews]
## Expected Output
Format: JSON
Include: specifications, availability, related_products
2. Backend = Agentic Framework
You don’t code integrations. You compose agents. The framework provides pre-built agents for common tasks:
# backend.agents.yml
database:
type: postgres-agent
model: llama-3-8b
connection: $DATABASE_URL
capabilities: [read, write]
payments:
type: stripe-agent
model: llama-3-8b
key: $STRIPE_KEY
storage:
type: s3-agent
model: llama-3-8b
bucket: my-app-files
orchestrator:
model: claude-sonnet
routes:
get_product: [database]
process_payment: [payments, database]
upload_file: [storage]
Each agent is pre-configured to handle its domain. The Postgres agent already knows SQL, connection pooling, and authentication. You just provide the connection string and intent.
3. Context-Aware by Default
Every request carries the user’s journey context. The backend doesn’t just know what the user wants—it understands why and where from.
## User Session Context
- Started: Product listing page
- Action: Filtered by "electronics"
- Action: Clicked "Widget Pro X"
- Current: Product detail page
- Request: Get specifications
This context enables the backend to provide intelligent responses, recommendations, and assistance that traditional backends simply can’t match.
The Architecture
Here’s how everything connects:
┌────────────────────────────────────────────────────────────────────┐
│ USER'S BROWSER │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Static HTML/CSS + Modern Browser Features │ │
│ │ Reads from: /data/session-{id}.json │ │
│ │ Minimal JS: triggers endpoint, swaps content │ │
│ └──────────────────────────┬───────────────────────────────────┘ │
└─────────────────────────────┼──────────────────────────────────────┘
│
┌──────────▼──────────┐
│ Single Endpoint │
│ (Prompt Payload) │
└──────────┬──────────┘
│
┌─────────────────────────────▼──────────────────────────────────────┐
│ DECLARATIVE AGENTIC BACKEND (Conduit) │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ ORCHESTRATOR (Claude) │ │
│ │ Reads prompt → Plans → Delegates │ │
│ └──────────────────────────┬───────────────────────────────────┘ │
│ │ │
│ ┌──────────┬──────────┼──────────┬──────────┐ │
│ ▼ ▼ ▼ ▼ ▼ │
│ ┌───────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │Postgres│ │ MySQL │ │ Stripe │ │ S3 │ │ MCP │ │
│ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │ Bridge │ │
│ │(Llama)│ │(Llama) │ │(Llama) │ │(Llama) │ │ │ │
│ └───────┘ └────────┘ └────────┘ └────────┘ └────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ PRE-BUILT AGENT LIBRARY │ │
│ │ • db/postgres • db/mysql • db/mongo │ │
│ │ • cloud/aws-s3 • cloud/gcp • cloud/azure │ │
│ │ • payments/stripe • api/rest • api/graphql │ │
│ │ • util/transform • util/validate • util/format │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ UPDATE SESSION JSON → PARTIAL DEPLOY │ │
│ └──────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────┘
Important clarification: This isn’t “no backend”—it’s no custom backend code. There’s still a server receiving requests; it’s just powered by LLM inference rather than hand-written business logic.
Model Tiering
Not every task needs a powerful model. The framework uses the right model for each job:
| Task | Model | Why |
|---|---|---|
| Orchestration, complex reasoning | Claude | Needs full context understanding |
| Database queries | Llama 3 (specialized) | Structured, predictable task |
| API calls | Small model or rule-based | Pattern matching |
| Response formatting | Small model | Template-driven |
You don’t need Claude Sonnet to query a database. A fine-tuned Llama 3 that speaks SQL fluently costs a fraction and executes faster.
The Logging and Debugging Challenge
Traditional backends have mature logging and debugging:
- Stack traces
- Request/response logs
- Database query logs
- APM tools (DataDog, New Relic)
With an agentic backend, we have multiple moving parts that all need visibility:
What Needs Logging
1. Orchestrator Decisions
- What intent did it parse?
- Which agents did it choose to invoke?
- What was its reasoning?
2. MCP Interactions
- Which MCP servers were called?
- What tools were used?
- What data was passed and returned?
3. Skills and Scripts Sometimes an MCP server doesn’t exist for your use case. You might have a skill that invokes a Python script, shell command, or custom API call:
skills:
fetch_legacy_data:
type: script
command: python scripts/legacy_api.py
args: [--format, json]
Where do these logs go? How do you correlate them with the orchestrator’s session?
4. On-the-Fly Code Execution If the LLM generates and executes code at runtime, you need:
- The generated code itself (for audit)
- Execution output
- Error traces if it fails
The Unsolved Problem
We need a centralized logging and debugging system that:
- Aggregates logs from all sources (orchestrator, MCP, skills, generated code)
- Correlates them by session/request ID
- Provides a timeline view of what happened
- Enables replay for debugging failed requests
This doesn’t exist as a turnkey solution today. Building observability into agentic systems is an open challenge.
Complexity: Shifted, Not Eliminated
Let me be honest: this architecture doesn’t eliminate complexity—it shifts it.
What you no longer maintain:
- API route handlers
- Database query code
- SDK integration code
- Most error handling boilerplate
What you now maintain:
- Agent configurations (YAML)
- Prompt templates and intents
- Schema definitions for structured outputs
- Validation logic for LLM responses
- Logging and monitoring infrastructure
- Skills/scripts for cases without MCP support
Is this less work? I believe so—if properly configured. The shift is from writing repetitive boilerplate to declaring intent and configuring agents. The former scales linearly with features; the latter can scale more efficiently once the foundation is solid.
But it’s not magic. There’s still work, just different work.
Leveraging the Modern Browser
One reason this architecture becomes more viable: browsers have evolved significantly. Many things that once required JavaScript frameworks are now native:
| Browser Feature | What It Replaces |
|---|---|
| CSS Container Queries | JavaScript responsive logic |
| CSS :has() selector | JavaScript parent selection |
| View Transitions API | React Router transitions |
| Service Workers | Framework caching strategies |
| IndexedDB | Redux persistence |
This matters because the less JavaScript your frontend needs, the more “static” it can be. A static frontend pairs naturally with an agentic backend—you’re not trying to coordinate two complex systems.
┌─────────────────────────────────────────────────────────────┐
│ BROWSER AS THE RUNTIME │
│ │
│ Service Worker │
│ → Caches static assets │
│ → Intercepts fetch to /data/*.json │
│ → Polls for updates │
│ │
│ IndexedDB │
│ → Stores user session locally │
│ → Syncs with server-side session JSON │
│ │
│ CSS (2024+) │
│ → Handles most "dynamic" UI without JS │
│ → :has(), container queries, nesting, layers │
│ │
│ Minimal Vanilla JS │
│ → Form validation, micro-interactions only │
│ → Triggers prompt endpoint when needed │
│ → Reads updated JSON, swaps HTML fragments │
│ │
└─────────────────────────────────────────────────────────────┘
The System Gets Smarter
Here’s something traditional backends can’t do: learn from context.
Every interaction adds to the conversation. The agentic backend accumulates understanding:
- First visit: Generic responses, standard recommendations
- After browsing: Knows your interests, can suggest relevant products
- After purchase history: Understands your preferences, budget range
- After support interactions: Knows your technical level, past issues
If you’re reading documentation and going back and forth between pages, the system notices. It can proactively offer help: “I see you’re looking at the authentication section repeatedly. Would you like me to explain OAuth flow step by step?”
This isn’t a feature you code. It emerges from the architecture. When your backend is an LLM with full context, intelligence comes free.
What Needs to Happen
This vision isn’t fully achievable today. Several pieces need to mature:
| Component | Current State | What’s Needed |
|---|---|---|
| MCP ecosystem | Growing rapidly (OpenAI, Google adopted) | More pre-built servers for common services |
| Authentication | No standard pattern | Secure auth flow for prompt-based systems |
| Logging/debugging | Fragmented | Centralized observability for agentic systems |
| Specialized agents | Possible with fine-tuning | Framework with pre-configured agents |
| Determinism | Mitigations exist | Better guarantees for critical operations |
| Trigger mechanism | Workarounds exist | Native GitHub Action ↔ static site bridge |
| Model latency | Hundreds of milliseconds | Faster inference for real-time interactions |
| Cost | Improving rapidly | Continued reduction for viability at scale |
But the trajectory is clear. Models are getting faster and cheaper. MCP is gaining adoption. The pieces are falling into place.
The Developer Experience Transformation
Building an app today:
- Choose frontend framework (React? Vue? Svelte?)
- Choose backend framework (Express? Django? Rails?)
- Choose ORM (Prisma? Sequelize?)
- Write authentication logic
- Write API routes
- Write database queries
- Write integration code for each service
- Handle errors everywhere
- Write tests
- Deploy frontend and backend separately
- Manage infrastructure
- Maintain everything forever
Building an app with a Declarative Agentic Backend:
- Write HTML/CSS (static)
- Configure agents (declarative YAML)
- Define intents and prompts
- Set up logging and validation
- Deploy static site
- Done
The custom code you would have written? The LLM executes equivalent logic at runtime. The integrations you would have maintained? Pre-configured agents handle them. The complexity you would have managed? The orchestrator deals with it.
It’s not zero work—but it’s different work that may scale better.
The Attack Surface Comparison
Traditional applications expose everything:
TRADITIONAL APP ATTACK SURFACE:
┌────────────────────────────────────────┐
│ Frontend (XSS vulnerabilities) │
│ API endpoints (injection, auth bugs) │
│ Backend code (logic flaws) │
│ Database (SQL injection) │
│ Dependencies (supply chain attacks) │
│ Server (misconfiguration, unpatched) │
│ Secrets (exposed environment vars) │
└────────────────────────────────────────┘
The Declarative Agentic Backend exposes less:
CONDUIT ATTACK SURFACE:
┌────────────────────────────────────────┐
│ Static files (CDN, nearly immutable) │
│ Single trigger endpoint (minimal) │
│ Prompt injection (new attack vector) │
│ Agent misconfiguration │
└────────────────────────────────────────┘
Agents run in isolated environment:
→ Spins up, executes, terminates
→ No persistent attack surface
→ Credentials never exposed to frontend
Note: This introduces new attack vectors like prompt injection. The attack surface is different, not necessarily smaller in all dimensions.
A New Development Era
Looking at the evolution of web development:
| Era | Paradigm | Complexity Handler |
|---|---|---|
| 1990s | CGI/PHP | Humans write everything |
| 2000s | Frameworks (Rails, Django) | Convention over configuration |
| 2010s | SPAs + APIs (React + Node) | Frontend/backend separation |
| 2020s | Serverless + JAMstack | Functions + static |
| Future | Static + Agentic Backend | LLM handles complexity at runtime |
Frameworks like React solved real problems—but they solved them for human developers who needed help managing complexity. When the complexity handler is an LLM that can reason, plan, and execute, the abstraction layer becomes overhead.
This doesn’t mean frameworks disappear overnight. There will always be cases where you need 60fps interactions, real-time collaboration, or game-like responsiveness. But for the vast majority of applications—CRUD apps, dashboards, content sites, e-commerce—the traditional stack is more than what’s needed.
Where This Is Heading
The pieces exist today, scattered:
- LLMs can reason about code and execute logic
- MCP provides a standard protocol for tool access
- GitHub Actions can run agentic workflows
- Static site generators deploy in seconds
- Browsers handle more without JavaScript every year
What’s missing is the unified framework that brings it all together. Something that lets you:
- Write static HTML/CSS
- Declare your agents in YAML
- Define intents as prompts
- Configure logging and observability
- Deploy and iterate
Call it Conduit, call it something else—the name matters less than the paradigm shift.
We’re moving toward a world where the backend isn’t code you maintain. It’s intelligence you configure.
Open Questions
This post raises more questions than it answers. Some I’m actively thinking about:
- How do we handle authentication securely in a prompt-based architecture?
- What does a production-grade logging system for agentic backends look like?
- How do we achieve sufficient determinism for financial operations?
- What’s the right boundary between “let the LLM handle it” and “use deterministic code”?
- How do we debug a system where the “code” is generated at runtime?
If you have thoughts on any of these, I’d love to hear them. This is a vision being refined, and the best ideas come from conversation.
This is still early. The gaps are real. But I believe the direction is clear—and worth exploring.