The Future of App Development: Your Backend is an LLM

A Note on This Post: What follows is a vision—not a fully fleshed-out solution. Many pieces remain unsolved: authentication, logging, debugging, determinism, and more. I’m sharing this as a direction I believe we’re heading, not a blueprint ready for production. Think of it as an invitation to explore and refine these ideas together.

How Applications Work Today

If you’ve ever wondered what happens behind the scenes when you use a web application, here’s the typical architecture:

┌─────────────┐      ┌─────────────┐      ┌─────────────┐
│   Browser   │ ───▶ │   Backend   │ ───▶ │  Database   │
│  (Frontend) │ ◀─── │   Server    │ ◀─── │             │
└─────────────┘      └─────────────┘      └─────────────┘
     HTML              Node.js              PostgreSQL
     CSS               Python               MySQL
     JavaScript        Java                 MongoDB
     React/Vue/Angular Express/Django/Rails

The frontend is what you see in your browser. It’s built with HTML, CSS, and typically a JavaScript framework like React, Vue, or Angular. These frameworks help developers manage complex user interfaces, handle state (like whether you’re logged in), and make API calls to fetch data.

The backend is where the business logic lives. It processes requests, talks to databases, handles authentication, and returns data. Developers write this in languages like Python, JavaScript (Node.js), or Java, using frameworks like Express, Django, or Rails.

The database stores your data persistently.

This architecture has served us well for decades. But it comes with costs.

The Hidden Cost of Complexity

Every layer in this stack introduces:

Code to write and maintain: API routes, database queries, authentication logic, error handling
Security vulnerabilities: Each endpoint is a potential attack vector
Integration burden: Every external service needs SDK code, API handling, retry logic
Framework overhead: React alone adds ~40KB to your bundle before you write a single line

And here’s the thing about frameworks: they exist to help humans manage complexity. React’s virtual DOM, state management libraries like Redux, ORMs like Prisma—these are all abstractions that make it easier for developers to reason about complex systems.

But what if the machine handling the complexity doesn’t need those abstractions?

The Security Advantage of Static Sites

Consider a purely static website—just HTML and CSS files. No JavaScript framework, no backend server, no database connection.

The attack surface shrinks dramatically:

Traditional App	Static Site
SQL injection	Not possible (no database)
Authentication bypass	Not possible (no auth)
API exploitation	Not possible (no APIs)
Server misconfiguration	Not possible (no server)
Dependency vulnerabilities	Minimal (no runtime deps)

Static sites can still face issues like DNS hijacking or CDN compromise, but compared to a full-stack application, there’s remarkably little to attack.

I use this approach for my own portfolio tracker on this site. Instead of building a backend that queries stock prices and exposes API endpoints, I have a GitHub Action that runs on a schedule:

GitHub Action (runs every few hours)
        ↓
Python script fetches stock prices
        ↓
Updates JSON file with current data
        ↓
Redeploys static site to GitHub Pages
        ↓
HTML reads from JSON, displays prices

No backend is exposed. The Python script runs in an isolated GitHub Action, does its job, and disappears. The user only ever sees static HTML and CSS.

For my use case—position trading over months, not day trading—I don’t need real-time prices. Updates every few hours are perfect.

But this got me thinking: what if we could extend this pattern to more dynamic applications?

The Vision: LLM as Your Backend

Today, we use LLMs to write code:

Developer: "Write me code to connect to Postgres and fetch product specs"
        ↓
LLM: "Here's 50 lines of Python..."
        ↓
Developer: copies → pastes → tests → debugs → deploys
        ↓
Code exists in your codebase forever
        ↓
You maintain it forever

Every API change, security patch, or new requirement means updating that code.

Here’s a different approach:

User clicks "Get product specs"
        ↓
Request goes to LLM (the backend)
        ↓
LLM connects to database directly
        ↓
LLM returns data
        ↓
Static site displays result

No intermediate code to maintain. The LLM doesn’t write code for you to deploy—it is the backend that executes the logic.

Why write code at all if the LLM can execute the intent directly?

What This Vision Is NOT For

Before diving deeper, let me be clear about what this approach cannot handle today—and may never be the right fit for:

Authentication-Heavy Applications

If your app requires user authentication, you face a fundamental problem: how do you securely pass user identity to the LLM backend?

Sending credentials or session tokens via prompt payloads is problematic:

Interception risk: Prompts could be logged, cached, or exposed
No standard auth flow: OAuth, JWT, session cookies—none of these map cleanly to prompt-based communication
Identity verification: How does the LLM verify the user is who they claim to be?

This remains an unsolved problem in the agentic backend model. Traditional backends handle auth through well-established, battle-tested patterns. We don’t have equivalents yet.

Financial Transactions and High-Stakes Operations

For applications like:

Payment processing (Stripe, PayPal integrations)
E-commerce platforms (eBay-like marketplaces)
Banking and trading systems
Healthcare record management

…you need absolute determinism and auditability. A payment endpoint must charge exactly $100.00 every time, not “usually $100.00.” More on this in the determinism section below.

Real-Time and High-Throughput Systems

Current LLM inference latency (500ms-2000ms+) makes this unsuitable for:

Real-time collaboration (Google Docs-style)
Gaming backends
Live trading platforms
Chat applications requiring instant responses

Large-Scale Platforms

AWS, Shopify, Netflix—these require:

Millions of requests per second
Sub-10ms latency
Complex distributed systems
Cost efficiency at massive scale

The economics don’t work. At $0.003 per 1K tokens, 10 million daily requests would cost thousands of dollars—versus a few hundred on traditional infrastructure.

When This DOES Make Sense

Content sites with occasional dynamic features
Internal tools and dashboards
Prototypes and MVPs
Personal projects (like my portfolio tracker)
Applications where 500ms+ latency is acceptable
Low-to-medium traffic scenarios

The Determinism Problem

Here’s the elephant in the room: LLMs are non-deterministic by design.

The same prompt can return different results. For business logic, this is potentially catastrophic:

Prompt: "Calculate total for cart: 3 items at $33.33 each"

Response 1: "$99.99"
Response 2: "$100.00"
Response 3: "$99.99"

Traditional code returns the same answer every time. LLMs might not.

It’s Not As Bad As It Sounds

Here’s the key insight: the LLM doesn’t have to do everything.

In a well-designed agentic backend, the LLM handles orchestration and intent parsing—deciding what to do. The actual work is delegated to deterministic components:

Skills pointing to scripts are deterministic:

skills:
  calculate_cart_total:
    type: script
    command: python scripts/cart_calculator.py

The Python script produces the same output every time. The LLM just decides to call it.

MCP queries are deterministic: When the LLM uses MCP to query a database, the database returns the same data for the same query. The data retrieval is deterministic—only the LLM’s decision to query is variable.

The architecture pattern:

User Intent → LLM (parses intent, chooses action)
                     ↓
              Deterministic Execution
              • Script via Skill
              • Database via MCP
              • API via MCP server
                     ↓
              Deterministic Output

The non-determinism is confined to the routing layer, not the execution layer.

Additional Mitigations

For the parts where LLM variability matters, these guardrails help:

1. Temperature = 0 Setting temperature to zero reduces (but doesn’t eliminate) variability. Most providers support this.

2. Structured Outputs Force the LLM to return data in strict schemas (JSON with defined fields). This constrains the output space significantly.

3. Validation Layers Every LLM response passes through validation before being used:

LLM Response → Schema Validation → Business Rule Check → Use

4. Idempotency Keys For operations that must not repeat (payments, record creation), traditional idempotency patterns still apply—the LLM doesn’t change this requirement.

5. Deterministic Fallbacks For critical calculations, don’t use the LLM at all. Use it for orchestration and intent parsing, but delegate math to deterministic code.

Remaining Gaps

Even with these mitigations, challenges remain:

What if the LLM misparses intent and calls the wrong skill?
How do we ensure consistent routing for edge cases?
What’s the testing strategy for probabilistic routing?

These are solvable problems, but they require new patterns we’re still developing.

Declarative Agentic Backend

This is the concept I’m calling a Declarative Agentic Backend. Instead of writing imperative code, you declare what you need, and an agentic framework handles execution.

What is MCP?

Before explaining the architecture, a quick note on MCP (Model Context Protocol). Launched by Anthropic in late 2024, MCP is essentially a standard protocol for connecting AI systems to external tools—databases, APIs, file systems, and more.

Think of it as USB for LLMs: a universal way for AI to plug into services without custom integration code for each one.

As of 2025-2026, MCP has gained adoption from OpenAI, Google DeepMind, and others, making it an emerging industry standard rather than a single company’s initiative.

The Three Pillars

1. Prompt-First Communication

All interaction happens through structured prompts. No REST conventions, no SDK calls—prompts are the interface.

## Request
Intent: get_product_specs
Product ID: 12345

## User Context
Session started: 2024-01-15 10:30 UTC
Journey: home → electronics → filters(brand=Sony) → product_detail
Previous queries: [price_comparison, reviews]

## Expected Output
Format: JSON
Include: specifications, availability, related_products

2. Backend = Agentic Framework

You don’t code integrations. You compose agents. The framework provides pre-built agents for common tasks:

# backend.agents.yml

database:
  type: postgres-agent
  model: llama-3-8b
  connection: $DATABASE_URL
  capabilities: [read, write]

payments:
  type: stripe-agent
  model: llama-3-8b
  key: $STRIPE_KEY

storage:
  type: s3-agent
  model: llama-3-8b
  bucket: my-app-files

orchestrator:
  model: claude-sonnet
  routes:
    get_product: [database]
    process_payment: [payments, database]
    upload_file: [storage]

Each agent is pre-configured to handle its domain. The Postgres agent already knows SQL, connection pooling, and authentication. You just provide the connection string and intent.

3. Context-Aware by Default

Every request carries the user’s journey context. The backend doesn’t just know what the user wants—it understands why and where from.

## User Session Context
- Started: Product listing page
- Action: Filtered by "electronics"
- Action: Clicked "Widget Pro X"
- Current: Product detail page
- Request: Get specifications

This context enables the backend to provide intelligent responses, recommendations, and assistance that traditional backends simply can’t match.

The Architecture

Here’s how everything connects:

┌────────────────────────────────────────────────────────────────────┐
│                         USER'S BROWSER                             │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │  Static HTML/CSS + Modern Browser Features                   │  │
│  │  Reads from: /data/session-{id}.json                         │  │
│  │  Minimal JS: triggers endpoint, swaps content                │  │
│  └──────────────────────────┬───────────────────────────────────┘  │
└─────────────────────────────┼──────────────────────────────────────┘
                              │
                   ┌──────────▼──────────┐
                   │  Single Endpoint    │
                   │  (Prompt Payload)   │
                   └──────────┬──────────┘
                              │
┌─────────────────────────────▼──────────────────────────────────────┐
│              DECLARATIVE AGENTIC BACKEND (Conduit)                 │
│                                                                    │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │                    ORCHESTRATOR (Claude)                     │  │
│  │              Reads prompt → Plans → Delegates                │  │
│  └──────────────────────────┬───────────────────────────────────┘  │
│                             │                                      │
│       ┌──────────┬──────────┼──────────┬──────────┐                │
│       ▼          ▼          ▼          ▼          ▼                │
│   ┌───────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐           │
│   │Postgres│ │ MySQL  │ │ Stripe │ │  S3    │ │  MCP   │           │
│   │ Agent │ │ Agent  │ │ Agent  │ │ Agent  │ │ Bridge │           │
│   │(Llama)│ │(Llama) │ │(Llama) │ │(Llama) │ │        │           │
│   └───────┘ └────────┘ └────────┘ └────────┘ └────────┘           │
│                                                                    │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │              PRE-BUILT AGENT LIBRARY                         │  │
│  │  • db/postgres    • db/mysql      • db/mongo                 │  │
│  │  • cloud/aws-s3   • cloud/gcp     • cloud/azure              │  │
│  │  • payments/stripe • api/rest     • api/graphql              │  │
│  │  • util/transform  • util/validate • util/format             │  │
│  └──────────────────────────────────────────────────────────────┘  │
│                             │                                      │
│                             ▼                                      │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │         UPDATE SESSION JSON → PARTIAL DEPLOY                 │  │
│  └──────────────────────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────────────────────┘

Important clarification: This isn’t “no backend”—it’s no custom backend code. There’s still a server receiving requests; it’s just powered by LLM inference rather than hand-written business logic.

Model Tiering

Not every task needs a powerful model. The framework uses the right model for each job:

Task	Model	Why
Orchestration, complex reasoning	Claude	Needs full context understanding
Database queries	Llama 3 (specialized)	Structured, predictable task
API calls	Small model or rule-based	Pattern matching
Response formatting	Small model	Template-driven

You don’t need Claude Sonnet to query a database. A fine-tuned Llama 3 that speaks SQL fluently costs a fraction and executes faster.

The Logging and Debugging Challenge

Traditional backends have mature logging and debugging:

Stack traces
Request/response logs
Database query logs
APM tools (DataDog, New Relic)

With an agentic backend, we have multiple moving parts that all need visibility:

What Needs Logging

1. Orchestrator Decisions

What intent did it parse?
Which agents did it choose to invoke?
What was its reasoning?

2. MCP Interactions

Which MCP servers were called?
What tools were used?
What data was passed and returned?

3. Skills and Scripts Sometimes an MCP server doesn’t exist for your use case. You might have a skill that invokes a Python script, shell command, or custom API call:

skills:
  fetch_legacy_data:
    type: script
    command: python scripts/legacy_api.py
    args: [--format, json]

Where do these logs go? How do you correlate them with the orchestrator’s session?

4. On-the-Fly Code Execution If the LLM generates and executes code at runtime, you need:

The generated code itself (for audit)
Execution output
Error traces if it fails

The Unsolved Problem

We need a centralized logging and debugging system that:

Aggregates logs from all sources (orchestrator, MCP, skills, generated code)
Correlates them by session/request ID
Provides a timeline view of what happened
Enables replay for debugging failed requests

This doesn’t exist as a turnkey solution today. Building observability into agentic systems is an open challenge.

Complexity: Shifted, Not Eliminated

Let me be honest: this architecture doesn’t eliminate complexity—it shifts it.

What you no longer maintain:

API route handlers
Database query code
SDK integration code
Most error handling boilerplate

What you now maintain:

Agent configurations (YAML)
Prompt templates and intents
Schema definitions for structured outputs
Validation logic for LLM responses
Logging and monitoring infrastructure
Skills/scripts for cases without MCP support

Is this less work? I believe so—if properly configured. The shift is from writing repetitive boilerplate to declaring intent and configuring agents. The former scales linearly with features; the latter can scale more efficiently once the foundation is solid.

But it’s not magic. There’s still work, just different work.

Leveraging the Modern Browser

One reason this architecture becomes more viable: browsers have evolved significantly. Many things that once required JavaScript frameworks are now native:

Browser Feature	What It Replaces
CSS Container Queries	JavaScript responsive logic
CSS :has() selector	JavaScript parent selection
View Transitions API	React Router transitions
Service Workers	Framework caching strategies
IndexedDB	Redux persistence

This matters because the less JavaScript your frontend needs, the more “static” it can be. A static frontend pairs naturally with an agentic backend—you’re not trying to coordinate two complex systems.

┌─────────────────────────────────────────────────────────────┐
│                 BROWSER AS THE RUNTIME                      │
│                                                             │
│  Service Worker                                             │
│    → Caches static assets                                   │
│    → Intercepts fetch to /data/*.json                       │
│    → Polls for updates                                      │
│                                                             │
│  IndexedDB                                                  │
│    → Stores user session locally                            │
│    → Syncs with server-side session JSON                    │
│                                                             │
│  CSS (2024+)                                                │
│    → Handles most "dynamic" UI without JS                   │
│    → :has(), container queries, nesting, layers             │
│                                                             │
│  Minimal Vanilla JS                                         │
│    → Form validation, micro-interactions only               │
│    → Triggers prompt endpoint when needed                   │
│    → Reads updated JSON, swaps HTML fragments               │
│                                                             │
└─────────────────────────────────────────────────────────────┘

The System Gets Smarter

Here’s something traditional backends can’t do: learn from context.

Every interaction adds to the conversation. The agentic backend accumulates understanding:

First visit: Generic responses, standard recommendations
After browsing: Knows your interests, can suggest relevant products
After purchase history: Understands your preferences, budget range
After support interactions: Knows your technical level, past issues

If you’re reading documentation and going back and forth between pages, the system notices. It can proactively offer help: “I see you’re looking at the authentication section repeatedly. Would you like me to explain OAuth flow step by step?”

This isn’t a feature you code. It emerges from the architecture. When your backend is an LLM with full context, intelligence comes free.

What Needs to Happen

This vision isn’t fully achievable today. Several pieces need to mature:

Component	Current State	What’s Needed
MCP ecosystem	Growing rapidly (OpenAI, Google adopted)	More pre-built servers for common services
Authentication	No standard pattern	Secure auth flow for prompt-based systems
Logging/debugging	Fragmented	Centralized observability for agentic systems
Specialized agents	Possible with fine-tuning	Framework with pre-configured agents
Determinism	Mitigations exist	Better guarantees for critical operations
Trigger mechanism	Workarounds exist	Native GitHub Action ↔ static site bridge
Model latency	Hundreds of milliseconds	Faster inference for real-time interactions
Cost	Improving rapidly	Continued reduction for viability at scale

But the trajectory is clear. Models are getting faster and cheaper. MCP is gaining adoption. The pieces are falling into place.

The Developer Experience Transformation

Building an app today:

Choose frontend framework (React? Vue? Svelte?)
Choose backend framework (Express? Django? Rails?)
Choose ORM (Prisma? Sequelize?)
Write authentication logic
Write API routes
Write database queries
Write integration code for each service
Handle errors everywhere
Write tests
Deploy frontend and backend separately
Manage infrastructure
Maintain everything forever

Building an app with a Declarative Agentic Backend:

Write HTML/CSS (static)
Configure agents (declarative YAML)
Define intents and prompts
Set up logging and validation
Deploy static site
Done

The custom code you would have written? The LLM executes equivalent logic at runtime. The integrations you would have maintained? Pre-configured agents handle them. The complexity you would have managed? The orchestrator deals with it.

It’s not zero work—but it’s different work that may scale better.

The Attack Surface Comparison

Traditional applications expose everything:

TRADITIONAL APP ATTACK SURFACE:
┌────────────────────────────────────────┐
│  Frontend (XSS vulnerabilities)        │
│  API endpoints (injection, auth bugs)  │
│  Backend code (logic flaws)            │
│  Database (SQL injection)              │
│  Dependencies (supply chain attacks)   │
│  Server (misconfiguration, unpatched)  │
│  Secrets (exposed environment vars)    │
└────────────────────────────────────────┘

The Declarative Agentic Backend exposes less:

CONDUIT ATTACK SURFACE:
┌────────────────────────────────────────┐
│  Static files (CDN, nearly immutable)  │
│  Single trigger endpoint (minimal)     │
│  Prompt injection (new attack vector)  │
│  Agent misconfiguration                │
└────────────────────────────────────────┘

Agents run in isolated environment:
  → Spins up, executes, terminates
  → No persistent attack surface
  → Credentials never exposed to frontend

Note: This introduces new attack vectors like prompt injection. The attack surface is different, not necessarily smaller in all dimensions.

A New Development Era

Looking at the evolution of web development:

Era	Paradigm	Complexity Handler
1990s	CGI/PHP	Humans write everything
2000s	Frameworks (Rails, Django)	Convention over configuration
2010s	SPAs + APIs (React + Node)	Frontend/backend separation
2020s	Serverless + JAMstack	Functions + static
Future	Static + Agentic Backend	LLM handles complexity at runtime

Frameworks like React solved real problems—but they solved them for human developers who needed help managing complexity. When the complexity handler is an LLM that can reason, plan, and execute, the abstraction layer becomes overhead.

This doesn’t mean frameworks disappear overnight. There will always be cases where you need 60fps interactions, real-time collaboration, or game-like responsiveness. But for the vast majority of applications—CRUD apps, dashboards, content sites, e-commerce—the traditional stack is more than what’s needed.

Where This Is Heading

The pieces exist today, scattered:

LLMs can reason about code and execute logic
MCP provides a standard protocol for tool access
GitHub Actions can run agentic workflows
Static site generators deploy in seconds
Browsers handle more without JavaScript every year

What’s missing is the unified framework that brings it all together. Something that lets you:

Write static HTML/CSS
Declare your agents in YAML
Define intents as prompts
Configure logging and observability
Deploy and iterate

Call it Conduit, call it something else—the name matters less than the paradigm shift.

We’re moving toward a world where the backend isn’t code you maintain. It’s intelligence you configure.

Open Questions

This post raises more questions than it answers. Some I’m actively thinking about:

How do we handle authentication securely in a prompt-based architecture?
What does a production-grade logging system for agentic backends look like?
How do we achieve sufficient determinism for financial operations?
What’s the right boundary between “let the LLM handle it” and “use deterministic code”?
How do we debug a system where the “code” is generated at runtime?

If you have thoughts on any of these, I’d love to hear them. This is a vision being refined, and the best ideas come from conversation.

This is still early. The gaps are real. But I believe the direction is clear—and worth exploring.

How Applications Work Today#

The Hidden Cost of Complexity#

The Security Advantage of Static Sites#

The Vision: LLM as Your Backend#

What This Vision Is NOT For#

Authentication-Heavy Applications#

Financial Transactions and High-Stakes Operations#

Real-Time and High-Throughput Systems#

Large-Scale Platforms#

When This DOES Make Sense#

The Determinism Problem#

It’s Not As Bad As It Sounds#

Additional Mitigations#

Remaining Gaps#

Declarative Agentic Backend#

What is MCP?#

The Three Pillars#

The Architecture#

Model Tiering#

The Logging and Debugging Challenge#

What Needs Logging#

The Unsolved Problem#

Complexity: Shifted, Not Eliminated#

Leveraging the Modern Browser#

The System Gets Smarter#

What Needs to Happen#

The Developer Experience Transformation#

The Attack Surface Comparison#

A New Development Era#

Where This Is Heading#

Open Questions#