S

The Complete API Architecture Guide: REST, GraphQL, gRPC, tRPC, WebSockets & SSE

A Production-Ready Comparison of Modern API Patterns, Their Trade-offs, and When to Use Each

Author
Sunil Khadka
Software Engineer
8 min read
The Complete API Architecture Guide: REST, GraphQL, gRPC, tRPC, WebSockets & SSE

Modern applications rely heavily on APIs but there’s no single “best” way to build them. From REST and GraphQL to gRPC, WebSockets, and more, each approach comes with its own trade-offs.

In this guide, we’ll break down these API architectures, how they work, and when to use each in real-world systems.

But before diving in, it’s important to understand the foundation they all rely on: HTTP.

Understanding How the Web Actually Works (HTTP Explained Simply)

I used APIs every day without truly understanding what was happening under the hood. In this post, I break down HTTP, requests, responses, and how the web actually works, in a way that finally made things click for me.

sunil001.com.np
Understanding How the Web Actually Works (HTTP Explained Simply)

If you’re not fully comfortable with how requests and responses work, I recommend starting there first, it’ll make everything in this guide much clearer.

REST: The Reliable Default

Best for: Public APIs, third-party integrations, anything where "it just needs to work everywhere" REST is the Honda Civic of API architectures, not exciting, but it runs everywhere, everyone knows how to drive it, and mechanics (developers) are easy to find.

The core idea is simple: you have resources (like users, orders, products), and you interact with them using standard HTTP methods. GET to fetch, POST to create, PUT to update, DELETE to remove. Return JSON. Done.

Why it works: Every programming language has an HTTP client. Every developer understands GET and POST. You can debug REST APIs with nothing more than your browser or curl. Documentation tools like Swagger/OpenAPI are mature and widely supported.

Where it gets annoying: Ever built a complex dashboard and needed data from 12 different endpoints? That's the "N+1 requests" problem. You fetch a list of items, then need to hit separate endpoints for details on each one. Waterfall latency adds up fast. Plus, you often get back way more data than you need (over-fetching) or not quite enough (under-fetching), forcing more requests.

When to use it: Building something external developers will consume? REST. Need maximum compatibility across mobile apps, web apps, and random scripts? REST. Working with teams who know nothing about your stack? REST.


GraphQL: The "Give Me Exactly What I Asked For" Approach

Best for: Complex, interconnected data; supporting multiple client types (like web and mobile apps with different needs); aggregating data from multiple sources

GraphQL was Facebook's answer to REST's over-fetching problem. Instead of hitting /users/123 and getting whatever the server decides to send, you write a query describing exactly which fields you want:

query {
  user(id: 123) {
    name
    email
    orders {
      total
      items {
        name
        price
      }
    }
  }
}

One request. Precise data. No more hitting five endpoints to build a single view.

Why developers love it: The schema acts as living documentation. Tools like GraphiQL let you explore the API interactively, no more digging through outdated docs to see what fields are available. Frontend teams gain independence; they can request new data shapes without waiting for backend teams to build new endpoints.

The catches: Caching is complicated. With REST, you can lean on HTTP caching (CDNs, browser cache, etc.). GraphQL usually POSTs to a single endpoint, so you lose that for free. You end up implementing client-side caching (Apollo Client, Relay) which adds complexity.

Also, without safeguards, users can write absurdly expensive queries that hammer your database. You need complexity analysis and depth limiting to prevent someone from requesting "a user's friends' friends' friends' posts... going 10 levels deep."

When to use it: Your data looks like a graph (lots of relationships). You have multiple clients needing different data shapes. You're aggregating microservices behind a unified API. GitHub's API v4 is GraphQL for exactly these reasons, repos connect to issues, PRs, comments, orgs, users in complex webs.


gRPC: When Speed Actually Matters

Best for: Internal microservices, real-time AI/ML serving, high-throughput systems where milliseconds count.

gRPC comes from Google and trades human-readability for raw performance. Instead of JSON over HTTP/1.1, you use Protocol Buffers (binary format) over HTTP/2.

What this means practically: smaller payloads, persistent connections, and the ability to stream data in both directions efficiently. Your services chat with each other in compressed binary rather than verbose text.

The workflow: You define your service in a .proto file, then generate client and server code in whatever languages you're using. Type safety comes baked in, if the schema changes, your generated code won't compile, catching issues early.

The trade-offs: Browser support is... awkward. gRPC uses HTTP/2 features that browsers don't expose directly to JavaScript, so you need gRPC-Web with a translation proxy for frontend use. Debugging requires specialized tools rather than just looking at JSON in DevTools. Load balancers need to be HTTP/2 aware.

When to use it: Service-to-service communication inside your infrastructure. Real-time inference where 25ms vs 250ms actually impacts user experience. Systems handling massive throughput where JSON parsing overhead adds up. Netflix uses it for recommendation serving because at their scale, efficiency compounds.


tRPC: The TypeScript Developer's Dream

Best for: Full-stack TypeScript apps, internal tools, monorepos, moving fast with confidence

If you're all-in on TypeScript, tRPC removes the "API layer" mental overhead entirely. Your frontend imports router definitions directly from your backend. Change a server function's signature, and TypeScript immediately yells at every broken call site in your client code.

No code generation step. No OpenAPI specs to keep in sync. No runtime type validation libraries. Just TypeScript doing what TypeScript does best.

// On the server
const appRouter = router({
  user: {
    getById: procedure
      .input(z.object({ id: z.string() }))
      .query(({ input }) => {
        return db.user.findById(input.id);
      }),
  },
});
 
// On the client - fully typed, autocomplete works
const user = trpc.user.getById.useQuery({ id: "123" });

Why it's addictive: The feedback loop is instant. Refactor a database schema, follow the TypeScript errors, fix everything before deploying. New team members understand the API by exploring types in their IDE, not reading documentation.

The limitations: It's TypeScript-only. Building a mobile app in Swift or Kotlin? Out of luck. Need to expose your API to external developers? tRPC assumes tight coupling between client and server, that's the point, but it doesn't work for public APIs.

When to use it: You're in a TypeScript monorepo. You're building internal tools where velocity matters more than "proper" API boundaries. Your team is small and moves fast. Cal.com uses it for their scheduling platform for exactly this reason.


WebSockets: True Two-Way Communication

Best for: Multiplayer games, collaborative editing, trading platforms, anything where both sides need to talk simultaneously and frequently

WebSockets upgrade an HTTP connection to a persistent, full-duplex tunnel. Both client and server can push messages anytime without the overhead of opening new connections.

The classic example: Google Docs. Multiple people editing simultaneously, every keystroke propagates to all connected clients in near real-time. The server needs to push updates constantly, and clients need to send changes constantly. HTTP request-response would be absurd here.

Where people overuse it: "We need real-time updates for our dashboard!" Okay, but does the client need to send frequent messages back? If it's mostly server → client data (live metrics, notifications, activity feeds), you're paying the WebSockets complexity tax for bidirectional capability you don't use.

The complexity tax: Connection state management. Reconnection logic. Horizontal scaling requires sticky sessions or pub/sub backplanes so multiple servers can share connection state. Debugging frame-level issues. It's all solvable, but it's work.


Server-Sent Events (SSE): The Simpler "Real-Time"

Best for: Live dashboards, notifications, AI streaming, stock tickers, log streaming, any server-to-client push

SSE does one thing: server pushes text data to the client over a standard HTTP connection. That's it. No fancy bidirectional magic, just a long-lived HTTP response that the server writes to periodically.

Why it's underrated: It's HTTP. Your existing load balancers work. Your existing authentication middleware works. Debugging is just reading text streams. Browsers handle automatic reconnection and tracking the last received event ID (for resuming interrupted streams). Ten lines of code on both sides and you're streaming.

// Client side - dead simple
const source = new EventSource('/events');
source.onmessage = (event) => {
  console.log('New data:', event.data);
};

The OpenAI pattern: Send your prompt via regular POST, get the AI's response streamed back via SSE. Each token appears as it generates rather than waiting for the complete response. Shopify's Black Friday live map used SSE to stream 323 billion events because it's operationally simple at massive scale.

When not to use it: You need frequent client-to-server messages (use WebSockets). You're sending binary data (SSE is text-only). You need to support Internet Explorer (irrelevant for most of us now, but worth noting).


Hybrid Architectures Are Normal

Real systems mix and match. Netflix uses gRPC between services for performance, REST for public APIs for compatibility. Shopify uses SSE for live updates, GraphQL for complex data fetching. OpenAI uses REST for the request, SSE for the streaming response.

Start with your constraints, not the technology. A solo developer building a side project doesn't need Kubernetes and gRPC. A fintech startup handling trades needs to care about latency in ways a content site doesn't.

The best engineers I've worked with aren't religious about any of these. They understand the trade-offs, pick what fits the situation, and stay flexible enough to evolve when requirements change.

Share this article

Related Articles

REST APIs: Beyond the Buzzwords
backendMar 20, 2026

REST APIs: Beyond the Buzzwords

Stop guessing how to structure your endpoints. We break down the core principles of RESTful design and explain why some "rules" are made to be broken in production.

16 min readRead Article
The Axios Hack 2026: What Happened and What You Need to Know
frontendMar 31, 2026

The Axios Hack 2026: What Happened and What You Need to Know

On March 31, 2026, attackers briefly compromised Axios, a tool used in millions of websites. Here's what happened in plain English, and what you should check right now.

6 min readRead Article
Claude Code Source Leak: GitHub Repo, What’s Inside, and What Happened
AIMar 31, 2026

Claude Code Source Leak: GitHub Repo, What’s Inside, and What Happened

Looking for the Claude Code GitHub repository or the leaked source from February 2025? Here are the exact mirrors, what they contain, and the story behind how a debugging source map accidentally exposed the internals of Anthropic’s Claude Code tool.

6 min readRead Article
Understanding Golang Packages And Modules
goMar 23, 2026

Understanding Golang Packages And Modules

Go’s simplicity hides powerful concepts like packages and modules that make large-scale applications maintainable and efficient. In this guide, we break down how packages structure your code and how modules handle dependencies in modern Go development.

4 min readRead Article
Understanding How the Web Actually Works (HTTP Explained Simply)
backendMar 18, 2026

Understanding How the Web Actually Works (HTTP Explained Simply)

I used APIs every day without truly understanding what was happening under the hood. In this post, I break down HTTP, requests, responses, and how the web actually works, in a way that finally made things click for me.

22 min readRead Article