awesome-everything RU
↑ Back to the climb

Backend Architecture

Handler and response: from business logic to bytes on the wire

Crux The handler runs your logic, then the result is serialized, framed with a status and headers, and written to the socket. Serialization cost and keep-alive reuse decide how much that final stretch costs.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 13 min

An endpoint that returns a list of orders gets 40% slower after a product launch. The query is unchanged and still returns in 8 ms. The slowdown is entirely in one line nobody profiles: turning the result into JSON. The list grew from 50 rows to 5,000, and JSON.stringify is now the most expensive thing the request does.

The handler is mostly waiting

Your handler runs the business logic, but for a typical request most of its wall-clock time is not computation — it is waiting on a database query, a cache lookup, or another service. That waiting is why async I/O matters (the next unit), and why a “3 ms handler” usually means 3 ms of CPU wrapped around tens of milliseconds of I/O wait. The handler’s job is to gather data and produce a result object; it does not yet touch the socket.

Serialization is real work

Turning a result object into bytes is CPU the request always pays:

  • JSONJSON.stringify is synchronous and scales with object size. On a busy event-loop server, serializing a large array blocks every other request for the duration. A 5,000-row payload can cost several milliseconds of pure CPU.
  • Schema serializers (Fastify’s fast-json-stringify, Protobuf, MessagePack) precompile the shape and run far faster than generic stringify, often 2–4× — which is why high-throughput APIs declare response schemas.
  • HTML/template rendering has the same property: it is CPU, and it is synchronous unless the framework streams it.

The cost is invisible until the object grows. Pagination is not just a UX choice; it is a serialization-cost control.

Status, headers, and framing

Before the body goes out, the response needs a status line and headers. Two framing choices matter:

  • Content-Length — the server knows the full body size up front, sends it, and the client knows exactly when the response ends. Requires buffering the whole body first.
  • Transfer-Encoding: chunked — the server streams without knowing the total size in advance (covered next lesson).

The status code is part of the contract, not decoration: 200 vs 201 (created), 204 (no body), 4xx (client must change the request), 5xx (server fault, safe to retry an idempotent call). Returning 200 with an error body inside is a common anti-pattern that breaks every client’s retry and monitoring logic.

ChoiceCostWhen it bites
JSON.stringify (generic)O(size), synchronous, blocks the loopLarge arrays after data growth
Schema serializer2–4× faster, precompiledWorth it on hot, high-volume endpoints
Buffer then send (Content-Length)Holds full body in memoryLarge responses spike memory
Wrong status (200 on error)Breaks client retries + alertsSilent until an incident

Keep-alive: paying the handshake once

Opening a TCP (and TLS) connection is expensive — a handshake costs one or more round trips. HTTP keep-alive (Connection: keep-alive, default in HTTP/1.1) reuses one connection for many requests, so the handshake is amortized. After the response drains, the connection returns to an idle pool on both sides until a keep-alive timeout closes it.

This is why connection reuse is a lifecycle concern: a client that opens a fresh connection per request pays the handshake every time, and a server with too short a keep-alive timeout forces reconnects under steady load. The response is not “done” when the handler returns — it is done when the bytes are drained and the connection is correctly kept or closed.

Quiz

An endpoint slows down 40% after its result set grows from 50 to 5,000 rows, but the database query time is unchanged. What is the most likely cause?

Quiz

Why is returning HTTP 200 with an error object inside the body considered an anti-pattern?

Complete the analogy

Fill in the blank: HTTP keep-alive lets many requests reuse one connection so the expensive TCP/TLS _______ is paid once instead of per request.

Recall before you leave
  1. 01
    Why is most of a typical handler's wall-clock time not CPU, and why does serialization cost still matter?
  2. 02
    What do Content-Length and chunked transfer encoding each require of the server, and how does the status code function as a contract?
  3. 03
    What does HTTP keep-alive optimize, and why is connection reuse a lifecycle concern?
Recap

At the center of the onion the handler runs, but most of its time is I/O wait, not computation — it produces a result object without touching the socket. Turning that object into bytes is the serialization stage: synchronous CPU that scales with payload size, where generic JSON.stringify on large results blocks the event loop and schema serializers or pagination cut the cost. The response is then framed with a status line and headers, using Content-Length (buffer first) or chunked encoding, and the status code carries real contract meaning that clients, proxies, and alerts depend on. Finally, keep-alive amortizes the expensive TCP/TLS handshake across many requests on one connection. The response is only truly done when the bytes drain — and when the client reads slower than the server writes, that draining becomes the hard problem of the next lesson: backpressure.

Connected lessons
appears again in185
Continue the climb ↑Streaming and backpressure: when the client reads slower than you write
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.