Backend Architecture BE · 01 · 04

Handler and response: from business logic to bytes on the wire

The handler runs your logic, then the result is serialized, framed with a status and headers, and written to the socket. Serialization cost and keep-alive reuse decide how much that final stretch costs.

BE Middle ◷ 13 min

Level

FoundationsJuniorMiddleSenior

An endpoint that returns a list of orders gets 40% slower after a product launch. The query is unchanged and still returns in 8 ms. The slowdown is entirely in one line nobody profiles: turning the result into JSON. The list grew from 50 rows to 5,000, and JSON.stringify is now the most expensive thing the request does.

The handler is mostly waiting

Your handler runs the business logic, but for a typical request most of its wall-clock time is not computation — it is waiting on a database query, a cache lookup, or another service. That waiting is why async I/O matters (the next unit), and why a “3 ms handler” usually means 3 ms of CPU wrapped around tens of milliseconds of I/O wait. The handler’s job is to gather data and produce a result object; it does not yet touch the socket.

Serialization is real work

Here is the part most engineers miss until they profile it: once the handler returns, the work is not done. Turning a result object into bytes is CPU the request always pays:

JSON — JSON.stringify is synchronous and scales with object size. On a busy event-loop server, serializing a large array blocks every other request for the duration. A 5,000-row payload can cost several milliseconds of pure CPU.
Schema serializers (Fastify’s fast-json-stringify, Protobuf, MessagePack) precompile the shape and run far faster than generic stringify, often 2–4× — which is why high-throughput APIs declare response schemas.
HTML/template rendering has the same property: it is CPU, and it is synchronous unless the framework streams it.

The cost is invisible until the object grows. Pagination is not just a UX choice; it is a serialization-cost control.

Generic JSON.stringify is the default but synchronous and O(payload size); a schema serializer trades an up-front schema for 2–4x faster output on hot endpoints.

Status, headers, and framing

Before the body goes out, the response needs a status line and headers. Two framing choices matter:

Content-Length — the server knows the full body size up front, sends it, and the client knows exactly when the response ends. Requires buffering the whole body first.
Transfer-Encoding: chunked — the server streams without knowing the total size in advance (covered next lesson).

The status code is part of the contract, not decoration: 200 vs 201 (created), 204 (no body), 4xx (client must change the request), 5xx (server fault, safe to retry an idempotent call). Returning 200 with an error body inside is a common anti-pattern that breaks every client’s retry and monitoring logic.

Choice	Cost	When it bites
`JSON.stringify` (generic)	O(size), synchronous, blocks the loop	Large arrays after data growth
Schema serializer	2–4× faster, precompiled	Worth it on hot, high-volume endpoints
Buffer then send (Content-Length)	Holds full body in memory	Large responses spike memory
Wrong status (200 on error)	Breaks client retries + alerts	Silent until an incident

Keep-alive: paying the handshake once

Opening a TCP (and TLS) connection is expensive — a handshake costs one or more round trips. HTTP keep-alive (Connection: keep-alive, default in HTTP/1.1) reuses one connection for many requests, so the handshake is amortized. After the response drains, the connection returns to an idle pool on both sides until a keep-alive timeout closes it.

This is why connection reuse is a lifecycle concern: a client that opens a fresh connection per request pays the handshake every time, and a server with too short a keep-alive timeout forces reconnects under steady load. The response is not “done” when the handler returns — it is done when the bytes are drained and the connection is correctly kept or closed.

Quiz

An endpoint slows down 40% after its result set grows from 50 to 5,000 rows, but the database query time is unchanged. What is the most likely cause?

Quiz

Why is returning HTTP 200 with an error object inside the body considered an anti-pattern?

Complete the analogy

Fill in the blank: HTTP keep-alive lets many requests reuse one connection so the expensive TCP/TLS _______ is paid once instead of per request.

The handler produces a result object (mostly I/O wait); serialization is synchronous CPU that scales with payload size; framing adds the status line and Content-Length; draining writes to a kept-alive socket. The response is done only when the bytes drain.

Recall before you leave

01
Why is most of a typical handler's wall-clock time not CPU, and why does serialization cost still matter?
02
What do Content-Length and chunked transfer encoding each require of the server, and how does the status code function as a contract?
03
What does HTTP keep-alive optimize, and why is connection reuse a lifecycle concern?

Recap

At the center of the onion the handler runs, but most of its time is I/O wait, not computation — it produces a result object without touching the socket. Turning that object into bytes is the serialization stage: synchronous CPU that scales with payload size, where generic JSON.stringify on large results blocks the event loop and schema serializers or pagination cut the cost. The response is then framed with a status line and headers, using Content-Length (buffer first) or chunked encoding, and the status code carries real contract meaning that clients, proxies, and alerts depend on. Finally, keep-alive amortizes the expensive TCP/TLS handshake across many requests on one connection. Now when you see a handler that is fast but the endpoint is slow, look at serialization cost and keep-alive settings before concluding the code is wrong. The response is only truly done when the bytes drain — and when the client reads slower than the server writes, that draining becomes the hard problem of the next lesson: backpressure.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

builds on

Routing and middleware: choosing what runs, and in what ordermiddle

unlocks

Streaming and backpressure: when the client reads slower than you writesenior

deepens into

Streaming and backpressure: when the client reads slower than you writesenior

appears again in188

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.

Apply this

Put this lesson to work on a real build.

Mini CRUD APIBuild your first real backend: a tiny HTTP API that creates, reads, updates, and deletes notes — backed by SQLite so the data survives a restart. You go from a one-line 'hello' server to a small service that validates input and stores rows, one honest step at a time.