James Smith - Build Your Own Web Server From Scratch in Node - JS - Learn Network Programming, HTTP, and WebSocket by Coding A Web Server (2024)
James Smith - Build Your Own Web Server From Scratch in Node - JS - Learn Network Programming, HTTP, and WebSocket by Coding A Web Server (2024)
James Smith
build-your-own.org
Build Your Own
Web Server
From Scratch
in Node.JS
James Smith
2024-02-02 build-your-own.org
Contents
01. Introduction 1
1.1 Why Code a Web Server? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Build Your Own X From Scratch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 The Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Network Programming
The first step is to make programs talk over a network. This is also called socket programming.
But socket programming is more than just gluing APIs together! It’s easy to end up with
half-working solutions if you skip the basics.
In order to communicate over a network, the data sent over the network must conform
to a specific format called a “protocol”. Learn how to create or implement any network
protocols by using HTTP as the target.
HTTP in Detail
You probably already know something about HTTP, such as URLs, different methods like
GET and POST, response codes, various headers, and etc. But have you ever thought that
you can create all the details from scratch, by your own code? It’s not very complicated and
it’s rewarding.
1
2024-02-02 01. Introduction
Designed to guide you through your own web server implementation, the book follows a
step-by-step approach, and each chapter builds on the previous one.
4. Core applications.
5. WebSocket.
• Message-based designs.
• Concurrent programming.
– Blocking queues.
Producing code is the easiest part and you’ll also need to debug the code to make it work.
So I have included some tips on this.
• What’s missing from the code? The gap between toys and the real thing, such as
optimizations and applications.
• Important concepts beyond coding, such as event loops and backpressure. These are
what you are likely to overlook.
• Design choices. Why stuff has to work that way? You can learn from both the good
ones and the bad ones.
• Alternative routes. Where you can deviate from this book.
Code samples use TypeScript with type annotations for readability, but the differences
from JS are minor.
The code is mostly data structures + functions, avoiding fancy abstractions to maximize
clarity.
Book Series
This is part of the “Build Your Own X ” book series, which includes books on building your
own Redis[1] , database[2] , and compiler[3] .
https://build-your-own.org
[1]
https://build-your-own.org/redis/
[2]
https://build-your-own.org/database/
[3]
https://build-your-own.org/compiler/
2.1 Overview
As you may already know, the HTTP protocol sits above the TCP protocol. How TCP
itself works in detail is not our concern; what we need to know about TCP is that it’s a
bidirectional channel for transmitting raw bytes — a carrier for other application protocols
such as HTTP or SSH.
Although each direction of a TCP connection can operate independently, many protocols
follow the request-response model. The client sends a request, then the server sends a
response, then the client might use the same connection for further requests and responses.
client server
------ ------
| req1 | ==>
<== | res1 |
| req2 | ==>
<== | res2 |
...
nc example.com 80
The nc (netcat) command creates a TCP connection to the destination host and port, and
then attaches the connection to stdin and stdout. We can now start typing in the terminal
and the data will be transmitted:
GET / HTTP/1.0
Host: example.com
(empty line)
4
2024-02-02 02. HTTP Overview
<!doctype html>
<!-- omitted -->
Making HTTP requests from the command line is very easy. Let’s take a look at the data
and try to figure out what it means.
The first line of the request — GET / HTTP/1.0 — contains the HTTP method GET, the URI /,
and the HTTP version 1.0. This is easy to figure out.
And the first line of the response — HTTP/1.0 200 OK — contains the HTTP version and the
response code 200.
Following the first line is a list of header fields in the format of Key: value. The request has
a single header field — Host — which contains the domain name.
The response contains many fields, and their functions are not as obvious as the Host field.
Many HTTP header fields are optional, and some are even useless. We will learn more
about this in later chapters.
The response header is followed by the payload, which in our case is an HTML document.
Payload and header are separated by an empty line. The GET request has no payload so it
ends with an empty line.
This is just a simple example that you can play with from the command line. We will
examine the HTTP protocol in more detail later.
HTTP/1.1 fixed this and became a practical protocol. You can try using the nc command to
send multiple requests on the same connection by simply changing HTTP/1.0 to HTTP/1.1.
This book will focus on HTTP/1.1, as it is still very popular and easy to understand. Even
software systems that have nothing to do with the Web have adopted HTTP as the basis of
their network protocol. When a backend developer talks about an “API”, they likely mean
an HTTP-based one, even for internal software services.
Why is HTTP so popular? One possible reason is that it can be used as a generic request-
response protocol; developers can reply on HTTP instead of inventing their own protocols.
This makes HTTP a good target for learning how to build network protocols.
There have been further developments since HTTP/1.1. HTTP/2, related to SPDY, is the
next iteration of HTTP. In addition to incremental refinements such as compressed headers,
it has 2 new capacities.
• Server push, which is sending resources to the client before the client requests them.
• Multiplexing of multiple requests over a single TCP connection, which is an attempt
to address head-of-line blocking.
With these new features, HTTP/2 is no longer a simple request-response protocol. That’s
why we start with HTTP/1.1 because it’s simple enough and easy to understand.
HTTP/3 is much larger than HTTP/2. It replaces TCP and uses UDP instead. So it needs
to replicate most of the functionality of TCP, this TCP alternative is called QUIC. The
motivations behind QUIC are userspace congestion control, multiplexing, and head-of-line
blocking.
You can learn a lot by reading about these new technologies, but you may be overwhelmed
by these concepts and jargon. So let’s start with something small and simple: coding an
HTTP/1.1 server.
nc example.com 80 <request.txt
In practice, you may encounter some quirks of the nc command, such as not sending EOF,
or multiple versions of nc with incompatible flags. You can use the modern replacement
socat instead.
socat tcp:example.com:80 -
telnet example.com 80
You can also use an existing HTTP client instead of manually constructing the request data.
Try the curl command:
Most sites support HTTPS alongside plaintext HTTP. HTTPS adds an extra protocol layer
called “TLS” between HTTP and TCP. TLS is not plaintext, so you cannot use netcat to
test an HTTPS server. But TLS still provides a byte stream like TCP, so you just need to
replace netcat with a TLS client.
Network protocols are divided into different layers, where the higher layer depends on the
lower layer, and each layer provides different capacities.
top
/\ | App | message or whatever
|| | TCP | byte stream
|| | IP | packets
|| | ... |
bottom
The layer below TCP is the IP layer. Each IP packet is a message with 3 components:
Communication with a packet-based scheme is not easy. There are lots of problems for
applications to solve:
To make things simple, the next layer is added on top of IP packets. TCP provides:
A byte stream is simply an ordered sequence of bytes. A protocol, rather than the application,
is used to make sense of these bytes. Protocols are like file formats, except that the total
length is unknown and the data is read in one pass.
UDP is on the same layer as TCP, but is still packet-based like the lower layer. UDP just
adds port numbers over IP packets.
8
2024-02-02 03. Code A TCP Server
• UDP: Each read from a socket corresponds to a single write from the peer.
• TCP: No such correspondence! Data is a continuous flow of bytes.
1. TCP send buffer: This is where data is stored before transmission. Multiple writes
are indistinguishable from a single write.
2. Data is encapsulated as one or more IP packets, IP boundaries have no relationship to
the original write boundaries.
3. TCP receive buffer: Data is available to applications as it arrives.
The No. 1 beginner trap in socket programming is “concatenating & splitting TCP packets”
because there is no such thing as “TCP packets”. Protocols are required to interpret TCP
data by imposing boundaries within the byte stream.
To help you understand the implications of the byte stream, let’s use the DNS protocol
(domain name to IP address lookup) as an example.
DNS runs on UDP, the client sends a single request message and the server responds with a
single response message. A DNS message is encapsulated in a UDP packet.
| IP header | IP payload |
\............................../
| UDP header | UDP payload |
\.............../
| DNS message |
Due to the drawbacks of packet-based protocols, e.g., the inability to use large messages,
DNS is also designed to run on TCP. But TCP knows nothing about “message”, so when
sending DNS messages over TCP, a 2-byte length field is prepended to each DNS message
so that the server or client can tell which part of the byte stream is which message. This
2-byte length field is the simplest example of an application protocol on top of TCP. This
protocol allows for multiple application messages (DNS) in a single TCP byte stream.
To establish a TCP connection, there should be a client and a server (ignoring the simul-
taneous case). The server waits for the client at a specific address (IP + port), this step is
called bind & listen. Then the client can connect to that address. The “connect” operation
involves a 3-step handshake (SYN, SYN-ACK, ACK), but this is not our concern because
the OS does it transparently. After the OS completes the handshake, the connection can be
accepted by the server.
Once established, the TCP connection can be used as a bi-directional byte stream, with 2
channels for each direction. Many protocols are request-response like HTTP/1.1, where a
peer is either sending a request/response or receiving a response/request. But TCP isn’t
restricted to this mode of communication. Each peer can send and receive at the same time
(e.g. WebSocket), this is called full-duplex communication.
A peer tells the other side that no more data will be sent with the FIN flag, then the other
side ACKs the FIN. The remote application is notified of the termination when reading
from the channel.
Each direction of channels can be terminated independently, so the other side also performs
the same handshake to fully close the connection.
When you create a TCP connection, the connection is managed by your operating system,
and you use the socket handle to refer to the connection in the socket API. In Linux, a
socket handle is simply a file descriptor (fd). In Node.js, socket handles are wrapped into JS
objects with methods on them.
Any OS handle must be closed by the application to terminate the underlying resource and
recycle the handle.
End of Transmission
Send and receive are also called read and write. For the write side, there are ways to tell the
peer that no more data will be sent.
• Closing a socket terminates a connection and causes the TCP FIN to be sent. Closing
a handle of any type also recycles the handle itself. (Once the handle is gone, you
cannot do anything with it.)
• You can also shutdown your side of the transmission (also send FIN) while still being
able to receive data from the peer; this is called a half-open connection, more on this
later.
For the read side, there are ways to know when the peer has ended the transmission (received
FIN). The end of transmission is often called the end of file (EOF).
• Listening socket:
• Connection socket:
– read
– write
– close
The next thing is the accept primitive for getting new connections. Unfortunately, there is
no accept() function that simply returns a connection.
Here we need some background knowledge about IO in JS: There are 2 styles of handling
IO in JS, the first style is using callbacks; you request something to be done and register a
callback with the runtime, and when the thing is done, the callback is invoked.
In the above code listing, server.on('connection', newConn) registers the callback function
newConn. The runtime will automatically perform the accept operation and invoke the callback
with the new connection as an argument of type net.Socket. This callback is registered once,
but will be called for each new connection.
The 'connection' argument is called an event, which is something you can register callbacks
on. There are other events on a listening socket. For example, there is the 'error' event,
which is invoked when an error occurs.
Here we simply throw the exception and terminate the program. You can test this by
running 2 servers on the same address and port, the second server will fail.
As this book is not a manual, we will not list everything here. Read the Node.js documenta-
tion[1] to find out other potentially useful events.
Data received from the connection is also delivered via callbacks. The relevant events for
reading from a socket are the 'data' event and the 'end' event. The 'data' event is invoked
whenever data arrives from the peer, and the 'end' event is invoked when the peer has
ended the transmission.
socket.on('end', () => {
// FIN received. The connection will be closed automatically.
console.log('EOF.');
});
socket.on('data', (data: Buffer) => {
console.log('data:', data);
socket.write(data); // echo back the data.
});
The socket.end() method ends the transmission and closes the socket. Here we call
socket.end() when the data contains the letter “q” so we can easily test this scenario.
[1]
https://nodejs.org/api/net.html#class-netserver
When the transmission is ended from either side, the socket is automatically closed by the
runtime. There is also the 'error' event on net.Socket that reports IO errors. This event
also causes the runtime to close the socket.
Step 6: Test It
Here is the complete code for our echo server.
socket.on('end', () => {
// FIN received. The connection will be closed automatically.
console.log('EOF.');
});
socket.on('data', (data: Buffer) => {
console.log('data:', data);
socket.write(data); // echo back the data.
Start the echo server by running node --enable-source-maps echo_server.js. And test it with
the nc or socat command.
• A cannot send any more data, but can still receive from B.
• B gets EOF, but can still send to A.
Not many applications make use of this. Most applications treat EOF the same way as being
fully closed by the peer, and will also close the socket immediately.
The socket primitive for this is called shudown[2] . Sockets in Node.js do not support half-
open by default, and are automatically closed when either side sends or receives EOF. To
support TCP half-open, an additional flag is required.
When the allowHalfOpen flag is enabled, you are responsible for closing the connection,
because socket.end() will no longer close the connection, but will only send EOF. Use
socket.destroy() to close the socket manually.
[2]
https://man7.org/linux/man-pages/man2/shutdown.2.html
// pseudo code!
while (running) {
let events = wait_for_events(); // blocking
for (let e of events) {
do_something(e); // may invoke callbacks
}
}
The runtime polls for IO events from the OS, such as a new connection arriving, a socket
becoming ready to read, or a timer expiring. Then the runtime reacts to the events and
invokes the callbacks that the programmer registered earlier. This process repeats after all
events have been handled, thus it’s called the event loop.
The event loop is single-threaded; execution is either on the runtime code or on the JS code
(callbacks or the main program). This works because when a callback returns, or awaits,
control is back to the runtime, so the runtime can emit events and schedule other tasks.
This implies that any JS code is expected to finish in a short time because the event loop is
halted when executing JS code.
To help you understand the implication of the event loop, let’s now consider concurrency.
A server can have multiple connections simultaneously, and each connection can emit
events.
While an event handler is running, the single-threaded runtime cannot do anything for the
other connections until the handler returns. The longer you process an event, the longer
everything else is delayed.
It’s vital to avoid staying in the event loop for too long. One way to cause such trouble is to
run CPU-intensive code. This can be solved by …
These topics are beyond the scope of this book, and our primary concern is IO.
The OS provides both blocking mode and non-blocking mode for network IO.
• In blocking mode, the calling OS thread blocks until the result is ready.
• In non-blocking mode, the OS immediately returns if the result is not ready (or is
ready), and there is a way to be notified of readiness (for event loops).
The Node.JS runtime uses non-blocking mode because blocking mode is incompatible
with event-based concurrency. The only blocking operation in an event loop is polling the
OS for more events when there is nothing to do.
IO in Node.js is Asynchronous
Most Node.js library functions related to IO are either callback-based or promise-based.
Promises can be viewed as another way to manage callbacks. These are also described as
asynchronous, meaning that the result is delivered via a callback. These APIs do not block
the event loop because the JS code doesn’t wait for the result; instead, the JS code returns
to the runtime, and when the result is ready, the runtime invokes the callback to continue
your program.
The opposite is the synchronous API, which blocks the calling OS thread to wait for the
result. For example, let’s take a look at the documentation of the fs module, file APIs are
available in all 3 types[3] .
// promise
filehandle.read([options]);
// callback
fs.read(fd[, options], callback);
// synchronous, do not use!
fs.readSync(fd, buffer[, options]);
The synchronous API is what you do NOT use in network applications since it blocks the
event loop. They exist for some simple use cases (like scripting) that do not depend on the
event loop at all.
[3]
https://nodejs.org/api/fs.html
A hypothetical promised-based API for the accept primitive looks like this:
// pseudo code!
while (running) {
let socket = await server.accept();
newConn(socket); // no `await` on this
}
And the hypothetical API for the read and write primitive looks like this:
// pseudo code!
async function newConn(socket) {
while (true) {
let data = await socket.read();
if (!data) {
break; // EOF
}
await socket.write(data);
}
}
The above pseudo code appears to be synchronous, but without blocking the event loop.
Although the advantage may not be clear at this point, since our program is very simple.
Some Node.js APIs, but not all of them, are available in both callback-based and promise-
based styles. However, with some effort, callback-based APIs can be converted to promised-
based ones, as we will see in the next chapter.
Using Callbacks
An example of a callback-based API. The application logic is continued in a callback
function.
function my_app() {
do_something_cb((err, result) => {
if (err) {
// fail.
} else {
// succuss, use the result.
}
});
}
19
2024-02-02 04. Promises and Events
Creating Promises
An example of creating promises: converting a callback-based API to promise-based.
function do_something_prmoise() {
return new Promise<T>((resolve, reject) => {
do_something_cb((err, result) => {
if (err) {
reject(err);
} else {
resolve(result);
}
});
});
}
Callbacks are unavoidable in JS. When creating a promise object, an executor callback is
passed as an argument to receive 2 more callbacks:
You must call one of them when the result is available or the operation has failed. This
may happen outside the executor function, so you may need to store these callbacks some-
where.
runtime can poll for events and invoke callbacks, that’s the event loop we talked about
earlier!
Initially, the Promise type is just a way to manage callbacks. It allows chaining multiple
callbacks without too many nested functions. However, we will not bother with this use of
promises because of the addition of async functions.
Unlike normal functions, async functions can return to the runtime in the middle of
execution; this happens when you use the await statement on a promise. And when the
promise is settled, execution of the async function resumes with the result of the promise.
This is a superior coding experience because you can write sequential IO code in the same
function without being interrupted by callbacks.
Invoking an async function results in a promise that settles itself when the async function
returns or throws. You can await on it like a normal promise, but if you don’t, the async
function will still be scheduled by the runtime. This is similar to starting a thread in multi-
threaded programming. But all JS code shares a single OS thread, so a better word to use is
task.
The soRead function returns a promise which is resolved with socket data. It depends on 3
events.
To resolve or reject the promise from these events, the promise has to be stored somewhere.
We will create the TCPConn wrapper object for this purpose.
The promise’s resolve and reject callbacks are stored in the TCPConn.reader field.
Let’s try to implement the 'data' event now. Here we have a problem: the 'data' event is
emitted whenever data arrives, but the promise only exists when the program is reading
from the socket. So there must be a way to control when the 'data' event is ready to fire.
Since the 'data' event is paused until we read the socket, the socket should be paused by
default after it is created. There is a flag to do this.
});
return conn;
}
There is also the 'drain' event[1] in the Node.js documentation which can be used for this
task. Node.js libraries often give you multiple ways to do the same thing, you can just
choose one and ignore the others.
We also wrapped our code in a try-catch block because the await statement can throw
exceptions when rejected. Although you may want to actually handle errors in production
code instead of using a catch-all exception handler.
[1]
https://nodejs.org/api/net.html#event-drain
// echo server
async function serveClient(socket: net.Socket): Promise<void> {
const conn: TCPConn = soInit(socket);
while (true) {
const data = await soRead(conn);
if (data.length === 0) {
console.log('end connection');
break;
}
console.log('data', data);
await soWrite(conn, data);
}
}
The code to use the socket now becomes straightforward. There are no callbacks to interrupt
the application logic.
Note that the newConn async function is not awaited anywhere. It is simply invoked as a
callback of the listening socket. This means that multiple connections are handled concur-
rently.
type TCPListener = {
socket: net.Socket;
// ...
};
Our new echo server has a major difference — we now wait for socket.write() to complete.
But what does the “completion of the write” mean? And why do we have to wait for it?
To answer the question, socket.write() is completed when the data is submitted to the OS,
but a new question arises, why the data cannot be submitted to the OS immediately. This
question actually goes deeper than network programming itself.
Wherever there is asynchronous communication, there are queues or buffers that connect
producers to consumers. Queues and buffers in our physical world are bounded in size and
cannot hold an infinite amount of data. One problem with asynchronous communication is
that what happens when the producer is producing faster than the consumer is consum-
ing? There must be a mechanism to prevent the queue or buffer from overflowing. This
mechanism is often called backpressure in network applications.
• The consumer’s TCP stack stores incoming data in a receive buffer for the application
to consume.
• The amount of data the producer’s TCP stack can send is bounded by a window known
to the producer’s TCP stack, and it will pause sending data when the window is full.
• The consumer’s TCP stack manages the window; when the app drains from the
receive buffer, it moves the window forward and notifies the producer’s TCP stack to
resume sending.
The effect of flow control: TCP can pause and resume transmission so that the consumer’s
receive buffer is bounded.
TCP flow control should not be confused with TCP congestion control, which also controls
the window.
This nice mechanism needs to be implemented not only in TCP, but also in applications.
Let’s focus on the producer side. The application produces data and submits it to the OS,
the data goes to the send buffer, and the TCP stack consumes from the send buffer and
transmits the data.
How does the OS prevent the send buffer from overflowing? Simple, the application cannot
write more data when the buffer is full. Now the application is responsible for throttling
itself from overproducing, because the data has to go somewhere, but memory is finite.
If the application is doing blocking IO, the call will block when the send buffer is full, so
backpressure is effortless. However, this is not the case when coding in JS with an event
loop.
We can now answer the question: why wait for writes to complete? Because while the
application is waiting, it cannot produce! The socket.write() will always succeed even if the
runtime cannot submit more data to the OS due to a full send buffer, but the data has to go
somewhere, it goes to an unbounded internal queue in the runtime, which is a footgun that
can cause unbounded memory usage.
Taking our old echo server as an example, the server is both a producer and a consumer, as
is the client. If the client produces data faster than the client consumes the echoed data (or
the client does not consume any data at all), the server’s memory will grow indefinitely if
the server does not wait for writes to complete.
Backpressure should exist in any system that connects producers to consumers. A rule of
thumb is to look for unbounded queues in software systems, as they are a sign of the lack of
backpressure.
There is another reason to pause the 'data' event. In callback-based code, when the event
handler returns, the runtime can fire the next 'data' event if it is not paused. The problem is
that the completion of the event callback doesn’t mean the completion of the event handling
— the handling can continue with further callbacks. And the interleaved handling can cause
problems, considering that the data is an ordered sequence of bytes!
This situation is called a race condition, and is a class of problems related to concurrency. In
this situation, unwanted concurrency is introduced.
• If you stick to promises and async/await, it’s harder to create the kind of race conditions
described above because things happen in order.
• With callback-based code, it’s not only harder to figure out the order of code execu-
tion, it’s also harder to control the order. In short, callbacks are harder to read and
more error-prone to write.
• Backpressure is naturally present when using the promse-based style. This is similar
to coding with blocking IO (which you can’t do in Node.js).
We have learned the basics of the socket API. Let’s move on to the next topic: protocol.
Our protocol consists of messages separated by '\n' (the newline character). The server
reads messages and sends back replies using the same protocol.
• If the client sends 'quit', reply with 'Bye.' and close the connection.
• Otherwise, echo the message back with the prefix 'Echo: '.
client server
------ ------
msg1\n ==>
<== Echo: msg1\n
msg2\n ==>
<== Echo: msg2\n
quit\n ==>
<== Bye.\n
A Buffer object in Node.JS is a fixed-size chunk of binary data. It cannot grow by appending
data. What we can do is to concatenate 2 buffers into a new buffer.
31
2024-02-02 05. A Simple Network Protocol
Each time you append new data by concatenation, the old data is copied. To amortize the
cost of copying, dynamic arrays are used:
// A dynamic-sized buffer
type DynBuf = {
data: Buffer,
length: number,
};
The syntax of the copy() method is src.copy(dst, dst_offset, src_offset). This is for ap-
pending data.
Buffer.alloc(cap) creates a new buffer of a given size. This is for resizing the buffer. The
new buffer has to grow exponentially so that the amortized cost is O(1). We’ll use power of
two series for the buffer capacity.
1. Parse and remove a complete message from the incoming byte stream.
}
// got some data, try it again.
continue;
}
// omitted. process the message and send the response ...
} // loop for messages
}
A socket read is not related to any message boundary. What we do is append data to a buffer
until it contains a complete message.
The cutMessage() function tests if the message is complete using the delimiter '\n'.
Then it makes a copy of the message data, because it will be removed from the buffer.
buf.copyWithin(dst, src_start, src_end) copies data within a buffer, source and destination
can overlap, like memmove() in C. This way of handling buffers is not very optimal, more on
this later.
while (true) {
// try to get 1 message from the buffer
const msg = cutMessage(buf);
if (!msg) {
// omitted ...
continue;
}
Our message echo server is now complete. Test it with the socat command.
Consider a typical modern web page that involves many scripts and style sheets. It takes
many HTTP requests to load the page, and each request increases the load time by at least
one roundtrip time (RTT). If we can send multiple requests at once, without waiting for
the responses one by one, the load time could be greatly reduced. On the server side, the
server shouldn’t tell the difference because a TCP connection is just a byte stream.
client server
------ ------
|req1| ==>
|req2| ==>
|req3| ==> <== |res1|
... <== |res2|
...
This is called pipelined requests. It’s a common way to reduce RTTs in request-response
protocols.
This is why we kept the remaining buffer data around, because there can be more than 1
message in it.
While you can make pipelined requests to many well-implemented network servers, such
as Redis, NGINX, some less common implementations are problematic! Web browsers do
not use pipelined HTTP requests due to buggy servers, they may use multiple concurrent
connections instead.
But if you treat TCP data strictly as a continuous stream of bytes, pipelined messages should
be indistinguishable, because the parser doesn’t depend on the size of the buffered data, it
just consumes elements one by one.
So pipelined messages are a way to check the correctness of a protocol parser. If the server
treats a socket read as a “TCP packet”, it would easily fail.
You can test the pipelined scenario with the following command line.
The server will probably receive 2 messages in a single 'data' event, and our server handled
them correctly.
Deadlock by Pipelining
A caveat about pipelining requests: pipelining too many requests can lead to deadlocks;
because both the server and the client can be sending at the same time, and if both their
send buffers are full, it’s deadlocked as they are both stuck at sending and cannot drain the
buffer.
There is still O(n2 ) behavior in our buffer code; whenever we removed a message from the
buffer, we moved the remaining data to the front. This can be triggered by pipelining many
small messages.
To fix this, the data movement has to be amortized. This can be done by deferring the data
movement. We can keep the remaining data temporarily in place until the wasted space in
the front reaches a threshold (such as 1/2 capacity).
This fix requires us to keep track of the beginning of the data, so you’ll need a new method
for retrieving the real data. The code is left as an exercise to you.
Sometimes it’s possible to use a large enough buffer without resizing if the message size in
the protocol is limited to a reasonably small value.
For example, many HTTP implementations have a small limit on header size as there is no
legitimate use case for putting lots of data in the header. For these implementations, one
buffer allocation per connection is enough, and the buffer is also sufficient for reading the
payload if it doesn’t need to store the entire payload in memory.
This is not very relevant for Node.JS apps, but these advantages are desirable in environments
with manual memory management or constrained hardware.
<!doctype html>
<!-- omitted -->
6.2 Content-Length
HTTP semantics are mostly about interpreting header fields, which is described in RFC
9110[1] . Try reading it yourself.
[1]
https://www.rfc-editor.org/rfc/rfc9110.html
39
2024-02-02 06. HTTP Semantics and Syntax
The most important header fields are Content-Length and Transfer-Encoding, because they
determine the length of an HTTP message, which is the most important function of a
protocol.
Some ancient HTTP/1.0 software doesn’t use Content-Length, so the body is just the rest of
the connection data, the parser reads the socket to EOF and that’s the body. This is the
second way to determine the body length. This way is problematic because you cannot tell
if the connection is ended prematurely.
This allows the server to send the response while generating it on the fly. This use case is
called streaming. An example is displaying real-time logs to the client without waiting for
the process to finish.
The receiver parses the byte stream into chunks and consumes the data, until the special
chunk is received. Here is a concrete example:
4\r\nHTTP\r\n5\r\nserver\r\n0\r\n\r\n
• 4\r\nHTTP\r\n
• 6\r\nserver\r\n
• 0\r\n\r\n
You can easily guess how this works. Chunks start with the size of the data, and a 0-sized
chunk marks the end of the stream.
There are also special cases, such as GET and HEAD, 304 (Not Modified) status code, which
make HTTP not easy to implement.
Another ambiguity is the nonexistent payload body for the GET request, what if the the
request includes Content-Length? Should the server ignore the field or forbid the field? What
about Content-Length: 0?
Also, should the server or client even allow users to mess with the Content-Length and
Transfer-Encoding fields at all? There are many discussions on the Internet, and although the
RFC tried to enumerate the cases[3] , different implementations handle them differently.
An exercise for the reader: If you are designing a new protocol, how do you avoid ambiguities
like this?
[2]
https://en.wikipedia.org/wiki/HTTP_request_smuggling
[3]
https://www.rfc-editor.org/rfc/rfc9112#section-6.3
The HTTP message format is described in a language called BNF. Go to the “2. Message”
section in RFC 9112 and you will see things like this:
This says: An HTTP message is either a request message or a response message. A message
starts with either a request line or a status line, followed by multiple header fields, then an
empty line, then the optional payload body. Lines are separated by CRLF, which is the
ASCII string '\r\n'. The BNF language is much more concise and less ambiguous than
English.
The header field name and value are separated by a colon, but the rules for field name and
value are defined in RFC 9110 instead.
field-name = token
token = 1*tchar
tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*"
/ "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
/ DIGIT / ALPHA
; any VCHAR, except delimiters
OWS = *( SP / HTAB )
; optional whitespace
field-value = *field-content
field-content = field-vchar
[ 1*( SP / HTAB / field-vchar ) field-vchar ]
field-vchar = VCHAR / obs-text
obs-text = %x80-FF
This is the general rule for field name and value. SP, HTAB, and VCHAR refer to space, tab, and
printable ASCII character, respectively. Some characters are forbidden in header fields,
especially CR and LF.
[4]
https://www.rfc-editor.org/rfc/rfc9112.html
Some header fields have additional rules for interpretation, such as comma-separated values
or quoted strings. For now, we can just leave them as they are until we need them.
The HTTP specification is very large, and this chapter only covers the most important bits
of implementing an HTTP server, which we will do in the next chapter.
The 2 most important HTTP methods are GET and POST. Why do we need different HTTP
methods? Besides the obvious fact that a POST request can carry a payload where a GET cannot,
it is also a good idea to separate read-only operations from write operations. You can use GET
for read-only operations and POST for the rest.
• GET.
• HEAD, like GET but without the response body.
• OPTIONS, rarely used, for identifying allowed request methods and CORS related things.
Cacheability
One reason for separating read-only operations from write operations is that read-only
operations are generally cacheable. On the other hand, it makes no sense to cache write
operations as they are state-changing.
However, the rules for cacheability are more complicated than different HTTP methods.
• GET and HEAD are considered to be cacheable methods. But OPTIONS is not, as it is for
special purposes.
• The status code also affects cacheability.
• Cache-Control header can affect cacheability.
• POST is usually not cacheable, unless[5] an obscure header field (Content-Location) is
used and certain cache directives are present.
• Different implementations have different cacheability rules.
[5]
https://www.rfc-editor.org/rfc/rfc9110#section-9.3.3-5
Idempotence
But why add CRUD as HTTP methods? A forum user may also move a post to another
forum, should HTTP also include a MOVE method? Mirroring arbitrary English verbs is not a
good reason to define HTTP methods. One of the better reasons is to define the idempotence
of operations.
An idempotent operation is one that can be repeated with the same effect. This means that
you can safely retry the operation until it succeeds. For example, if you rm a file over SSH
and the connection breaks before you see the result, so the state of the file is unknown to
you, but you can always blindly rm it again (if it’s really the same file):
An idempotent operation over HTTP can still result in a different status code, just like the
return code of rm.
Idempotence in HTTP:
Idempotence in browsers:
• If you submit a <form> via POST and then refresh the page, the browser will warn you
against resubmitting the potentially non-idempotent form.
• HTML forms are limited to GET and POST[6] . You need AJAX to use these idempotent
methods.
[6]
https://softwareengineering.stackexchange.com/a/211790
But this still doesn’t answer the puzzle of why there are so many verbs, because HTTP
could just add 1 more method for idempotent writes instead of 3 (PATCH, PUT, DELETE). In fact,
there may be no strong reason for apps to use them all.
Verb Safe Idempotent Cacheable <form> CRUD Req body Res body
GET Yes Yes Yes Yes read No Yes
HEAD Yes Yes Yes No read No No
POST No No No* Yes - Yes Yes
PATCH No No No* No update Yes May
PUT No Yes No No create Yes May
DELETE No Yes No No delete May May
HTTP is designed in a way that you can send requests from telnet, so you can learn it by
poking around. However, textual protocols have downsides.
One downside is that human-readable formats are often less machine-readable, because they
are more flexible than necessary. Consider the way HTTP payload length is determined:
HTTP is a simple protocol, where simple means it’s easy to look at. Writing code for it is
not simple because there are too many rules for interpreting it, and the rules still leave you
with ambiguities.
Another downside is that dealing with text is a lot more work. To properly handle text
strings, you need to know their length first, which is often determined by delimiters. The
extra work of looking for delimiters is the cost of human-readable formats.
HTTP/2 is binary and more complex than HTTP/1.1, but parsing the protocol is still easier
because you don’t have to deal with elements of unknown length.
One problem with delimiters is that the data cannot contain the delimiter itself. Failure to
enforce this rule can lead to some injection exploits.
If a malicious client can trick a buggy server into emitting an header field value with CRLF
in it, and the header field is the last field, then the payload body starts with the part of the
field value that the attacker controls. This is called “HTTP response splitting[7] ”.
A proper HTTP server/client must forbid CRLF in header fields as there is no way to encode
them. However, this is not true for many generic data formats. For example, JSON uses
{}[],: to delimit elements, but a JSON string can contain arbitrary characters, so strings are
quoted to avoid ambiguity with delimiters. But the quotes themselves are also delimiters, so
escape sequences are needed to encode quotes.
This is why you need a JSON library to produce JSON instead of concatenating strings
together. And HTTP is less well defined and more complicated than JSON, so pay attention
to the specifications.
[7]
https://en.wikipedia.org/wiki/HTTP_response_splitting
• The chunked transfer encoding. Although the length itself is still delimited.
• The WebSocket frame format. No delimiters at all.
• HTTP/2. Frame-based.
• The MessagePack[8] serialization format. Some kind of binary JSON.
[8]
https://github.com/msgpack/msgpack/blob/master/spec.md
// an HTTP response
type HTTPRes = {
code: number,
headers: Buffer[],
body: BodyReader,
};
We use Buffer instead of string for the URI and header fields. Although HTTP is mostly
plaintext, there is no guarantee that URI and header fields must be ASCII or UTF-8 strings.
So we just leave them as bytes until we need to parse them.
The BodyReader type is the interface for reading data from the body payload.
49
2024-02-02 07. Code A Basic HTTP Server
The payload body can be arbitrarily long, it may not even fit in memory, thus we have
to use the read() function to read from it instead of a simple Buffer. The read() function
follows the convention of the soRead() function — the end of data is signaled by an empty
Buffer.
And when using chunked encoding, the length of the body is not known, which is another
reason why this interface is needed.
The HTTPError is a custom exception type defined by us. It is used to generate an error
response and close the connection. Note that this thing exists only to make our code
simpler by deferring the unhappy case of error handling. You probably don’t want to throw
exceptions around like this in production code.
In theory, there is no limit to the size of the header, but in practice there is. Because we are
going to parse and store the header in memory, and memory is finite.
// parse & remove a header from the beginning of the buffer if possible
function cutMessage(buf: DynBuf): null|HTTPReq {
// the end of the header is marked by '\r\n\r\n'
const idx = buf.data.subarray(0, buf.length).indexOf('\r\n\r\n');
if (idx < 0) {
if (buf.length >= kMaxHeaderLen) {
throw new HTTPError(413, 'header is too large');
}
return null; // need more data
}
// parse & remove the header
const msg = parseHTTPReq(buf.data.subarray(0, idx + 4));
bufPop(buf, idx + 4);
return msg;
}
Parsing is also easier when we have the complete data. That’s another reason why we waited
for the full HTTP header before parsing anything.
The first line is simply 3 pieces separated by space. The rest of the lines are header fields.
Although we’re not trying to parse the header fields here, it’s still a good idea to do some
validations on them.
The splitLines(), parseRequestLine(), and validateHeader() functions are not very interesting,
so we will not show them here. You can easily code them yourself according to RFCs.
if (bodyLen >= 0) {
// "Content-Length" is present
return readerFromConnLength(conn, buf, bodyLen);
} else if (chunked) {
// chunked encoding
throw new HTTPError(501, 'TODO');
} else {
// read the rest of the connection
throw new HTTPError(501, 'TODO');
}
}
Here we need to look at the Content-Length field and the Transfer-Encoding field. The
fieldGet() function is for looking up the field value by name. Note that field names are
case-insensitive. The implementation is left to the reader.
We will only implement the case where the Content-Length field is present, the other cases
are left for later chapters.
}
if (buf.length === 0) {
// try to get some data if there is none
const data = await soRead(conn);
bufPush(buf, data);
if (data.length === 0) {
// expect more data!
throw new Error('Unexpected EOF from HTTP body');
}
}
// consume data from the buffer
const consume = Math.min(buf.length, remain);
remain -= consume;
const data = Buffer.from(buf.data.subarray(0, consume));
bufPop(buf, consume);
return data;
}
};
}
The readerFromConnLength() function returns a BodyReader that reads exactly the number of
bytes specified in the Content-Length field. Note that the data from the socket goes into the
buffer first, then we drain data from the buffer. This is because:
• There may be extra data in the buffer before we read from the socket.
• The last read may return more data than we need, so we need to put the extra data
back into the buffer.
The remain variable is a state captured by the read() function to keep track of the remaining
body length.
case '/echo':
// http echo server
resp = body;
break;
default:
resp = readerFromMemory(Buffer.from('hello world.\n'));
break;
}
return {
code: 200,
headers: [Buffer.from('Server: my_first_http_server')],
body: resp,
};
}
If the URI is '/echo', we simply set the response payload to the request payload. This
essentially creates an echo server in HTTP. You can test this by POSTing data with the curl
command.
The other sample response is a fixed string 'hello world.\n'. To do this, we must first create
the BodyReader object.
};
}
The read() function returns the full data on the first call and returns EOF after that. This is
useful for responding with something small and already fits in memory.
After handling the request, we can send the response header and the response body if there is
one. In this chapter, we will only deal with the payload body of known length; the chunked
encoding is left for later chapters. All we need to do is to add the Content-Length field.
The encodeHTTPResp() function encodes a response header into a byte buffer. The message
format is almost identical to the request message, except for the first line.
Encoding is much easier than parsing, so the implementation is left to the reader.
There is still work to be done after sending the response. We can provide some compatibility
for HTTP/1.0 clients by closing the connection immediately, since the connection cannot
be reused anyway.
And most importantly, before continuing the loop to the next request, we must make sure
that the request body is completely consumed, because the handler function may have
ignored the request body and left the parser at the wrong position.
7.2 Testing
The simplest test case is to make requests with curl. The server should greet you with “hello
world”. You can also POST data to the '/echo' path and the server should echo the data
back.
Note the crnl option in the socat command, this is to make sure that lines end with CRLF
instead of just LF.
If you remove the sleep 1 in the above script, you will also be testing pipelined requests.
// Bad example!
await soWrite(conn, Buffer.from(`HTTP/1.1 ${msg.code} ${status}\r\n`));
for (const h of msg.headers) {
await soWrite(conn, h);
await soWrite(conn, Buffer.from('\r\n'));
}
await soWrite(conn, Buffer.from('\r\n'));
The problem with this is that it generates many small writes, causing TCP to send many
small packets. Not only does each packet have a relatively large space overhead, but more
computation is required to process more packets. People saw this optimization opportunity
and added a feature to the TCP stack known as “Nagle’s algorithm” — the TCP stack delays
transmission to allow the send buffer to accumulate data, so that multiple consecutive small
writes can be combined.
Premature Optimization
However, this is not a good optimization. Many newer network protocol designs, such as
TLS, have put a lot of effort into reducing RTTs because many performance problems are
latency problems. Adding delays to TCP to combine writes now looks like anti-optimization.
And the intended optimization goal can easily be achieved at the application level instead;
applications can simply combine small data themselves without delays.
Nagle’s algorithm is often enabled by default. This can be disabled using the noDelay flag in
Node.js.
Instead of explicitly serializing data into a buffer, as we do with the response header, we
can also add a buffer to the TCPConn type and change the way it works.
In the new scheme, the soWrite() function is changed to append data to an internal buffer
in TCPConn, and the new soFlush() function is used to actually write the data. The buffer
size is limited, and the soWrite() function can also flush the buffer when it is full.
This style of IO is very popular and you may have seen it in other programming languages.
For example, stdio in C has a built-in buffer which is enabled by default, you must use
fflush() when appropriate.
type BufferedWriter = {
write: (data: Buffer) => Promise<void>,
flush: () => Promise<void>,
// ...
};
This is similar to the bufio.Writer in Golang. This scheme is more flexible than adding
buffering to the socket code, because the buffered wrapper is also applicable to other forms
of IO. And the Go standard library was designed with well-defined interfaces (io.Writer[1] ),
making the buffered writer a drop-in replacement for the unbuffered writer.
There are many more good ideas to steal from the Go standard library. One of them is that
the bufio.Writer is not just an io.Writer, but also exposes its internal buffer[2] so that you
can write to it directly! This can eliminate temporary buffers and extra data copies when
serializing data.
[1]
https://pkg.go.dev/io#Writer
[2]
https://pkg.go.dev/bufio#example-Writer.AvailableBuffer
chunked-body = *chunk
last-chunk
trailer-section
CRLF
4\r\nHTTP\r\n5\r\nserver\r\n0\r\n\r\n
• Each chunk starts with a hexadecimal number that ends with CRLF; this is the length
of the chunk data.
• The following chunk data also ends with CRLF; this CRLF serves no purpose other
than to make the protocol human-readable.
• A zero-length chunk marks the end of the HTTP body. We have seen this convention
often!
62
2024-02-02 08. Dynamic Content and Streaming
The chunk-ext is designed to extend the chunk format by adding additional key-value pairs,
but in practice there is no need for such extensions.
The trailer-section is designed to put some header fields after the HTTP body. You may
wonder why would one do that, this is for some rare use cases. Some people like to put
application-specific stuff in HTTP header fields, such as the checksum of the data. The
problem is that when streaming data, the checksum is only known after the data has been
streamed, so a mechanism has been designed to append some header fields after the last
chunk.
Many HTTP server or client implementations simply ignore these obscure features as they
have little use or value. On the other hand, it is unwise to design your application to use
any obscure features, even if they are backed by RFC, because you are more likely to run
into problems with clients and middleware. We will ignore these in our implementation as
well.
The interface from the last chapter was designed with chunked encoding in mind. What
we need to do is to implement the case where the length is -1 (unknown).
resp.headers.push(Buffer.from('Transfer-Encoding: chunked'));
} else {
resp.headers.push(Buffer.from(`Content-Length: ${resp.body.length}`));
}
// write the header
await soWrite(conn, encodeHTTPResp(resp));
// write the body
const crlf = Buffer.from('\r\n');
for (let last = false; !last; ) {
let data = await resp.body.read();
last = (data.length === 0); // ended?
if (resp.body.length < 0) { // chunked encoding
data = Buffer.concat([
Buffer.from(data.length.toString(16)), crlf,
data, crlf,
]);
}
if (data.length) {
await soWrite(conn, data);
}
}
}
Note that the chunked message is created with Buffer.concat() before we send it to the
socket with a single write. This follows the advice from the last chapter on buffered IO.
8.3 JS Generators
The next step is to connect the application code that generates the response body (producer)
to the BodyReader interface (queue), which the writeHTTPResp() function (consumer) pulls
data from.
// pseudo code!
async function produce_response() {
// ...
await output(data1); // consumed from a BodyReader
// ...
await output(data2); // consumed from a BodyReader
// ...
}
JS Generators as Producers
JavaScript generators are well suited for producer-consumer problems. The producer
generator uses the yield statement to pass data and control to the consumer. And when
the consumer pulls data again, execution resumes from the last yield statement. Below is a
sample response generator.
// count to 99
async function *countSheep(): BufferGenerator {
for (let i = 0; i < 100; i++) {
// sleep 1s, then output the counter
await new Promise((resolve) => setTimeout(resolve, 1000));
yield Buffer.from(`${i}\n`);
}
}
The asterisk after the function keyword is the syntax for JS generators. You can think of
generators as functions with multiple returns (yield). The statement is called yield because
they yield to the runtime, like a return in a normal function, or an await in an async function.
And there are both normal and async generators.
The AsyncGenerator TypeScript interface is used to specify the type of an async generator, it
has 3 type parameters:
Let’s add a new URI handler for the response generator. The readerFromGenerator() function
converts a JS generator into a BodyReader.
case '/sheep':
resp = readerFromGenerator(countSheep());
break;
To pull data from a generator, use the next() method. The next() method returns when
the generator yields or returns, which is differentiated by the done flag, and the data can be
retrieved from the value member.
The next() method can take an optional argument that is passed to the producer as the
result of the yield statement, which means that JS generators are bi-directional. Although
we don’t need this capability for now.
Test the /sheep URI with curl and you should see the counter output every 1s, which means
the chunked encoding is working. There is no limit to the total body length when using
chunked encoding, you can even generate an infinite byte stream.
[1]
https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream
[2]
https://developer.mozilla.org/en-US/docs/Web/API/fetch#body
// decode the chunked encoding and yield the data on the fly
async function* readChunks(conn: TCPConn, buf: DynBuf): BufferGenerator;
The readChunks() generator should behave just like the sample response generator we coded
before. So we can just use readerFromGenerator() to convert it into a BodyReader.
The readerFromConnEOF() function is for compatibility with HTTP/1.0 clients. We’ll skip
the code listing.
// decode the chunked encoding and yield the data on the fly
async function* readChunks(conn: TCPConn, buf: DynBuf): BufferGenerator {
for (let last = false; !last; ) {
// read the chunk-size line
This is similar to the header-body structure of HTTP itself. You wait for the header (chunk-
size) and find out the length of the body (chunk-data). The header is of variable length, so
we need to limit its size while parsing, just like what we did with the HTTP header.
Decoding the format is easy. What you need to pay attention to is the inner loop for reading
the chunk data. The code doesn’t wait for the full chunk data to arrive, instead it yield the
data whenever it arrives. Remember that the chunked encoding is still supposed to present
the application with a byte stream intead of messages. If we waited for the full chunk data,
we would have to store the whole chunk in memory and impose a maximum chunk size
limit.
The curl will read data from stdin till EOF, without knowing the request length before-
hand.
Adding print statements is a quick way to inspect what’s happening. However, there is
a cheaper way for network applications: Packet capture tools, such as tcpdump, ngrep, and
Wireshark, can intercept TCP data from the network. This is especially useful in production
where you cannot just edit the code.
0x0010: 7f00 0001 04d2 e9e0 42d3 3926 bb89 becb ........B.9&....
0x0020: 8018 0200 fe35 0000 0101 080a 8f86 a39d .....5..........
0x0030: 8f86 a39c 6865 6c6c 6f20 776f 726c 642e ....hello.world.
0x0040: 0a .
...
If you prefer GUIs, you can open the captured file with Wireshark, which, in addition to
showing the hex dump, can also …
• Filter packets with boolean expressions. This takes effect immediately, so you can do
it by trial and error.
• Highlight individual connections.
• Disassemble the protocol, show and highlight each element and structure.
ngrep is like grep, but for sockets instead of files, hence its name.
It’s worth noting that all of these tools can write packets to a file in an interchangeable
format (pcap). You can capture packets once and analyze them later.
For example, the server could generate a stream of JSON messages, separated by new lines,
delivered via the chunked encoding. The client only needs to issue a single request, and the
messages can be read by the client without the extra latency introduced by polling.
client server
------ ------
|header| ==>
<== |header|
...
<== |JSON\n|
<== |JSON\n|
<== |JSON\n|
...
WebSocket is Message-Based
There are multiple motivations behind the design of WebSocket. One of them is the message-
oriented design as opposed to the byte stream. The one-JSON-per-line example sounds
simple enough, but you still have to split the byte stream into lines. The No. 1 mistake in
coding networked applications is the failure to understand the byte stream. WebSocket
comes with a built-in framing format that outputs messages instead of bytes, which saves
programmers from repeating rookie mistakes.
client server
------ ------
|header| ==>
<== |header|
... ...
|message| ==>
|message| ==> <== |message|
|message| ==>
<== |message|
|message| ==>
... ...
We’ll implement WebSocket in a later chapter, after we have explored other important
uses of HTTP.
We’ll start by implementing a basic file server, which will familiarize us with the File API.
1. Open a file.
2. stat the file to get the metadata such as the size.
3. Read the file data.
4. Close the file.
Like sockets, disk files are represented by opaque handles wrapped in JS objects. Here’s the
simplified TypeScript API definition we’ll use.
interface FileReadResult {
bytesRead: number;
buffer: Buffer;
}
interface FileReadOptions {
73
2024-02-02 09. File IO & Resource Management
buffer?: Buffer;
offset?: number | null;
length?: number | null;
position?: number | null;
}
interface Stats {
isFile(): boolean;
isDirectory(): boolean;
// ...
size: number;
// ...
}
interface FileHandle {
read(options?: FileReadOptions): Promise<FileReadResult>;
close(): Promise<void>;
stat(): Promise<Stats>;
}
Our first step is to write the code to serve disk files. We will add a handler to serve files
from the current working directory.
For a production web server, there is additional work to ensure that the URI path is
really contained by the intended directory (URI normalization) and that the file is actually
accessible. We will skip this work.
Like sockets, disk files must be closed manually, the try-finally block is used to ensure this.
There is also the fs.stat(path) function, which takes a path instead of a handle as an
argument. It can stat a file by its path without opening it first. However, it is preferable to
use stat on a file handle. This is because the path can refer to different files. If you check a
path with stat and then open it later, the path may be replaced by another file (by renaming
or deleting), which is a case of a race condition[1] . Opening a file first ensures that you are
working on the same file.
The read() function of the BodyReader is directly wired to the fp.read() method. However,
there are a few gotchas.
• The fp.read() method will automatically create a buffer if it’s not supplied, but
surprisingly, it returns the buffer as is, without trimming it to the data size.
• When serving static files, we must make sure that the file we send matches the Content-
Length. A change in file size is unrecoverable, we can only close the connection in
this state.
[1]
https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use
If the handler function successfully returns a BodyReader, the file is still open and must be
closed later. We’ll add a close() function to the BodyReader interface to do the cleanup.
The close() member is made optional, so we only need to modify the readerFromStaticFile()
function.
The code after the handler function call is wrapped in a try-finally block to ensure that the
cleanup function is called.
That’s the most rudimentary file server. Time to stop and test.
The ownership of a resource is the most important concept in manual resource management.
That is, who is responsible for terminating and cleaning up the resource. The owner of a
resource can be either:
Ownership can be static or dynamic. Dynamic means that the owner changes at run time.
No matter what you do, a resource always has exactly 1 owner.
All of these mechanisms, except Go’s defer, can also be used in a smaller scope than the
entire function.
References to a resource can be classified into owning and non-owning references. You
must maintain exactly 1 owning reference. The owning reference is attached to a scope
or object, which terminates the resource unless the reference is nullified by a transfer to
another owner.
The readerFromStaticFile() function in this chapter can never throw an exception, but if it
does, it should close the file because it owns the file. A problem with the caller code is that
the file reference is not set to null when an exception is thrown, so fp is closed twice. A
more robust pattern is to use a try-finally block:
try {
const reader: BodyReader = readerFromStaticFile(fp, size);
return {code: 200, headers: [], body: reader};
} finally {
fp = null; // transferred to the BodyReader
}
In the serveClient() function, after the handler returns the response object, there is an
example of the ownership chain.
| serveClient() | ==> | HTTPRes | ==> | BodyReader | ==> | file |
function object object resource
Study the source code to understand how each owner holds the reference and does the
cleanup.
// count to 99
async function *countSheep(): BufferGenerator {
try {
for (let i = 0; i < 100; i++) {
// sleep 1s, then output the counter
await new Promise((resolve) => setTimeout(resolve, 1000));
yield Buffer.from(`${i}\n`);
}
} finally {
console.log('cleanup!');
}
}
You can test this case with the '/sheep' URI; if the client disconnects midway, the finally
block will not be executed. Fortunately, there is a way to fix this.
},
};
}
There is also the throw() method, which generates an exception at the yield statement.
This will also execute the finally block. You can even use a special type of exception to
communicate to the generator that you intend to abort it midway, so that the generator can
deal with this particular case.
Python generators have similar methods[3] . And surprisingly[4] , the Python garbage collector
automatically executes the finally block for generators, although it is debatable whether
the GC should handle non-memory business.
The readerFromStaticFile() function allocates and returns a new buffer for each read. This
is not very efficient because buffer allocations have costs:
1. The costs from the memory allocator. For many implementations, this cost is constant
in most cases. In the best case, the allocator either returns an object from a free list or
allocates from a large chunk.
2. The cost of initializing with zeros. Scales linearly with buffer size.
Larger buffers are often desirable for potentially higher throughput due to the reduced
number of syscalls. However, allocating oversized buffers has a linear initialization cost,
which can be avoided by using uninitialized buffers in Node.JS.
[2]
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Generator/retu
rn
[3]
https://docs.python.org/3/reference/expressions.html#generator.close
[4]
https://docs.python.org/2.5/whatsnew/pep-342.html
This function is called allocUnsafe because a buggy program has a greater chance of leaking
sensitive internal information to users. Uninitialized data bugs are also harder to debug due
to their unpredictability.
Uninitialized buffers are not easily obtainable in some languages, such as Go or Python.
What you can do is reuse the same buffer for multiple IO operations.
Pooled Buffers
Instead of allocating buffers locally, you can have a global pool of used buffers. Whenever
you are done with a buffer, you put it into that pool. And when you need a buffer, you try
to grab one from the pool first, and only create a new one if the pool is empty.
• Grabbing stuff from the pool can cost even less than the memory allocator.
• Different code paths all benefit from the pool. There are fewer real allocations overall.
An object pool is trivial to code. The Node.JS Buffer type uses a built-in pool[5] , but only
for small buffers. The Go standard library also includes an object pool[6] .
[5]
https://nodejs.org/api/buffer.html#static-method-bufferallocunsafeslowsize
[6]
https://pkg.go.dev/sync#Pool
Let’s apply the lesson from the resource management discussion: who owns the buffer? For
the code in this chapter, we know that the buffer returned by the producer (BodyReader) is
written to the socket, and our code waits for the write to complete before invoking the
next read(), so the producer is the only place the buffer is referenced, so we can reason that
the consumer doesn’t need to own the buffer, so the buffer reuse is valid. However, this
may not be true for more complicated consumers!
Another case is when the consumer allocates the buffer, the lifetime of the buffer is clear as
long as the data is processed in place so the consumer is the sole owner.
// pseudo code!
const buf = Buffer.allocUnsafe(65536); // reused for each read
while (more_data()) {
const nbytes = await read_data(buf); // read from the producer
await process(buf.subarray(0, nbytes));
// `buf` is not used anymore, so it's safe to reuse.
}
In Golang, when pulling data from the io.Reader interface, the buffer is provided by the
consumer. This encourages buffer reuse and discourages unnecessary allocations by design.
range-unit is always bytes, range-set is a comma-separated list of ranges, and there are 2
possible formats for a single range:
1. An inclusive interval like 12-34, which means the range from the 12th byte to the
34th byte.
2. A negative integer like -12, which means the last 12 bytes.
[1]
https://www.rfc-editor.org/rfc/rfc9110.html
84
2024-02-02 10. Range Requests
• If the intersection is empty, the server responds with status code 416 (Range Not
Satisfiable).
• If any of the ranges are invalid, the server ignores the header field (and serves the full
file).
Once you start adding features to your HTTP server, you’ll find that most HTTP features
are optional, and optional features can make things more complicated than they seem.
A server may not support range requests at all, and even if it does, they are not applicable to
all requests. For example, it makes no sense to return a ranged response for dynamically
generated content, because the length of the content is not even known in advance, and the
server cannot tell if the request is out of range.
How does a client know if the server supports range requests? That’s the job of the 206
(Partial Content) status code, which indicates a partial response instead of a full one (200
OK).
As seen in the examples above, the effective range may be different from the requested
range, so the server needs to tell the client the exact range returned. That’s the job of the
Content-Range header field. It contains the effective range of the response, along with the
total content length.
This header field is also used for 416 (Range Not Satisfiable) responses to tell the client the
correct length of the content.
If the server supports the HEAD method, it should behave like a GET but without the response
body, returning the response header field and the status code as if the request is being
fully served. The client can then check for Accept-Ranges or Content-Range without actually
receiving the content.
Remember that the way payload length is determined in HTTP is messy. The HEAD method
adds a special case for clients: The server may return the Content-Length, but it doesn’t tell
the payload length this time; the HEAD method has no response body, regardless of what the
header fields say!
GET / HTTP/1.1
Range: bytes=0-5,8-13
--SEPARATOR1234567
Content-Type: text/html; charset=UTF-8
Content-Range: bytes 0-5/1234
<html>
--SEPARATOR1234567
Content-Type: text/html; charset=UTF-8
Content-Range: bytes 8-13/1234
<head>
--SEPARATOR1234567--
The Content-Length header field still serves the same purpose, which is the length of the
payload (the multipart message), not the sum of the ranges.
Here we learned a new idea to separate data — random string delimiters. The delimiter
should be long enough so that the payload is unlikely to contain it. The maximum length is
70 bytes according to RFC 2046[2] . Still, delimiters are generally not a good idea.
Why not simply concatenate multiple ranges without this message encapsulation? This can
work if Content-Range could return the range list.
While many computer problems are solved by an extra encapsulation, it’s also common to
see encapsulations that do nothing.
[2]
https://datatracker.ietf.org/doc/html/rfc2046
// range-spec = int-range
// / suffix-range
// / other-range
// int-range = first-pos "-" [ last-pos ]
// suffix-range = "-" suffix-length
type HTTPRange = [number, number|null] | number;
You can read from the desired file position using the position argument of the fp.read()
method. File IO APIs often include a seek() method to set the read/write position, this is
unnecessary in Node.JS because the position can be specified in the read() method, and the
underlying OS interface is a single syscall or API call in popular operating systems anyway.
The rest of the code is omitted. Remember to apply the discussion from the last chapter
when coding this.
try {
return staticFileResp(req, fp, size);
} finally {
fp = null; // transferred to staticFileResp()
}
1. Check the Range header field and compute the effective range.
• Return 416 (Range Not Satisfiable) if the requested range does not intersect
with the file.
2. Add the Content-Range header field that contains the effective range.
3. Respond with 206 (Partial Content).
This is the code path for a single-range response. We’ll skip the case for multiple ranges
because there is nothing more to learn.
The BodyReader is still closed regardless, because in our case the BodyReader is still owning a
file resource. You can also handle the HEAD method in the request handler and close the file
earlier (and not create the BodyReader in the first place).
Validate by Timestamp
One way to do this is through the Last-Modified and If-Modified-Since header fields. This
works in the following steps:
1. The server returns the filesystem modification time of the file in the Last-Modified
header field.
2. The client caches the response along with the timestamp.
3. The client needs to validate the cached response before reusing it by sending the
request with the If-Modified-Since header field set to the cached timestamp.
• If the file has been modified on the server (indicated by a different timestamp),
the server returns a response as usual.
• If the server doesn’t support this validation method and ignores the header field,
it also returns as usual.
• If the timestamp is the same on the server, the server returns the status code
304 (Not Modified), this status code has no payload body, it just tells the client
to reuse the cache.
4. The client will reuse the cached response if the status code is 304, or use the normal
response and update its cache.
Validate by ETag
The resolution of the timestamp is only 1 second, which is inadequate for some use cases,
and filesystem timestamps are not a reliable way to identify the content anyway. Another
way to identify the content is through the ETag header fields. This works the same way as
Last-Modified, except that its value is arbitrary, you can use the hash of the content, or keep
a version number whenever you update the content.
91
2024-02-02 11. HTTP Caching
The ETag scheme will likely require a separate subsystem to keep track of the content,
whether it’s a hash value or a version number, so we won’t code this.
The server uses the Cache-Control header field to advise the client about caching. Its value is
a comma-separated list of directives. For example:
The max-age=x directive specifies how long the cached item will live in seconds (time-to-
live, or TTL). However, the cache does not simply delete the item when it expires. It’s
complicated, so complicated that it has its own document: RFC 9111 — HTTP Caching[1] .
Cached items are described as fresh or stale depending on whether they are older than their
max-age. The fresh or stale state determines when the item should be validated. This also
depends on other cache directives, as shown in the table.
The max-age directive alone doesn’t give much control over the lifetime of the cache as seen
from the table, because the table uses the ambiguous words “maybe” and “should”, what’s
the difference between them?
[1]
https://datatracker.ietf.org/doc/html/rfc9111
A browser will normally reuse a fresh item, but it MAY revalidate the item even if the cache
is fresh, this can happen when the user hits the reload button.
But why is “Validate stale” a “should”? What’s the point of setting the TTL if the effect is
not even guaranteed? It turns out that there are 2 levels of guarantee.
1. Without must-revalidate, a stale item may still be reused, this can happen when the
server is unreachable.
2. With must-revalidate, the cache TTL is actually respected.
The no-cache directive is confusing, it does not mean “do not cache”, it’s a shorthand for
max-age=0, must-revalidate, which means “cache but always validate”. It is used when the
latest data is desired.
Prevent Caching
To prevent caching altogether, use the no-store directive, which instructs the client not to
touch the cache at all. (This also won’t cause the existing cache entry to be deleted.)
The immutable directive is an optimization; it tells the client not to revalidate a fresh item,
even when reloading. This can be used when the URL is immutable, such as when the
resource is retrieved by its hash value. Not all browsers have implemented this, though.
Heuristic Caching
Even with these directives, the client has lots of freedom. Without real use cases, it’s
impossible to implement useful software just from specifications. The freedom of clients
is maximized when the server does not use Cache-Control at all. This is called heuristic
caching[2] in web browsers.
For example, a browser may use the Last-Modified time as a heuristic, using a fraction of the
last-modified-to-now duration as the cache TTL.
[2]
https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching#heuristic_caching
Non-Client Caching
So far, we have assumed that caching is done by a user-facing client, such as a browser. In
practice, a cache can also be provided by:
• By the server application. This is especially useful for dynamically generated content.
The app stores the generated content for itself to reuse. The goal is to save the latency
and/or the cost of content generation. The ETag scheme can also be easily used to aid
client-side caching, since the app controls the content.
• By a transparent proxy or middleware. The proxy acts as a client of the origin server,
serving requests on behalf of the real client. It may cache responses just like other
clients. This form of proxy can be deployed by the service developer, by the CDN,
or by the ISP (before HTTPS was common).
This problem does not exist for a browser because it represents a single user. For caches
shared by multiple users, there are directives to explicitly mark the response as either public
or private. The private directive is for personalized content that should not be cached by
proxies.
Cache-Control: private
There are additional rules for determining whether to cache or not. For example, the
Authorization header field normally prevents a proxy from caching the response. But the
public directive can override this rule and make the content cacheable.
There are many cloud providers that offer a CDN or caching service, some even offer
additional caching controls. Read their documentation to understand how things work in
practice. Some examples:
• Cloudflare[3]
• AWS CloudFront[4]
[3]
https://developers.cloudflare.com/cache/concepts/cache-control/
[4]
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html
• Azure CDN[5]
• Google Cloud CDN[6]
[5]
https://learn.microsoft.com/en-us/azure/cdn/cdn-how-caching-works
[6]
https://cloud.google.com/cdn/docs/caching
HTTP compression, like other features, is optional, so the client have to explicitly request
the server to compress the response. This is done via the Accept-Encoding header field,
which contains a comma-separated list of compression methods. Commonly used methods
include:
• gzip
• deflate
• br. Brotli[1] ; newer but less widely supported.
The method may be followed by an optional weight to indicate its priority. See the RFC
for the exact syntax.
If the server chooses one of the suggested compression methods, it will report back with
Content-Encoding.
Content-Encoding: gzip
More than one Content-Encoding can be applied. However, there is no practical use for this.
Some content types are already compressed, such as PNG, JPG, or ZIP. It’s best to exclude
them from compression, as they are not compressible.
[1]
https://www.rfc-editor.org/rfc/rfc7932.html
97
2024-02-02 12. Compression & the Stream API
The purpose of Content-Length does not change with respect to compression, that is the
length of the HTTP body, not the length of the uncompressed data.
When compression is applied on the fly, the compressed size is not known in advance (unless
the entire output is buffered, which does not work for large content), so we need to also
use chunked encoding.
Compression is CPU intensive, it is not always applied on the fly; a caching proxy can cache
and serve compressed responses (if the Accept-Encoding allows it).
If a caching proxy doesn’t care about compression, it must at least not mix responses from
different compression methods. That’s the purpose of the Vary header field, to inform
proxies that the response will vary if certain header fields are used.
For example, if Vary: content-encoding is used, the cache item will be keyed with the value
of Content-Encoding, so the cache lookup will take that into account.
Compressed Uploads
Due to the nature of content negotiation, there is no standard way to compress the request
body. If a client does this with Content-Encoding, it is probably the application, rather than
the HTTP server, to decompress the request.
Content-Encodingis a property of content. Therefore, the Range header field works on the
compressed data, which is NOT the desired behavior.
[2]
https://bugs.chromium.org/p/chromium/issues/detail?id=94730
[3]
https://bugzilla.mozilla.org/show_bug.cgi?id=68517
The above header field means that the payload is first gzipped and then chunked.
Unlike other obscure HTTP features, compression with Transfer-Encoding is actually more
practical and thus superior, but unfortunately implementations have gone the other way.
zlib.gzip(buffer, callback);
The second type is to produce output on the fly, you first initiate the stateful compressor
object, then you feed the compressor input, the compressor will produce output simultane-
ously.
The compressor has backpressure, so it won’t blow up memory if used properly. And you
must drain the output simultaneously to keep the input flowing.
The way to process data on the fly is with stateful processors that can read and write simulta-
neously, which is similar to what we do with a socket.
[4]
https://nodejs.org/api/zlib.html
A pipe automatically connects an output to an input, without the user writing the code to
move data between steps.
Abstract Pipes
We could use a pipe-like abstraction in our HTTP server:
[5]
https://nodejs.org/api/stream.html
Readable and Writable are the 2 basic abstractions. Both Duplex and Transform are a combina-
tion of Readable and Writable, how they differ is not relevant to us at this point.
Some built-in Node.JS modules already implement them, as seen in the table.
You can also put one or more stream.Duplex between the source and destination.
function pipeline(
src: stream.Readable,
transform1: stream.Duplex,
transform2: stream.Duplex,
// more ...
dst: stream.Writable,
): Promise<void>;
The gzipFilter() function is what we’ll implement next, which returns a wrapper of a
BodyReader to compress the data.
// pseudo code!
function gzipFilter(reader: BodyReader): BodyReader {
const gz: stream.Duplex = zlib.createGzip();
return {
length: -1,
read: async (): Promise<Buffer> => {
const data = await reader.read();
await write_input(gz, data); // deadlock by backpressure!
return await read_output(gz); // deadlock by buffering!
},
}
}
1. The compressor implements backpressure, which means that if you write a large
piece of input, the code will block until the input is processed and drained. But in the
above pseudo code, the data is drained after the write is completed, which is a case of
deadlock! (We discussed a similar deadlock in the “Pipelined Requests” section.)
2. Most data compression methods inherently require buffering; 1 data input to the
compressor does not result in 1 compressed output. Without enough data, the
compressor won’t generate output because it hasn’t decided how to compress the
data. Reading from the compressor will stuck when it is expecting more data!
In a Unix pipe, reading and writing happen in different processes, the pipe is fed and drained
simultaneously, so there are no deadlocks like in the above solution. That’s why I mentioned
that the pipe abstraction is less error-prone to use.
1. A stream.Readable has an internal queue, use the push() method to add data to the
queue, then the data is available for consumption.
2. When should you add data to the queue? You need to implement a read() callback,
which is invoked when the queue is empty and someone is consuming. This is
essential for maintaining backpressure.
3. Use the destroy() method to propagate an error to the consumer.
Note: The push() method uses null to mark the end of the stream, which is different from
what we are using (0-length buffer).
The pipeline() function will move data for us, and it returns a promise for us to await on it.
We must also catch any exceptions while awaiting on a promise, and the caught error must
be propagated to the next stream, otherwise its consumer will hang forever.
In the gzipFilter() function, we cannot wait for the pipe to complete because we must
return a wrapped BodyReader to the caller, so the try-catch block is wrapped in an anonymous
async function call without await. This can be simplified by registering an error handling
callback to the promise.
Now that the pipe is running, we can read the output from the compressor. Remember that
a net.Socket is also a stream.Readable; the soRead() function can be used without modification
(except for type annotations) because various socket events such as 'data' and 'end', pause()
and resume() methods, are actually defined on stream.Readable.
But the stream.Readable interface already has a promise-based method for reading. Its
iterator() method returns an AsyncIterator, which you can use in a for await...of loop like
a generator. The AsyncIterator also has a next() method that is similar to a generator.
The returned BodyReader wrapper reads from the compressor using the new method. If not
for learning, we could have skipped the soRead() function.
Step 6: Test It
We’re almost done, let’s enable compression for all responses and test it.
gz.flush(callback);
But since we are using pipeline() instead of explicit IO, there is no place to insert the flush()
call. However, there is an option to make the compressor automatically flush every input.
Flushing the compressor frequently reduces the effectiveness of the compression. So it’s a
good idea to let applications control whether to flush or not. For example, we don’t need
to flush the compressor when serving static files because the data source is generated as fast
as the system can.
For applications that generate very small chunks of data, it may be better not to compress at
all, because small data likely won’t compress well with frequent flushing.
The way we maintain backpressure is by waiting for the queue or buffer to drain. From the
producer and consumer point of view, a queue has 2 states:
However, some backpressure implementations are slightly more sophisticated — the queue
is limited by the number of bytes instead of just 1 write. This limit is often called a high
water mark. The queue has 3 states when using a high water mark for backpressure:
This is more like Unix pipes, a Unix pipe is a bounded buffer, as long as it’s not full you can
add more bytes to it; you do not have to wait until it’s completely drained.
The reason for using the high water mark instead of the one-item queue is that the queue
can combine multiple small writes. For example, here are some possible optimizations
when moving data from a queue to a socket:
• The queue could be a single buffer; whenever you write to it, the data is appended
to the buffer. This automatically combines multiple small writes, and all the data is
sent to the socket in one go.
• The queue stores individual buffer objects, multiple buffers can be combined into a
single larger buffer before being sent to the socket.
• The queue stores individual buffer objects, the writev()[6] syscall can be used to send
multiple buffers in one go without manually combining them.
All of these optimizations implement the semi-transparent buffering discussed earlier. See
also: stream.Writable._writev[7] .
When using the high water mark for backpressure, check the return value of Readable.push()
and Writable.write():
• They return false when the queue is full, so you can wait for it to drain.
• Otherwise, you can push more data into it.
We didn’t use their return values because we simply treated the high water mark as 0 and
drained the queue after each write.
Stream interfaces support, but do not mandate backpressure; you can still push data into
them even if the queue is full, which is a footgun for beginners.
[6]
https://linux.die.net/man/2/writev
[7]
https://nodejs.org/api/stream.html#writable_writevchunks-callback
We have already discussed WebSockets in the “Dynamic Content” chapter. From a user’s
point of view, a WebSocket is:
A WebSocket starts with an HTTP header, that’s the only thing it has in common with
HTTP. The client uses the Upgrade: websocket header field to indicate that it intends to
create a WebSocket, and the server responds with this header and status code 101 to indicate
that the WebSocket is established. The rest of the connection is then taken over by the
WebSocket protocol.
An example:
If a server knows nothing about WebSocket, it will respond like a normal HTTP request.
This is one way to extend an existing protocol. It is also used to switch from HTTP/1.1 to
HTTP/2.
[1]
https://datatracker.ietf.org/doc/html/rfc6455
109
2024-02-02 13. WebSocket & Concurrency
WebSocket Handshake
You may find these points confusing when you think about it, because these points are
weak.
1. The server already proved itself with the 101 status code.
2. If a non-WebSocket client wants to prevent users from creating WebSockets, it will
likely just forbid the Upgrade header field instead.
3. Proxies are unlikely to cache the 101 status code.
The Connection: Upgrade header field is used to instruct non-transparent proxies to consume
and remove the Upgrade header field so that a non-transparent proxy will not forward
WebSocket handshakes unless it understands the protocol.
This is one of the hop-by-hop[2] header fields. We do not care about this since we’re the
server, not a proxy.
[2]
https://datatracker.ietf.org/doc/html/rfc9110#field.connection
The rest of the connection after the hanshake consists of a series of frames.
Note: The 0th bit is the most significant bit (MSB) in a byte.
1. The first byte contains the FIN flag and a 4-bit opcode.
2. The MASK bit in the second byte indicates the optional masking-key.
3. The next field is the payload length, which is encoded as a variable-length integer
starting with a 7-bit integer in the second byte.
The 4-byte random data mask is XORed with the payload data. The mask must be used
for client-to-server frames, but not for the other way around. The purpose of the XOR
mask is to prevent a type of cache poisoning attack[3] .
[3]
http://www.adambarth.com/papers/2011/huang-chen-barth-rescorla-jackson.pdf
Types of Frames
• 0x08: A control message for graceful termination. The optional payload contains a
status code and a string message.
• 0x09 and 0x10: Ping and pong, or heartbeats. For keeping the connection alive and
probing dead connections. Can be transparent to applications.
Fragmented Messages
WebSocket also supports sending messages without knowing their length in advance, like
the chunked transfer encoding in HTTP. A “logical” data message can be broken into
multiple frames, but the application sees the combined message.
• The first frame has opcode 0x01 or 0x02, while the rest frames use 0x00.
• The last frame has the FIN flag set.
• Only data messages (opcode 0x01 and 0x02) can be fragmented.
• Frames of different messages cannot be interleaved, except for control messages.
An unfragmented message is encoded as a single frame with opcode 0x01 or 0x02 and the
FIN flag is set.
Like the design of chunked transfer encoding, you can stream arbitrarily large data as a
single message. But many implementations represent the message data as a single buffer, so
there is a limit to the message size.
// pseudo code!
async function send(conn, msg) {
await write_header(conn, msg);
await write_body(conn, msg);
}
In this example, writing 1 message involves 2 socket writes, and the code yields to the
runtime at each await statement. The problem is that while back to the runtime, the
runtime may schedule another task that is also sending a message, resulting in interleaved
socket writes.
This is called a race condition in concurrent programming. Note that this has nothing to do
with the single-threaded runtime or the number of CPU cores.
Atomic Operations
You may think that the solution is to use only 1 socket write for each message. However,
this solution is …
• Correct only if socket writes are atomic, which may not be guaranteed.
• Not always possible and not convenient. Consider that WebSocket messages can be
fragmented and used like streams.
Although operating systems try[4] to[5] implement concurrent writes as atomic operations,
no sane application will depend on this behavior, and a byte stream is not about messages
anyway.
[4]
https://man7.org/linux/man-pages/man2/write.2.html#BUGS
[5]
http://web.archive.org/web/20120623124411/http://www.almaden.ibm.com/cs/people/marksmith/se
ndmsg.html
// pseudo code!
async function send(mutex, msg) {
await mutex.lock();
try {
// write to the socket
} finally {
mutex.unlock();
}
}
A mutex is “mutual exclusion” because it disallows more than one task from entering the
locked state. In the case of concurrent access, one task holds the lock and the rest block on
the await statement, and releasing the lock will unblock one of the waiters. The mutex is
one of the synchronization primitives in many concurrent programming environments.
// pseudo code!
let queue = createQueue();
A mutex is also a form of queue that doesn’t pass data, it passes control instead.
A JS array is a queue that you can push and pop. But it is not useful for concurrent
programming. How do you, as a consumer, know when the queue is not empty? There
needs to be a mechanism that allows consumers to wait for producers.
This mechanism can be implemented in the queue! That’s why the above pseudo code
uses await to consume from the queue. This is called a blocking queue because consumers are
blocked when the queue is empty.
Also, backpressure requires that the queue be bounded in capacity. This is achieved by
having the producer block when the queue is full. In contrast to the Stream API in Node.JS,
this removes the footgun of not mandating backpressure.
A blocking queue not only passes data between producers and consumers, it also passes
control, which is similar to a mutex. In fact, while passing data is convenient, passing control
is more fundamental in concurrent programming.
A blocking queue is similar to a Unix pipe in terms of blocking behavior. Except for one
thing: a pipe can be closed!
In concurrent programming, sometimes you need to make tasks quit their jobs. For example,
if a WebSocket is closed while the application task is blocking on it (either reading or
writing), the task should be woken up (to get an EOF or an error) instead of hanging
forever.
If the task is a consumer of a queue, you can put special values into the queue to notify con-
sumers to quit. But for producers waiting on a queue, this is not so easy. Fortunately, we can
learn from Unix pipes — add a close() method to unblock all producers and consumers.
type Queue<T> = {
pushBack(item: T): Promise<void>; // throws if closed
popFront(): Promise<null|T>; // returns null if closed
close(): void;
};
We can throw exceptions to producers and return nulls to consumers, mimicking closed
pipes.
A synchronization primitive is something used to block and unblock tasks (pass control).
Some traditional synchronization primitives include:
• Mutex.
• Semaphore.
• Condition variable.
• Event.
Many simple concurrency problems are solved by passing data, which is favored in Go. We
will demonstrate this with our WebSocket implementation.
Let’s consider a blocking queue with just push and pop, and no buffering capacity. The
queue should be usable with multiple producers and multiple consumers, which is more
versatile than generators.
type Queue<T> = {
pushBack(item: T): Promise<void>;
popFront(): Promise<T>;
};
Creating a promise results in the resolve callback that is used to fulfill it later. A producer
should either wait for a consumer or wake up a waiting consumer.
For consumers, just store their own resolve callback to receive the data.
type Queue<T> = {
pushBack(item: T): Promise<void>; // throws if closed
popFront(): Promise<null|T>; // returns null if closed
close(): void;
};
Since we may throw exceptions at producers, the reject callback is also stored.
We will also remember the closed state. This state should be checked before pushing or
poping. And closing a queue unblocks everyone.
close: () => {
// unblock any waiting producers or consumers
closed = true;
while (producers.length) {
producers.shift()!.reject(new Error('queue closed.'));
}
while (consumers.length) {
consumers.shift()!(null);
}
},
While the read() function does not require the whole message to be stored in memory,
it does have a downside: the application must ensure that the message data is consumed,
otherwise the next frame won’t be parsed.
The WSServer is the application interface for sending and receiving data messages.
This is similar to what we do with a socket — read from it and write to it. Except that
concurrent reads and writes have valid use cases and must be supported.
You can also use callbacks to deliver messages instead of actively reading them. But this
makes the backpressure less obvious, because you’ll also need the pause() and resume()
methods. This is why we ditched callback-based IO early on.
These 2 tasks run concurrently with the application tasks. Two queues are used to con-
nect the 2 tasks to the application (send and receive). Another queue is used to connect
wsServerSend() to the socket.
There can be multiple app tasks concurrently producing or consuming with a single Web-
Socket, but the tasks for reading and writing to the socket must be sequential. This is an
example of solving concurrency problems by passing data.
The finally() callback registered on the promise behaves like a try-finally block. This is
used to close queues so that tasks won’t hang in a queue forever.
Note: If you do not await an async function, it’s vital to catch exceptions with a catch()
callback, otherwise your program will crash. In this case, we simply log and ignore the
exception, but sometimes you need to propagate errors.
We’ll use another queue to send frame data to the read() function of a WSMsg.
// create a WSMsg
const q = data = createQueue<Buffer>();
const msg: WSMsg = {
// omitted ...
read: async () => await q.popFront() || Buffer.from(''),
};
await qrecv.pushBack(msg);
// omitted. feed payload data into `q` ...
A data message can span multiple frames, so we need to store the queue outside the loop
and feed the queue as more frames arrive. Remember to make sure the queue is closed,
either when the FIN flag is set or when exiting the loop.
Buffer.from('Connection: Upgrade'),
Buffer.from(`Sec-WebSocket-Accept: ${wsKeyAccept(key)}`),
],
body: resBody,
};
}
Use a web browser to create WebSockets for testing. Remember to use packet capturing
tools. Wireshark can disassemble the protocol, so you can use it to …
• Learn what the correct format looks like if you failed to parse the frames.
• See what’s wrong with the frames you’ve generated.
The most important difference is that the browser API is callback-based. The browser
invokes the event handler[6] as messages arrive. This is similar to net.Socket in Node.JS,
except that there are no pause() and resume() methods, so backpressure is impossible for
receiving.
For sending messages, the send() method[7] just buffers the data and returns nothing. There
is an attribute bufferedAmount to tell the buffer size, so applications can still control backpres-
sure.
There is an alternative API design[8] that uses async/await for reading and writing, which is
the direction we took.
[6]
https://developer.mozilla.org/en-US/docs/Web/API/WebSocket/message_event
[7]
https://developer.mozilla.org/en-US/docs/Web/API/WebSocket/send
[8]
https://developer.chrome.com/docs/capabilities/web-apis/websocketstream
There are messaging protocols that run on top of WebSocket that implement backpressure,
such as RSocket[9] .
Applications that send large chunks of data over WebSockets should have a fragmentation
mechanism to keep the message size small.
• HTTP semantics:
– Generators.
– Streams.
– Blocking queues.
[9]
https://rsocket.io/about/protocol#flow-control