HTTP Messages
HTTP messages are how data is exchanged between a server and a client. There are two types of messages: requests sent by the client to trigger an action on the server, and responses, the answer from the server.
Web developers, or webmasters, rarely craft these textual HTTP messages themselves: software, a Web browser, proxy, or Web server, perform this action. They provide HTTP messages through config files (for proxies or servers), APIs (for browsers), or other interfaces.
HTTP requests, and responses, share similar structure and are composed of:
- A start-line describing the requests to be implemented, or its status of whether successful or a failure. This is always a single line.
- An optional set of HTTP headers specifying the request, or describing the body included in the message.
- A blank line indicating all meta-information for the request has been sent.
- An optional body containing data associated with the request (like content of an HTML form), or the document associated with a response. The presence of the body and its size is specified by the start-line and HTTP headers.
The start-line and HTTP headers of the HTTP message are collectively known as the head of the requests, and the part afterwards that contains its content is known as the body.
HTTP Requests
Request line
Note: The start-line is called the "request-line" in requests.
HTTP requests are messages sent by the client to initiate an action on the server. Their request-line contains three elements:
-
An HTTP method, a verb (like
GET
,PUT
orPOST
) or a noun (likeHEAD
orOPTIONS
), that describes the action to be performed. For example,GET
indicates that a resource should be fetched orPOST
means that data is pushed to the server (creating or modifying a resource, or generating a temporary document to send back). -
The request target, usually a URL, or the absolute path of the protocol, port, and domain are usually characterized by the request context. The format of this request target varies between different HTTP methods. It can be:
- An absolute path, ultimately followed by a
'?'
and query string. This is the most common form, known as the origin form, and is used withGET
,POST
,HEAD
, andOPTIONS
methods.POST / HTTP/1.1
GET /background.png HTTP/1.0
HEAD /test.html?query=alibaba HTTP/1.1
OPTIONS /any-page.html HTTP/1.0
-
A complete URL, known as the absolute form, is mostly used with
GET
when connected to a proxy.GET https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages HTTP/1.1
-
The authority component of a URL, consisting of the domain name and optionally the port (prefixed by a
':'
), is called the authority form. It is only used withCONNECT
when setting up an HTTP tunnel.CONNECT developer.mozilla.org:80 HTTP/1.1
-
The asterisk form, a simple asterisk (
'*'
) is used withOPTIONS
, representing the server as a whole.OPTIONS * HTTP/1.1
- An absolute path, ultimately followed by a
-
The HTTP version, which defines the structure of the remaining message, acting as an indicator of the expected version to use for the response.
Headers
HTTP headers from a request follow the same basic structure of an HTTP header: a case-insensitive string followed by a colon (':'
) and a value whose structure depends upon the header. The whole header, including the value, consists of one single line, which can be quite long.
Many different headers can appear in requests. They can be divided in several groups:
- General headers, like
Via
, apply to the message as a whole. - Request headers, like
User-Agent
orAccept
, modify the request by specifying it further (likeAccept-Language
), by giving context (likeReferer
), or by conditionally restricting it (likeIf-None-Match
). - Representation headers like
Content-Type
that describe the original format of the message data and any encoding applied (only present if the message has a body).
Body
The last part of a request is the body.
Not all requests have one: requests with a GET
HTTP method should only be used to request data and shouldn't contain a body.
Bodies can be broadly divided into two categories:
- Single-resource bodies, consisting of one single file, defined by the two headers:
Content-Type
andContent-Length
. - Multiple-resource bodies, consisting of a multipart body, each containing a different bit of information. This is typically associated with HTML Forms.
HTTP Responses
Status line
Note: The start-line is called the "status line" in responses.
The start line of an HTTP response, called the status line, contains the following information:
- The protocol version, usually
HTTP/1.1
, but can also beHTTP/1.0
. - A status code, indicating success or failure of the request. Common status codes are
200
,404
, or302
. - A status text. A brief, purely informational, textual description of the status code to help a human understand the HTTP message.
A typical status line looks like: HTTP/1.1 404 Not Found
.
Headers
HTTP headers for responses follow the same structure as any other header: a case-insensitive string followed by a colon (':'
) and a value whose structure depends upon the type of the header. The whole header, including its value, presents as a single line.
Many different headers can appear in responses. These can be divided into several groups:
- General headers, like
Via
, apply to the whole message. - Response headers, like
Vary
andAccept-Ranges
, give additional information about the server which doesn't fit in the status line. - Representation headers like
Content-Type
that describe the original format of the message data and any encoding applied (only present if the message has a body).
Body
The last part of a response is the body. Not all responses have one: responses with a status code that sufficiently answers the request without the need to include content (like 201 Created
or 204 No Content
) usually don't.
Bodies can be broadly divided into three categories:
- Single-resource bodies, consisting of a single file of known length, defined by the two headers:
Content-Type
andContent-Length
. - Single-resource bodies, consisting of a single file of unknown length, encoded by chunks with
Transfer-Encoding
set tochunked
. - Multiple-resource bodies, consisting of a multipart body, each containing a different section of information. These are relatively rare.
HTTP/2 Frames
HTTP/1.x messages have a few drawbacks for performance:
- Headers, unlike bodies, are uncompressed.
- Headers are often very similar from one message to the next one, yet still repeated across connections.
- Although HTTP/1.1 has pipelining, it's not activated by default in most browsers, and doesn't allow for multiplexing (i.e. sending requests concurrently). Several connections need opening on the same server to send requests concurrently; and warm TCP connections are more efficient than cold ones.
HTTP/2 introduces an extra step: it divides HTTP/1.x messages into frames which are embedded in a stream. Data and header frames are separated, which allows header compression. Several streams can be combined together, a process called multiplexing, allowing more efficient use of underlying TCP connections.
HTTP frames are now transparent to Web developers. This is an additional step in HTTP/2, between HTTP/1.1 messages and the underlying transport protocol. No changes are needed in the APIs used by Web developers to utilize HTTP frames; when available in both the browser and the server, HTTP/2 is switched on and used.
Conclusion
HTTP messages are the key in using HTTP; their structure is simple, and they are highly extensible. The HTTP/2 framing mechanism adds a new intermediate layer between the HTTP/1.x syntax and the underlying transport protocol, without fundamentally modifying it: building upon proven mechanisms.