Digital Garden
Computer Science
Distributed Systems
The Internet

The Internet

IETF

IETF stands for Interent Engineering Task Force which is task force/organization with the goal to make the internet work better. They define internet standards and regulate the internets architecture with so called RFC specifications (Request for comments). Some of the most well known RFCs are the following

HTTP Protocol

The HTTP protocol stands for Hypertext transfer protocol and is used to access static or dynamic data on another computer and is based on a reliable transport layer protocols like TCP and IP.

httpProtocol

Request

A HTTP request consists of a request line which holds the method of the request (GET, POST etc.), the url of the target and the version of the HTTP protocol to be used followed by a carriage return line feed (CR LF). You then have the headers which are key-value pairs separated by ':' and a CR LF at the end of each one. Last but not least you have the body which is a chunk of bytes

httpRequest

Methods

Idempotent means that multiple identical requests will have the same outcome. So it does not matter if a request is sent once or multiple times. The following HTTP methods are idempotent: GET, HEAD, OPTIONS, TRACE, PUT and DELETE.

httpRequestMethods

Headers

KeyDescriptionExample
Hostspecifies the host and port number of the server to which the request is being sentHost: developer.mozilla.org:8080
Acceptindicates which content types, expressed as MIME types, the client can understandAccept: text/html
Accept-Languageindicates the natural language and locale that the client prefersAccept-Language: de-CH
Accept-Encodingindicates the content encoding (usually a compression algorithm) that the client can understandAccept-Encoding: gzip
User-Agentlets servers and network peers identify the application, operating system, vendor, and/or version of the requesting user agentUser-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:47.0)
Referercontains an absolute or partial address of the page that makes the requestReferer: https://example.com/ (opens in a new tab)
Connectioncontrols whether the network connection stays open after the current transaction finishesConnection: keep-alive, Connection: close
Cookiecontains stored HTTP cookies associated with the serverCookie: PHPSESSID=298zf09hf012fh2;
Content-Lengthindicates the size of the message body, in bytes, sent to the recipientContent-Length: 4
Content-Typeindicates the original media type of the resource (prior to any content encoding applied for sending)Content-Type: text/html; charset=UTF-8

Response

A HTTP reponse is built very similiar to a request but instead of a request line it has a status line which also holds the version of the HTTP protocol to be used followed by a status code and message. httpResponse

Codes

httpReponseCodes

Headers

KeyDescriptionExample
Content-Lengthindicates the size of the message body, in bytes, sent to the recipientContent-Length: 4
Content-Typeindicates the original media type of the resource (prior to any content encoding applied for sending)Content-Type: text/html; charset=UTF-8
Content-Encodinglists any encodings that have been applied to the representation (message payload), and in what orderContent-Encoding: gzip
Locationindicates the URL to redirect a page to. It only provides a meaning when served with a 3xx (redirection) or 201 (created) status responseLocation: /index.html
Datecontains the date and time at which the message originatedDate: Wed, 21 Oct 2015 07:28:00 GMT
Last-Modifiedcontains a date and time when the origin server believes the resource was last modifiedLast-Modified: Wed, 21 Oct 2015 07:28:00 GMT
Expirescontains the date/time after which the response is considered expiredExpires: Wed, 21 Oct 2015 07:28:00 GMT
Serverdescribes the software used by the origin server that handled the requestServer: Apache/2.4.1 (Unix)
Transfer-Encodingspecifies the form of encoding used to safely transfer the payload body to the userTransfer-Encoding: chunked
Cache-Controlcontrol caching in browsers and shared cachesCache-Control: no-cache

MIME Types

The Content-Type or MIME (Multipurpose Internet Mail Extensions) type specifies type of the body, like text/javascript or something else like audio, video, etc. being sent between client and server. MIME types are not limited to HTTP, they are used in many other locations.

Media-Type = type / subtype { “;” parameter }

Types: text / image / audio / video / application / message / multipart

Subtypes that start with x are non standard subtypes.

For example:

  • Media-Type: text/html;charset =ISO 8859 1
  • Media-Type: application/octet stream
  • Media-Type: image/jpeg

Enhancements

Over the years the HTTP protocol has been enhanced and newer versions have been released/specified.

Version 1.1

  • Network connection management

    • Persistent connections were introduced, meaning that several requests can be sent over the same connection
    • Pipelining was introduced, meaning you can send a new request before the previous ones have even been answered
  • Bandwidth optimization

    • Clients can request parts/ranges of documents for example to complete a previously interrupted request
  • Message transmission

    • Trailers were introduced, meaning message headers can be delivered at the end of the body which can also be similar to a checksum
    • Transfer encoding and content length. Clients reading a resposne need to know when they have reached the end. Servers can indicate the end of a message in four ways
      • Implied content length, for example certain response codes like 304 are defined to never have content, so the client can assume the response to terminate with a double CR LF
      • Content-length header, the length of the content is specified in the content-length attribute in bytes.
      • Chunked encoding, the content is broken down into a number of chunks each prefixed by its size in bytes, a zero size chunk then indicated the end of the message. For this to work the server must set the header to transfer-encoding : chunked.
  • Internet address conservation

Version 2.0

  • Binary, packet-based protocol. With the switch to 2.0 all HTTP messages are split and sent in clearly defined frames. This also means that chunked transfer encoding must not be used with HTTP/2.0. This switch provides more mechanisms for data streaming but also allows for more efficiency.
  • Multiple requests can now be sent in parallel over a single TCP connections.
  • HPACK, headers are compressed and cached on the server
  • Servers can push resources together with a requested resources for example a script or css file along with a HTML page.