The primary purpose of a Web server is to deliver a document on request to a client. The document may be text, an image file, or other type of file. The document is identified by a name called a URL (Uniform Resource Locator). If the server stores that particular URL (or can generate content for that URL), then it returns the document as the message reply.
http://pandonia/OS.html ftp://services.canberra.edu.au/bin/ls
There are 3 versions of HTTP
Each version must understand all earlier versions
Request = Simple-Request Simple-Request = "GET" SP Request-URI CRLF
Response = Simple-Response Simple-Response = [Entity-Body]
This version added much more information to the requests and responses. Rather than "grow" the 0.9 format, it was just left alongside the new version.
Request = Simple-Request | Full-Request Simple-Request = "GET" SP Request-URI CRLF Full-Request = Request-Line *(General-Header | Request-Header | Entity-Header) CRLF [Entity-Body]A Simple-Request is an HTTP/0.9 request and must be replied to by a Simple-Response.
A Request-Line has format
Request-Line = Method SP Request-URI SP HTTP-Version CRLFwhere
Method = "GET" | "HEAD" | POST | extension-methode.g.
GET http://jan.newmarch.name/index.html HTTP/1.0
Response = Simple-Response | Full-Response Simple-Response = [Entity-Body] Full-Response = Status-Line *(General-Header | Response-Header | Entity-Header) CRLF [Entity-Body]
The Status-Line gives information about the fate of the request:
Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLFe.g.
HTTP/1.0 200 OKThe codes are
Status-Code = "200" ; OK | "201" ; Created | "202" ; Accepted | "204" ; No Content | "301" ; Moved permanently | "302" ; Moved temporarily | "304" ; Not modified | "400" ; Bad request | "401" ; Unauthorised | "403" ; Forbidden | "404" ; Not found | "500" ; Internal server error | "501" ; Not implemented | "502" ; Bad gateway | "503" | Service unavailable | extension-code
The Entity-Header contains useful information about the Entity-Body to follow
Entity-Header = Allow | Content-Encoding | Content-Length | Content-Type | Expires | Last-Modified | extension-headerFor example
HTTP/1.1 200 OK Date: Fri, 29 Aug 2003 00:59:56 GMT Server: Apache/2.0.40 (Unix) Accept-Ranges: bytes Content-Length: 1595 Connection: close Content-Type: text/html; charset=ISO-8859-1
GET http://www.w3.org/index.html HTTP/1.1
The set of requests has been expanded to
Accept Accept-Charset Accept-Encoding Accept-Language
Accept: audio/*; q=0.2, audio/basic
Accept-Charset: iso-8859-5, unicode-1-1;q=0.8
Accept-Encoding: compress;q=0.5, gzip;q=1.0
Accept-Language: da, en-gb;q=0.8, en;q=0.7
Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format
HTTP-date = rfc1123-date | rfc850-date | asctime-date rfc1123-date = wkday "," SP date1 SP time SP "GMT" rfc850-date = weekday "," SP date2 SP time SP "GMT" asctime-date = wkday SP date3 SP time SP 4DIGIT date1 = 2DIGIT SP month SP 4DIGIT ; day month year (e.g., 02 Jun 1982) date2 = 2DIGIT "-" month "-" 2DIGIT ; day-month-year (e.g., 02-Jun-82) date3 = month SP ( 2DIGIT | ( SP 1DIGIT )) ; month day (e.g., Jun 2) time = 2DIGIT ":" 2DIGIT ":" 2DIGIT ; 00:00:00 - 23:59:59 wkday = "Mon" | "Tue" | "Wed" | "Thu" | "Fri" | "Sat" | "Sun" weekday = "Monday" | "Tuesday" | "Wednesday" | "Thursday" | "Friday" | "Saturday" | "Sunday" month = "Jan" | "Feb" | "Mar" | "Apr" | "May" | "Jun" | "Jul" | "Aug"
This is not a very secure scheme. All the HTTP messages are sent in plain text format. The user-id and password are not encrypted in any way.
"Normal" queries use GET. Strictly, if a request is "idempotent" it should use GET. Idempotent means that the client is not asking for a state change in the server, and would expect a repeat request to return the same result. This is the norm for static document requests
GET http://localhost/index.html
GET should also be used for idempotent form requests. Again, these are ones that do not cause any (visible) change of state.
GET http://localhost/cgi-bin/test-cgi?name=jan
Parameters are passed after a '?', in the form vbl=value
. Any problematic
characters have to be escaped. e.g. space is written as its Ascii value in hex as '%20' (or '+').
GET url's can become very long. They can also be a security leak since the form data is visible
in the url and is often saved in bookmarks, log files, etc.
Note that a GET request that e.g. increases a count of logins to the server is still regarded as idempotent since it is not visible to the client.
Queries may be intended to result in state changes on the server. e.g. uploading a file, confirming a transaction, etc. These queries should use POST, and include form data in the content part of the message.
SOAP (see later) is criticised for forcing use of POST even for idempotent queries.