Explaining HTTP Basics

Objectives

After completing this lesson, you will be able to:

  • Explaining HTTP basics

HTTP Basics

The Hypertext Transfer Protocol

SOAP is a protocol for exchanging XML information set-based messages over a computer network.

Any transport protocol can be used to send SOAP messages, for example, FTP, SMTP, HTTP, or JMS. In practice, HTTP is usually used due to its compatibility with common network architectures (such as firewalls). Encrypted transmission of SOAP messages is also possible using HTTP(S).

The XML Information Set of the SOAP request is sent as XML to a given URL using HTTP(S) in the body of an HTTP POST request.

HTTP is a stateless protocol. Information from previous requests are lost. A reliable carrying of session data can only be implemented at the application layer by a session using a session identifier. However, cookies in the header information can be used to implement applications that can assign status information (user entries, shopping carts).

HTTP Request - Response Cycle

The communication process essentially consists of a HTTP request from a client to the server, through which the client requests an object from the server, and a HTTP response that the server sends back to the client after having received the request. The response then contains the information that was requested by the client. This communication between client and server is based on messages written in text format.

TCP / IP Reference Model

Communication is implemented in computer networks using network protocols. In practice, it is divided into functional layers. For the internet and the internet protocol family, the structure according to the so-called TCP/IP reference model, which describes 4 layers based on each other, is decisive. This is tailored to the internet protocols, which enable data exchange beyond the boundaries of local networks (these are mainly TCP and IP). Neither the access to a transmission medium nor the data transmission technology are defined here. Rather, the internet protocols are responsible for forwarding data packets over several point-to-point connections (hops) and on this basis establishing connections between network participants over several hops.

Application Layer

The application layer includes all protocols that interact with the application programs and that use the network infrastructure to exchange application-specific data. The most widely known protocols include HTTP (Hypertext Transfer Protocol); FTP (File Transfer Protocol); SMTP (Simple Mail Transfer Protocol) for sending e-mails; POP3 (Post Office Protocol Version 3) for retrieving e-mails; and DNS (Domain Name System) for conversion between domain names and IP addresses.

Data Link Layer

The data link layer is located above the transport layer. It is implemented by the Transmission Control Protocol (TCP). This layer monitors the transfer, supplies any missing data packets, and arranges them in the correct sequence. As a rule, both the TCP and IP protocols are integrated into TCP/IP. As these protocols are located one above the other, this is often called a TCP/IP stack.

Transport Layer

The transport layer is located above the network layer. From a software point of view, this layer is implemented via the Internet Protocol (IP). It bundles the data to be transferred into packets and assigns these packets a sender and the recipient address. The data packets are forwarded to the network layer below for transfer. IP receives the data packets from the network and unpacks them. This renders the data transfer more convenient, as entire data packets can now be exchanged. A mechanism to define whether all data packets have arrived, and what sequence these take, has yet to be established.

Network Layer

The network layer is specified in the TCP/IP reference model, but does not contain protocols of the TCP/IP family. Rather, it is to be understood as a placeholder for various techniques for point-to-point data transmission. The internet protocols were developed with the aim of interconnecting different subnets. Therefore, the network layer can be populated by protocols such as Ethernet, FDDI, PPP (point-to-point connection), or 802.11 (WLAN).

Format of HTTP Messages

Communication between the client and server occurs through the exchange of messages that transfer requests and responses between the client and server. These messages consist mainly of the HTTP header and the actual data. The HTTP header contains control information and the required URL. The header entries are arranged into four categories: general header, request, response, and entity header entries. The general header entries are contained both in the requests and in the responses. Entity headers describe the data part of the message. The entity body of the message contains the actual data that was requested. This could be an HTML document, for example.

Request Line / Status Line

A distinction must be made between the request from the client and the response from the server. If a request is involved, it will be specified here which object (document or program) is to be accessed with which method. In addition, the protocol version to be used for the transfer (HTTP 1.0 or HTTP 1.1) will also be defined. HTTP defines a number of methods, including the well-known GET and POST methods. If a response from the server is involved, the protocol version and the status code for the result of the request are specified in the status line. This is then followed by text that describes the status code.

General Header

Every transmitted message (request or response) has the following fields that can be queried: Cache-Control, Connection, Date, Pragma, Transfer-Encoding, Upgrade, and Via. The header information also includes the time and date of the transmitted data packet.

Request Header / Response Header

The response header transmits additional information about the server. It uses the following fields: Age, Location, Proxy-Authenticate, Public, Retry-After, Server, Vary, Warning, and WWW-Authenticate.

Entity Header

The entity header transmits information about the length of the document or about the last change made to it. If no entity body is defined, the following fields provide information about the resources without actually sending them in the entity body: Allow, Content-Base, Content-Encoding, Content-Language, Content-Length, Content-Location, Content-MD5, Content-Range, Content-Type, Expires, and Last-Modified.

Entity Body

The entity body is separated from the header by a blank line. The actual data from the message is placed in the body. This can be either the client's user data or the server's response.

HTTP Status Codes

The status code provides information about the result of the request. It consists of a three-digit figure with additional, optional text. The first digit of the status code defines the response class. The five existing response classes are as follows:

HTTP Status Codes

  • 1xx - Informational

    The response from the server is temporary. The server received the request and is processing it at present. For example: 100 Continue

  • 2xx - Successful

    The request was received, understood, and accepted by the server. For example:

    • 200 OK
    • 202 Accecpted
  • 3xx - Redirection

    The request could not be processed in full, and the server refers to other servers that the client must contact in order to process the request successfully. For example: 301 Moved Permanently

  • 4xx - Client Error

    The request could not be processed by the server. This may be due to a request that could not be processed by the server. For example:

    • 400 Bad Request
    • 401 Unauthorized
    • 403 Forbidden
  • 5xx - Server Error

    As a result of an error arising on the server, the server was unable to process the request. For example:

    • 500 Internal Server Error
    • 502 Bad Gateway
    • 503 Service unavailable
  • Unofficial Codes

    There are a few of undocumented status codes. For example:

    • 218 This is fine

    • 103 Checkpoint

Most commonly, the return values involve the values 200 OK or 404 Not Found.

HTTP Methods

HTTP defines methods to indicate the desired action to be performed on the identified host. The HTTP/1.0 specification defined the GET, HEAD, and POST methods and the HTTP/1.1 specification added five new methods: OPTIONS, PUT, DELETE, TRACE, and CONNECT. Method names are case sensitive. This is in contrast to HTTP header field names, which are not case sensitive.

Important HTTP methods are:

HTTP Methods

  • GET

    The GET method requests a representation of the specified resource. Requests using GET should only retrieve data and should have no other effect. You can handover requested data to a server program using parameters. Parameters are separated by a question mark added to the URL.

  • POST

    The POST method is used to transfer data to a server program. The data is contained in the body of the client request.

  • PUT

    The PUT method tries to store the data transferred in the entity body under the specified URL on the server. An attempt is made on the server to generate a new object, which of course requires the appropriate permissions.

  • DELETE

    This method can be used to delete the data stored under the specified URL if the appropriate permissions are in place. This and the PUT method are two of the riskiest. If the server was not configured properly, the data on the server can end up being manipulated.

Log in to track your progress & complete quizzes