At the heart of the world wide web lies the hyper text transfer protocol (HTTP). It was created in the early 1990’s and it’s one of the primary drivers of the success of the internet. Basically the function of HTTP is to transfer data requested by client computers from various web servers across the world.
HTTP has ports reserved for it’s use, according to the Internet Assigned Numbers Authority (IANA) both the UDP and TCP have reserved port 80 for HTTP traffic. For years the majority of web traffic would be transferred using these ports. However, there is another port – 443 reserved for both TCP and UDP which can be used for more secure traffic – specifically SSL (Secure Socket Layer) traffic. SSL is a method of creating a private tunnel using public/private key exchanges prior to data being transmitted, it’s generally considered to be the secure option to using the internet.
When we’re considering how data is transferred over the internet, it’s important to remember this point. When you access a single resource, usually a single text page written in HTML (Hyper Text Mark-up Language) then this doesn’t mean only one file is transmitted. From within this text file, the HTML will usually request additional resources from all sorts of location. This can happen without the explicit direction of the client requesting the resource, for example a HTML page may load multiple images or videos embedded in the page. These additional files can be in lots of different formats including text, images, sound, video and even computer code.
The HTPP Transaction
Typically on the web, the HTTP client is a web browser usually sitting on a computer somewhere. However as the web has developed this is just as likely to be on a smart phone, tablet or any other mobile devices. In many senses this is irrelevant, the client sends requests to the web server.
The request will normally come from the user entering an address, clicking on a link or accessing a specific web site. The browser will then build a request for this data and send it to the web server reference in the link entered. The web server will receive the request and then will respond with the data after it has been processed. This is important as mentioned previously many of the data resources could exist on other servers or in remote locations.
The type of web browser running on the client is in many respects irrelevant, however the majority of users will use Chrome, IE or Firefox. There are plenty of others though and there are even text based browsers (Lynx) although these are not widely used. To the web server it’s not important, and the HTTP daemon which sits listening on the specified TCP/IP port will handle all requests in exactly the same way.
Incidentally there are different version of webs servers too – although these are dominated by two main players – Apache (Apache Software Foundation HTTP Server) and IIS (Microsoft Internet Information Server). These two are by the most likely to operate any web site that you may visit.
Which version is largely dependent on the Operating system which is running on the web server. IIS is created and supported by Microsoft and is only available for Windows based systems. However you can install a version of Apache onto most popular operating systems including Linux, Unix and Windows. The ability to freely access the source code for Apache means that it can also be installed onto other platforms too with sufficient knowledge.
What you should be aware of though is that the majority of non-SSL traffic is transmitted in clear text. Which means as the internet basically sits on a network of shared hardware – it’s not hard for people to intercept and read what you’re doing online. It’s one of the reasons why security programs like these secure VPNs which both encrypt all your internet connection (not just web browsing) plus hide your location and identity too are becoming so popular.
The Other Important Stage – Proxies
In many communications there is another step that exists in the connection between client and server = – the web proxy server. These proxies often are deployed in networks to handle all web traffic effectively adding a barrier between client and server. In other instances they are used by the client in order to provide some privacy or control over the internet connection. Millions use them to hide their IP address or gain access to geo-restricted content by changing their digital location. In both instances though the web proxy will receive requests from the client and forward responses from the web server itself.
There are numerous different types of ‘proxies’ used for all different purposes. Indeed some web proxies are only used for network efficiency – acting as ‘caching proxies’ for for data in order to minimize requests for popular web sites.
Some Important HTTP Security Issues
In the early days of the internet, security was never really considered that important. Indeed when people were mostly accessing open resources from publicly owned servers then there was never really an issue. This has of course, changed significantly in the last couple of decades. Most of us now routinely conduct financial transactions online, often involving significant sums. From buying stuff from eBay to buying our car insurance, our web transactions now invariably include sensitive financial data too.
HTTP was simply not designed for security but merely practicality. In fact, basic HTTP has no security whatsoever and any secure transactions over this protocol will be via SSL tunnels bolted onto HTTP – or HTTPS which they are more commonly known as. The security is however not embedded into the protocol itself, but as merely been bolted on to provide some protection for the millions of financial and private transactions that take place online every day.
There are many issues though, and these can often be leveraged by computer savvy criminals with access to your data. Issues with HTTP transactions are common source of compromise for example where web servers fail to check for length based exploits. These are often referred to as buffer overflow attacks and allow attacker to execute commands using the security level of the HTTP daemon.
If you’re interested in how these work then one of the first, famous exploits og this type is a good example. It’s called the Apache Chunked Encoding Vulnerability from way back in 2002.