"High Performance Web Sites" by Steve Souders
- A review

-          Damodar Chetty

-          Jan 21, 2008

 

Over the weekend, I had a chance to read Steve Souder's book on performance improvements for web applications. As an interesting departure from all the tomes that you see on tweaking server-side application architectures, JVM configurations, application server parameters, and Java API usage, this book focuses on that thin slice of the presentation tier that lives on the client.

The net upshot is a book that is almost technology agnostic - it can be read by any web developer, whether you develop using ASP.NET, PHP, or even J2EE. However, note that web server configuration tips are Apache centric.

For those who can't wait to hear what I think (I'm sure there's at least one of you out there), I can safely categorize this as one of my most eye-opening reads in the recent 12 months, and is a must-read for any web applications developer. Over the next few paragraphs, I'll try to explain why this is so.

Making the case for front-end engineering

The case for front-end engineering is made very forcefully by an interesting graphic (that should be familiar for those who use Firebug's Net tab in Firefox). This graphic shows a timeline of the entire HTTP traffic that is spawned by what, to the user, is a single request for a page. When the requested page is retrieved and parsed, additional requests may be spawned to retrieve the components required by that page. A typical page will have references to one or more images, stylesheets, and Javascript files. Each of these components must be requested separately by the browser client, and this is what causes that flurry of HTTP traffic.

An interesting statistic is that it often takes longer to download these ancillary components than it does to generate and return the original page requested. His empirical data shows that the main request comprises only 10% of the actual traffic.

This has the significant implication that even minor tweaks to the handling of these ancillary components can result in much larger end user response time improvements. It also implies that the traditional approach to performance tuning - which targets server side application code - is probably not the most cost effective approach.

In other words it is crucial that you determine your own ratio of (time required to download the main page / time required for the completed response) for your web site. The lower this number, the more likely you are to find improvements by following the rules in this book. Of course, the higher this number, the more likely it is that you are bound by your server side application code.

But regardless of what this number turns out to be, it's a wonderful measure that tells you something you probably should know about your web site.

 

However, a more compelling argument is that changes to the front-end usually don't require as much resources as do projects that involve rearchitecting the backend software and/or hardware.

The Rules

A quick perusal of the table of contents indicates that this book is organized using 14 rules that identify the best practices to be employed. I've enumerated the top 10 of these rules below.

Rule 1: Make Fewer HTTP Requests

If a single HTTP request spawns a request for each web component on that page, then logic dictates that the fewer the components on that page, the faster it will load.

Simple techniques to do this, without adversely affecting the user experience, are to combine the individual graphics into a single graphic, and then use either an image map or a CSS sprite to delineate the different sections of that image; or, to combine separate CSS (or JS) files into a single master CSS (or JS) file.

 

Rule 2: Use a Content Delivery Network

This next rule addresses performance by moving fairly static web components (e.g., images, stylesheets, scripts, etc.) closer to the end user - by leveraging Content Delivery Networks run by third parties such as Akamai, to host your components. Note that this does not address your application components - which can end up being a major reengineering exercise.

Rule 3: Add an Expires Header

This rule reaches out across the wire to make the client's cache work for you. The goal is to reduce the number of web component requests, after the first request, by having the client cache them locally.

You do this by marking components that change infrequently (in particular, images) as cacheable, and by setting their lifetime in the cache to a date far in the future. "Far" is a relative term, and depends on the item being cached. For a company logo, you might set this in terms of years, whereas hot topic images might expire in a day.

You can do this using either the HTTP/1.0 Expires header or the HTTP/1.1 Cache-Control header with the max-age directive, or preferably both. The advantage with the latter being that you can use a relative cache lifetime (in seconds), rather than having to worry about computing absolute dates and clock synchronization.

Note that caching is not fool proof, users are known to clear their caches; filled caches are automatically cleared to make space for new entries; and the user may not visit the page frequently enough to benefit from the cache.

There is a development cost to be paid for using this rule though, i.e., how to handle the case where a component has indeed changed since it was cached. A workable approach is to add a revision number to that component's file name. When a page requests that component using the new file name, the cached version will no longer match, and so a fresh request will be issued.

Rule 4: Gzip Components

This rule reduces network traffic by actively compressing the component before it is transmitted over the wire. It focuses on the tradeoff of network traffic against server CPU overhead incurred due to the compression.

Servers must strive to gzip all text responses, including HTML documents, scripts, and stylesheets. PDF files and images are already compressed, and so do not benefit from additional compression.

When the server detects that a browser can support compression (via its Accept-Encoding request header), and if the file matches the configured filter (e.g., size and type), it will compress the file, and send back a Content-Encoding response header to indicate the type of compression used.

As with the other rules there are caveats galore - incl. the use of proxies, and browser incompatibilities, which need to be considered.

Rule 5: Put Stylesheets at the Top

The progress indicator of the web world is the incremental rendering of the page itself. The components of a page are painted in, as they are downloaded, to give an indication that the page is alive and well - this is known as "progressive rendering".

Unfortunately, placing stylesheets at the bottom of a document prevents Internet Explorer from rendering any content until the stylesheet is downloaded (aka Blank White Screen). This is "by design", and is intended to avoid having to re-render elements if their styles change (aka Flash of Unstyled Content).

The interesting aspect here is that this does not delay the downloading of the components that come before the stylesheet. It simply suppresses visual cues to the user, causing the impression of slower performance.

Rule 6: Put Scripts at the Bottom

Unlike stylesheets, where progressive rendering is blocked until all stylesheets have been downloaded, scripts block progressive rendering only for content below the script. Hence you should move them as low in the page as possible.

Interestingly, parallel downloading is disabled while a script is being downloaded, even on different hostnames. All other components must wait until the script is completely retrieved. This implies that placing scripts at the top of the page will (a) force all other downloads to wait, and (b) will block progressive rendering for the entire page, until the script is loaded. Placing scripts at the bottom of the page therefore results in better performance, both actual and perceived.

Rule 7: Avoid CSS Expressions

I've not seen the use of CSS expressions much in the code bases I've worked with, and the example provided wasn't very compelling. However, Steve sounds the alarm on how often such expressions are evaluated.

 

Rule 8: Make JavaScript and CSS External

Inlining Javascript and stylesheets tend to be faster than using external files, since no additional HTTP requests are needed to retrieve them from the server. However, external files can be cached by the browser, reducing the number of requests that are needed. The more often a visitor accesses your site in a session, or in a given period, the more likely that he will benefit from caching external components.

Ideally, we should inline scripts and CSS for home pages, but use external files for the secondary pages. You can achieve this by using the onload event in the home page. I.e., once the home page has completely loaded, you can then dynamically download the external components used by the secondary pages, into the browser's cache. These scripts/styles are also loaded inline for the home page, so in order for the second download to work, your code must deal with double definitions.

Rule 9: Reduce DNS Lookups

On the Internet, Domain Name Servers provide the service of resolving user memorable host names into IP addresses. The browser must wait patiently without downloading anything, until the lookup is complete. The response time depends on the DNS resolver, your proximity to it, the load on it, and your bandwidth speed.

DNS records are cached both by your browser, as well as your operating system. When both these caches are empty, one DNS lookup is required per unique host name on the retrieved page. In other words, reducing the number of unique host names will reduce the number of lookups that are required. However, this has the negative side effect of reducing parallelization of downloads. Steve's rule of thumb is to split your components across 2 unique host names.

Rule 10: Minify JavaScript

This rule suggests that you reduce the size of your script files by minimizing their contents, i.e., by eliminating white space and comments. This is particularly useful when in combination with compressing your text files.

Interesting Factoids

1.    You can often see a blank space with no downloads that occurs right after the HTML page is retrieved. This period of quiet indicates that the browser is occupied parsing the page's contents, and in retrieving components that may have been previously cached.

2.    Even components that do not have an Expires header in the future are stored in the cache. On subsequent requests, the browser checks the cache and finds that this component has expired. A browser cannot use a stale component without first querying the server using a conditional GET request.

If the component hasn't changed, the server returns a "304 Not Modified" status code that instructs the browser to use the cached component. Else, it sends back the requested component. Using a far future expires header cuts down on these conditional GETs.

3.    The HTTP/1.1 spec suggests browsers should download two components in parallel per hostname. This means that distributing your components evenly across two hostnames will result in 4 components being downloaded in parallel - this is the sweet spot.

4.    FireFox's network.http.max-persistent-connections-per-server setting in the about:config page lets you set the number of parallel downloads.

Conclusion

This book was easy to browse, was fairly concise, and had a lot of good information to convey, a truly unbeatable combination. It provides you with a nice tour of the performance related aspects of the HTTP specification, while at the same time getting you in the right frame of mind for analyzing your own web sites. If you write or maintain web code you'd be well served by reading this book.