Search This Blog

Friday, August 31, 2007

Optimize your Web site performance to survive heavy traffic

1. Avoid accessing databases to often
On a database driven content sites, the slowest task is accessing the database to retrieve the content to display. If a site is not fast enough to serve its pages, many simultaneous user accesses force the Web server to create more processes to handle all the requests.
This is bad because it demands more server memory. Once the server RAM is exhausted, the machine starts using the virtual memory, making the server even slower, until it halts completely.

2. Cache Web pages
If the site needs to access the database to retrieve the content to display, what can we do to minimize the database accesses?
First, we need to focus on what kind of information the site retrieves from databases. The most common type of data is information used to build the HTML pages.
The fact is that the information in database does not change so frequently. As a matter of fact, usually different users see the exact same HTML content when they access the same page.
It is not unusual to execute many SQL queries to retrieve all the data that is necessary to build a single HTML page. So, it is evident that it would be much more efficient if the sites could cache the HTML of each page after it is accessed for the first time.
So, what if the database content changes? Just call a code that invalidates all the page caches that depend on content of the changed database table rows. This way, it forces the caches to be rebuilt on the next access to the site pages, and so the pages are always upto date.

3. Avoid needless personalization
What about pages that appear differently to each user that accesses them? Use separate cache files depending on the user accessing the pages.
However, it would useless if the site would use a separate cache file to store the HTML that each user sees. The benefit of reusing cached information would be lost.
To maximize the efficiency of this approach you should minimize the number of user profiles that may be used for each page context. Therefore, it is very important to avoid personalization as much as you can.

4. Queue tasks that may take too long
Caching is great to avoid repeated accesses to the same content stored in a database.
However, caching only applies to accesses that retrieve data from databases. Operations that update the database content may not benefit from caching. However, when done to frequently, database write accesses may cause server overload.
One of the solution is not to update that table in real time. Instead, create a similar table that act as a queue. The queue table has no indexes. Periodically, start a background task from cron, and move the queue table data to the main table.

5. YSlow 13 optimization rules
There are many aspects to be concerned regarding what to do to make browsers interact with the Web servers in an optimized way that it takes less time and less server resources.
The issues are often not trivial to understand for people looking for quick and easy solutions to optimize their Web sites. Fortunately, there are tools that help you to audit your sites and suggest what needs to be done to boost the performance of the sites to the limit.

YSlow is one of those Web site performance auditing tools. This is a tool that the fine folks of the team of Jeremy Zawodny at Yahoo! released a few weeks ago.
http://developer.yahoo.com/yslow/

This is a wonderful tool that in a few seconds gives you an overview of how a page of your site is being served. It also suggests what can be done to optimize aspects that affect page loading speed and the consumption of Web server resources.
This tool is an extension for the Firefox browser. It works together with another very good extension named Firebug. So, to use YSlow, first you need to install Firebug.
http://www.getfirebug.com/

Once you have installed Firebug and YSlow, it is very easy to audit the performance of any Web page. Just load a Web page that you want to test and wait until it finishes loading. Then open the Firebug pane, click on the YSlow tab, and use the Performance button.

Immediately YSlow starts collecting details about the current page. When it is done it shows a list of 13 rules about aspects of the page loading performance.
On the left side of the listing you see grades from A to F. Those grades express how your page performs on each of the 13 rules, A being the best and F being the worst. On the top of the listing you see your overall "Performance Grade" also from A to F, and a score between 0 and 100.
If you got the score 100, congratulations, your page is perfect. Otherwise there is performance tuning work to be done.

Thirteen Simple Rules for Speeding Up Your Web Site are
  1. Make Fewer HTTP Requests
  2. Use a Content Delivery Network
  3. Add an Expires Header
  4. Gzip Components
  5. Put Stylesheets at the Top
  6. Put Scripts at the Bottom
  7. Avoid CSS Expressions
  8. Make JavaScript and CSS External
  9. Reduce DNS Lookups
  10. Minify JavaScript
  11. Avoid Redirects
  12. Remove Duplicate Scripts
  13. Configure ETags


6. Too much AJAX and external Javascript may kill your page performance
Using AJAX and cool Javascript libraries is definitely the latest fashion in Web application development. These are common aspects that you may notice in the so-called Web 2.0 sites.
The problem is that when you use a Javascript library that has may interdependent components, sometimes to use a simple component you end up loading a pile of separate Javascript files.

This makes the browser send a lot of requests to the Web server once it enters a page that needs many of those Javascript files. Not only it may cause excessive load to the server, but it also slows down the rendering of a page, even when the page HTML has already been fully loaded. It is a similar effect of pages that use Flash movies that take a while to load.

Browsers cache Javascript files, but when the user is making the first access to a page that needs many Javascript files, it may take too long. It may give the impression that your site is much slower than it actually is.

More detailed information regarding rules to optimize your Web site performance and for defensive programming practices to survive Web site traffic peaks can be found in www.phpclasses.org

No comments: