Search This Blog

Wednesday, January 30, 2008

Tips for Writing High-performance SQL

Tips for Writing High-performance SQL
These tips apply broadly when writing high-performance stored procedures. Unfortunately, unlike some tips, you can't simply apply most of them without first considering the nature and schema of the data you're querying.

  1. Avoid using cursors (as well as other looping structures) as much as possible. Cursors are inefficient, and database engines usually don't have the best loop implementations in terms of performance.

    On the database side, you can usually replace code involving cursors with aggregate SQL statements (SELECT, INSERT, and UPDATE) that use vector tables. All database engines are heavily optimized for aggregate statements, so even if a loop is unavoidable, it is always better to execute a few aggregate statements in a loop with a small number of iterations, than to create a cursor and execute simple statements over a large number of iterations.

    Even if initial performance tests, especially with a small amount of data, show cursors to be more efficient than a complex aggregate statement, it is worthwhile to try to optimize the operation by breaking it into smaller portions or using other approaches—unless you can guarantee that the data value will stay small. Cursor approaches will not scale.

  2. Filter data wisely. One alternative to using cursors uses a fall-through approach, filtering and aggregating data in multiple steps via a set of data storages, which could be physical tables, temporary tables, or table variables. It is usually best to include some aggregate filters into aggregate statements to filter out the majority of data in one simple shot whenever necessary, working on smaller amounts of data. Then you can proceed with joining and filtering, making sure to keep the number of join permutations under control at all times.

  3. It is usually more efficient to execute multiple statements with one condition than a single statement with multiple OR conditions when executing UPDATE and DELETE statements against permanent database tables that can be accessed by multiple users simultaneously. This tip is especially important from the scalability point of view; from the performance point of view the difference is usually marginal. The major reason for the tip is the locking of the database records and the lock escalations that occur behind the scenes.

  4. Make wise distinctions between temp tables and table variables. Table variables are in-memory structures that may work from 2-100 times faster than temp tables. But keep in mind that access to table variables gets slower as the volume of data they contain grows. At some point, table variables will overflow the available memory and that kills the performance. Therefore, use table variables only when their data content is guaranteed not to grow unpredictably; the breaking size is around several thousand records. For larger data volumes, I recommend temp tables with clustered indexes. Interestingly, I've found that a temp table with one clustered index is often faster than having multiple simple indexes. In contrast, multiple simple indexes with physical tables are often faster than one clustered index.

  5. Make careful distinctions between hard rules and assumptions. This is more of a business design tip, which applies more to code design than to performance and scalability design in general. In real life however, performance and scalability are generally the first things to suffer from improper design. When rules are implemented as assumptions, they usually cause unnecessary calculations to be performed, affecting performance. However, when assumptions are implemented as rules they tend to cause errors and algorithm failures, which usually requires an urgent redesign. That, in turn, is usually performed with business constraints and results in inefficient final algorithms. That's because bad design decisions are often corrected in a rush and without sufficient resources—sometimes under pressure from customers whose businesses are usually in a critical stage when problems are uncovered, but must continue operating during the process.

  6. Pay attention to join order. Using proper join order sometimes lets the database engine generate hints that execute joins with an optimal amount of records. Most database engines also support hard hints, but in most cases you should avoid using hard hints and let the database engine figure out the best way to do its job on its own.

  7. Be careful when joining complex views to other views and database tables in complex SELECT statements. When the database contains a significant amount of data, SQL Server engine tends to recalculate the execution plan of the resulting statement, which often results in an inefficient execution plan and may kill the performance. The most difficult part is that the behavior of SQL Server engine is inconsistent in that respect, and heavily depends on the database size, indexes, foreign keys, and other database structures and constraints. The consistent work-around is to pre-select data from the view into a temp table with the reasonable pre-filters, and then use that temp table in place of the underlying view.

  8. Create indexes on temp tables wisely. As mentioned in Tip 4, clustered indexes are usually the best in terms of performance for temp tables; however, there is a difference between creating the index before or after inserting data into the temp table. Creating the index before the insert complicates the insert, because the database engine must order the selection. For complex selections such as those mentioned in Tip 7, the extra ordering may overcomplicate the overall statement and drastically degrade the performance. On the other hand, creating the index after the insert forces the database engine to recalculate the execution plan of the stored procedure every time it is called. Therefore, the decision is always a trade-off and you should make it based on the relative costs of the two possibilities.

  9. In general, try to avoid execution plan recalculation. One common cause of recalculation occurs when the stored procedure contains several paths that depend on values passed in parameters. However, whether avoiding recalculation is possible depends on the complexity of the stored procedure and on other circumstances, such as those described in tip 8. When the engine does recalculate execution, performance always suffers; however, recalculating the execution plan of the caller does not force the execution plan recalculation of the called procedure (or view or function). Therefore, the workaround is to divide one stored procedure into multiple procedures (depending on the passed-in parameters), and then call the children from the parent conditionally. You should perform this subdivision very carefully though, because it can be a maintenance nightmare—but sometimes it seems to be the only way to achieve acceptable database performance and scalability.
Finally, although this isn't either a performance or a scalability tip, I urge you to format your stored procedure scripts legibly. It's best to agree on common practices such as clause order and formatting rules with your coworkers in advance. Not only does that help avoid errors, it also clearly shows the logical structure of the statements and often aids in figuring out faulty filters and joins.

This list of tips is certainly not exhaustive, but they probably cover the most important performance and scalability factors.

The original article could be read from the following link:
http://www.devx.com/dbzone/Article/33551/0/page/2

For an example solution (Sudoku Stored Procedure Script),  read from the following link:
http://www.devx.com/dbzone/Article/33551/0/page/3

Gene Pinski currently works as a Dev Lead at Pearson School Systems where he's responsible for several development areas of Chancery SMS in Burnaby, BC, Canada. His professional interests include developing business and web applications, and developing games. Find more about Gene from his web site.

Sunday, January 13, 2008

JavaScript Best Practices

Use Object Oriented JavaScript
• Provides better reusability of the code
• Enables your objects to be better organized
• Allow for dynamic loading of objects

Use Object Hierarchies to Organize JavaScript Objects to Avoid Name Collision
• In JavaScript there is the potential for object names to collide. In Java language, package names are used to prevent naming collisions
• JavaScript does not provide package names like Java however you can when writing components use objects and object hierarchies to organize related objects and prevent naming collision

Use the prototype property
• Use it to define shared behavior and to extend objects
• The prototype property is a language feature of JavaScript. The property is available on all objects

Write reusable JavaScript
• JavaScript should not be tied to a specific component unless absolutely necessary
• Consider not hard coding data in your functions that can be parameterized

Object Literals
• Object literals are objects defined using braces ({}) that contain a set of comma separated key value pairs much like a map in Java
• Example
{key1: "stringValue", key2: 2, key3: ['blue','green','yellow']}
• Object literals are very handy in that they can be used as arguments to a function
• Object literals should not be confused with JSON which has similar syntax

Load JavaScript On Demand
• If you have a large library or set of libraries you don't need to load everything when a page is loaded
• JavaScript may be loaded dynamically at runtime using a library such as JSAN or done manually by using AJAX to load JavaScript code and calling eval() on the JavaScript

Separation of content, CSS, and JavaScript
• A rich web application user interface is made up of
content (HTML/XHTML), styles (CSS), JavaScript
• Separating the CSS styles from the JavaScript is a practice which will make your code more manageable, easier to read, and easier to customize
• Place CSS and JavaScript code in separate files
• Optimize the bandwidth usage by having CSS and JavaScript file loaded only once

Reduce the size of JavaScript file
• Remove the white spaces and shortening the names of variables and functions in a file
• While in development mode keep your scripts readable so that they may be debugged easier
• Consider compressing your JavaScript resources when you deploy your application
• If you use a 3rd party JavaScript library use the compressed version if one is provided.
Example compression tool: ShrinkSafe

To read the actual article, you could download the pdf file from the following link.
http://www.javapassion.com/ajax/JavaScriptBestPractices.pdf