T-SQL Programming Guidelines
In this article, I share the 20 Programming Guidelines that I follow when working with T-SQL (MS SQL SERVER).
These guidelines have been developed over 10+ Years while working with a multi-terabyte database system.
1. Filter Condition – When joining physical tables in a query, there must be at least 1 filter condition (WHERE col1 = @par1) on any of the joining tables, to narrow down the result set. We must not get all the data from the table/tables, only get the data that is currently needed by the application.
2. Data Type Mismatch – When creating temporary objects in our stored procedures (# or @), it is important that the data types of the columns in the object, are the same as the data types of columns from which the data will be fetched. The same must be followed for local variables as well. Even for Table Joins, the columns on either side of the equality, must have matching datatypes, else it causes the Optimizer to convert implicitly, which affects performance negatively.
3. SET V/S SELECT –
We must use SET when:
Assigning a value to a variable from a query, as, if multiple values are returned from the query, SET throws an exception, whereas SELECT assigns the first returned value which may not be the right one.
We must use SELECT when:
Assigning values to multiple variables from a query. This should be used only when this assignment is not being made in a loop or multiple times in a stored procedure, as, if the current value becomes NULL, SELECT retains the previous value, which would be incorrect, in such cases use SET.
4. Placement of Filter Conditions in Outer Joins – While using an outer join, if the filter condition is placed in the ON Clause, it will apply the filter condition and then add the outer rows, whereas if the condition is placed in the WHERE Clause, the filter condition is applied after the outer rows have already been added. Hence it is important to show extreme precaution when using Outer Joins with filter conditions.
5. Temp Table v/s Table Variable – In our procedures, for temporary processing purposes if temporary objects have been created then the general rule is to use temporary tables (# tables) if we are going to store more than 50 rows in that object and Table Variables (@ tables) for lesser number of rows to ensure correct estimation in execution plans. A more generalized approach is for storing lookup data temporarily – use table variables and for transactional data – use temporary tables.
6. SCOPE_IDENTITY () V/S @@IDENTITY – When trying to fetch the last inserted identity value we must use SCOPE_IDENTITY () and not @@IDENTITY. Also when trying to find the number of rows inserted in the previous statement, we must use @@ROWCOUNT and not “Select Count () from”. @@RowCount must be immediately used after the statement for which we need to find the affected row count.
7. IN v/s EXISTS – It is recommended to use exists/not exists instead of in/not in, as it prevents scanning of the internally called table. IN can be used for static comparisons like – IN (1, 2, 3…).
8. Index Check While Joining Tables – When joining tables, the joining columns or the filtering column must have the relevant index. Also we must keep an eye on new tables/ columns created when reviewing code to verify if relevant indexes have been created for them according to use cases.
9. Scalar Valued Functions – We must avoid calling scalar valued functions in select statements, if the scalar valued function joins several high data tables internally, instead this should be converted to table valued or better in to a stored procedure. This is because Scalar valued function, operate row-by-agonizing-row instead of on a set.
10. Dynamic SQL and sp_executesql – We must avoid dynamic SQL as much as we can. If unavoidable, the dynamic SQL must be parameterized, this ensure, plan caching and reuse.
11. ANSI Joins – While joining tables in a statement, we must use ANSI standard joins (INNER JOIN or OUTER JOIN), and must not join the tables in WHERE clause (old Microsoft-style join). In other words, we must not list the tables in FROM clause separated by comma and must define the join conditions in WHERE clause, as the former will not be supported in future SQL Server Versions.
12. Insert with Column Names – In INSERT statements, we must list the column names being inserted, even when inserting values to all columns in that table.
13. Order By and Distinct – Use Order By and Distinct only if unavoidable. The order by columns must preferably be in the index definition. If a result has to be ordered then the final SELECT statement which retrieves the result set must be explicitly ordered using the ORDER BY clause. As must as possible, do the ordering at the application, as database sorting is a performance overhead.
14. XML Parameters – We must not use XML parameters. Use TVPs instead. Processing of XMLs consume more memory and time, also adds code complexity.
15. SARG predicates – The conditions within WHERE clause must be SARG (searchable argument). Manipulation of physical columns (ISNULL (Col1) = @par1), must be avoided, as that renders the query NON – SARG’able and such queries can then cannot use underlying indexes.
16. Table Hints and Index Hints – We must not use hints in queries as plans change according to data, and we must let sql server decide the indexes to be used.
17. Nested SP Calls – While calling another stored procedure within a stored procedure, we must use the @Param = Value format. This will avoid any wrong matching of parameters and values in case of any change in the parameter list in the procedure being called.
18. Avoid Recompile – While Declaring Temporary objects in our procedure, we must keep all declarations at the beginning of the procedure, else it may cause a SP recompile hence affecting performance.
19. Transaction Handling – Transactions should be handled at the application level, hence our procedures must not have transaction handling within them (Begin Tran…. Commit Tran…. Rollback Tran).
20. Naming Convention and SET NOCOUNT ON – If the object(Procedure, Function, View) is newly created then we can also check the below:
a. The stored procedure names must start with “usp" and functions with “udf".
b. Stored Procedures must have SET NOCOUNT ON in the beginning, this reduces unnecessary informatory messages from SQL Server.