8 Ways to Fine-tune your SQL Queries (for production databases)

‘Sanitization’ is the removal of malicious data from user input, such as form submissions or maybe more simply…

The cleaning of user input to avoid code-conflicts (duplicate ids for instance), security issues (xss codes etc), or other issues that might arise from non-standardized input & human error/deviance.

Define Business Requirements before Beginning

Write very specific requirements and confirm them with stakeholders. The performance of the production database is too critical to have unclear or ambiguous requirements. Make sure the requirements are as specific as possible, and confirm the requirements with all stakeholders before running the query.


Define SELECT Fields instead of SELECT *

Inefficient:

SELECT *
FROM Customers

Efficient:

SELECT FirstName, LastName, Address, City, State, Zip
FROM Customers

Select More Fields to Avoid SELECT DISTINCT

Inefficient and inaccurate:

SELECT DISTINCT FirstName, LastName, State
FROM Customers

Efficient and accurate:

SELECT FirstName, LastName, Address, City, State, Zip
FROM Customers

Create Joins with INNER JOIN Rather than WHERE

Some SQL developers prefer to make joins with WHERE clauses, such as the following:

SELECT Customers.CustomerID, Customers.Name, Sales.LastSaleDate
FROM Customers, Sales
WHERE Customers.CustomerID = Sales.CustomerID

To prevent creating a Cartesian Join, INNER JOIN should be used instead:

SELECT Customers.CustomerID, Customers.Name, Sales.LastSaleDate
FROM Customers
INNER JOIN Sales
ON Customers.CustomerID = Sales.CustomerID

Use WHERE instead of HAVING to Define Filters

Use Wildcards at the End of a Phrase Only
When searching plaintext data, such as cities or names, wildcards create the widest search possible. However, the widest search is also the most inefficient search.

When a leading wildcard is used, especially in combination with an ending wildcard, the database is tasked with searching all records for a match anywhere within the selected field.

Consider this query to pull cities beginning with ‘Char’:

SELECT City FROM Customers
WHERE City LIKE ‘%Char%’

This query will pull the expected results of Charleston, Charlotte and Charlton. However, it will also pull unexpected results, such as Cape Charles, Crab Orchard, and Richardson.

A more efficient query would be:

SELECT City FROM Customers
WHERE City LIKE ‘Char%’

Use LIMIT to Sample Query Results
Before running a query for the first time, ensure the results will be desirable and meaningful by using a LIMIT statement. (In some DBMS systems, the word TOP is used interchangeably with LIMIT.) The LIMIT statement returns only the number of records specified. Using a LIMIT statement prevents taxing the production database with a large query, only to find out the query needs editing or refinement.

Run Analytical Queries During Off-Peak Times

In order to minimize query impact on the production database, talk to a DBA about scheduling the query to run at an off-peak time. The query should run when concurrent users are at their lowest number, which is typically the middle of the night (3 – 5 AM).

The more of the following criteria your query has, the more likely of a candidate it should be to run at night:

Selecting from large tables (>1,000,000 records)
Cartesian Joins or CROSS JOINs
Looping statements
SELECT DISTINCT statements
Nested subqueries
Wildcard searches in long text or memo fields
Multiple schema queries

Bonus

Learn How to Create Indexes Properly
Only Retrieve the Data You Really Need