Chitika Add

Monday, September 12

Sequence of SQL statement processed -- basic knowledge to increase the performance of your script


Sequence of SQL statement processed

When a query is submitted to the database, it is executed in the following order:
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
OR
DER BY clause

The following diagram will show a clear picture:



So why is it important to understand this?
When a query is executed, First all the tables and their join conditions are executed filtering out invalid references between them.

Then the WHERE clause is applied which again filters the records based on the condition given. Now you have handful of records which are GROUP-ed
And HAVING clause is applied on the result. As soon as it is completed, the columns mentioned are selected from the corresponding tables. And finally sorted using ORDER BY clause. So when a query is written it should be verified based on this order, otherwise it will lead wrong result sets.

Example:
Recently I came across the following problem.

The original query was
SELECT DISTINCT so.departure_airport, so.airport_name
FROM special_offers so, special_offers_groups_join sogj
WHERE so.resort_code = '000170' AND so.accomodation_code = '015397'
AND so.departure_date = '2008-01-13'
AND so.nights = '7'
AND sogj.group_id = '1'
AND sogj.special_offer_id = so.special_offer_id
ORDER BY price ASC

Table data:
departure_airport    airport_name                       price
'MAN'                      'Manchester Airport'           344
'MAN'                      'Manchester Airport'           288
'MAN'                      'Manchester Airport'            316
'BRS'                       'Bristol'                             289
'BRS'                       'Bristol'                             345
'BRS'                       'Bristol'                             317
'BHX'                      'Birmingham Airport'             343
'BHX'                      'Birmingham Airport'             287
'BHX'                      'Birmingham Airport'             315

Here the query is executed as follows:
All the records from both the tables are taken.

Conditions mentioned in the WHERE clause is applied fetching say 9 records.
Now the distinct of departure_airport and airport_name alone is taken since price is not in SELECT clause. This gives you three rows randomly selected (mostly in the order of insertion to the table!)
And the result is sorted based on price (incorrect price column due to distinct keyword).
So the output is not ordered properly based on price!

So we need to rewrite the query as follows:
All the records from both the tables should be taken.

Conditions mentioned in the WHERE clause needs to applied on the resulting resultset.

Group the resultset based on departure_airport and airport_name to get the minimum price in each group.

Departure_airport               airport_name                                   minprice
'BHX'                                '
Birmingham Airport'                       287
'BRS'                                 '
Bristol'                                       289
'MAN'                                '
Manchester Airport'                      288

Select the columns departure_airport, airport_name and min(so.price) from the resultset. And finally apply the ORDER By clause which orders the resultset based on the column min(so.price).

So the query should be:
SELECT so.departure_airport, so.airport_name, MIN(so.price) AS minprice
FROM special_offers so, special_offers_groups_join sogj
WHERE so.resort_code = '000170'
AND so.accomodation_code = '015397'
AND so.departure_date = '2008-01-13'
AND so.nights = '7'
AND sogj.group_id = '1'
AND sogj.special_offer_id = so.special_offer_id
GROUP BY so.departure_airport,so.airport_name
ORDER BY minprice;


Where clause and from clause for better performance: In SQL

Order of tables in the FROM clause should be:
1. Larger table
2. Smaller table
Order of conditions in WHERE clause should be:
2. Less selective
1. More selective
All Boolean conditions without built-in functions or sub queries are evaluated in reverse from the order they are found in the WHERE clause, with the last predicate being evaluated first.

No comments: