| Author: avtar 05 Sep 2008 | Member Level: Gold | Rating: Points: -20 |
its a CORRELATED SUBQUERY .. keep these points in mind, when you use a correlated sub-query
Correlated sub-query is a performance overhead to the database server and so, you have to use it only if it is required Avoid using Correlated subquery on large tables, as the inner query is evaluated for each row of the outer query let’s look at the query that captures the Nth maximum value: Select * From Employee E1 Where (N-1) = (Select Count(Distinct(E2.Salary)) From Employee E2 Where E2.Salary > E1.Salary)
Where N is the level of Salary to be determined
lets explore it... create table and insert info.
Create table Employee ( Eid int, Name varchar(10), Salary money ) Go
Insert into Employee values (1,'harry',3500) Insert into Employee values (2,'jack',2500) Insert into Employee values (3,'john',2500) Insert into Employee values (4,'xavier',5500) Insert into Employee values (5,'steven',7500) Insert into Employee values (6,'susana',2400) Go
In the above example, the inner query uses a value of the outer query in its filter condition meaning; the inner query cannot be evaluated before evaluating the outer query. So each row in the outer query is evaluated first and the inner query is run for that row. Let’s look into the background process of this query, by substituting a value for N i.e. 4,(Idea is to find the 4th maximum salary):
Select * From Employee E1 Where (4-1) = (Select Count(Distinct(E2.Salary)) From Employee E2 Where E2.Salary > E1.Salary)Since the outer query’s value is referred in the inner query, the operation is done row-by-row. Based on the sample data as shown above, the process starts with the following record:
Employee E1 ---------------------------------- Eid Name Salary 1 harry 3500 The salary of this record is substituted in the inner query and evaluated as:
Select Count(Distinct(E2.Salary)) From Employee E2 Where E2.Salary > 3500Above query returns 2 (as there are only 2 salaries greater than 3500). This value is substituted in the outer query and will be evaluated as:
Select * From Employee E1 Where (4-1) = (2)The "where" condition evaluates to FALSE and so, this record is NOT fetched in the result.
Next the SQL Engine processes the 2nd record which is:
Employee E1 ---------------------------------- Eid Name Salary 2 jack 2500 Now the inner query is evaluated as:
Select Count(Distinct(E2.Salary)) From Employee E2 Where E2.Salary > 2500This query returns a value of 3 (as there are 3 salaries greater than 2500). The value is substituted in the outer query and evaluated as:
Select * From Employee E1 Where (4-1) = (3)The "where" condition evaluates to TRUE and so, this record IS fetched in the result. This operation continues for all the remaining records. Finally the result shows these 2 records:
Eid Name Salary 2 jack 2500 3 john 2500The above query works in the same manner in Oracle and Sybase as well. Applying the same logic, to find out the first maximum salary the query would be:
Select * From Employee E1 Where (1-1) = (Select Count(Distinct(E2.Salary)) From Employee E2 Where E2.Salary > E1.Salary)
If you are able to understand this functionality, you can workout various other queries in the same manner. The bottom line is, the query should be efficient and NOT resource hungry.
|
| Author: @@@ Hyderabadi Biryani @@@ 05 Sep 2008 | Member Level: Diamond | Rating: Points: 0 |
HI Avatar, This is pretty good explanation... Keep up the good work... Thanks -- Vj
|