Extracting a result subset

Posted 7 April, 2006 - 13:41

There are times when we are interested in fetching just a subset of a query's results. A number of approaches are available.

A Bad Approach Using ROWNUM

One approach is to use inline views and the pseudo-column ROWNUM. The following method cannot be relied upon to work for general values of m and n (figure 1.0):

Figure 1.0

SELECT * FROM
(
  SELECT *
  FROM emp
)
WHERE ROWNUM BETWEEN m AND n

What happens in the case where m > 1? The result is that no rows are retrieved at all. This is because the ROWNUM value is assigned to each row upon selection and before sorting takes place. For all rows with ROWNUM < m, the predicate ROWNUM BETWEEN m and n will always be false. Since rows 1 through m-1 cannot ever make it into the result set, there will never be a row m either, since that would have to be the first row (but m>1). We therefore have a contradiction and the result set will be empty for all values of m > 1. This method is really only appropriate for Top-N queries that lack a sorting clause. It is inappropriate for queries that need an ordering and it is also unsuitable for returning a subset of the results beginning at an arbitrary position.

A Correct Approach Using ROWNUM

The following approach (figure 1.1) uses a variation on the above theme to correctly return rows m through n of a query. Unlike the previous method, the query is free to make use of an ORDER BY clause.

Figure 1.1

SELECT
  empno, ename, job, mgr, hiredate, sal, comm, deptno
FROM
(
  SELECT
    ROWNUM AS rank,
    empno, ename, job, mgr, hiredate, sal, comm, deptno
  FROM
  (
    SELECT
      empno, ename, job, mgr, hiredate, sal, comm, deptno
    FROM emp
    ORDER BY ename
  )
)
WHERE rank BETWEEN m AND n

This method uses two levels of inline view and already it's looking quite hairy! Now, imagine if the query was something a lot more sophisticated than SELECT * FROM emp ORDER BY ename and it doesn't take long to begin considering using a different approach.

A Far More Efficient Approach Using ROWNUM

The following method (figure 1.2) is a slight variation on the above. It comes into its own when filtering large result sets where it can yield significant performance gains over the previous approach.

Figure 1.2

SELECT
  empno, ename, job, mgr, hiredate, sal, comm, deptno
FROM
(
  SELECT
    ROWNUM AS rank,
    empno, ename, job, mgr, hiredate, sal, comm, deptno
  FROM
  (
    SELECT
      empno, ename, job, mgr, hiredate, sal, comm, deptno
    FROM emp
    ORDER BY ename
  )
  WHERE ROWNUM <= n
)
WHERE rank >= m

An Approach Using Analytic Functions

ORACLE 8i introduced the concept of analytic functions. These functions answer queries that compute running totals, moving averages and Top-N queries to name a few applications. They are ideal for the problem at hand. Figure 1.3 provides an example of their use.

Figure 1.3

SELECT empno, ename, job, mgr, hiredate, sal, comm, deptno
FROM
(
  SELECT
    ROW_NUMBER() OVER (ORDER BY ename) AS rank,
    empno, ename, job, mgr, hiredate, sal, comm, deptno
  FROM  emp
)
WHERE rank BETWEEN m AND n


This is a much more elegant approach than that using ROWNUM and, with luck, ORACLE can use its optimiser to get at the result in as inexpensive a manner as possible.

So there you have it - rows m through n of a result set delivered by the database server itself in a single SQL statement.