Top-n-analysis is used when you want to retrieve only the top number of records from a result set.

From oracle8i onwards the inner query can have an order by clause AND IS A REQUIREMENT IN THE SUB QUERY TO PERFORM the Top-N analysis. Top-N queries are useful in scenarios where the need is to display only the n top-most or the n bottom-most records from a table based on a condition. This result set can be used for further analysis.  For example, using Top-N analysis you can perform the following types of queries:

Top-n-analysis is used when you want to retrieve only the top number of records from a result set.

  • Top three earners in the company
  • Four most recent recruits in the company
  • Top two sales reps who have sold the maximum number of products
  • Top three products that have had maximum sales in the last six months

The high-level structure of a top-n analysis query is:

select [column_list], ROWNUM [ROWNUM_ALIAS] from (select [column_list] from table ORDER BY Top-N_column) where ROWNUM<=N;

Performing “Top-N” Analysis

Top-N queries use a consistent nested query structure with the elements described below:

  1. A subquery or an inline view to generate the sorted list of data. The subquery or the inline view includes the ORDER BY clause to ensure that the ranking is in the desired order. For results retrieving the largest values, a DESC parameter is needed.
  2. An outer query to limit the number of rows in the final result set. The outer query includes the following components:
  • The ROWNUM pseudocolumn, which assigns a sequential value starting with 1 to each of the rows returned from the subquery.
  • A WHERE clause, which specifies the n rows to be returned. The outer WHERE clause must use a < or <= operator.

The following is an example for performing

  •  TOP 3 earners names and their salaries.

select ROWNUM as RANK, ename,sal from(select ename,sal from emp ORDER BY sal DESC) WHERE ROWNUM<=3;

      RANK ENAME             SAL

---------- ---------- ----------

         1 KING             5000

         2 SCOTT            3000

         3 FORD             3000

  • The least 3 earners names and their salaries.

select ROWNUM as RANK, ename,sal from(select ename,sal from emp ORDER BY sal) WHERE ROWNUM<=3;

      RANK ENAME             SAL

---------- ---------- ----------

         1 SMITH             800

         2 JAMES             950

         3 ADAMS            1100

  • To find the top 4 senior most employees in the emp table.

select ROWNUM as SENIORS, SENRS.ename,SENRS.hiredate from (select ename,hiredate from emp order by hiredate)SENRS where ROWNUM<=4;

   SENIORS ENAME      HIREDATE

---------- ---------- ---------

         1 SMITH      17-DEC-80

         2 ALLEN      20-FEB-81

         3 WARD       22-FEB-81

         4 JONES      02-APR-81

  • To find 4 junior most employees in the emp table.

select ROWNUM as SENIORS, SENRS.ename,SENRS.hiredate from (select ename,hiredate from emp order by hiredate DESC)SENRS where ROWNUM<=4;

   JUNIORS ENAME      HIREDATE

---------- ---------- ---------

         1 ADAMS      23-MAY-87

         2 SCOTT      19-APR-87

         3 MILLER     23-JAN-82

         4 JAMES      03-DEC-81

SQL>select ROWNUM as SENIORS, SENRS.ename,SENRS.hiredate from (select ename,hiredate from emp order by hiredate)SENRS where ROWNUM<=4 order by rownum desc;

  SENIORS ENAME      HIREDATE

--------- ---------- ---------

        4 JONES      02-APR-81

        3 WARD       22-FEB-81

        2 ALLEN      20-FEB-81

        1 SMITH      17-DEC-80

It's common to run a query using only part of a dataset – for example, the top 100 salespeople in a company. In this article, we'll see how to use Oracle's Top-N query method to query just these rows.

Top-N queries retrieve a defined number of rows (top or bottom) from a result set. In other words, they find the best or worst of something – the ten best selling cars in a certain region, the five most popular routers, the 20 worst-performing stores, etc. These kinds of queries are mostly used in Business Intelligence, where it's important to track the performance of certain entities and thus find ways of increasing profits or cutting costs.

As database developer or BI developer, your role is to enable users to get results in a timely and dependable manner. In this article, we will explore some common ways of constructing Top-N queries in an Oracle database.

First, we need to lay the groundwork for our queries by familiarizing ourselves with some basic analytical functions. If you aren't familiar with SQL functions, I recommend using Vertabelo Academy, particularly the Standard SQL Functions Course.

Assuming you do have some background in SQL, let's get started by looking at the data in a simple sales table:

This table contains the following information:

  • id – A unique customer identifier in the system.
  • account – The customer's account number.
  • basket – Identifies the product group that was sold.
  • amount – The total number of products purchased.
  • sales date – The date of the sale.

Now let's query this table and see what we have inside:

SELECT * FROM sales ORDER BY amount DESC;

Here is the result:

Now suppose you want to query the top five sales, or the five biggest sales. Usually, people execute this query using the Oracle pseudocolumn ROWNUM. ROWNUM returns a number indicating the order that the row is selected from the table. Here's the code:

SELECT * FROM sales WHERE rownum <=>

But this is a mistake, as you can see from the result set:

This is because ROWNUM is executed before the ORDER BY statement. To get around this, you use the inline view (a SELECT statement within the FROM clause of another SELECT, as shown below.) This tells Oracle to sort the data first.

select * from (select * from sales order by amount desc) where rownum <=>

And we get the correct result:

However, there is one problem: how we define the best sale. Are the five best sales the first five distinct best? The five non-distinct best? Or the first five ranked sales? The definition of the best sales will come from the business side of things, but figuring out the solution is up to you.

What Is a Top-N Query?

Now that we understand the business problems and the data, let's move on to Top-N queries. I'll look at each of these query types:

  • Top-N with ROW_NUMBER
  • Top-N Distinct
  • Top-N with RANK

Top-N queries don't have a special command or keyword. They utilize an ordered inline view, select results from that view, and then limit the number of rows using ROWNUM.

Note for Oracle 12c users: Oracle 12c introduced a new clause, FETCH FIRST. This allows us to write Top-N queries by adding the FETCH FIRST clause at the end of a basic query. I've included examples of this method as well.

This is the classic Top-N. It is non-distinct, meaning that there may be gaps in the ranking if two or more values are the same. The ROW_NUMBER analytical function returns a unique row for each returned row. We can take this query further than we can one with ROWNUM because of the windowing aspect of ROW_NUMBER. To learn more about analytical functions in Oracle, start here.

The query is :

SELECT * FROM ( SELECT sales.*, ROW_NUMBER() OVER (ORDER BY amount DESC) AS amount_dense_rank FROM sales ) WHERE amount_dense_rank <=>

And the result we get is:

Oracle 12c Syntax

To get the same result using Oracle 12c, we simply write:

SELECT amount FROM sales ORDER BY amount DESC FETCH FIRST 3 ROWS ONLY;

And the result we get is:

Top-N Distinct

To get a Top-N with distinct results (i.e. no gaps in the ranked results), we use the DENSE_RANK analytical function. DENSE_RANK is similar to RANK (which we will discuss below), but it returns without any gaps in the results. We would use the Top-N Distinct query type when we want to see all the values and all the corresponding rows.

Here is an example of a Top-N query with DENSE_RANK:

SELECT * FROM ( SELECT sales.*, DENSE_RANK() OVER (ORDER BY amount DESC) AS amount_dense_rank FROM sales ) WHERE amount_dense_rank <=>

And the result:

We see that the same amounts are given the same rank – there are multiple 1s, 2s, and 3s – but there are no gaps. We will receive the three highest distinct values.

Top-N Non-Distinct with Ranked Values

A Top-N non-distinct query with ranked values will return an N number of the highest values and all the corresponding rows. To implement this, we use the RANK analytical function, which produces a sequential rank for each distinct value in the specified window.

Let's look at an example with three amounts:

SELECT * FROM ( SELECT sales.*, RANK() OVER (ORDER BY amount DESC) AS amount_rank FROM sales ) WHERE amount_rank <=>

And the results :

The RANK function returns the same number for non-distinct values and skips the number of distinct values to keep the rank consistent. If we had three sales that all had amounts of 100, there would be three "first-place" values with the amount of 100 and the next rank would be four.

Oracle 12c Syntax

You can achieve the same ranking results using Oracle 12c's new syntax feature:

SELECT amount FROM sales ORDER BY amount DESC FETCH FIRST 3 ROWS WITH TIES;

And we get the same result:

We've explored three simple methods of creating Top-N queries with analytical functions. And we've explored a new syntax available in the Oracle 12c database. While we've concentrated on finding the best performers, an inverse query that finds the worst performers is also possible. All you'd need to do is reverse the order of the analytical functions or inline views. Why not put in some practice with creating queries? I recommend the SQL Window Functions online course – the only interactive course for SQL analytical functions on the Internet. See what you can come up with!