Selecting One Employee from Each Department Using Window Functions in SQL

Window Functions for Selecting Employees from Each Department

In this article, we’ll explore how to use window functions in SQL to select one employee from every department. This is a common requirement when working with data that needs to be aggregated or summarized at different levels.

Introduction

Window functions are a powerful tool in SQL that allow you to perform calculations across rows based on a defined partitioning scheme. In the context of selecting employees from each department, window functions provide an efficient and elegant solution to achieve this goal.

Background

Before we dive into the code, let’s take a brief look at the basics of window functions and how they work.

In SQL, a window is a collection of rows that are considered as a single unit when performing calculations. A function is a calculation or operation that is applied to one or more windows. When combined, window functions enable you to perform complex operations on data without having to use self-joins or subqueries.

There are three main types of window functions:

  • ROW_NUMBER(): assigns a unique number to each row within a partition.
  • RANK(): assigns a ranking to each row within a partition based on the value of the function applied to that row.
  • DENSE_RANK(): similar to RANK(), but if two rows have the same rank, they are assigned consecutive ranks instead of gaps.

For our use case, we’ll be using ROW_NUMBER() to select one employee from each department.

The Problem

Suppose we have an Employee table with the following structure:

+--------+----------+-----------+
| EmployeeId | EmployeeName | EmpDepartment |
+--------+----------+-----------+
| 1       | John Smith   | 1          |
| 2       | Jane Doe    | 1          |
| 3       | Bob Brown   | 1          |
| 4       | Alice White | 2          |
| 5       | Mike Davis  | 2          |
+--------+----------+-----------+

We want to select one employee from each department, which means we need to get:

  • EmployeeId
  • EmployeeName
  • EmpDepartment

from employees with unique EmpDepartment values.

Solution

To solve this problem, we’ll use a window function like ROW_NUMBER() OVER (PARTITION BY EmpDepartment) AS RN. This will assign a unique number to each row within a partition based on the value of EmpDepartment.

Here’s how you can do it:

SELECT 
    E.EmployeeId,
    E.EmployeeName,
    E.EmpDepartment
FROM (
    SELECT 
        EmployeeId,
        EmployeeName,
        EmpDepartment,
        ROW_NUMBER() OVER (PARTITION BY EmpDepartment) AS RN
    FROM Employee
) X
WHERE X.RN = 1

This query will return the desired result:

+--------+----------+-----------+
| EmployeeId | EmployeeName | EmpDepartment |
+--------+----------+-----------+
| 4       | Alice White | 2          |
| 3       | Bob Brown   | 1          |
| 1       | John Smith   | 1          |
+--------+----------+-----------+

As you can see, the query selects one employee from each department based on their EmpDepartment value.

Adding an Order Clause

If we have a business rule that requires us to order employees within each department before selecting them, we can add an ORDER BY clause to our window function.

Here’s how you can do it:

SELECT 
    E.EmployeeId,
    E.EmployeeName,
    E.EmpDepartment
FROM (
    SELECT 
        EmployeeId,
        EmployeeName,
        EmpDepartment,
        ROW_NUMBER() OVER (PARTITION BY EmpDepartment ORDER BY EmployeeId) AS RN
    FROM Employee
) X
WHERE X.RN = 1

This query will return the same result as before, but now the employees are ordered by their EmployeeId within each department.

+--------+----------+-----------+
| EmployeeId | EmployeeName | EmpDepartment |
+--------+----------+-----------+
| 3       | Bob Brown   | 1          |
| 1       | John Smith   | 1          |
| 4       | Alice White | 2          |
+--------+----------+-----------+

As you can see, the employees are now ordered by their EmployeeId within each department.

Best Practices and Considerations

When using window functions like ROW_NUMBER(), keep in mind the following best practices:

  • Always specify the ORDER BY clause when partitioning by a column that has duplicate values.
  • Use the PARTITION BY clause to group rows based on a specific condition.
  • Be cautious with overlapping partitions, as this can lead to unexpected results.

In conclusion, window functions are an essential tool in SQL for performing complex calculations across rows. By understanding how window functions work and applying them effectively, you can solve problems like selecting one employee from each department efficiently and elegantly.

Whether you’re working on a personal project or a large-scale enterprise application, mastering window functions will help you become more proficient in SQL and improve your overall data analysis skills.

Further Reading


Last modified on 2023-10-08