Window functions are advanced SQL functions that allow us to perform complex calculations on a set of rows called a “window.” A window is a subset of rows from a larger result set, usually defined by some criteria such as an ORDER BY clause or a PARTITION BY clause. Window functions operate on this window, calculating results for each row within it. The results of window functions are not aggregated, but rather returned for each row in the result set.
Some common window functions in SQL include:
- ROW_NUMBER(): assigns a unique integer to each row in the window, starting from 1.
- RANK(): assigns a rank to each row within the window, with ties receiving the same rank and leaving gaps in the sequence.
- DENSE_RANK(): assigns a rank to each row within the window, with ties receiving the same rank but leaving no gaps in the sequence.
- LAG(): returns the value of a column from a previous row within the window.
- LEAD(): returns the value of a column from a subsequent row within the window.
- SUM(), AVG(), COUNT(), MIN(), MAX(): these are aggregate functions that can be used as window functions. When used as window functions, they calculate the aggregate value over the window.
- NTILE(): divides the window into a specified number of buckets and assigns a bucket number to each row.
Here’s an example of using the ROW_NUMBER() window function to assign a unique integer to each row in a result set:
SELECT name, score, ROW_NUMBER() OVER(ORDER BY score DESC) as rank
FROM students;
This query returns a result set with three columns: “name”, “score”, and “rank”. The “rank” column is calculated using the ROW_NUMBER() function, which assigns a unique integer to each row based on the score column in descending order.
We can also use the PARTITION BY clause to define windows based on a grouping criterion. Here’s an example of using the RANK() window function to rank students by score within each class:
SELECT class, name, score, RANK() OVER(PARTITION BY class ORDER BY score DESC) as rank
FROM students;
This query returns a result set with four columns: “class”, “name”, “score”, and “rank”. The “rank” column is calculated using the RANK() function, which assigns a rank to each row based on the score column within each class defined by the PARTITION BY clause.
Window functions can be very powerful when it comes to data analysis, especially when combined with other SQL concepts such as subqueries and joins. They provide a way to perform complex calculations on subsets of data, allowing us to extract valuable insights and patterns from our data.