As data engineers, we work with SQL daily, but how many of us truly understand the inner workings of a SQL query? Knowing the order of execution can significantly improve the way you write SQL queries. Let’s dive into the process with a practical example.
Example Query1:
Step 1: Gather the Data
Operation: FROM & JOIN
The query begins by fetching data from the specified tables and joining them based on a common key.
Step 2: Filter the Data
Operation: WHERE
Next, it applies a filter to narrow down the data based on a specified condition.
Step 3: Group the Data
Operation: GROUP BY
The data is then grouped by the desired attribute, enabling aggregate calculations.
Step 4: Filter the Grouped Data
Operation: HAVING
After grouping, the query applies a condition to filter the grouped data.
Step 5: Select the data columns
Operations: SELECT
Step 6: Order the Data
Operation: ORDER BY
The filtered results are sorted based on a specified column.
Step 7: Limit the Data
Operation: LIMIT
Finally, the query restricts the result set to a specific number of rows.
Understanding SQL Query Execution with CTEs and Subqueries:
Explanation and Order of Execution:
- FROM Clause:
- The Sales table data is read.
2. CTE SalesRank:
The ROW_NUMBER() window function is used to assign a rank to each sale per ProductID, ordered by SaleDate in descending order.
3. CTE ProductSales:
- WHERE s.Rank <= 3: Filters to include only the top 3 most recent sales for each product.
- GROUP BY s.ProductID: Groups the data by ProductID for aggregation.
- Aggregates data with COUNT and AVG to compute total sales and average sale amount per product.
4. Subquery in WHERE Clause:
- Calculates the average amount of all sales within the year 2024. This value compares against the average sale amount in the main query.
5. JOIN Clause:
- Joins the Products table with the ProductSales CTE on ProductID.
6. WHERE Clause:
- Filters the products where the average sale amount for recent sales is greater than the average sale amount calculated in the subquery.
7. SELECT Clause:
- Selects the desired columns from the joined tables.
8. ORDER BY Clause:
- Orders the final results by TotalSales in descending order.