The Pareto Principle, commonly known as the 80/20 rule, argues that 80% of the results are achieved with 20% of the effort. Joseph M. Juran introduced this idea, which was influenced by Vilfredo Pareto, an Italian economist who discovered that 20% of the people controlled 80% of the land in Italy. Over time, this approach has been widely implemented in a variety of fields, including business, engineering, and analytics. In interviews and real-world circumstances, recognizing the top 20% of contributors who drive the majority of results (80%) is a valuable talent.
Let’s start by looking at how to use this approach using SQL to determine the top 20% of products that account for 80% of overall sales.
Pareto Principle
Problem Statement
We have a dataset that includes sales data for several products. The goal is to find the products that account for 80% of overall sales. This will help us focus on the core goods that provide the majority of the money.
Database Setup
We start by creating a table for order and populating it with sample data:
CREATE TABLE orders (
product_id INT,
sales NUMERIC
);
INSERT INTO orders (product_id, sales)
VALUES
(1, 100),
(2, 200),
(3, 150),
(4, 50),
(5, 300),
(6, 250);
This dataset represents six products, each with its corresponding sales figure.
Query to Identify Top Contributors
To achieve our goal, we break the query into logical steps using Common Table Expressions (CTEs):
WITH product_by_sales AS (
SELECT
product_id,
SUM(sales) AS product_sales
FROM orders
GROUP BY product_id
),
ts AS (
SELECT
product_id,
product_sales,
SUM(product_sales) OVER (ORDER BY product_sales DESC) AS running_sales,
SUM(product_sales) OVER () AS total_sales,
(SUM(product_sales) OVER ()) * 0.8 AS eighty_percent
FROM product_by_sales
)
SELECT *
FROM ts
WHERE running_sales <= eighty_percent;
Step-by-Step Explanation
product_by_sales
CTE:- The first CTE groups the sales by
product_id
and calculates the total sales for each product.
- The first CTE groups the sales by
ts
CTE:- This adds the following columns to refine the data:
running_sales
: Cumulative sum of sales using a window function ordered byproduct_sales
in descending order.total_sales
: Total sales for all products.eighty_percent
: 80% of the total sales, calculated as(SUM(product_sales) OVER ()) * 0.8
.
- This adds the following columns to refine the data:
- Final Query:
- Filters the results to only show products with running_sales less than or equal to 80% of total sales.
- This helps identify the products contributing to the top 80% of sales.
Query Results
For the specified dataset, the query will return the products that account for 80% of total sales. This allows us to focus on the important goods that provide the majority of our revenue.
This method is a practical use of the Pareto Principle in sales analysis. Identifying the major items contributing to 80% of the income allows firms to:
- Optimize inventory management.
- Focus marketing efforts on top-performing products.
- Improve decision-making for resource allocation.
I hope you would have found this useful, if so, consider sharing it with your friends, thank you.
- Write a query to identify the employee(s) whose salary is closest to the average salary of the company.
- Display all months where sales exceeded the average monthly sales.
- Find the most common value (mode) in a specific column.
- Identify employees who have never received a performance review.
- Write a query to retrieve the first order placed by each customer.
Hi, I am Vishal Jaiswal, I have about a decade of experience of working in MNCs like Genpact, Savista, Ingenious. Currently i am working in EXL as a senior quality analyst. Using my writing skills i want to share the experience i have gained and help as many as i can.