1661 Average Time of Process per Machine: This is a really interesting question where we write a compact code of 1 line which is equivalent to multiline CTE code. Read more
Company: MAANG
Difficulty: Easy
Source: LeetCode
SQL Interview Question
Q. 1661 Average Time of Process per Machine
CREATE TABLE Activity (
machine_id INT,
process_id INT,
activity_type TEXT CHECK(activity_type IN ('start', 'end')),
timestamp FLOAT
);
INSERT INTO Activity (machine_id, process_id, activity_type, timestamp) VALUES (0, 0, 'start', 0.712);
INSERT INTO Activity (machine_id, process_id, activity_type, timestamp) VALUES (0, 0, 'end', 1.52);
INSERT INTO Activity (machine_id, process_id, activity_type, timestamp) VALUES (0, 1, 'start', 3.14);
INSERT INTO Activity (machine_id, process_id, activity_type, timestamp) VALUES (0, 1, 'end', 4.12);
INSERT INTO Activity (machine_id, process_id, activity_type, timestamp) VALUES (1, 0, 'start', 0.55);
INSERT INTO Activity (machine_id, process_id, activity_type, timestamp) VALUES (1, 0, 'end', 1.55);
INSERT INTO Activity (machine_id, process_id, activity_type, timestamp) VALUES (1, 1, 'start', 0.43);
INSERT INTO Activity (machine_id, process_id, activity_type, timestamp) VALUES (1, 1, 'end', 1.42);
INSERT INTO Activity (machine_id, process_id, activity_type, timestamp) VALUES (2, 0, 'start', 4.1);
INSERT INTO Activity (machine_id, process_id, activity_type, timestamp) VALUES (2, 0, 'end', 4.512);
INSERT INTO Activity (machine_id, process_id, activity_type, timestamp) VALUES (2, 1, 'start', 2.5);
INSERT INTO Activity (machine_id, process_id, activity_type, timestamp) VALUES (2, 1, 'end', 5.0);
View code on DB Fiddle
Solution
Solution1
WITH ProcessTimes AS (
SELECT
machine_id,
process_id,
MAX(CASE WHEN activity_type = 'end' THEN timestamp END) AS end_time,
MAX(CASE WHEN activity_type = 'start' THEN timestamp END) AS start_time
FROM Activity
GROUP BY machine_id, process_id
),
ProcessingDurations AS (
SELECT
machine_id,
(end_time - start_time) AS processing_time
FROM ProcessTimes
)
SELECT
machine_id,
ROUND(AVG(processing_time), 3) AS processing_time
FROM ProcessingDurations
GROUP BY machine_id
ORDER BY machine_id;
Explanation
To solve this problem using the CTE (Common Table Expression) method, we’ve broken the process into three steps:
Step 1: Extracting Start and End Times (CTE: ProcessTimes
)
The first thing we needed to do was transform the start
and end
timestamps from the row level to the column level for each machine_id
and process_id
. In the provided Activity
table, the start
and end
times appear as separate rows for the same process_id
. To perform calculations efficiently, these values need to appear in separate columns (start_time
and end_time
) for each process_id
.
To achieve this, we used a CASE statement inside the first CTE (ProcessTimes
):
- CASE Statement:
CASE
WHEN activity_type = 'start' THEN timestamp
ELSE NULL
END
This captures the timestamp
for start
and end
rows.
Using MAX
:
Wrapping the CASE
statements in MAX
allows us to satisfy SQL’s grouping rules when using GROUP BY
on machine_id
and process_id
:
MAX(CASE WHEN activity_type = 'start' THEN timestamp END) AS start_time,
MAX(CASE WHEN activity_type = 'end' THEN timestamp END) AS end_time
Does MAX
change the values? No, since the problem guarantees only one start
and end
timestamp per process_id
. MAX
simply ensures compliance with GROUP BY
. If multiple values existed, MAX
would pick the latest, which still works for duration calculations.
Resulting Output (ProcessTimes
):
machine_id | process_id | start_time | end_time |
0 | 0 | 0.712 | 1.520 |
0 | 0 | 3.140 | 4.120 |
1 | 0 | 0.550 | 1.550 |
Step 2: Calculating Process Duration (CTE: ProcessingDurations
)
In this step, we calculate the duration of each process by subtracting the start_time
from the end_time
:
SELECT
machine_id,
(end_time - start_time) AS processing_time
FROM ProcessTimes
This creates a table (ProcessingDurations
) where each row represents the duration of a single process for a specific machine. The result looks like this:
machine_id | processing_time |
---|---|
0 | 0.808 |
0 | 0.980 |
1 | 1.000 |
Step 3: Aggregating Average Processing Time
Finally, we calculate the average processing time for each machine. Using the results from ProcessingDurations
, we calculate the average as follows:
SELECT
machine_id,
ROUND(AVG(processing_time), 3) AS processing_time
FROM ProcessingDurations
GROUP BY machine_id
ORDER BY machine_id
AVG(processing_time)
: Calculates the average of all process durations for a machine.ROUND(..., 3)
: Rounds the result to 3 decimal places, as required by the problem.GROUP BY machine_id
: Groups the results by each machine.
Final Output
The query produces the following result:
machine_id | processing_time |
---|---|
0 | 0.894 |
1 | 0.995 |
2 | 1.456 |
This output satisfies the requirements of the problem.
Solution 2
SELECT
machine_id,
ROUND(SUM(CASE WHEN activity_type='start' THEN timestamp*-1 ELSE timestamp END)*1.0/ (SELECT COUNT(DISTINCT process_id)),3) AS processing_time
FROM
Activity
GROUP BY machine_id
Explanation
For this solution you need to understand the below part, where entire three CTEs are in one line itself, using a mathmetical logic.
SUM(CASE WHEN activity_type='start' THEN timestamp*-1 ELSE timestamp END) * 1.0 / (SELECT COUNT(DISTINCT process_id))
To understand this better let’s break it down further:
SUM(CASE WHEN activity_type=’start’ THEN timestamp*-1 ELSE timestamp END)
This part calculates the total time for each process. the logic used here is:
- If
activity_type = 'start'
:- The
timestamp
is multiplied by-1
. This effectively turns thestart
time into a negative value so it can be subtracted later.
- The
- If
activity_type = 'end'
:- The
timestamp
is added as-is (positive).
- The
This means for every process, the start
time and end
time are combined to produce the processing duration. For example:
- For
start = 0.712
andend = 1.52
, this will sum to:
(−0.712)+1.52=0.808 (duration of the process)
SUM
: The SUM
function adds up these values for all processes within the same machine (GROUP BY machine_id
is applied in the query). This gives the total processing time for the machine.
1.0
This part ensures that the division produces a floating-point result instead of an integer.
- Without
* 1.0
, SQL might perform integer division and truncate the decimal part, depending on the database. - Multiplying by
1.0
ensures the calculation is performed in floating-point arithmetic, preserving the precision.
/ (SELECT COUNT(DISTINCT process_id))
This calculates the average processing time by dividing the total processing time (from step 1) by the number of unique processes for the machine.
Putting It All Together
SELECT
machine_id,
ROUND(SUM(CASE WHEN activity_type='start' THEN timestamp*-1 ELSE timestamp END)*1.0/ (SELECT COUNT(DISTINCT process_id)),3) AS processing_time
FROM
Activity
GROUP BY machine_id
Final Output
The query produces the following result:
machine_id | processing_time |
---|---|
0 | 0.894 |
1 | 0.995 |
2 | 1.456 |
I hope this would have been helpful for you, consider sharing it with your friends. thank you.
- 1164 Product Price at a Given Date
- 1661 Average Time of Process per Machine
- Write a query to identify the employee(s) whose salary is closest to the average salary of the company.
- Display all months where sales exceeded the average monthly sales.
- Find the most common value (mode) in a specific column.
Hi, I am Vishal Jaiswal, I have about a decade of experience of working in MNCs like Genpact, Savista, Ingenious. Currently i am working in EXL as a senior quality analyst. Using my writing skills i want to share the experience i have gained and help as many as i can.