'Having trouble to display Average sales including the Region with no sales

Having trouble to display the output including the Region with no sales.

An insurance company maintains records of sales made by its employees. Each employee is assigned to a state. States are grouped under regions. The following tables contain the data:

TABLE regions
  id INTEGER PRIMARY KEY
  name VARCHAR(50) NOT NULL

TABLE states
  id INTEGER PRIMARY KEY
  name VARCHAR(50) NOT NULL
  regionId INTEGER NOT NULL REFERENCES regions(id)

TABLE employees
  id INTEGER PRIMARY KEY
  name VARCHAR(50) NOT NULL
  stateId INTEGER NOT NULL REFERENCES states(id)

TABLE sales
  id INTEGER PRIMARY KEY
  amount INTEGER NOT NULL
  employeeId INTEGER NOT NULL REFERENCES employees(id)  

Management requires a comparative region sales analysis report.

Write a query that returns:

The region name.
Average sales per employee for the region (Average sales = Total sales made for the region / Number of employees in the region).
The difference between the average sales of the region with the highest average sales, and the average sales per employee for the region (average sales to be calculated as explained above).

A region with no sales should be also returned. Use 0 for average sales per employee for such a region when calculating the 2nd and the 3rd column.

This is the code as far:

SELECT regions.name, 
(CASE WHEN sales.amount<>0 THEN SUM(sales.amount)/COUNT(employees.id)ELSE 0 end)
AS average_sale,
(CASE WHEN sales.amount<>0 THEN SUM(sales.amount)/COUNT(employees.id)-MAX(sales.amount) ELSE 0 end) 
AS Diff
FROM regions
JOIN states
ON states.regionId = regions.Id
JOIN employees
ON states.id=employees.stateId
JOIN sales
ON employees.id=sales.employeeId
GROUP BY regions.Id;


Solution 1:[1]

If you have reached this post from here, we need to have a query that will pass all the edge cases. The most important thing is: "Employees can have multiple sales."

Average sales per employee for the region (Average sales = Total sales made for the region / Number of employees in the region).

Remember we are getting multiple sales done by each employee. So we need to count the number of employees in the region accurately. A distinct count of employees will give us the correct value and the average sales per employee will be accurate as well.

Left join, so that we get NULL values for some employee sales, that gives us regions with no sales, which will be displayed as zero.

So let's give a try:

with SalesAvg as (
  select R.name as rgn, 
    CASE WHEN SUM(IFNULL(SL.amount,0)) = 0 THEN 0                    /*region with no sales returning 0*/
    ELSE  SUM(IFNULL(SL.amount,0)) / COUNT(DISTINCT E.id) END as average             
/*distinct employee count gives the correct value for number of employees in ther region.*/
  from regions R
    left join states S on R.id = S.regionId
    left join employees E on S.id = E.stateId
    left join sales SL on E.id = SL.employeeId
  group by R.Id, R.name
) 
select 
  rgn, 
  average,
  (select max(average) from SalesAvg)- average as difference        /*highest average sales -region average*/
from SalesAvg
group by rgn

Solution 2:[2]

with c as (
select sum(sales.amount)/count(employee.id) as average from regions region1
left join states state on region1.id = state.regionId
left join employees employee on state.id = employee.stateId
left join sales sales on sales.employeeId = employee.id 
group by region1.id)

select region.name, 
(case when sum(sales.amount)<>0 then sum(sales.amount)/count(employee.id) else 0 end) 
as average, 
(case when sum(sales.amount)<>0 then (select max(average) from c) -     
(sum(sales.amount)/count(employee.id))  
else (select max(average) from c) end) 
as difference
from regions region
left  join states state on region.id = state.regionId
left  join employees employee on state.id = employee.stateId
left  join sales sales on sales.employeeId = employee.id
group by region.id

Solution 3:[3]

select regions.name, avg(coalesce(sales.amount,0)) as average
from regions
left join states
on regions.id=states.regionId
left join employees 
on states.id=employees.stateId
left join sales
on employees.id=sales.employeeid
group by (regions.name)
order by avg(sales.amount)  desc

Solution 4:[4]



SELECT r.name, 
   COALESCE(SUM(COALESCE(s.amount, 0)), 0)/COALESCE(COUNT(e.Name), 0) AS avg_amount,
    ABS(COALESCE(SUM(COALESCE(s.amount, 0)), 0)/COALESCE(COUNT(e.Name), 0) -
(SELECT MAX(amount)
FROM
(SELECT COALESCE(SUM(s.amount), 0)/COALESCE(COUNT(e.Name), 0) AS amount
    FROM sales s
    JOIN employees e
    ON s.employeeid = e.Id
    JOIN states st
    ON st.id = e.stateid
    JOIN regions r
    ON r.id = st.regionid
    GROUP BY r.name))) AS diff
FROM  sales s
JOIN employees e
ON s.employeeid = e.Id
JOIN states st
ON st.id = e.stateid
JOIN regions r
ON r.id = st.regionid
GROUP BY r.name

Solution 5:[5]

WITH c AS (
  SELECT regions.name AS name,
  (CASE WHEN sales.amount <> 0
    THEN SUM(sales.amount) / COUNT(employees.id) 
   ELSE 0 END) AS average
  FROM regions 
    LEFT JOIN states ON regions.id = states.regionid
    LEFT JOIN employees ON states.id = employees.stateId 
    LEFT JOIN sales ON employees.id = sales.employeeid
GROUP BY regions.id)

SELECT name, 
    average,
    (SELECT MAX(average) FROM c) - average as difference
FROM c
GROUP BY name

Solution 6:[6]

with D as(
select regions.name as name, 
    case when sum(coalesce(sales.amount,0)) = 0 then 0
   else sum(coalesce(sales.amount,0))/count( distinct employees.id)

end as average from regions left join states on regions.id=states.regionId left join employees on states.id=employees.stateId left join sales on employees.id=sales.employeeid group by (regions.name) order by avg(sales.amount)) select name, average, (select max(average) from D)-average as difference from D;

Solution 7:[7]

with sales_per_region as 
(
    select regions.name , COALESCE(sum(amount)/count(employees.id),0) as average
    from regions
    left join states on regions.id = states.regionId
    left join employees on states.id = employees.stateId
    left join sales on employees.id = sales.employeeId
    group by 1
),
max_sales_per_region as 
(
  select max(average) as max_average_sales
  from sales_per_region
)
select name, average, max_average_sales-average as difference
from sales_per_region,max_sales_per_region

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 user1528775
Solution 3 Harry Barkis
Solution 4 Erica Liu
Solution 5 Abbos-Ali Mirkhanov
Solution 6
Solution 7 M_M