'Get cumulative sum based on unique item in MySQL

I'm using MySQL and I'm trying to write a stored procedure query that joins two tables and produces a particular column's running sum. Instead of the usual continuous running sum which is continuous, I would like the one that resets each time the item changes.

I hope what I'm requesting is clearer after my reproducible sample.

Table 1

CREATE TABLE `table1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `Date` date DEFAULT NULL,
  `Item` varchar(20) DEFAULT NULL,
  `Quantity` decimal(5,3) DEFAULT NULL,
  `Volume` decimal(20,2) DEFAULT NULL,
  PRIMARY KEY (`id`)
  );
  
  INSERT INTO table1 (Date, Item, Quantity, Volume)
  VALUES ('2022-04-25', 'Ball', 5, 30),
        ('2022-04-25', 'Balloon', 3, 14),
        ('2022-04-25', 'Bag', 2, 7),
        ('2022-04-24', 'Ball', 7, 20),
        ('2022-04-24', 'Balloon', 1, 9),
        ('2022-04-24', 'Bag', 4, 18),
        ('2022-04-23', 'Ball', 9, 53),
        ('2022-04-23', 'Balloon', 4, 25),
        ('2022-04-23', 'Bag', 11, 12),
        ('2022-04-22', 'Ball', 13, 8);

Table 2

CREATE TABLE `table2` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `Date` date DEFAULT NULL,
  `Item` varchar(20) DEFAULT NULL,
  `Size (inches)` decimal(10,2) DEFAULT NULL,
  `density` decimal(10,2) DEFAULT NULL,
  PRIMARY KEY (`id`)
  );
  
  
  
  INSERT INTO table2 (Date, Item, `Size (inches)`, density)
  VALUES ('2022-04-25', 'Ball', 15, 20),
        ('2022-04-25', 'Balloon', 13, 34),
        ('2022-04-25', 'Bag', 12, 17),
        ('2022-04-24', 'Ball', 17, 50),
        ('2022-04-24', 'Balloon', 11, 19),
        ('2022-04-24', 'Bag', 14, 8),
        ('2022-04-23', 'Ball', 19, 3),
        ('2022-04-23', 'Balloon', 14, 5),
        ('2022-04-23', 'Bag', 31, 2),
        ('2022-04-22', 'Ball', 42, 18);

This is the stored procedure I have at the moment:

DELIMITER $$
CREATE DEFINER=`localhost`@`%` PROCEDURE `procedure1`()
BEGIN

DROP TABLE IF EXISTS `procedure_table`;

SET @running_total:=0;

CREATE TABLE `procedure_table` AS SELECT * FROM (
SELECT i.`Item`,
i.`Date`,
ROUND(i.`Volume`/i.`Size (inches)`,2) as `Volume/Size`,
ROUND(i.`Quantity`/i.`Volume`,2) as `Quantity x Volume`,
i.`Size (inches)` as `Size (inches)`,
i.`density` as density,
i.`Quantity`,
ROUND(i.Volume) as `Oil Volume`,
(@running_total := @running_total + IFNULL(i.`Volume`,0)) AS `Cumulative Volume`
FROM (SELECT `table1`.*,
`table2`.`Size (inches)`,
`table2`.`density`
  FROM `table1` 
  LEFT JOIN `table2`
  ON `table1`.Item = `table2`.`Item` 
  AND Month(`table1`.Date) = Month(`table2`.Date)
  AND Year(`table1`.Date) = Year(`table2`.Date)
  ORDER BY `table1`.Item,Date) as i) u;



END$$
DELIMITER ;

When I run this I get a table that looks like this:

enter image description here

When instead I'd what I want is:

enter image description here

I've tried the PARTITION BY function but haven't been able to get it to work in MySQL.

How do I get my desired output?

Edit - Output without problematic column

SELECT * FROM (
SELECT i.`Item`,
i.`Date`,
ROUND(i.`Volume`/i.`Size (inches)`,2) as `Volume/Size`,
ROUND(i.`Quantity`/i.`Volume`,2) as `Quantity x Volume`,
i.`Size (inches)` as `Size (inches)`,
i.`density` as density,
i.`Quantity`,
ROUND(i.Volume) as `Oil Volume`
FROM (SELECT `table1`.*,
`table2`.`Size (inches)`,
`table2`.`density`
  FROM `table1` 
  LEFT JOIN `table2`
  ON `table1`.Item = `table2`.`Item` 
  AND Month(`table1`.Date) = Month(`table2`.Date)
  AND Year(`table1`.Date) = Year(`table2`.Date)
  ORDER BY `table1`.Item,Date) as i) u 
  ORDER BY Item;


Solution 1:[1]

Output without problematic column

SELECT * 
FROM ( SELECT i.`Item`,
              i.`Date`,
              ROUND(i.`Volume`/i.`Size (inches)`,2) as `Volume/Size`,
              ROUND(i.`Quantity`/i.`Volume`,2) as `Quantity x Volume`,
              i.`Size (inches)` as `Size (inches)`,
              i.`density` as density,
              i.`Quantity`,
              ROUND(i.Volume) as `Oil Volume`
       FROM ( SELECT `table1`.*,
                     `table2`.`Size (inches)`,
                     `table2`.`density`
              FROM `table1` 
              LEFT JOIN `table2` ON `table1`.Item = `table2`.`Item` 
                                AND Month(`table1`.Date) = Month(`table2`.Date)
                                AND Year(`table1`.Date) = Year(`table2`.Date)
              ORDER BY `table1`.Item,Date
             ) as i
      ) u 
ORDER BY Item;

The ordering is not deterministic. Looking your desired output I see that the secondary sorting is performed by Date output column. I.e. for correct output rows ordering and cumulative sum calculation the the ORDER BY must be expanded to ORDER BY Item, `Date`;.

And the query will be:

SELECT *,
       @sum := CASE WHEN Item = @item
                    THEN @sum + ROUND(i.Volume)
                    ELSE ROUND(i.Volume)
                    END AS `cumulative sum`,
       @item := Item AS Item
FROM ( SELECT i.`Item`,
              i.`Date`,
              ROUND(i.`Volume`/i.`Size (inches)`,2) as `Volume/Size`,
              ROUND(i.`Quantity`/i.`Volume`,2) as `Quantity x Volume`,
              i.`Size (inches)` as `Size (inches)`,
              i.`density` as density,
              i.`Quantity`,
              ROUND(i.Volume) as `Oil Volume`
       FROM ( SELECT `table1`.*,
                     `table2`.`Size (inches)`,
                     `table2`.`density`
              FROM `table1` 
              LEFT JOIN `table2` ON `table1`.Item = `table2`.`Item` 
                                AND Month(`table1`.Date) = Month(`table2`.Date)
                                AND Year(`table1`.Date) = Year(`table2`.Date)
              ORDER BY `table1`.Item,Date
             ) as i
      ) u 
CROSS JOIN ( SELECT @item := '', @sum:=0 ) init_variables
ORDER BY Item, `Date`;

First additional column either adds current Oil Volume to previous one or takes current only depends on the fact does the item is the same like in previous row or not. Second additional column simply stores current Item value for to be used on the next row evaluation. These columns can be moved within the output fieldset, but their relative posession must be stored.

PS. If (Item, `Date`) values pair is not unique then the rows ordering is not definite again. In this case you must either group in the subquery providing this expression uniqueness or expand the ordering expression additionally.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Akina