How to running total using SQL?

Example Case#1:

Table or query "Instructor" like below, need to running total to sort instructor in order.

CNT Instructor
1 Lastname1, Firstname1
1 Lastname2, Firstname2
1 Lastname3, Firstname3
1 Lastname4, Firstname4
1 Lastname5, Firstname5

SQL: SELECT (SELECT SUM(b.CNT)
FROM Instructor b
WHERE b.Instructor <= a.Instructor) AS Sort, a.Instructor
FROM Instructor AS a
ORDER BY a.Instructor;

Result:

Sort Instructor
1 Lastname1, Firstname1
2 Lastname2, Firstname2
3 Lastname3, Firstname3
4 Lastname4, Firstname4
5 Lastname5, Firstname5

Example Case#2

Table or query "ag" like below, need to running total for Credit Earn for each student.

 
stunumterm stunum term CrAtt CrErn
123456789200000QF 123456789 200000QF
 
12
123456789201410QF 123456789 201410QF 12 12
123456789201501QF 123456789 201501QF 6 6
123456789201504QF 123456789 201504QF 6 6
123456789201507QF 123456789 201507QF 8 8
123456789201510QF 123456789 201510QF 7 7
123456789201601QF 123456789 201601QF 12 12
123456789201604QF 123456789 201604QF 5 5
123456789201607QF 123456789 201607QF 6 6
123456789201610QF 123456789 201610QF 12 12
123456789201701QF 123456789 201701QF 12 12
123456789201704QF 123456789 201704QF 8 8
123456789201707QF 123456789 201707QF 10 10
123456789201710QF 123456789 201710QF 6 6
123456789201801QF 123456789 201801QF 8 8
123456789201804QF 123456789 201804QF 15 15
123456789201807QF 123456789 201807QF 11 11
123456789201810QF 123456789 201810QF 13 13
123456789201901QF 123456789 201901QF 11 11
234567890201410QF 234567890 201410QF 10 10
234567890201501QF 234567890 201501QF 12 12
234567890201504QF 234567890 201504QF 9 9
234567890201507QF 234567890 201507QF 6 6

SQL: SELECT a.stunumterm, a.stunum, a.term, a.CrAtt, a.CrErn,
(select sum(CrErn) from ag as b where b.stunum =a.stunum and b.stunumterm<=a.stunumterm) AS SumCrErn
FROM AG AS a
ORDER BY a.stunumterm

Result:

stunumterm stunum term CrAtt CrErn SumCrErn
123456789200000QF 123456789 200000QF
 
12 12
123456789201410QF 123456789 201410QF 12 12 24
123456789201501QF 123456789 201501QF 6 6 30
123456789201504QF 123456789 201504QF 6 6 36
123456789201507QF 123456789 201507QF 8 8 44
123456789201510QF 123456789 201510QF 7 7 51
123456789201601QF 123456789 201601QF 12 12 63
123456789201604QF 123456789 201604QF 5 5 68
123456789201607QF 123456789 201607QF 6 6 74
123456789201610QF 123456789 201610QF 12 12 86
123456789201701QF 123456789 201701QF 12 12 98
123456789201704QF 123456789 201704QF 8 8 106
123456789201707QF 123456789 201707QF 10 10 116
123456789201710QF 123456789 201710QF 6 6 122
123456789201801QF 123456789 201801QF 8 8 130
123456789201804QF 123456789 201804QF 15 15 145
123456789201807QF 123456789 201807QF 11 11 156
123456789201810QF 123456789 201810QF 13 13 169
123456789201901QF 123456789 201901QF 11 11 180
234567890201410QF 234567890 201410QF 10 10 10
234567890201501QF 234567890 201501QF 12 12 22
234567890201504QF 234567890 201504QF 9 9 31
234567890201507QF 234567890 201507QF 6 6 37

Introduction

One typical question is, how to calculate running totals in SQL Server. There are several ways of doing it and this article tries to explain a few of them.

Test environment

First we need a table for the data. To keep things simple, let's create a table with just an auto incremented id and a value field.

 
--------------------------------------------------------------------
-- table for test 
--------------------------------------------------------------------
CREATE TABLE RunTotalTestData (
   id    int not null identity(1,1) primary key,
   value int not null
);

And populate it with some data:

 
--------------------------------------------------------------------
-- test data
--------------------------------------------------------------------
INSERT INTO RunTotalTestData (value) VALUES (1);
INSERT INTO RunTotalTestData (value) VALUES (2);
INSERT INTO RunTotalTestData (value) VALUES (4);
INSERT INTO RunTotalTestData (value) VALUES (7);
INSERT INTO RunTotalTestData (value) VALUES (9);
INSERT INTO RunTotalTestData (value) VALUES (12);
INSERT INTO RunTotalTestData (value) VALUES (13);
INSERT INTO RunTotalTestData (value) VALUES (16);
INSERT INTO RunTotalTestData (value) VALUES (22);
INSERT INTO RunTotalTestData (value) VALUES (42);
INSERT INTO RunTotalTestData (value) VALUES (57);
INSERT INTO RunTotalTestData (value) VALUES (58);
INSERT INTO RunTotalTestData (value) VALUES (59);
INSERT INTO RunTotalTestData (value) VALUES (60);

The scenario is to fetch a running total when the data is ordered ascending by the id field.

Correlated scalar query

One very traditional way is to use a correlated scalar query to fetch the running total so far. The query could look like:

 
--------------------------------------------------------------------
-- correlated scalar
--------------------------------------------------------------------
SELECT a.id, a.value, (SELECT SUM(b.value)
                       FROM RunTotalTestData b
                       WHERE b.id <= a.id)
FROM   RunTotalTestData a
ORDER BY a.id;

When this is run, the results are:

 
id   value   running total
--   -----   -------------
1    1       1
2    2       3
3    4       7
4    7       14
5    9       23
6    12      35
7    13      48
8    16      64
9    22      86
10   42      128
11   57      185
12   58      243
13   59      302
14   60      362

So there it was. Along with the actual row values, we have a running total. The scalar query simply fetches the sum of the value field from the rows where the ID is equal or less than the value of the current row. Let us look at the execution plan:

600_CorrelatedScalarPlan.jpg

What happens is that the database fetches all the rows from the table and using a nested loop, it again fetches the rows from which the sum is calculated. This can also be seen in the statistics:

 
Table 'RunTotalTestData'. Scan count 15, logical reads 30, physical reads 0...

Using join

Another variation is to use join. Now the query could look like:

 
--------------------------------------------------------------------
-- using join 
--------------------------------------------------------------------
SELECT a.id, a.value, SUM(b.Value)
FROM   RunTotalTestData a,
       RunTotalTestData b
WHERE b.id <= a.id
GROUP BY a.id, a.value
ORDER BY a.id;

The results are the same but the technique is a bit different. Instead of fetching the sum for each row, the sum is created by using a GROUP BY clause. The rows are cross joined restricting the join only to equal or smaller ID values in B. The plan:

600_SubsetJoinPlan.jpg

The plan looks somewhat different and what actually happens is that the table is read only twice. This can be seen more clearly with the statistics.

 
Table 'RunTotalTestData'. Scan count 2, logical reads 31...

The correlated scalar query has a calculated cost of 0.0087873 while the cost for the join version is 0.0087618. The difference isn't much but then again it has to be remembered that we're playing with extremely small amounts of data.

Using conditions

In real-life scenarios, restricting conditions are often used, so how are conditions applied to these queries. The basic rule is that the condition must be defined twice in both of these variations. Once for the rows to fetch and the second time for the rows from which the sum is calculated.

If we want to calculate the running total for odd value numbers, the correlated scalar version could look like the following:

 
--------------------------------------------------------------------
-- correlated scalar, subset
--------------------------------------------------------------------
SELECT a.id, a.value, (SELECT SUM(b.value)
                       FROM RunTotalTestData b
                       WHERE b.id <= a.id
                       AND b.value % 2 = 1)
FROM  RunTotalTestData a
WHERE a.value % 2 = 1
ORDER BY a.id;

The results are:

 
id   value   runningtotal
--   -----   ------------
1    1       1
4    7       8
5    9       17
7    13      30
11   57      87
13   59      146

And with the join version, it could be like:

 
--------------------------------------------------------------------
-- with join, subset
--------------------------------------------------------------------
SELECT a.id, a.value, SUM(b.Value)
FROM   RunTotalTestData a,
       RunTotalTestData b
WHERE b.id        <= a.id
AND   a.value % 2  = 1
AND   b.value % 2  = 1
GROUP BY a.id, a.value
ORDER BY a.id;

When actually having more conditions, it can be quite painful to maintain the conditions correctly. Especially if they are built dynamically.

Calculating running totals for partitions of data

If the running total needs to be calculated to different partitions of data, one way to do it is just to use more conditions in the joins. For example, if the running totals would be calculated for both odd and even numbers, the correlated scalar query could look like:

 
--------------------------------------------------------------------
-- correlated scalar, partitioning
--------------------------------------------------------------------
SELECT a.value%2, a.id, a.value, (SELECT SUM(b.value) 
                               FROM RunTotalTestData b
                               WHERE b.id <= a.id
                               AND b.value%2 = a.value%2)
FROM   RunTotalTestData a
ORDER BY a.value%2, a.id;

The results:

 
even   id   value   running total
----   --   -----   -------------
0      2    2       2
0      3    4       6
0      6    12      18
0      8    16      34
0      9    22      56
0      10   42      98
0      12   58      156
0      14   60      216
1      1    1       1
1      4    7       8
1      5    9       17
1      7    13      30
1      11   57      87
1      13   59      146

So now the partitioning condition is added to the WHERE clause of the scalar query. When using the join version, it could be similar to:

 
--------------------------------------------------------------------
-- with join, partitioning
--------------------------------------------------------------------
SELECT a.value%2, a.id, a.value, SUM(b.Value)
FROM   RunTotalTestData a,
       RunTotalTestData b
WHERE b.id      <= a.id
AND   b.value%2  = a.value%2
GROUP BY a.value%2, a.id, a.value
ORDER BY a.value%2, a.id;

With SQL Server 2012

SQL Server 2012 makes life much more simpler. With this version, it's possible to define an ORDER BY clause in the OVER clause.

So to get the running total for all rows, the query would look:

 
--------------------------------------------------------------------
-- Using OVER clause
--------------------------------------------------------------------
SELECT a.id, a.value, SUM(a.value) OVER (ORDER BY a.id)
FROM   RunTotalTestData a
ORDER BY a.id;

The syntax allows to define the ordering of the partition (which in this example includes all rows) and the summary is calculated in that order.

To define a condition for the data, it doesn't have to be repeated anymore. The running total for odd numbers would look like:

 
--------------------------------------------------------------------
-- Using OVER clause, subset
--------------------------------------------------------------------
SELECT a.id, a.value, SUM(a.value) OVER (ORDER BY a.id)
FROM   RunTotalTestData a
WHERE a.value % 2 = 1
ORDER BY a.id;

And finally, partitioning would be:

 
--------------------------------------------------------------------
-- Using OVER clause, partition
--------------------------------------------------------------------
SELECT a.value%2, a.id, a.value, SUM(a.value) OVER (PARTITION BY a.value%2 ORDER BY a.id)
FROM   RunTotalTestData a
ORDER BY a.value%2, a.id;

What about the plan? It's looking very different. For example, the simple running total for all rows looks like:

600_OverOrderBy.jpg

And the statistics:

 
Table 'Worktable'. Scan count 15, logical reads 85, physical reads 0...
Table 'RunTotalTestData'. Scan count 1, logical reads 2, physical reads 0...

Even though the scan count looks quite high at first glance, it isn't targeting the actual table but a worktable. The worktable is used to store intermediate results which are then read in order to create the calculated results.

The calculated cost for this query is now 0.0033428 while previously with the join version, it was 0.0087618. Quite an improvement.

 

SQL Server: Calculating Running Totals, Subtotals and Grand Total Without a Cursor

If you have ever had the need to show detailed data of individual transactions and also keep a running total, subtotals, and grand total columns at the same time, but were not exactly sure how to tackle the problem then this article might help. In this article I will show you a few different techniques for calculating and summing information on multiple rows without using a cursor. The techniques I will show you will just use a basic SELECT statement. Of course, the calculations of the running total, subtotals and grand total will be done using other SQL clauses and functions like SUM and CASE.

Sample Data Used by Examples

Prior to showing you my examples, I will first need to establish a set of test data, which all my examples will use. My test data will consist of an "Orders" table with the following format:

create table Orders
(OrderID int identity,
 OrderAmt Decimal(8,2),
 OrderDate SmallDatetime)

I've populated this test Orders table with the following set of records:

OrderID     OrderAmt   OrderDate                                              
----------- ---------- --------------------
1           10.50      2003-10-11 08:00:00
2           11.50      2003-10-11 10:00:00
3           1.25       2003-10-11 12:00:00
4           100.57     2003-10-12 09:00:00
5           19.99      2003-10-12 11:00:00
6           47.14      2003-10-13 10:00:00
7           10.08      2003-10-13 12:00:00
8           7.50       2003-10-13 19:00:00
9           9.50       2003-10-13 21:00:00

All my examples will be using this table to produce the running totals, sub totals, and grand total reports shown below. Basically this table contains a number of different orders that where created over time. Each order has an ID (OrderID) which uniquely identifies each record, an order amount (OrderAmt) that holds a decimal amount for the order, and a timestamp (OrderDate) that identifies when the order was placed.

Running Total On Each Record

This first example will display a simple method of calculating the running total of the OrderAmt. The calculated running total will be displayed along with each record in the Orders table. The "Running Total" column will be created with a simple SELECT statement and a correlated sub query. The correlated sub query is the part of the statement that does the heavy lifting to produce the running total.

select OrderId, OrderDate, O.OrderAmt
	,(select sum(OrderAmt) from Orders 
                          where OrderID <= O.OrderID)
	  'Running Total'
from Orders O

When I run this query against my Orders table I get the following results:

OrderId     OrderDate            OrderAmt   Running Total                            
----------- -------------------- ---------- ------------- 
1           2003-10-11 08:00:00  10.50      10.50
2           2003-10-11 10:00:00  11.50      22.00
3           2003-10-11 12:00:00  1.25       23.25
4           2003-10-12 09:00:00  100.57     123.82
5           2003-10-12 11:00:00  19.99      143.81
6           2003-10-13 10:00:00  47.14      190.95
7           2003-10-13 12:00:00  10.08      201.03
8           2003-10-13 19:00:00  7.50       208.53
9           2003-10-13 21:00:00  9.50       218.03

As you can see, there is a "Running Total" column that displays the running total along with the other column information associated with each Orders table record. This running total column is calculated, by summing up the OrderAmt for all Orders where the OrderID is less than or equal to the OrderID of the current ID being displayed.

Running Total for Each OrderDate

This example is similar to the one above, but I will calculate a running total for each record, but only if the OrderDate for the records are on the same date. Once the OrderDate is for a different day, then a new running total will be started and accumulated for the new day. Here is the code to accomplish this:

select O.OrderId,
    convert(char(10),O.OrderDate,101) as 'Order Date',
    O.OrderAmt, 
    (select sum(OrderAmt) from Orders 
                          where OrderID <= O.OrderID and 
                               convert(char(10),OrderDate,101)
                             = convert(char(10),O.OrderDate,101))
	                               'Running Total' 
from Orders O
  order by OrderID

Here are the results returned from the query using my sample Orders Table:

OrderId     Order Date OrderAmt   Running Total                            
----------- ---------- ---------- ---------------
1           10/11/2003 10.50      10.50
2           10/11/2003 11.50      22.00
3           10/11/2003 1.25       23.25
4           10/12/2003 100.57     100.57
5           10/12/2003 19.99      120.56
6           10/13/2003 47.14      47.14
7           10/13/2003 10.08      57.22
8           10/13/2003 7.50       64.72
9           10/13/2003 9.50       74.22

Note that the "Running Total" starts out with a value of 10.50, and then becomes 22.00, and finally becomes 23.25 for OrderID 3, since all these records have the same OrderDate (10/11/2003). But when OrderID 4 is displayed the running total is reset, and the running total starts over again. This is because OrderID 4 has a different date for its OrderDate, then OrderID 1, 2, and 3. Calculating this running total for each unique date is once again accomplished by using a correlated sub query, although an extra WHERE condition is required, which identified that the OrderDate's on different records need to be the same day. This WHERE condition is accomplished by using the CONVERT function to truncate the OrderDate into a MM/DD/YYYY format.

Running Totals With Subtotals and Grand totals

In this example, I will calculate a single sub totals for all Orders that were created on the same day and a Grand Total for all Orders. This will be done using a CASE clause in the SELECT statement. Here is my example.

select O.OrderID,convert(char(10),O.OrderDate,101) 'Order Date',O.OrderAmt, 
       case when OrderID = (select top 1 OrderId from Orders 
                           where convert(char(10),OrderDate,101)
                              = convert(char(10),O.OrderDate,101)
                          order by OrderID desc)
            then (select cast(sum(OrderAmt) as char(10)) 
                     from Orders
                     where OrderID <= O.OrderID
                        and convert(char(10),OrderDate,101)
                           = convert(char(10),O.OrderDate,101))
            else ' ' end as 'Sub Total',
       case when OrderID = (select top 1 OrderId from Orders 
                           order by OrderDate desc)
            then (select cast(sum(OrderAmt) as char(10)) 
                      from Orders) 
             else ' ' end as 'Grand Total'
from Orders O
  order by OrderID

Output from the SELECT statement looks like this:

OrderID     Order Date OrderAmt   Sub Total  Grand Total 
----------- ---------- ---------- ---------- ----------- 
1           10/11/2003 10.50                           
2           10/11/2003 11.50                           
3           10/11/2003 1.25       23.25                
4           10/12/2003 100.57                          
5           10/12/2003 19.99      120.56               
6           10/13/2003 47.14                           
7           10/13/2003 10.08                           
8           10/13/2003 7.50                            
9           10/13/2003 9.50       74.22      218.03

In this example the first CASE statement controls the printing of the "Sub Total' column. As you can see, the sub total is printed only on the last order of the day, which is determined by using a correlated sub query. The second CASE statement prints the "Grand Total", which is only printed along with the very last order. Each of these CASE statements uses the TOP clause to determine which OrderID is the correct order for which to print out the "Grand Total".

Conclusion

Hopefully these examples will help you understand different methods that can be used to calculate running totals, sub totals, and a grand total. As you can see you don't need to use a cursor to calculate these different totals. With the creative use of correlated sub queries and other SELECT clauses like CASE you can easily create all these different totals. Next time you need to calculate totals consider using one of these non-cursor based solutions.