The Logic Problem in the EndOfMonth function

2012-03-04 - General, TDD, tSQLt

Disclaimer: This post is not about the new SQL 2012 EOMONTH() function.

Introduction

This post is a follow-up on an article I wrote for the August 2012 issue of SQL Server Pro: Getting Started with Test Driven Design. You can find the full list of tests from that article at the end of this post.

In the SQL Server Pro article we developed the EndOfMonth function using Test Driven Design. That function is supposed to return the last day of the month that contains the passed in DATETIME value. At the end of the article the function seemed functional but there is a problem in the calculation that we still need to work out.

Identifying the Problem

To find the last day of the month the function subtracts n days from the passed in date were n is the "day" DATEPART. That calculation finds the end of the previous month. Then the function adds one month to that result to calculate the end of the correct month. So if we pass in 6/17/2012 the first step will be to subtract 17 days which will result in 5/31/2012. The second step then adds one month to this intermediate result which results correctly in 6/30/2012.

If we however start with 7/19/2012, subtracting 19 days will result in 6/30/2012. Adding 1 month to that date will result in 7/30/2012. That is clearly not the correct answer.

First Things First

At the end of the SQL Server Pro article I left you with a failing test case. Following TDD it is a best practice to not leave failing tests hanging, but instead always fix the code as soon as possible. So before we can address the problem we identified above, we need to make that test pass.

The test asked for the time part to be removed from the final result. Here is the test again:

[sql] CREATE PROCEDURE DateFunctionTests.[test dbo.EndOfMonth returns only datetime values with 00:00:00 time part] AS
BEGIN
--Assemble
DECLARE @Actual DATETIME;
DECLARE @Expected DATETIME;
--Act
SET @Actual = dbo.EndOfMonth('2012-01-31 14:15:16.17');
--Assert
SET @Expected = '2012-01-31 00:00:00';
EXEC tSQLt.AssertEquals @Expected,@Actual;
END;
[/sql]

To make this test pass, we just need to cast the passed in value to the DATE data type before using it in the calculation:

[sql] CREATE FUNCTION dbo.EndOfMonth(@DTime DATETIME)
RETURNS DATETIME
AS
BEGIN
DECLARE @EndOfMonth DATETIME;
DECLARE @Date DATE = CAST(@DTime AS DATE);
SET @EndOfMonth = DATEADD(MONTH,1,DATEADD(DAY,-DATEPART(DAY,@Date),@Date));
RETURN @EndOfMonth;
END;
[/sql]

As always, after every change we run all test cases to make sure nothing unsuspected broke.

Fixing the Bug

The first thing in addressing a bug is to write a test that proves that the current functionality deviates from the specification. Let's take the example we used above for this test:

[sql] CREATE PROCEDURE DateFunctionTests.[test dbo.EndOfMonth returns the last day of the month if previous month is shorter] AS
BEGIN
--Assemble
DECLARE @Actual DATETIME;
DECLARE @Expected DATETIME;
--Act
SET @Actual = dbo.EndOfMonth('2012-07-19');
--Assert
SET @Expected = '2012-07-31';
EXEC tSQLt.AssertEquals @Expected,@Actual;
END;
[/sql]

As expected, this test fails because our function returns 7/30 instead of 7/31. To fix this problem, we need to change the order of the two calculation steps. The function has to first add one month to the passed in date and then subtract the correct number of days to get to the end of the correct month. For the above example that would mean to first add a month to 7/19 which results in 8/19. Then we subtract 19 days resulting in the correct value of 7/31. Implementing that change shouldn't be too hard:

[sql] CREATE FUNCTION dbo.EndOfMonth(@DTime DATETIME)
RETURNS DATETIME
AS
BEGIN
DECLARE @EndOfMonth DATETIME;
DECLARE @Date DATE = CAST(@DTime AS DATE);
SET @EndOfMonth = DATEADD(DAY,-DATEPART(DAY,@Date),DATEADD(MONTH,1,@Date));
RETURN @EndOfMonth;
END;
[/sql]

That makes the new test case pass. But two other ones are now failing:

Two Failing Test after Logic Change

The reason for this new failure is that adding one month to a date like 1/31/2012 can lead to a date with a different "day" DATEPART, 2/29/2012 in this case. However, our function is using the "day" DATEPART of the passed in date for the following subtraction step. — Good thing we had those tests.

To fix this we just need to make sure that the DATEPART calculation also uses the date value after adding 1 month instead of the passed in value:

[sql] CREATE FUNCTION dbo.EndOfMonth(@DTime DATETIME)
RETURNS DATETIME
AS
BEGIN
DECLARE @EndOfMonth DATETIME;
DECLARE @Date DATE = CAST(DATEADD(MONTH,1,@DTime) AS DATE);
SET @EndOfMonth = DATEADD(DAY,-DATEPART(DAY,@Date),@Date);
RETURN @EndOfMonth;
END;
[/sql]

With that change in place, all tests are now passing.

Test Case Heuristics

One of the most common questions asked by people new to TDD is: "How do I know which tests to write?" And as you have seen in the above example, it is important to get this right.

There are a few techniques that can help with the test case selection, grouped together under the term test case heuristics.
The definition of the word "heurism" is "knowledge derived from empirical study and practical adoption of experience" (http://en.wiktionary.org/wiki/heurism). Test case heuristics have names like Boundary Condition, 0-1-Some and All Pairs. Depending on the logic you are trying to test, Test case heuristics provide guidance on which values to target. The "0-1-Some" heuristic for example helps with writing tests for code that deals with collections of objects or loops. If you are test driving code that interacts with the result of a table valued function you want to write test cases for at least the three circumstances in which that function returns 0 rows, 1 row or a few rows. Knowing and following the suggestions made by those test case heuristics can help to avoid most of the problems that tend to be common in code that was not test driven.

The links in above paragraph are from a distributed article series in which Dennis Lloyd Jr. is currently documenting the heuristics and how they apply to database code. You can find the complete list in his Test Case Heuristics post.

Dates

In the example of the EndOfMonth function we are dealing with dates and with boundaries. The boundary heuristic tells us that for every boundary in input values that changes the output value we have to write three tests: One that tests the boundary value itself, one just off to the side of it and one just off to the other side of the boundary. One of the boundaries for the EndOfMonth function is 5/31. For every date that is close enough to this boundary and that is <= 5/31 the function returns 5/31, for close enough dates > 5/31 the function returns 6/30. So we want to write tests for the value 5/30, 5/31 and 6/1.

Within a year we have 12 such boundaries. Those boundaries fall into five groups based on the number of days in the months that are separated by the boundary: (31/28), (28/31), (31/30), (30/31), (31/31). We want to make sure to write enough tests to cover those boundaries. We also have to check a boundary that is in between two years and we need to make sure that our function can handle leap years. Finally we should have at least one test that uses a date somewhere in the middle of a month.

Because the logic in our function is not concerned with the year of the passed in date, we do not need to test multiple different years. You should base your decisions about which tests to write on your knowledge about the underlying logic. Try to keep the number of tests small, but make sure you cover all the bases, based on the test case heuristics.

One thing to avoid is complex logic in your tests. A test should test one idea; it should be short and concise. If you start using IF statements or loops in a test take a step back and try to break it up into multiple tests. Otherwise your tests will become hard to read and understand which in turn increases the likelihood for a defect to hide in there.

Conclusion

Test case heuristics can be used to guide you in the selection of necessary tests. Following test case heuristics will uncover most defects that otherwise might stay hidden. They can also help to reduce the number of tests to write by guiding you to the right test situations.

This post provided a short introduction into test case heuristics. I hope it piqued your curiosity. To learn more make sure to follow the Test Case Heuristics series mentioned above. For more material about Test Driven Design in general you can go to the tSQLt.org website. If you have questions or comments, feel free to leave them in the comment area below.


Below you will find the complete list of tests covered in this article and also each intermediate variant of the EndOfMonth function. This list includes the tests that were part of the SQL Server Pro article. These are not the only tests I would recommend for the EndOfMonth function. Rather, they were picked to show the importance of following the heuristics to select your test cases.

[sql] EXEC tSQLt.NewTestClass 'DateFunctionTests';
GO
CREATE PROCEDURE DateFunctionTests.[test dbo.EndOfMonth returns NULL if @DTime is NULL] AS
BEGIN
--Assemble
DECLARE @Actual DATETIME;
--Act
SET @Actual = dbo.EndOfMonth(NULL);
--Assert
EXEC tSQLt.AssertEquals NULL,@Actual;
END;
GO
--EXEC tSQLt.RunAll;
GO
IF OBJECT_ID('dbo.EndOfMonth') IS NOT NULL DROP FUNCTION dbo.EndOfMonth;
GO
CREATE FUNCTION dbo.EndOfMonth(@DTime DATETIME)
RETURNS DATETIME
AS
BEGIN
RETURN NULL;
END;
GO
CREATE PROCEDURE DateFunctionTests.[test dbo.EndOfMonth returns the passed in date if it is the last day of a month] AS
BEGIN
--Assemble
DECLARE @Actual DATETIME;
DECLARE @Expected DATETIME;
--Act
SET @Actual = dbo.EndOfMonth('2012-01-31');
--Assert
SET @Expected = '2012-01-31';
EXEC tSQLt.AssertEquals @Expected,@Actual;
END;
GO
IF OBJECT_ID('dbo.EndOfMonth') IS NOT NULL DROP FUNCTION dbo.EndOfMonth;
GO
CREATE FUNCTION dbo.EndOfMonth(@DTime DATETIME)
RETURNS DATETIME
AS
BEGIN
RETURN @DTime;
END;
GO
CREATE PROCEDURE DateFunctionTests.[test dbo.EndOfMonth returns the last day of the month of the passed in parameter] AS
BEGIN
--Assemble
DECLARE @Actual DATETIME;
DECLARE @Expected DATETIME;
--Act
SET @Actual = dbo.EndOfMonth('2012-01-10');
--Assert
SET @Expected = '2012-01-31';
EXEC tSQLt.AssertEquals @Expected,@Actual;
END;
GO
IF OBJECT_ID('dbo.EndOfMonth') IS NOT NULL DROP FUNCTION dbo.EndOfMonth;
GO
CREATE FUNCTION dbo.EndOfMonth(@DTime DATETIME)
RETURNS DATETIME
AS
BEGIN
DECLARE @EndOfMonth DATETIME;
SET @EndOfMonth = DATEADD(MONTH,1,DATEADD(DAY,-DATEPART(DAY,@DTime),@DTime));
RETURN @EndOfMonth;
END;
GO
CREATE PROCEDURE DateFunctionTests.[test dbo.EndOfMonth returns only datetime values with 00:00:00 time part] AS
BEGIN
--Assemble
DECLARE @Actual DATETIME;
DECLARE @Expected DATETIME;
--Act
SET @Actual = dbo.EndOfMonth('2012-01-31 14:15:16.17');
--Assert
SET @Expected = '2012-01-31 00:00:00';
EXEC tSQLt.AssertEquals @Expected,@Actual;
END;
GO
IF OBJECT_ID('dbo.EndOfMonth') IS NOT NULL DROP FUNCTION dbo.EndOfMonth;
GO
CREATE FUNCTION dbo.EndOfMonth(@DTime DATETIME)
RETURNS DATETIME
AS
BEGIN
DECLARE @EndOfMonth DATETIME;
DECLARE @Date DATE = CAST(@DTime AS DATE);
SET @EndOfMonth = DATEADD(MONTH,1,DATEADD(DAY,-DATEPART(DAY,@Date),@Date));
RETURN @EndOfMonth;
END;
GO
CREATE PROCEDURE DateFunctionTests.[test dbo.EndOfMonth returns the last day of the month if previous month is shorter] AS
BEGIN
--Assemble
DECLARE @Actual DATETIME;
DECLARE @Expected DATETIME;
--Act
SET @Actual = dbo.EndOfMonth('2012-07-19');
--Assert
SET @Expected = '2012-07-31';
EXEC tSQLt.AssertEquals @Expected,@Actual;
END;
GO
IF OBJECT_ID('dbo.EndOfMonth') IS NOT NULL DROP FUNCTION dbo.EndOfMonth;
GO
CREATE FUNCTION dbo.EndOfMonth(@DTime DATETIME)
RETURNS DATETIME
AS
BEGIN
DECLARE @EndOfMonth DATETIME;
DECLARE @Date DATE = CAST(@DTime AS DATE);
SET @EndOfMonth = DATEADD(DAY,-DATEPART(DAY,@Date),DATEADD(MONTH,1,@Date));
RETURN @EndOfMonth;
END;
GO
IF OBJECT_ID('dbo.EndOfMonth') IS NOT NULL DROP FUNCTION dbo.EndOfMonth;
GO
CREATE FUNCTION dbo.EndOfMonth(@DTime DATETIME)
RETURNS DATETIME
AS
BEGIN
DECLARE @EndOfMonth DATETIME;
DECLARE @Date DATE = CAST(DATEADD(MONTH,1,@DTime) AS DATE);
SET @EndOfMonth = DATEADD(DAY,-DATEPART(DAY,@Date),@Date);
RETURN @EndOfMonth;
END;
GO
[/sql]

Categories: General, TDD, tSQLt

Leave a Reply