Unit Tests and Defensive Programming

Overview, Objectives, and Key Terms

In Lecture 7, errors were identified by debugging. In this lesson, we look at ways to ensure bugs don’t exist in the first place by using unit tests and defensive programming techniques. Although the extent to which these tools should be used depends quite a bit on the magnitude of the program being developed, even their limited use should help you write better code.

Objectives

By the end of this lesson, you should be able to

  • Write programs and functions that use assert statements to ensure correctness of inputs and/or outputs.
  • Write unit tests for testing individual functions.

Key Terms

  • defensive programming
  • assert
  • test-driven development
  • unittest
  • unittest.TestCase
  • unittest.main

Be on the Defensive

Now that we’ve learned how to write functions and put them in modules for use by you and others, the probability that things go wrong is higher. When you write a function with inputs, those inputs generally have an acceptable range of types and values.

Often, use of incorrect types will result an exception. Recall our function mean_abs_error from Lecture 16. If we pass that a single value, or anything else that is not a sequential type, we get an error that pretty clearly shows what is wrong:

In [1]:
from error_metrics import mean_abs_error
mean_abs_error(0.1)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-e3bca46ac261> in <module>()
      1 from error_metrics import mean_abs_error
----> 2 mean_abs_error(0.1)

~/Research/me400_notes/source/lectures/error_metrics.py in mean_abs_error(e)
      2     """Mean, absolute error."""
      3     v = 0
----> 4     for i in range(len(e)) :
      5         v += abs(e[i])
      6     return v/len(e)

TypeError: object of type 'float' has no len()

That means that whoever called the function made an error: the function requires a sequential type like list. A compiled language would rarely allow that; Python allows it to be written, but scolds us when executed.

However, the chastising automated by run-time errors cannot account for all errors made by the caller, perhaps due to typos or a misunderstanding of the function (that’s why we document things, right?). A possible solution is to use defensive programming, which, sort of cheekily, is the art of making our programs “idiot proof.” We stop the user before they get too far (by checking inputs, outputs, and intermediate values). Most often, it turns out, the idiot is us, so any tool to help us from misusing our own code is a good one!

As an example (and one inspired by this article), suppose we need to write a function that accepts a sequence of numbers and maps them to the range 0 to 1. Such normalization is quite common in data analysis when data ranges can span orders of magnitude while the algorithms used to analyze such data are tuned to smaller ranges. Here’s a first attempt:

In [2]:
def normalize(a):
    """Normalize the elements of a to the range [0 1]"""
    b = []
    for i in a:
        b.append(i/max(a))
    return b
In [3]:
normalize([1,2,3])
Out[3]:
[0.3333333333333333, 0.6666666666666666, 1.0]

That works: all three numbers are in the correct range. How about

In [4]:
normalize([-1,2,3])
Out[4]:
[-0.3333333333333333, 0.6666666666666666, 1.0]

Whoops. Something went wrong here. That first value is negative. The problem is simple, and the solution depends on our original intent. Currently, for normalize to do what we say it does requires that the values in a are greater than or equal to zero. Is that what we intended? If not, we have a bug to fix, and the unit tests to be covered below will help us do that. However, if we really do want to process non-negative numbers only, then we need to (1) make sure the user knows that limitation, and (2) provide a way to terminate the execution before the user is surprised by the wrong result. Here’s the solution, and it uses a new assert statement:

In [5]:
def normalize_defensive(a):
    """Normalize the elements of a to the range [0 1].  Elements of a should be >= 0."""
    b = []
    for i in a:
        assert i >= 0, 'Elements of a must be >= 0!'
        b.append(i/max(a))
    return b
In [6]:
normalize_defensive([1, 2, 3])
Out[6]:
[0.3333333333333333, 0.6666666666666666, 1.0]
In [7]:
normalize_defensive([-1, 2, 3])
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-7-77598fd7f16e> in <module>()
----> 1 normalize_defensive([-1, 2, 3])

<ipython-input-5-26edd58ca715> in normalize_defensive(a)
      3     b = []
      4     for i in a:
----> 5         assert i >= 0, 'Elements of a must be >= 0!'
      6         b.append(i/max(a))
      7     return b

AssertionError: Elements of a must be >= 0!

We’ve updated the docstring and added the assert statement. The general syntax for assert is

assert condition, message

where condition must have a bool equivalent, and message is a str value to print if the assertion fails. One can skip the message, e.g.,

In [8]:
assert 1 == 0
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-8-34d369bab8db> in <module>()
----> 1 assert 1 == 0

AssertionError:

However, including a message helps a user know what went wrong and how to address it.

Note: Use assert statements to ensure (1) functions get proper inputs, (2) functions provide proper outputs, and (3) variables have correct values throughout program execution.

You should find that assert statements are very useful as you develop longer Python programs, more complex functions, etc. By asserting that values are correct (or within a range, of a given type, etc.) during midexecution, you can catch bugs much sooner than you might otherwise be able to do.

Exercise: Modify the functions of error_metrics so that only sequential types are allowed as inputs.

Hint: Look up the function hasattr.

Testing, 1-2-3

We already saw in Lecture 16 how to include an executable block in a module (i.e., the if __name__ == "__main__" trick). That’s a pretty common place to put quick tests and demonstrations of a module. Sometimes, though, it is useful to write code specific for testing the logical correctness of your functions. Enter unit tests.

Let’s go back to our error-metric functions. Each one has a precise mathematical definition, and for a given input, we can compute the expected output. For example, given e = [0.1, -0.2, 0.3], the mean, absolute error is 0.2. However, remember that we’re dealing with a floating-point number system that uses 1’s and 0’s, and it turns out 0.1, 0.2, and 0.3 cannot be exactly represented (go try representing 0.1 as a sum of powers of two!). Hence, when we expect the result to be 0.2, we should hope, at best, for it to satisfy something like

\[|result - expected| < \tau\]

for some tolerance \(\tau\). Here, we’ll set that tolerance to \(10^{-14}\) to account not only for the 0.2 not-being-in-the-system issue, but also to account for effects of rounding (since 0.1, 0.2, and 0.3, and their sum must be rounded along the way).

unittest

To automate this sort of test, we can use the built-in unittest module. Its use is straightforward to show by example:

In [ ]:
import unittest
from math import sqrt

from error_metrics import mean_abs_error, rms_error, max_abs_error

class TestErrorMetrics(unittest.TestCase) :
    """The class name is TestErrorMetrics and should indicate in some way
       what is being tested."""

    def test_mean_abs_error(self) :
        e = [0.1, -0.2, 0.3]
        # assertAlmostEqual comparse two values (here,
        # mean_abs_error(e) and 0.2) and ensures they are the same
        # out to 14 decimal places (i.e., 10^-14)
        self.assertAlmostEqual(mean_abs_error(e), 0.2, places=14)

    def test_rms_error(self) :
        e = [0.1, -0.2, 0.3]
        self.assertAlmostEqual(rms_error(e), sqrt(0.14), places=14)

    def test_max_abs_error(self) :
        e = [0.1, -0.2, 0.3]
        self.assertAlmostEqual(max_abs_error(e), 0.3, places=14)

These contents can be located in a new file, e.g., test_error_metrics.py. To run the tests, there are several options. First, the following two lines can be added to the end of the file:

if __name__ == '__main__':
    unittest.main()

Then, test_error_metrics.py can be executed using python test_error_metrics.py or from within an environment like Spyder. (This approach is probably how you’ve been using the homework testers all along!)

In this Jupyter notebook, a different approach is needed:

In [ ]:
test = TestErrorMetrics()
suite = unittest.TestLoader().loadTestsFromModule(test)
unittest.TextTestRunner().run(suite)

Of course, these unit tests are introducing some new syntax: what is class? What is self? Although object-oriented programming is outside our scope, a class is a user-defined type that can include attributes (values) and functions (a bit like modules), and the self variable is an object or instance of the class. A simple example: int is actually a class, and the variable a = 1 is an object of the int class:

In [ ]:
a = 1
a.__class__

That said, you are not responsible for understanding any details of classes. The true objective here is that you can adapt the example unit tests shown above for use in testing your own programs.

Now, in the tests above, just one function was used assertAlmostEqual, which is great for comparing two float values that may differ up to some tolerance (or, here, number of decimal places). The following are some others:

  • assertEqual(a, b) (are a and b equal?)
  • assertTrue(condition) (is condition true?)
  • assertGreater(a, b) (is a > b?)
  • assertIn(item, seq) (is item in seq?)

There are others, to be found by dir(unittest.TestCase) and understood using help.

Test-Driven Development

Armed with unittest, may I propose a philosophy for solving programming problems?

Note: Before implementing a function, a module, or an entire program, write the tests. Then, at every step of your development process, you can establish immediately whether your implementation is doing what you think it should be doing.

This is philosophy is the basis for test-driven development (and, as you may note, has been encouraged all along with the homework testing files). Although test-driven development (or TDD) is a bonafide software development strategy, the basic concept of writing the test, failing the test, and then writing the code to pass the test is the important takeaway.

The challenge, of course, is knowing what the tests should be. Only after careful consideration of the requirements (whether from a homework problem statement, your boss, or a customer) can you begin to define the tests. Over time, these tests might change (understanding improves) or grow in number (scope creep is real). The important thing is to define tests sufficient in number and breadth so that every line of code can be shown to do what it is supposed to do.

Exercise: Implement unit tests for a linear search function that given a sorted sequence a returns the first location of v in the array. If v is not found, the location returned should be the (last) location of the element nearest to but smaller than v. The goal here is to test for normal cases and edge cases, those particular cases that may break otherwise reasonable logic (think empty sequences, unordered sequences, sequences of just one element, v smaller than the smallest value in a, etc.) Then, implement that linear search function to pass the tests.

Exercise: So the same for binary search. Hint: Do you actually need to write new tests?

Exercise: Write unit tests for a module with functions that analyze strings. Specifically, write the tests for three functions: (1) is_palindrome(s) (that checks whether s is a palindrome), (2) longest_palindrome(s) that returns the largest palindrome in s, and (3) longest_common_sequence(s1, s2) that returns the longest common substring shared by s1 and s2.

Further Reading

None at this time.