Key Points
What is unit testing?
- Testing is formalising what you already do informally when you verify your code
- Verification confirms that code implements a model correctly; validation confirms the model describes nature correctly — unit testing addresses verification
- Unit tests check that a single function produces the expected output for a given input
- Integration tests make sure that code units work together properly.
- Regression tests ensure that everything works the same today as it did yesterday.
Organizing code to enable unit testing
- Tests live in their own file and are compiled separately from the code under test
- A function is easy to test if it takes all its inputs as parameters and returns its output as a value
- Global state, side effects, hidden dependencies, and mixed concerns make functions harder to test and harder to reason about
- Writing testable code and writing maintainable code are largely the same discipline
- Refactoring untested code safely requires characterising its existing behaviour first — which requires tests you do not yet have
Unit testing with assert()
- Documentation and testing are symbiotic:
- Documentation records our expections of the code’s behaviour.
- Tests encode the verification of this behaviour in test cases.
-
assert(expression)aborts the program ifexpressionis false — silence means the test passed - Failure of an assertion results in an error message and program termination, providing a clear test failure condition.
- A failing
assert()tells you something went wrong, and where in the sode, but not directly how. - Manual compilation of multiple test files does not scale.
Integrating tests into a build system
- A build system like CMake ensures tests are always compiled against the current code before they are run.
- CTest is a test runner — it does not care how tests are written, only whether the executable exits cleanly
- Tests you have to run manually are tests you will forget to run — automation removes that risk
- Keeping the barrier to running tests low is as important as writing the tests themselves
- assert() is disabled when NDEBUG is defined — in a CMake release build your entire test suite silently disappears
Introducing GoogleTest
- assert() gives you an abort; GoogleTest tells you which test failed, what the actual value was, and what the expected value was
- GoogleTest integrates with CMake/CTest so your existing build workflow does not change
- A failing
TESTdoes not prevent furtherTESTs from running. -
EXPECT_*continues after a failure;ASSERT_*stops the currentTEST— useASSERT_*when continuing would be meaningless.
Floating point comparisonsTesting with floating point numbers
- Floating point arithmetic is not exact — two calculations that are mathematically equal may not be numerically equal.
-
EXPECT_EQis appropriate for floating point only when the value is exactly representable. -
EXPECT_DOUBLE_EQandEXPECT_FLOAT_EQcheck that two floating point numbers are within 4 ULPs of each other. -
EXPECT_NEAR(a, b, tol)checks that|a - b| < tol— the tolerance should reflect the expected numerical error from the specific type of calculation, not be chosen arbitrarily.
Testing exceptional behaviour
- Testing what your code refuses to do is as important as testing what it does
- A function’s error handling is part of its specification and should be documented and tested like any other behaviour.
- These often determine boundary conditions where bugs most commonly live, making them vital to test.
-
EXPECT_THROWchecks both that an exception was raised and that it was the right type — the type is part of the function’s specification. - With
invariant_mass()now fully tested, we have seen a near complete range of GoogleTest assertion types — the remaining episodes apply these tools to more complex code.
Testing stateful classes
- A half-open interval
[x_min, x_max)has been chosen for the bins — this is a decision with testable consequences. - There’s a distinction between
n_entries()and in-range fills — overflow and underflow are counted but excluded frombin_counts()andmean() - Note the author has defined an unweighted mean!
- The name
HistogramConstructiongroups all construction-related tests in a clear suite. - We see a use case for
EXPECT_NO_THROW: valid inputs should not throw! - GoogleTest’s Matchers from its GMock Component help to write tests more easily and expressively when dealing with more complex assertions.
- A stateful class is testable if its state is explicit and controlled through a well-defined interface — the difficulty arises from global state, not from state itself.
- Reading the specification before writing tests is not optional — it determines what the tests should assert and makes any ambiguities obvious.
- Each test case should verify one behaviour — if a test needs “and” in its name it is probably two tests
- GoogleTest provides helpers in GMock for more complex checks.
Test fixtures
- A fixture eliminates repeated setup code and makes the intended starting state of each test using that fixture explicit.
- SetUp() runs before every individual test — each test starts from a clean, identical state regardless of what other tests do
- Fixtures do not change what is being tested, only how the starting state is prepared
- Construction tests belong outside the fixture — the fixture assumes construction succeeds and tests behaviour from that point
Code coverage
- Coverage measures which lines and branches were executed during testing — not whether they were tested correctly
- A line shown as covered means it ran; it does not mean the result was checked or that the test would catch a bug there
- Branch coverage is more informative than line coverage — a line can execute without all its branches being taken
- Coverage is a lower bound on thoroughness, not an upper bound — 100% coverage is necessary but not sufficient
- The coverage report is most useful as a guide to where tests are missing, not as a measure of test quality
- Beware of diminishing returns
Sanitizers as another line of defenceCode Sanitizers
- Unit tests check that your code does what you intended; sanitizers check for errors your intentions did not anticipate
- A test suite that is green and fully covered can still contain memory errors and undefined behaviour
- AddressSanitizer detects out-of-bounds memory access and use-after-free at runtime — errors that produce no compiler warning and may crash only rarely in production
- Sanitizers diagnose bugs that already exist; a well-chosen test prevents their reintroduction
- No single tool is sufficient — unit tests, coverage measurement, and sanitizers answer different questions and catch different bugs; together they give you the best practical assurance that your code is correct