This is Part 4 — the final post in the data-contracts series. Part 1 introduced the problem. Part 2 covered the metaclass, validation engine, and registry. Part 3 covered the diff engine, rename heuristic, and notification bus. This post is entirely about testing — the part most tutorials skip.
Why testing this framework is interesting
Most Python testing tutorials show you how to test a function that takes inputs and returns outputs. That’s the easy case. This framework has three properties that make testing genuinely harder:
- Global state. The
SchemaRegistryis a class-level dictionary shared across the entire process. Schemas registered in one test are still there when the next one runs. - Code that runs at definition time. The metaclass fires when Python reads a
classstatement — not when you call a function. Most test setups don’t account for this. - Heuristics with edge cases. The rename heuristic is not a pure function with deterministic outputs. It has ambiguous cases where it should stay silent, and you need tests that verify the absence of a false positive.
Each of these needs a different testing strategy. Let’s go through them all.
Part 1: conftest.py — the foundation of everything
In pytest, conftest.py is a special file that lives at the root of your test directory. Pytest loads it automatically before any tests run. It’s the right place to put fixtures that multiple test files share.
Here is the full conftest.py for this project:
# tests/conftest.py
import pytest
from data_contracts.registry import SchemaRegistry
@pytest.fixture(autouse=True)
def clear_registry():
SchemaRegistry.clear() # clean slate BEFORE the test
yield # test runs here
SchemaRegistry.clear() # clean up AFTER, just in case
@pytest.fixture
def trade_v1_fields():
return {
"symbol": str,
"price": float,
"volume": int,
}
@pytest.fixture
def trade_v2_fields():
return {
"symbol": str,
"close_price": float, # renamed from price
"volume": int,
"timestamp": str, # added
}
@pytest.fixture
def valid_trade_row():
return {"symbol": "AAPL", "price": 182.5, "volume": 1000}
@pytest.fixture
def invalid_trade_row():
return {"symbol": "AAPL", "price": "not-a-float"}
Let’s go through each part.
The autouse fixture — your most important line of code
autouse=True tells pytest to run this fixture for every single test automatically — without any test having to ask for it by name.
The yield is the key. Everything before it is setup. Everything after it is teardown. It behaves like a try/finally block wrapped around every test — except you write it once and never think about it again.
Here is what happens without it:
# ❌ Without autouse — tests bleed into each other
def test_one():
class TradeSchema(ContractBase):
symbol: str
assert "TradeSchema" in SchemaRegistry.list_schemas() # passes
def test_two():
# TradeSchema is STILL registered from test_one
assert SchemaRegistry.list_schemas() == [] # FAILS
The failure in test_two has nothing to do with test_two itself. It depends on whether test_one ran first. That is one of the most painful kinds of test failure to diagnose — it passes in isolation and fails in the full suite, or vice versa.
With autouse:
# ✅ With autouse — each test runs in a clean room
def test_one():
# registry cleared before this test
class TradeSchema(ContractBase):
symbol: str
assert "TradeSchema" in SchemaRegistry.list_schemas() # passes
def test_two():
# registry cleared again before THIS test
assert SchemaRegistry.list_schemas() == [] # passes — fresh start
Test order is now completely irrelevant. Each test gets a clean slate.
The data fixtures — define once, use everywhere
The other fixtures in conftest.py — trade_v1_fields, valid_trade_row, and so on — are shared test data. Instead of defining the same dict in a dozen different test functions, you define it once here and inject it by name wherever it’s needed.
pytest’s fixture injection works by matching parameter names. If your test function has a parameter called valid_trade_row, pytest looks for a fixture with that name and passes it in automatically:
# pytest sees "valid_trade_row" parameter → injects the fixture
def test_valid_data_passes(valid_trade_row):
class TradeSchema(ContractBase):
symbol: str
price: float
volume: int
result = TradeSchema.validate(valid_trade_row)
assert result.is_valid
Part 2: Testing the metaclass
Testing a metaclass feels intimidating because it doesn’t behave like a normal function. You can’t call it directly and check the output. The metaclass runs as a side effect of defining a class.
The trick is to embrace that. You test the observable outcomes of defining a class — not the metaclass machinery itself.
Think of it like testing a smoke alarm. You don’t test the internal circuit. You light a small flame, and check that the alarm sounds. The observable outcome is what matters.
The four observable outcomes of the metaclass are: the schema is registered, the fields are correct, the version is correct, and validate() is attached and callable.
# tests/test_contracts.py
from data_contracts.contracts import ContractBase, contract
from data_contracts.registry import SchemaRegistry
class TestContractBase:
def test_schema_auto_registers_on_definition(self):
# the act of defining the class IS the test trigger
class MySchema(ContractBase):
name: str
age: int
assert "MySchema" in SchemaRegistry.list_schemas()
def test_fields_extracted_correctly(self):
class OrderSchema(ContractBase):
order_id: str
amount: float
sv = SchemaRegistry.get_latest("OrderSchema")
assert sv is not None
assert sv.fields == {"order_id": str, "amount": float}
def test_version_defaults_to_1_0_0(self):
class NoVersionSchema(ContractBase):
x: int
sv = SchemaRegistry.get_latest("NoVersionSchema")
assert sv.version == "1.0.0"
def test_explicit_version_used(self):
class VersionedSchema(ContractBase):
__version__ = "3.0.0"
x: int
sv = SchemaRegistry.get_latest("VersionedSchema")
assert sv.version == "3.0.0"
def test_validate_method_attached_and_callable(self):
class PriceSchema(ContractBase):
symbol: str
price: float
assert hasattr(PriceSchema, "validate")
assert callable(PriceSchema.validate)
def test_private_fields_excluded_from_schema(self):
class PrivateSchema(ContractBase):
symbol: str
_internal: str = ""
sv = SchemaRegistry.get_latest("PrivateSchema")
assert "_internal" not in sv.fields
assert "symbol" in sv.fields
def test_contractbase_itself_not_registered(self):
# ContractBase has a guard clause — it should never appear in the registry
assert "ContractBase" not in SchemaRegistry.list_schemas()
Notice the last test: test_contractbase_itself_not_registered. This directly tests the guard clause from Part 2 — the if name == "ContractBase": return cls line. Without that guard, this test would fail every single time the module loads. Testing for the absence of something is just as important as testing for presence.
Part 3: Testing the validation engine
The validation engine is the easiest part to test because validate_data() is a pure function — same input always gives the same output, no side effects. But there are more edge cases than you’d expect, and each one deserves its own test.
# tests/test_validation.py
from data_contracts.validation import Severity, validate_data
SCHEMA = {"symbol": str, "price": float, "volume": int}
class TestValidData:
def test_exact_match_passes(self):
result = validate_data(
SCHEMA, {"symbol": "AAPL", "price": 182.5, "volume": 1000}
)
assert result.is_valid
assert result.errors == []
assert result.warnings == []
def test_int_fills_float_field(self):
# 182 (int) should pass for a float field — numeric coercion
result = validate_data(
SCHEMA, {"symbol": "AAPL", "price": 182, "volume": 1000}
)
assert result.is_valid
class TestMissingFields:
def test_single_missing_field(self):
result = validate_data(SCHEMA, {"symbol": "AAPL", "price": 100.0})
assert not result.is_valid
assert any("volume" in e.message for e in result.errors)
def test_all_fields_missing(self):
result = validate_data(SCHEMA, {})
assert not result.is_valid
assert len(result.errors) == 3 # one per field
class TestWrongTypes:
def test_string_for_float_fails(self):
result = validate_data(
SCHEMA, {"symbol": "AAPL", "price": "one-eighty", "volume": 100}
)
assert not result.is_valid
assert any(e.field_name == "price" for e in result.errors)
def test_float_for_int_fails(self):
# the reverse coercion does NOT apply — float doesn't fill an int field
result = validate_data(
SCHEMA, {"symbol": "AAPL", "price": 100.0, "volume": 99.5}
)
assert not result.is_valid
class TestExtraFields:
def test_extra_field_is_warning_not_error(self):
result = validate_data(
SCHEMA,
{"symbol": "AAPL", "price": 100.0, "volume": 50, "new_field": "x"}
)
assert result.is_valid # still valid
assert len(result.warnings) == 1
assert result.warnings[0].severity == Severity.WARNING
assert len(result.errors) == 0
The test_float_for_int_fails test is worth pausing on. The coercion only goes one way: int → float is allowed, but float → int is not. A field declared as volume: int receiving 99.5 should fail — you’d silently lose the decimal part if that value were cast. Testing both directions of the coercion rule is how you document and enforce the decision.
Part 4: Testing the diff engine with @pytest.mark.parametrize
The diff engine has four change types, each of which needs testing. You could write four separate test functions. But there’s a cleaner way — @pytest.mark.parametrize, which runs the same test logic against multiple inputs automatically.
Think of parametrize as a loop over test cases — except each iteration is reported as a separate test in the output, with its own pass/fail status. If one case fails, the others still run.
# tests/test_diff.py
import pytest
from data_contracts.diff import ChangeType, SchemaDiff
from data_contracts.registry import SchemaVersion
def sv(name, version, fields):
# helper to build a SchemaVersion without importing dataclass details
return SchemaVersion(name=name, version=version, fields=fields)
@pytest.mark.parametrize("old_f, new_f, expected_type, is_breaking", [
(
{"x": str},
{"x": str, "y": int},
ChangeType.FIELD_ADDED,
False, # adding a field is safe
),
(
{"x": str, "y": int},
{"x": str},
ChangeType.FIELD_REMOVED,
True, # removing a field is breaking
),
(
{"x": int},
{"x": str},
ChangeType.TYPE_CHANGED,
True, # changing a type is breaking
),
])
def test_change_type_classification(old_f, new_f, expected_type, is_breaking):
report = SchemaDiff(
sv("S", "1.0", old_f),
sv("S", "2.0", new_f),
).generate_report()
change = next(c for c in report.changes
if c.change_type == expected_type)
assert change is not None
assert report.is_breaking == is_breaking
This single parametrized test runs three times and produces three separate entries in your test output — one per case. Add a new change type in the future? Add one more tuple to the list. Zero new test functions needed.
Part 5: Testing the rename heuristic edge cases
The rename heuristic is the part of the framework most likely to produce a false positive — a rename detection that’s wrong. So this is the section that needs the most careful edge case coverage.
There are three cases to test: the happy path (rename correctly detected), the ambiguous case (two fields of the same type — heuristic should stay silent), and the identical case (no changes — nothing should fire at all).
class TestRenameHeuristic:
def test_rename_detected_when_type_matches(self):
# clear case: one float removed, one float added = rename
report = SchemaDiff(
sv("S", "1.0", {"price": float, "symbol": str}),
sv("S", "2.0", {"close_price": float, "symbol": str}),
).generate_report()
rename = next(
c for c in report.changes
if c.change_type == ChangeType.FIELD_RENAMED
)
assert rename.old_value == "price"
assert rename.new_value == "close_price"
assert rename.is_breaking
def test_ambiguous_types_produce_no_rename(self):
# two floats removed, two floats added — heuristic must stay silent
report = SchemaDiff(
sv("S", "1.0", {"bid": float, "ask": float}),
sv("S", "2.0", {"bid_price": float, "ask_price": float}),
).generate_report()
change_types = {c.change_type for c in report.changes}
assert ChangeType.FIELD_RENAMED not in change_types
def test_identical_schemas_produce_no_changes(self):
fields = {"symbol": str, "price": float}
report = SchemaDiff(
sv("S", "1.0", fields),
sv("S", "2.0", fields),
).generate_report()
assert len(report.changes) == 0
assert not report.is_breaking
def test_one_field_not_matched_to_two_renames(self):
# one float removed, two floats added
# the removed field should match exactly one — second is just FIELD_ADDED
report = SchemaDiff(
sv("S", "1.0", {"price": float}),
sv("S", "2.0", {"close_price": float, "last_price": float}),
).generate_report()
# two candidates for "price" rename — ambiguous, so no rename fires
change_types = {c.change_type for c in report.changes}
assert ChangeType.FIELD_RENAMED not in change_types
# both new fields should show as FIELD_ADDED
added = [c for c in report.changes
if c.change_type == ChangeType.FIELD_ADDED]
assert len(added) == 2
The fourth test — test_one_field_not_matched_to_two_renames — is the most important one. One field removed, two added of the same type. The heuristic sees two candidates for the rename and stays silent. Both additions fall through as FIELD_ADDED. This is the edge case that would produce a confident wrong answer if the len(candidates) == 1 guard wasn’t there.
Running the full suite
# from the repo root
python -m pytest -v
# with coverage report
python -m pytest --cov=data_contracts --cov-report=term-missing
PASSED tests/test_contracts.py::TestContractBase::test_schema_auto_registers_on_definition
PASSED tests/test_contracts.py::TestContractBase::test_fields_extracted_correctly
PASSED tests/test_contracts.py::TestContractBase::test_private_fields_excluded_from_schema
PASSED tests/test_contracts.py::TestContractBase::test_contractbase_itself_not_registered
PASSED tests/test_validation.py::TestValidData::test_exact_match_passes
PASSED tests/test_validation.py::TestValidData::test_int_fills_float_field
PASSED tests/test_validation.py::TestWrongTypes::test_float_for_int_fails
PASSED tests/test_validation.py::TestExtraFields::test_extra_field_is_warning_not_error
PASSED tests/test_diff.py::test_change_type_classification[FIELD_ADDED]
PASSED tests/test_diff.py::test_change_type_classification[FIELD_REMOVED]
PASSED tests/test_diff.py::test_change_type_classification[TYPE_CHANGED]
PASSED tests/test_diff.py::TestRenameHeuristic::test_rename_detected_when_type_matches
PASSED tests/test_diff.py::TestRenameHeuristic::test_ambiguous_types_produce_no_rename
PASSED tests/test_diff.py::TestRenameHeuristic::test_one_field_not_matched_to_two_renames
26 passed in 0.24s
What good test coverage tells a senior engineer
Test coverage percentage is the least interesting thing about a test suite. What senior engineers actually look for when they read your tests is different:
- Are edge cases tested? Not just the happy path. The ambiguous rename test and the
float → intcoercion test are more valuable than three more happy-path tests. - Does each test have exactly one reason to fail? A test that checks five things at once hides which one broke. Each test here asserts one specific behaviour.
- Are tests independent? The
autousefixture guarantees this. Any test can run in any order and will always see a clean starting state. - Do the test names read like documentation?
test_ambiguous_types_produce_no_renametells you what the system does without reading the body. Tests are executable documentation.
Try it yourself
git clone https://github.com/devminda/data-contracts
cd data-contracts
pip install -e ".[dev]"
python -m pytest -v
All 26 tests, full coverage report, and the parametrized diff tests showing as individual entries in the output. Clone the repo, run the suite, then try breaking a test intentionally — remove the len(candidates) == 1 guard and watch test_ambiguous_types_produce_no_rename fail. That’s the fastest way to understand why it’s there.
That wraps up the series. We went from a silent API field rename breaking 50 pipelines, to a fully tested framework that detects it automatically, classifies the change, and notifies every affected consumer. If you’ve followed all four posts and built something on top of it — or adapted it for your own data platform — I’d love to hear about it in the comments.