
I've led data science teams long enough to know the dirty secret of the field: most models never make it to production. Not because the math is wrong, but because nobody designed the path from notebook to live system.
The notebook is a prototype, not a product
A notebook proves a model can work. Production asks whether it works on Tuesday at 3am when the upstream data is late and half the features are null. Those are different questions.
Treat data like code
Versioned datasets, reproducible pipelines, and tests on the data itself. If a feature distribution shifts, I want an alert — not a quiet accuracy collapse three weeks later.
def validate(features):
assert features["age"].between(0, 120).all()
assert features.isnull().mean().max() < 0.05
return features
Serve simple, monitor everything
A logistic regression you can monitor beats a deep net you can't explain. Start simple, measure real-world impact, and only add complexity when the data demands it.
The model is maybe 20% of the work. The other 80% is the boring plumbing that keeps it honest. That plumbing is the actual product.


