Data Leakage: How Data Collection Impacts the Decisions We Make and Vice Versa

I wrote this post on Cardinal Path’s blog. There is a lot to consider when building a model: Data leakage. Data leakage occurs when the data you are using to train a machine learning algorithm happens to include unexpected information related to what you are trying to predict, allowing the model or algorithm to make unrealistically […]

I really want to use data science! Where should I start?

Economist Robin Hansen’s most viral tweet  says: “Good CS expert says: Most firms that think they want advanced AI/ML really just need linear regression on cleaned-up data.” I’d take this even further and say that most firms that think they want linear regression really just need good data visualization on trustworthy data. And those that […]

Attribution and Goodhart’s Law

Check out this post I wrote on Cardinal Path’s blog! This is one of my favourite blog posts. Goodhart’s Law seems to pop up everywhere! That’s why you need to think about how you measure your goals and targets and make sure they don’t skew incentives in an undesirable direction. From the post: “When a measure […]