I wrote this post on Cardinal Path’s blog. There is a lot to consider when building a model: Data leakage. Data leakage occurs when the data you are using to train a machine learning algorithm happens to include unexpected information related to what you are trying to predict, allowing the model or algorithm to make unrealistically […]
Tag Archives: methodology
When all you have is a hammer, everything looks like a nail: choosing the right tool for the job
Check out this post that I wrote over on Cardinal Path’s blog that discusses finding the right tool for the job: A few weeks ago, a coworker asked me for some help with a data cleaning task. I consider him to be one of the best Tableau users in our office, and someone I frequent […]
I really want to use data science! Where should I start?
Economist Robin Hansen’s most viral tweet says: “Good CS expert says: Most firms that think they want advanced AI/ML really just need linear regression on cleaned-up data.” I’d take this even further and say that most firms that think they want linear regression really just need good data visualization on trustworthy data. And those that […]
Attribution and Goodhart’s Law
Check out this post I wrote on Cardinal Path’s blog! This is one of my favourite blog posts. Goodhart’s Law seems to pop up everywhere! That’s why you need to think about how you measure your goals and targets and make sure they don’t skew incentives in an undesirable direction. From the post: “When a measure […]