People is the biggest obstacle to becoming a data-driven organisation

Our data team of 3 is tasked with pushing our business to be more data-driven. From encouraging management to make strategic decisions with hard data to building real-time feedback loop within our products themselves. We have built the infrastructure in place to capture and analyse streams of high granularity data using open source tools like Kafka, Hadoop, and Cascalog. Now the hard part is to convince the rest of the company that counted numbers on an Excel spreadsheet is insufficient so that they can incorporate these new high-frequency and high-granularity information in their everyday decision-making process.

This came to light for me when I tried to revamp our Attribution Modelling, which is a fancy marketing term for how referral is attributed and doesn't have to do with mathematical modelling at all. Many companies traditionally use a last-referral-takes-the-cake approach. Say if a customer comes in via an email referral first, leaves the site, then comes back in from an Adword to ultimately make a purchase, the Adword would get attributed for that revenue produced. This last-referral method is a lazy way of doing things. Lead generating referrals get no credit and return on investment values are biased towards closing-stage marketing efforts. Granted, doing anything else would require a global view of customer journey, which is a non-trivial matter.

However, that is exactly what we're able to do easily now. So we took a couple of days to spike out a fair-share attribution model such that every touch point a customer used get a share of the pie. The intention is to provide a holistic view of referrers efficacy to feed into our higher-level models. But seeing that the old system is so unrealistic, I thought I might as well open it for everyone else.

Opening a can of worms, that was what it felt like. I found that things like financial reports and people's bonuses are tied to this old attribution modelling system. Even though it is obvious to everyone that the old attribution modelling system is unrealistic, changing it would require changing the work processes of multiple individuals across the company. You know what they say about people's habits? Habits are hard to change.

Seeing that this is a fundamental obstacle to our company's competitiveness, I've taken a break from pushing our data architecture to build and evangelise internal data products. I've been working closely individually with our business people to identify ways of using data, staging data views from Hadoop/Cascalog back into good old MySQL, and assisting them to generate actionable data with SQL to make their lives easier.

One of such views borrows a technique from my algorithmic trading. By applying a breakout strategy (which I wrote entirely in SQL, good fun) on the revenue and cost time-series data of our PPC and SEO campaigns, our PPC/SEO manager now have available a customisable screener for abnormality with any of our thousands of continuous marketing campaigns. Very much like a stock screener for breakouts. This used to be a subjective process based on a spreadsheet of numbers and expert opinions. Now it's data-driven.

There's a saying that a business is its people. To become a data-driven organisation, you need data-driven people. I never realised the significance of this until now.

My talk on bootstrapping data science in a company