Harnessing Big Data for SEM: How to Steal a March on the Competition
08 September 2016 // David Fothergill
The terminology 'big data' has been in the mainstream consciousness for a few years now, and its profile is being raised all the time by successes such as IBM Watson's impact since its 'Jeopardy' win and Google's AlphaGo.
In our industry, we can look to the implementation of Rank Brain as evidence of the impact of innovations enabled by big data.
Before that though, let's take a step back and define what we are actually talking about when using the term 'big data', which at times can be a little bit fuzzy.
A general definition would likely refer to 'the 3 Vs' - i.e. data which has one or more of the following characteristics:
These concepts are all pretty well illustrated by the gathering of petabytes upon petabytes from CERN's Large Hadron Collider (quite literally high velocity data), but what's the relevancy for the SEM industry?
Well, it's quite likely that you technically don't have big data in relative terms, but you almost certainly have access to large data sets with untapped value in terms of improving revenue and ROI potential.
To underline the importance of staying competitive, a Gartner report shows that for 47% businesses a key objective of investment in data projects is to achieve 'more targeted marketing'.
It could be that your current analysis environments fall apart when several million rows are thrown at them. Although not big data, you still have an interesting and non-trivial data problem, which has material benefit if you solve it.
To do so, you can lean on practices and technology from the big data sphere. This could be concerning storage, processing, building data pipelines or generally getting 'on tap' access to important metrics.
Getting SEM Value From Data
Some practical, industry-specific examples of getting value might include:
Better Attribution Modelling: this could mean getting granular clickstream data for actionable visitor level attribution, or using probabilistic methods to track users across devices (something not achievable by standard cookie tracking alone). This can lead to better decisions for both efficiency and growth.
Integrating hit-level data with CRM for highly-segmented and actionable Cohort and Lifetime Value (LTV) insight.
Crunching social, url/link and content analysis data together to create a content marketing campaign that has a high likelihood of success.
Utilising Machine Learning for the benefits of predictive analysis, personalisation and improved efficiency at scale.
How You Can Benefit From The Big Data Revolution
In short, innovations in technology and an increased interest in the field of 'data science' present opportunities for projects of all sizes. A couple of examples to illustrate this are:
Storage and computation
The barrier to entry for scalable cloud storage and computation power is so low these days that if you are considering a large-scale data project, budget is no longer a limiting factor (no need for a six-figure fee for a Solutions Architect to get you up and running).
Services such as AWS and Google Cloud Platform mean you can get all the storage you need and powerful clusters for processing the data for a low cost, so your initial investment is not spread across technology and those responsible for delivery (so you can potentially get up and running using your existing team or agencies)
Commoditization of big data technologies
Similarly to above, the demand for some of the key technologies that underpin big data analysis have been commercialised and are available as affordable SaaS services.
For example, Hadoop and Apache Spark require quite a bit of domain-specific IT support to manage independently, but services such as databricks and MapR are leading the way by providing low-cost access to self-service environments with all the DevOps requirements handled on your behalf.
Similarly, machine learning and AI services such as IBM Watson allow you to gain access to predictive and descriptive models without requiring a data scientist to be involved.
As an example of combining the above points, one project we work on involves ingesting >10 million rows of clickstream data per day via an Amazon S3 bucket, and moving through a pipeline where analysis is done in Apache Spark (serviced by databricks). The cost? <£100/month.
That's not to say that the cost is the be-all-and-end-all, but there are numerous benefits to this low barrier to entry. In the past, many data projects have stalled in the early stages due to uncertainty and indecision, so being able to take a more agile, incremental approach means the chance of getting value is hugely increased and the risk is vastly reduced.
Your Next Steps?
With the above in mind, consider the types of data-led initiatives that have perhaps previously seemed out of reach for reasons of resource (be it monetary or personnel). If you are to stay ahead of the pack in your field, what is it that you need to give serious thought to making happen in the next 12 months? How can you harness the opportunities the big data ecosystem's maturity provides you with?
Is your data-driven priority:
Understanding your customers better via improved profiling and segmentation? (for better personalisation, content marketing, targeted programmatic buys etc)
Making more intelligent budget and priority decisions by getting a handle on attribution of success by drilling down to customer level?
Enhancing your BI by providing quicker access to more insightful reporting?
Embracing some core 'data science' concepts such as statistical analysis or machine learning? (for more effective, efficient processes)