By William Storey

Google BigQuery is a data warehouse and querying engine that can be used to access and analyse huge quantities of data regarding user attributes and behavior on site. This blog explains some of the coolest and most useful things that you can do with this tool at your disposal.

Back to blog home

What is BigQuery?

As we know, Google Analytics is a very useful tool for reporting on the performance of our websites, with many visually compelling reports available to us. However, the raw data that goes into making these reports holds a wealth of information that we can use to really enhance our analysis.

Roll on BigQuery! This Google service (available only to GA360 clients) allows us to tap into the data sources that are used to create GA itself, allowing us to create our own tables and reports without the usual restraints of sampling and pre-aggregated reports. This blog will detail 3 innovative ideas for how you can use BigQuery to further enhance your understanding of how your website and customers work.

1. Mapping full user journeys

Google Analytics has several reports related to mapping the user journey through your site (e.g. User Flow, Behaviour Flow, Goal Flow reports). The problem we have with these is that irritatingly they are always sampled, meaning that we can’t necessarily trust the validity of what we’re seeing.

Behaviour flow

However, with GA360 alongside a BigQuery connector, you can pull all the data required at a completely un-sampled level. This allows us to map out full user journeys, including:

  • How the user arrived to your site (channel/device used, city their session originated from)
  • The order of what the users did on-site (pages visited, events/goals they completed)
  • The parts of a site that a user is likely to visit next, including the drop-out rates
  • The typical length of time between each interaction on-site

This information is clearly very useful as it allows us to see which areas of our site are putting users off, which elements are the most popular, whether any of our pages are redundant and what it is that makes users make the decisions they do.

If you want to find out how to customise these reports and visualise them in an easy-to-read way, check out our very own Joey Heighway’s blog here!

2. Enhanced customer modelling

If you cast your minds back, you may remember that I have previously written a blog on the advantages of using statistics to model user behaviour. For those of you who missed this, check it out here!

The blog explains how can we use a regression model with aggregated daily data from GA to determine what factors make a user likely to carry out a certain event. A classic example would be to see if the number of desktop users we got to our site today will affect the number of conversions we see today.

Regression visualisation

Using BigQuery, we don’t need to be constrained by this aggregated daily data and can plunge to new depths of precision and granularity by extracting data at the individual user/session level. This allows us to:

  • Increase the overall accuracy of our customer models owing to the significantly larger number of data-points being analysed
  • Determine how the pages a user visited or the actions they completed affects that user’s likelihood of converting (this is useful for determining which A/B tests it is worth carrying out on your site)
  • Identify demographics that are more likely to purchase particular products (for use in creating Audiences for site personalisation)

Finally, we can use BigQuery to enhance the range of data we analyse by linking GA data to the DoubleClick Data Transfer Files, allowing for a dataset spanning a larger portion of our media activity. For an even bigger picture, we could also link up to back-end CRM data, giving us access to a holistic view dataset for a significantly enhanced and client specific analysis.

3. Advanced segmentation

Creating segments in GA is great but sometimes we are held back by limitations in the segment builder, meaning we can’t analyse the specific group of users we’re interested in. For example, what if we want to build a segment for everyone in London? This is easy, but suppose we also want within this segment users who’ve read the blog but have not yet purchased any items. This then becomes much more complicated as demonstrated in the Venn diagram below:

Complex segment

The issue here is that we can’t use the simple include and exclude filters available in the segment builder (as we wish to exclude some users who have purchased items but not all of them).

We can solve this problem using (you guessed it) BigQuery! It is possible to create a list of users matching these criteria in BigQuery and set an identifier against them. This identifier can then be pushed back into GA as a custom dimension which we can then use in our segments. Et voila! We now have this complicated segment waiting in GA for us to analyse to our heart’s content.

So there we have it!

We've now covered three very handy things that you can do with BigQuery right now. Of course, we have only scratched the surface of the world of Big Data analysis in this blog, so keep your eyes posted for further updates from myself and the team on new and exciting techniques available to us with the help of BigQuery.

If you have found any of this blog’s content interesting and you would like some help with implementation, please feel free to get in touch today!

Share this article