Monday, May 9, 2011

Google Analytics Fast-Access Mode = FAIL-Access Mode

GA is a great service.  Google gives away this great service that other companies charge thousands of dollars. Besides that they have a wonderful API that you can use to write useful data applications.  My main complaint over the years, has been the use of Sampling.  In general, it may be good enough when looking at high level visitor data, but my experience is that sampling dramatically distorts calculations related to goals, e-commerce data, as well as data related to traffic sources which send relatively small amounts of visitors.

Recently, GA has introduced some new changes to its use of sampling. It appears that they have started using it in more cases.  Reports that would never trigger sampling with the amount of visits involved are now using "fail fast-access" mode.  That is Google's euphemism for "this report was based on sampled data and may be totally wrong".  Plus it looks like there may be a bug in the sampling method.

Take a look at this report using normal data for a site I monitor:
Normal graphic (you can see Easter's impact) of Google (Paid/Organic) coming to a site
Everthing above looks fine showing a fairly typical traffic pattern, with the exception of Easter week and an extra holiday in Great Britain (It must have been special hat day in the UK) .

Where did the traffic go?

Now look at the next two images which are the same report as above but with advanced segments applied:

Non-Paid Search Traffic: This should show the Google organic traffic the site received.
Non Paid Search Traffic....I didn't think that I only bought PPC???
Paid Search Traffic: This should show the Google PPC traffic the site received.
Data Dave is sad and confused.  If Google the search engine is sending me this traffic, what is it if not PPC or Non-Paid

What happened here?

OK, wait.  What if I just looked at the same report showing all three segments at the same time. Perhaps, it will start to make sense (or not):

Nope.  If anything, the results look even weirder and more distorted than when looking at the data individually.  There sure seems to be a big time bug in the sampling trigging and calculation system of GA.

In conclusion:

  • With no segments applied, the site data looks normal....however:
  • With segments applied, 1 month of relatively little data triggered sampling ("fast-access"). This never used to happen for relatvively little traffic in GA. 
  • In both cases of sampling, Google sent the site 0 visits during the month of April.
  • Where GA showed visits, it was very incorrect (multiple stand.devs apart from the actual observed data)
    • With sampling applied, in the first week of May, GA reports that Google PPC sent more way traffic than all of Google combined.
    • Non-Paid alone surpassed All Visits traffic for a coupld of days as well.
Data integrity is the foundation for any analytics / decision support tool.  We cannot improve our websites if we cannot rely and believe the data upon which we will be basing our decisions.  Luckily, I am not alone in seeing this latest turn for the worse with GA and sampling. 

 Hopefully, they will fix it soon.

No comments:

Post a Comment