15 Ways Your Google Analytics Might Be Broken, Part 2: Checking the Interface

Paul Koch, Former Data & Analytics Director

Article Categories: #Strategy, #Data & Analytics

Posted on

In Part 1 of this post, we showed ways that you can check your technical setup to make sure it’s not giving you unexpected data.  This second part shows ways that you can check your interface -- without ever needing to look at the code -- to make sure all your data is reporting as-expected.  Let’s get to it!

7)  Different parts of your site track into different web properties.  

Your overview screen may contain various web property ID numbers (UA-XXXXX-1, UA-XXXXX-2, etc.):

If different parts of your site track to different web property IDs, your data may be unnecessarily fragmented.  In the past, it was a good practice to use separate ID numbers to isolate certain types of data (such as e-commerce), but this practice is no longer necessary.  If a user’s visit spans pages with multiple web property IDs, then each part of that visit is fragmented into a specific profile -- incompletely capturing the whole picture and throwing off the bounce rate, along with other metrics.  We almost always encourage our clients to use a single web property ID across their entire site, then apply “view filters” to isolate specific subsets of data.

8)  Unexpected domains send data to GA.

In your reports, go to Audience > Technology > Network, then click on “Hostname” as the primary dimension directly above the table.  Do domains appear that you weren’t expecting?  I’m surprised how often a separate site has been counting toward an organization’s visits and pageviews without their knowing.  

It’s normal to see translate.googleusercontent.com (Google Translate), webcache.googleusercontent.com (Google’s cached versions of a page), and web.archive.org (The Wayback Machine).  Every once in a while, Viget finds another company showing up in our own data that has “borrowed” our site template but neglected to remove our GA tracking code.

9)  Too few domains send data to GA.  

Go to Google and search: site:domain.com, then add negative matches for all known subdomains appearing in your hostname report (-sub.domain.com -sub2.domain.com, etc.).  Often, subdomains will appear in the search results that have never had tracking on them.  Some analytics services offer web crawlers to check for the presence of code on a page, but we’ve found Google to be a much more effective crawler!

10)  Referrals from your own site appear in your traffic sources report.

If your domain appears as a referring site to itself, something is likely wrong. It could be the result of untagged pages (#9), iFrame issues, or inconsistent tracking settings between certain pages.  As a first step, go to your Acquisition > All Referrals report and click on your domain.  Then, add a secondary dimension of "landing page." Check whether movement between these two pages -- from referral path to landing page -- might have a setup that could break visits.  

11)  Web performance monitors or other bots inflate traffic.

One of the quickest ways to check if this is happening is the Cities report.  Go to Audience > Geo > Location, then change the Primary Dimension at the top of the table to “City.” You may see a not-that-big city in your top 10 list, or you might need to dig a little more.  Choose the “comparison to average” visualization (the second from the right), select “bounce rate” as the metric of comparison, and expand your table to more than 10 rows:

Performance monitors often check just a single page and, therefore, show a bounce rate of close to 100%.  If you see any odd city-related data, dig into it more and consider adding an “exclude filter.”  

12)  Filters aren’t being used or are being improperly applied.  

In the Cities report, do you see the city where your office is (or one very nearby) as a top traffic-driver?  If so, you may not be properly excluding your own organization’s IP addresses from the data.  If certain employees set your website as their homepage, they can especially inflate the data by sending a visit to GA every time they go online.  You can remedy this by applying a filter through Admin > All Filters > +New Filter and then selecting “ldquo;traffic from the IP addresses” as the dimension to exclude.

13)  Pages have exceptionally high exit rates.  

This data can be a signal of broken subdomain or cross-domain tracking, or of tracking that goes into a different web property ID.  Go to your Behavior > Site Content > All Pages report and follow the same steps in #10, this time comparing the Exit Rate to the site average.  When we’re doing an analytics audit, we often see high exit rates on login pages, which signifies that the logged-in state isn’t appropriately tracked.

If you’re unsure whether tracking breaks as you move from one page to another, here’s an easy way to check.  Visit your site with utm parameters identifying yourself, such as yoursite.com?utm_source=you&utm_medium=you.  Then go to the Real Time > Traffic Sources report and click on the utm_source or utm_medium parameter you defined:

This will apply a filter to other Real Time reports you view, such as the Content report.  A blue box at the top of the page will indicate that the filter is still applied:

Go to the Real Time Content report.  Then, navigate around your site.  You should see the “Active Page” update each time you load a new page.  If, at any point, the active page fails to update, it likely means that GA is broken between that page and the previous one.  As a result, the traffic source doesn’t persist, and the view of that page is not tied to the “you” source anymore.  

14)  Funnels are broken.  

People make two common mistakes when setting up funnels.  First, they think that the funnel constraints affect the total goals reported (they don’t -- just the Funnel Visualization report).  

Second, they think that funnels will help them isolate different potential visitor paths.  As Luna Metrics has long documented, Google’s backfilling of data makes these funnels unreliable.  Funnels should only be used for fixed processes during which a user can take no other path, such as a checkout process.  If the funnel measures anything other than a fixed process, the data will be recorded, but inaccurately.

15)  If your site has e-commerce, data doesn’t align closely with your backend data.  

If using e-commerce, we recommend isolating a certain date range and QAing each backend transaction against your GA data.  Make sure each backend transaction ID has a corresponding line in GA with the appropriate quantity, revenue, tax, and shipping.  If certain transactions don’t appear, try to replicate them to see if a technical reason might prevent them from recording.  GA data will almost never match up 1:1 with backend data, but it should be close.  Two of the most common reasons why it isn’t close include:

A parameter in the e-commerce code includes an apostrophe in a populated value.  For example, here’s a method to track a purchased item on a confirmation page:

_gaq.push(['_addItem',
  '1234', // transaction ID - required
  'ABCD', // SKU/code - required
  'Hockey Jersey', // product name
  'Mens', // category or variation
  '89.99', // unit price - required
  '1' // quantity - required
]);

In the example above, everything tracks correctly.  The entire transaction, however, will be broken if the apostrophe were not removed from ‘Men’s’, because the JavaScript would break, such as in this example:

_gaq.push(['_addItem',
  '1234', // transaction ID - required
  'ABCD', // SKU/code - required
  'Hockey Jersey', // product name
  'Men's', // category or variation
  '89.99', // unit price - required
  '1' // quantity - required
]);

When multiple quantities of the same item are ordered, the GA e-commerce code has been set up to use multiple _addItem methods instead of updating the quantity.  For example, the purchase of two items should be tracked as:

_gaq.push(['_addItem',
  '1234', // transaction ID - required
  'ABCD', // SKU/code - required
  'Hockey Jersey', // product name
  'Mens', // category or variation
  '89.99', // unit price - required
  '2' // quantity - required
]);

rather than:

_gaq.push(['_addItem',</span>
  '1234', // transaction ID - required
  'ABCD', // SKU/code - required
  'Hockey Jersey', // product name
  'Mens', // category or variation
  '89.99', // unit price - required
  '1' // quantity - required
]);
_gaq.push(['_addItem',
 '1234', // transaction ID - required
  'ABCD', // SKU/code - required
  'Hockey Jersey', // product name
  'Mens', // category or variation
  '89.99', // unit price - required
  '1' // quantity - required
]);

If two _addItem methods use an identical SKU, only the last one defined will be tracked, meaning that you’ll only record one of the items instead of both.

If any of these measurement issues apply to you, hopefully this post has sparked you to figure out ways you can solve them, rather than just shrugging away data oddities.

To use two cliches in a single sentence: ignorance is bliss, but knowledge is power.  Would you rather be acting on incorrect data, or taking steps to correct it?

Have you found any other common pitfalls?  Need help with solutions?  Feel free to comment or get in touch with us.

Related Articles