Home

Skip to Content Skip to Navigation

A few weeks ago I was approached by WhyCommunicate, a company run by a friend of mine, who were hoping to use the newly released Google Analytics API to provide client's with analytics information in their existing extranet. I loved the ambition of it, and I hadn't worked with an API before so I thought this was a great venture for me and went for it.

You can see the finished result over here in the portfolio, but what I'm going to detail in this post is a little about how it works, using examples from the work I completes. Since the existing system was built in PHP, I was a tad cheeky and immediately searched for a PHP class to take care of all the curl functions and XML responses, and luckily I found this beauty by Electric Toolbox. Effectively this is a tutorial covers the API using PHP and this class, but to be honest if you've used an API before, Google's Developer's section for this API provides ample alternative documentation. The basic platform remains the same, regardless of the medium.

Logging In And Profile ID

First things first, in order to access the analytics data for the intended profile you need to login using an API request. This basically involves using your Google Account login e-mail and password. Using the PHP class, this was a simple function called login() which took those parameters. Here is an adaption of the code used in this function from the class:


$ch = curl_init("https://www.google.com/accounts/ClientLogin");
curl_setopt($ch, CURLOPT_POST, true);

$data = array(
'accountType' => 'GOOGLE',
'Email' => $email,
'Passwd' => $password,
'service' => 'analytics',
'source' => ''
);

curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
$output = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);

$auth = '';
if($info['http_code'] == 200) {
preg_match('/Auth=(.*)/', $output, $matches);
if(isset($matches[1])) {
$auth = $matches[1];
}
}

curl_setopt($ch, CURLOPT_HTTPHEADER, array("Authorization: GoogleLogin auth=$this->auth"));

Once the login request is processed, we are given an authorization key to use when requesting data ($auth), which gets set as an cURL option for later requests, such as the requests for data, for session security purposes.

Once you get a successful login, you can start requesting data. First you need to find your profile ID for the website you want the data for, which can be found as the id parameter in the URL when you click View Report (a more complete walkthrough can be found here). From here, you're ready to start requesting the data.

Dimension and Metrics

There a few options that can be set out when making a request:

  • Dimensions: Dimensions are the areas you wish to request data (i.e. metrics) from, such as city, country, browser, and traffic source. Here is a full list of Dimensions, and there really are some ridiclous ones.
  • Metrics: Metrics are the actual items of data you can request, such as visits, pageviews, bounce rate, time on the site. A full list is available here.

These are the two most vital statistics which dictate exactly what data you are going to get e.g. Visits for the last 30 days per browser. There are a number of illegal combinations however, but to be honest there are enough valid combinations to get around this. Other options include sort, start and end date, number of results, offset, but the best way would be to show an example from the dashboard I made.

An Example: Statistics and Visits Graph for June

visitsWhat was needed here was 5 key pieces of information for the month of June, i.e. June 1st 2009 to July 1st 2009, and then the number of visits for each date of these 30 days. The five different pieces of information I needed were:

  • Total Page Views
  • Total Visits
  • # of Unique Visitors
  • Page Views per Visit
  • Time spent on site per Visit

Lets take a look at the URL the class executed in order to get the data:

https://www.google.com/analytics/feeds/data?ids=ga:012345&dimensions=&metrics=ga:pageviews,ga:visitors,ga:visits,ga:timeOnSite&sort=-ga:pageviews,ga:visitors,ga:visits,ga:timeOnSite&start-date=2009-06-01&end-date=2009-07-01&max-results=10&start-index=1

As you can see there are plenty of parameters here, mostly default values set by the class, so I've highlighted the key ones to look through:

  • ids=ga:012345: I've changed the value here for obvious reason, but this is the ID retrieved earlier from the View Report page. Technically, if you have a valid login, you can put whatever ID you want in here, but you will only get data pack if the ID belongs to a profile on your account.
  • dimensions=: Since we want data for the entire site over the full 30 days, we don't need a dimension.
  • metrics=ga:pageviews,ga:visitors,ga:visits,ga:timeOnSite: I've requested four metrics here, separated by a comma. You can see how these correlate with the 5 pieces of information. The first three are straight values, for example, ga:visits matches Total Visits. As for the last two, these are simple calculations, such as ga:pageviews / ga:visits for Page View per Visit.
  • start-date=2009-06-24&end-date=2009-07-23: And here we designate the two dates we want the data for.

That's the statistics done, but what about the graph? Well it's visits plotted against date, so the metric is sorted: ga:visits. As for the date, since we want metric data for a specific category, we have to use a dimension. Lets take a look at the URL executed:

https://www.google.com/analytics/feeds/data?ids=ga:379365&dimensions=ga:date&metrics=ga:visits&sort=-ga:visits&start-date=2009-06-01&end-date=2009-07-01&max-results=31&start-index=1

  • dimensions=ga:date: The dimension is set to ga:date, so this means that for each date between the start-date and end-date parameters, the feed will contain the metric chosen, in this case ga:visits. There is caveat though, which I will discuss after.
  • max-results=31: This is also important. Since we want 30 days worth of data, we need to set the max-results parameter to by this or higher. I tend to set it to 31 so any month can applied. This is also important as there are limits to how much data you can request with a certain time frame (documentation covers this)

These requests, using cURL functions e.g. curl_init and curl_exec, would return an XML response, which in turn got converted into a multidimensonal PHP array. So for this graph, the array might be in the format:


$visits_1 = $results["200906012"]["ga:visits"];
// $visits_1 contains the number of visits for the 1st of June 2009

Note that the dates returned are always in the format yyyymmdd.

Once I had this array, it was simply a matter of setting up a loop in order to generate the string I need to make a line graph using the graph framework.

Conclusions

As you can probably tell by the haphazardness of this post, there's a fair few tasks involved with this API, and this post has just covered the basics. If you're interested in using the API, I'd honestly say it's best just to try it yourself, perhaps using a class like I did. Google provides plenty of documentation, so this should be your first source of reference, but simply put allow room for trial and error.

One thing I should mention which I came across (the caveat I mentioned earlier) is that there is ever a 0 result for a metric with the ga:date dimension, the API will simply skip that date. The problem I had was in a 30 day stretch, there were three days with no visits, so I ended up getting 27 results, and since I was using a foreach loop, only 27 points. So I'd advise to use a while or for loop with a conditional to for a date to exist with 0 visits.

Hope this article has provided clarification or knowledge in some way, feel free to leave any questions and I'll be sure answer them!

1 Response to Working with Google Analytics API

Avatar

Craig Van Sant

October 11th, 2009 at 12:06 am

I had just about given up when I found this post. Not only did I finally “get it” but I was even able to put it together with a dynamic google line chart.

Thanks!

Comment Form

Back to top