A useful indicator of SEO success is the number of unique keyphrases that send traffic to a website. An increase in this number is a reflection of increased trust in the site by search-engines.
Google Analytics can show you the total number of unique organic keyphrases at a glance, on the Traffic Sources ⇒ Keywords page. (Make sure you select 'non-paid' to exclude any CPC campaigns.)
This post will show you how to break that down to a more useful level of granularity and help you to create a table such as the following:
We'll aim to categorise traffic into three buckets: 'branded', 'head terms' and 'mid-long tail terms'. (In reality, we'll actually calculate the first two, and the third one will be 'everything that is left'.)
As we often can't export enough keywords from Google Analytics to do the analysis offline, we will have to use 'Advanced Segments' to do this. This means that we can only group together 'branded terms' and 'head terms' in ways that we can explain through AND and OR statements.
The process for doing this goes like this:
- Plan to create advanced segments that define each group of keywords you want to track
- Define rules using 'AND' & 'OR' statements that describe which keywords should be in each group
- Apply these groups each month, one at a time, to the previous month's data, in order to reveal the number of unique keywords.
The next sections may have particular appeal to the more 'techie' readers (or just those people feeling brave) - so do feel free to just skip down to the end to see screen-shots of these segments applied to the keywords report, if the nitty-gritty isn't your cup of tea.
Creating the 'Branded Terms' Segment
If you've not really implemented Advanced Segments before, I suggest starting with Google Analytics' help pages on the topic, but also having a play with the feature, to see how it works. (Really, do have a play. I'm going to assume you at least have understood what most of the main buttons do, and that's a great way to find out.)
Planning the Segment
Let's use a fictional company, TechNet, who make a product called the Vox9000. Their segment for 'branded terms' will include anything that mentions these terms.
Define the Rules, Create the Segment
To create the segment for branded terms, begin by clicking 'Advanced Segments' ⇒ 'Create new custom segment'.
In the first 'dimension or metric' space, add a 'Medium' block (found under 'Dimensions') and set Condition to 'Matches exactly' and Value to 'organic'. Then hit 'and' to add another section. Place a 'Keywords' block here, with Condition as 'Matches regular expression' and a value that is all your branded terms, separated by the pipe character: |
(NB: the pipe acts as an 'OR' in these regular expressions.)
As an example, for TechNet (which people often search for it with a spaces, as 'Tech Net') that makes a product called 'Vox9000' (sometimes searched for as 'Vox 9000') would use the following string here: technet|tech net|vox9000|vox 9000
Give the segment a name, and save it.
Creating the 'Head Terms' Segment
Planning the Segment
The next segment - the head terms - is a bit more complicated, and you'll see why it's important for us to to specify rules that will define the head keyphrases.
Let's imagine that TechNet sells laptops and notebooks in Philadelphia and Baltimore. (Therefore head terms will be those such as 'notebooks' or 'laptops in philadelphia')
In this example, the rules to define head terms might be:
- the phrase can't mention any branded terms
- it must mention one of their product groups (laptop, notebook)
- it can only have two words of 3+ characters (this allows for some short linking words, such as a, in, at, etcetera)
- it can only have a maximum of four words in total.
Define the Rules, Create the Segment
The last two rules can be the trickiest to implement, so we'll look at these first. Two insights help us solve these requirements:
Insight 1: Combining the two rules, and using S and L to indicate short words (1 or 2 characters) and long words (3+ characters) we see that the only twenty possible structures for keyphrases are: L, LS, SL, LL, LSS, SLS, SSL, LLS, LSL, SLL, LSSS, SLSS, SSLS, SSSL, LLSS, LSLS, LSSL, SLLS, SLSL, SSLL
Insight 2: The regular expression: \b[^ ]{3,50}\b
matches a word of between 3 & 50 characters. It's also necessary to know that ^
matches something at the beginning of an expression, and $
matches at the end. (Seriously, they do. Start by going through the examples at this site if you want to know why that's the case.)
We're now in a position to take the list of combinations from 'Insight 1' and replace 'S' with \b[^ ]{1,2}\b
(matching words with 1/2 characters) and 'L' with \b[^ ]{3,50}\b
, putting spaces in-between, wrapping in parentheses, and matching at beginning and end. Missed that? OK, here are examples of some of the resulting statements:
L becomes ^(\b[^ ]{3,50}\b)$
SL becomes ^(\b[^ ]{1,2}\b \b[^ ]{3,50}\b)$
LSL becomes ^(\b[^ ]{3,50}\b \b[^ ]{3,50}\b \b[^ ]{1,2}\b)$
etc.
You should join the twenty created expressions together using a pipe character, to create the resulting, massive, expression. To save space, I won't post the whole expression in, but you can see what it looks like if you hover your mouse over this text.
NB: There seems to be a limit to the number of parts to an expression that you can put into Google Analytics, so I tend to break this up into two parts - say, those matching on three or less words, and those matching four - and put them as 'OR' alternatives in one section. I've done that below to demonstrate.
The resultant segment rules for 'Branded Keyphrases' look like this:
The image shown above reads:
- Dimension: Medium, Condition: Matches exactly, Value: organic
- AND
- Dimension: Keyword, Condition: Does not match regular expression, Value: technet|tech net|vox9000|vox 9000
- AND
- Dimension: Keyword, Condition: Matches regular expression, Value: (hover over here to see it)
- OR
- Dimension: Keyword, Condition: Matches regular expression, Value: (hover over here to see it)
- AND
- Dimension: Keyword, Condition: Matches regular expression, Value: laptop|notebook
Collecting the numbers
With our two Advanced Segments defined, we can head back to the 'keywords' page and set the date range to the last month. Click each image to see it full size.
We can apply each custom segment in turn, in order to collect the following numbers for September:
- Total keyphrases: 64,278
- Branded keyphrases: 393
- Head keyphrases: 2,835
- Other keyphrases: 61,050 (calculated from the previous three numbers)
You can now put these numbers in a spreadsheet in order to chart the change in number of unique keyphrases as months go by.
You can use these basic techniques to create and report on even more well defined segments of keyphrases (for example: you could group keyphrases by competitiveness, department, intent, etc.) If there are particular steps here that require more explanation, or you're looking for more ideas about how to apply this to your SEO reporting structure, drop a comment below.
Interesting concept.
You should be able to match your 20 combinations of 'S' and 'L' words using the following regex:
^(\b[^ ]{1,2}\b|\b[^ ]{3,50}\b)( (\b[^ ]{1,2}\b|\b[^ ]{3,50}\b))?( (\b[^ ]{1,2}\b|\b[^ ]{3,50}\b))?( (\b[^ ]{1,2}\b|\b[^ ]{3,50}\b))?$
You could also reduce the size of your inital keyword match using the '?' quantifier:
tech ?net|vox ?9000
The '?' basically says the character or group preceding it is optional. So in the exampleabove, 'tech ?net' matches 'technet' or 'tech net'.
And in the first example with the word combinations, the last 3 combinations of 'S' or 'L' are optional. While either 'S' or 'L' word is required at the start.
I'm sure there's some more efficient regex you could use for the 20 word combos but at least that means you need 1 less dimension in your Advanced Segment.
Hi JD80, welcome to SEOMoz and thanks for the comment.
You're quite right that my expressions could be more efficient. They can be a bit slow to apply to large reports at the moment, so I'll try simplifying and see how much difference it makes.
I love the concept here and the value you can get from this segmentation. Particularly when search traffic is rising or falling, it's so essential to know where you're losing it - from the head, branded terms or tail. Great walkthrough, Rob!
The branded search terms report is one of the most powerful reports that web analytics can produce and should be part of everyone's reporting arsenal.
Why?
Because it speaks directly to the 'traditional' marketers.
Having empirical data on brand performance is something that all consumer marketers salivate for!
If you have any offline marketers, brand marketers, trade marketers etc. they can make use of the branded search report as a measure of brand awareness and to gauge the performance of offline marketing campaigns.
For example, if you launched a new TV or radio ad, monitor your brand searches over time - did they increase or decrease?
If your competitor launched a new product, did it erode your brand presence or have no visible impact? (tip: look for people searching for your brand and the competitor's brand at the same time: "compare technet vox9000 cyberworld abc2000")
The branded search report is also useful to gauge the performance of your SEO: How much of your SEO traffic is actually people searching for your term? A high percentage of branded searches could either mean you have excellent brand awareness or more ominously, your SEO needs work as people are not finding you for generic terms.
Great advice Rob - we normally use Omniture for our branded search reporting, so it's nice to know the GA folk don't miss out either :)
Rocking! Can't wait to try this one.
Awesome post Rob. Now I don't have to tear out my remaining hair trying to figure out the RegExs.
Wouldn't it be a better idea to dive deeper into the 'other' keyphrases? Those make up around 95% of the search volume in your example data...
Great post!We're seeing some odd results - in all segments (even all traffic) there are a number of keywords at the end that show up as traffic-sending keywords but in fact show a result of 0. Should we be counting these? What are these and why are they showing up?In addition, in the branded segment, there seem to be like a few keywords (less than 1%) that 'slip' through the filter - I have no idea how. Coincidentally these keywords that slip through are all reporting 0 as well. Anyone else experience this? Check at the very end of the keywords reporting on any given segment.
Thanks!
Ah yes, those mystery keyphrases!
I think they are explained by the first situation that Will mentions here.
I would count these keywords - you rank for them, and they sent traffic, so they ought to be in the total.
Thanks! Now that accounts for the zero keywords, but why would they show up when they're not 'meant to'. Meaning it seems as if the filter is working perfectly but there are a few that don't match the filter criteria and still 'slip through'. For example if your company is called Smith Widgets and you filter for all organic traffic whose keywords contain the word 'smith' (branded keywords), yet in that segment there are a few keywords still slipping through like 'great widgets' or 'widget sites', etc. I read Wil's post quoting the analytics guru who said the zero's could represent a user who within the same session - 29 mins without clearing cookies - who searched for another keyword and again landed on your site.
Couldn't this then represent someone who either first searched for a branded keyword (smith widgets), arrived on the site, then went back to google and typed a more generic head term (great widgets), and then arrived on your site again? Just trying to understand why in the heck these non-branded terms are showing up with zeros in the branded segment and how to treat them.
Thanks !
I had no idea you could group keywords together like this. Great advice!
Great walkthrough, like the way you bucket them. Definitly will help to segment keywords by more that just drowning in too much keywords. As always, regex rules!
Awesome post. Recently went to a Google Analytics conference in Chicago...I wish they gave more practical tips like this.
Thanks for the step by step advice! Very useful!
I may not be the only one who thought of this...but I was able to copy/paste the list of word combinations into notepad placing a space between each S and L and entering each combo on it's own line. Then use Find/Replace to replace the S's and L's with the appropriate regular expression. This really sped up the process of creating my regular expression!
Hope that helps someone.
These equations are innovative.
Regex and pivot tables are the way forward!
Another alternative is to use the "&limit=50000" trick - it bypasses the GA 500 keyword limit and allows you to export 50,000 at a time. You can get the first 50k, then the next 50k and so forth until you have them all, then categorise/play around with them in Excel.
RobOusbey's post is a better solution though, and would save more time in the long run.
Great Post Rob and outstanding instruction for a moron like me. Already set up and now waitn for the next in what I sense might be a long and excellent series of posts on Analytics... which is much needed.
Rob, thank you so much for this detailed post and for making it all so easy to understand. I went through your instructions on one of the corporate accounts I manage and was amazed at the results. I then copied the segments onto other accounts and tailored them accordingly and wow! incredible, beautiful data, I have been looking at GA data for about 4 hours.
This post is, without any doubt, one of the most useful posts I have personally come across in SEOMoz. I now need to try and see if I can emulate it all in Webtrends. Thank you again
great post this will help me in my day to day work. My customer like keyword analysis because they want to know what is the roi attached to brand kw, generic kw, long tail..
I think this kind if stuff is so useful for this goal.
Anyway i have a question: is it possible to know the gender (male or female ) of visitors whose use some kind of keyword?
Thanks for the comment. Yes - you can see the gender (and other demographic segments) of people who use particular keywords, using the Microsoft Advertising Intelligence tool. (You will need Excel, and might need an MS Advertising account.) You can then put in your keyphrases, and get out a wealth of data that Microsoft has collected, included gender data such as this:
https://imgur.com/VL7N6.png
Incredible post. I love the regular expression stuff :-)
This is a great idea.
I'm going to go set this up right now.
Hi Rob,
I really love the concepts here, especially from the side of segmenting branded and non-branded terms as this is something that tends to distort monthly search stats for clients. I am already planning on implementing this for most sites by the end of the month.
I tried to create a segment following your instructions for the “Creating the Branded Terms Segment” and am not sure if I followed your instructions correctly. I took a screen capture of this segment. If you have some time can you verify if this is correct of if I botched it?
Also, I noticed that you used “laptop|notebook“ for the “Head Terms” segment. I am curious how you would address segmenting alternate and misspelled versions of these keywords such as:
Thanks for any clarification and the awesome post.
You can use regular expressions to accommodate for the common misspellings or addition of spaces in the terms....like this:
lap[- ]*top|not[-e ]*bo?ok
just off the top of my head...that should work for your variations. I was able to perfect my regex skills yesterday by looking at LunaMetrics blog post on using regular expressions with Google Analytics.
cnoble, Thanks for the insight. I'm still working my way through wildcard matching.
Now that's an interesting note, I was aware of these regex options, just haven't looked at it from this (misspelling) perspective.
It is a shame that you can't do this on a "account" basis, if you have access to 100 odd clients analytics, one segment applies to them all ... Excellent post though, definently something I will be setting up for some of the more dedicated clients.
Once again SEOMoz finds and posts the most amazing information. I love having you guys out finding these awesome posts where the research has been done properly for a solution such as this.
Keep up the great work!
The segmenting idea is great. It really allows for a better understanding of an SEO campaigns success.
There was one thing I'm hoping you can clarify. You are segmenting the head based on the number of words in the keyphrase and whether it mentions a product group keyword. Is there a reason you did this instead of segmenting based on keyword volume?
That's what I call really drilling down!
It's great to see such a detailed set of instructions. Thank you.
In the days before advanced segments were introduced, I used the same approach with extra profiles within GA.
I would use a custom advanced filter to match the contents of the keyword field and then ouput a word like 'Brand' or 'Head' overwriting the original content.
The down-side of this approach is that this "Bucketed Keywords" profile no longer contained the original keywords, so you would have to switch back to a plain profile to see them. Filters also only work going forward, they do not act on existing data, unlike advanced segments. So this approach only works for future months.
But this approach might still have a use for people who are interested in having the information easily available in all reports. At the moment there are still places where advance segments don't work -- notably funnel reports.
Tim
Great post. Going to spend some time working on all the regex.
Would it be at all useful (more or less) to divide the number of search terms by the traffic sent (ie- sent 2,000 visits via 500 search terms = 25%) and compare that month to month? Or even branded/head phrases as a percentage of total phrases month to month?
I work in an industry that is heavily seasonal, and when I applied this method (works great, btw, thank you), the trend simply seemed to follow that the reason there were more or less search terms was because there were greater or fewer visits and searches overall.
I guess that changes in that ratio (essentially 'average traffic per head keyphrase') might just match the seasonality of 'average searches per keyphrase'.
However, you could perhaps use it to spot keyphrase groups that are relatively less valuable to be competing on as time goes on.
One of the most informative posts I have ever read on this blog. Am going to try this out. Thanks a lot Rob!
Really very nice information. I am aregular reader of SEOmoz and I have already subscribe to this blog.
Hi,
Great. I am going to use it now.