Despite keywords being slightly out of fashion, thanks to the whole (not provided) debacle, it remains the case that a large part of an SEO's work revolves around discovering opportunity and filling that same opportunity with content to rank.
When you are focusing on smaller groups of terms, there are plenty of tools to help; the Moz Keyword Difficulty Tool being a great example.
These tools function by checking the top results for a given keyword, and looking at various strength metrics to give you a snapshot as to how tough they are to rank for.
The problem is, though, that these tools operate on the fly, and generally only allow you to search for a small amount of keywords at any one time. The Moz tool, for instance, limits you to 20 keywords.
But I need to check 100,000 keywords!
By the end of this tutorial you will be able to visualize keyword difficulty data in a couple of ways, either by keyword:
Or by keyword type:
Or by category of keyword, spliced by specific position in the results:
So what do we need to do?
All keyword difficulty tools work in the same way when you break them down.
They look at ranking factors for each result in a keyword set, and sort them. It's that simple.
The only thing we need to do is work out how to perform each step at scale:
Step 1: Get URLs
My preference for scraping Google is using Advanced Web Ranking to get the ranking results for large sets of keywords.
Quite a few companies offer software for this service (including Moz), but the problem with this approach is that costs spiral out of control when you are looking at hundreds of thousands of keywords.
Once you have added your keyword set, run a ranking report of the top 10 results for the search engine of your choice. Once it's complete you should see a screen something like this:
The next step is to get this data out of Advanced Web Ranking and into Excel, using a "Top Sites" report, in CSV format (The format is important! If you choose any other format it makes manipulating the data much tougher):
This presents us with a list of of keywords, positions, and result URLs:
So now we can start harvesting some SEO data on each one of those results!
My preference is to use the fantastic Niels Bosma Excel Plugin and the MajesticSEO API to access their Citation Score metric.
Equally, though, you could use the SEOgadget Excel tool alongside the Moz API. I haven't tested that thoroughly enough, but it should give you pretty similar results if you are more used to using them.
Step 2: Analyze results
Now that we have a nice result set of the top 10 results for your keyword list, its time to start pulling in SEO metrics for each of those to build some actionable data!
My preference is to use the Niels Bosma Excel Plugin, as its super easy and quick to pull the data you need directly into Excel where you can start analyzing the information and building charts.
If you haven't already done so, you should start by downloading and installing the plugin available here (note: It's for Windows only, so if you are a Mac user like me, you'll need to use Parallels or another virtual machine).
In the column adjacent to your list of URLs you simply need to use the formula:
=MajesticSEOIndexItemInfo(C2,"CitationFlow","fresh",TRUE)
This formula gives you the CitationFlow number for the URL in cell C2. Obviously, if your sheet is formatted differently, then you'll need to update the cell reference number.
Once you see the CitationFlow appear in that cell, just copy it down to fill the entire list, and if you have lots of keywords right now would be a great time to go grab a coffee, as it can take some time depending on your connection and the number of results you want.
Now you should be looking at a list something like this:
Which allows us to start doing some pretty incredible keyword research!
Step 3: Find opportunity
The first thing that you probably want to do is look at individual keywords and find the ranking opportunity in those. This is trivially easy to do as long as you are familiar with Excel pivot tables.
For a simple look, just create a pivot of the average citation score of each keyword, the resulting table creator wizard will look something like this:
Of course you can now visualize the data just by creating a simple chart, if we apply the above data to a standard bar chart you will begin to see the kind of actionable data we can build:
This is just the beginning, though! If you create a pivot chart across a large dataset and look at the average citation score for each position, you can see interesting patterns develop.
This example is looking at a dataset of 52,000 keywords, and taking the average score of each site appearing in each position in the top 10 results:
As you can see, across a large dataset there is a really nice degradation of strength in the top 10 results, a real vindication that the data we are looking at is rational and is a good indicator of how strong you need to be to rank a given page (providing the content is sufficient and focused enough).
You really want to splice the data into categories at this stage, to identify the areas of quickest opportunity and focus on building content and links towards the areas where you are likely to earn traffic.
The below chart represents a comparison of three categories of keywords, sorted by the average Citation of the results in each category:
From this we can see that of the three keyword categories, we are likely to rank higher up for keywords in the "brown widgets" category. Having said that, though, we are also able to rank lower down the page in the "blue widgets" category, so if that has significantly more traffic it might prove a better investment of your time and energy.
There you go!
We have created a homebrew keyword difficulty tool, capable of analyzing hundreds of thousands of URLs to mine for opportunity and guide your content and linkbuilding strategies!
There is so much you can do with this data if you put your mind to it.
True, scraping Google's results strictly speaking is against their Terms of Service, but they have a habit of using our data, so lets turn the tables on them for a change!
Thanks for the mention Martin, I *strongly recommend* you give a few alternative methods a try.
Firstly, your approach with SEO Tools for Excel is making single API calls for each row to majestic (each one of those queries will make a single API call). If you're making single calls, you'll be up all night hoping you didn't crash Excel.
It's worth learning the proper API functions with batch requests.We solved that problem in SEOgadget for Excel with API call batching. It works with Majestic, aHrefs and Mozscape, and is reliable up to 20,000 lines of data per batch. The GetIndexItemInfo call (equivalent to the URL Metrics call from Moz) is massively powerful and can return a lot of data very quickly in a single call, even in Excel.
Something like this would work:
=majesticAPI_toRange("majesticmetrics","GetIndexItemInfo","fresh",[yourtablerange])
("get URL metrics data for all rows of URLs found in [yourtablerange] from the fresh index")
Here's a helpful video and here's a introduction with a link to the manual :D
AWR - in the US, use Getstat. The UI is good and the hyperlocal granularity is impressive - local ranking flux is really big at a state level but that's impossible to get via AWR. Their API is excellent, wicked fast and easy to parse.
Cheers,
Hey Richard, thanks for the comment. Any chance you could build a nice wizard based UI like Niels' for the next release - I think that would push market adoption a lot further :)
Yeah, *maybe*. Wizards don't tend to correlate well to lateral thinking, though.
Our stuff is far more representative of the API call constructions offered by the tool vendors - it's aimed at people needing something quite a lot faster and more powerful (arrays, large data, etc) who already have a basic grasp of the idea of syntax and query construction, who can see ideas behind unique combinations of multiple calls between API services. Those same people tend to use these features for prototyping or to have an informed conversation with a developer.
It might not be for everyone, mind you - and I'm absolutely fine with it being that way.
Here's the manual, tell me what you think: https://seogadget.com/wp-content/uploads/2013/10/SEOgadget-for-Excel-Getting-Started.pdf - does that look like it needs a wizard?
Again, good post, because there's food for thought in here for sure.
PS: I think you should be including the data from Grepwords in here (again - via API) - they have reams of Google search volume data, a data point that now eludes most tool providers. Another wicked fast API :D
You know what, when I was playing about briefly with your plugin I signed up for grepwords and bought one of the premium plans, and then never went back to it.
I totally should play with that more and use its data in my iterations of above (which include a load more data-points as well, but the above was only really meant to be a starting point).
I'll dig further into your plugin over the next few days now that I have more time on my hands and do a more in depth follow up post with some download files etc. on my blog ;)
+1 to STAT.
Great post Martin. Would be worth making a tool for this methinks. ;]
+1 thanks for the new tool!
It always scares me when a tool doesn't release pricing. Anyone know of a ballpark range for GetStat?
Richard
I tried to do as you suggested with SEOGadget and us the batch call to mozapi like this:
MOZ_URLMETRICS_toRange("Table7",D3:D26265,"Cols="16384")
I do get a response but it is "An error occurred: The remote server returned an error: (503) Server Unavailable
I guess this could have something to do with a rate limit on MOZ's side. Any ideas Richard?
Great post Martin, keyword rankings have been something we have been keeping a very close eye on. Whilst Google's (not provided) update has made things harder to track, I agree that it has done us all a favor in a way, since we have to undertake a lot more in depth research now, leading to a better overall understanding of our situations.
I still suspect they will release this data in time, but for some kind of a fee.
Regards,
PK Group
Could not agree more! We will become better(more rounded) marketers from the (not provided). Then when we do not rely so heavily on keyword data they will bring it back as a paid feature.
Trevor Stewart
I currently use Moz as an initial keyword tool and it works pretty well. Starting to see a bunch of results that I didn't see when using analytics. A little more approachable and transparent but you get what you pay for.
I think both is right, we have to focus on Keywords (more Ranking here i think) just like PKGroup says here - but i think really its not a Keyword look - it is a "simple" ranking look.
Trevor Stewart says that we will become better(more rounded) marketers - and thats true, but it didn't means we have to lose the SERP Ranking focus!?!
SEO is allways a very SERP focused world - no matter if we are actual looking for Keywords, or for topic, for autorship or social media in generell ..
Both is right - but "keyword rankings" ?? It is just SERPs - so of course we have to watch out visibility.
a very good post
Very cool insights, Martin!
I think you're quite right about investing in the right keywords that you will most likely rank for instead of investing in the highest search terms that larger sites will most likely rank higher for. I think people often attempt to rank for the most heavily searched terms because that's what the data SEEMS to be telling them.
From your example, we can see blue widgets have a high search result, but as you said, the higher ranking sites will win the audience who search for that, not a smaller company. Definitely some smart advice to keep in mind.
Thanks again for sharing!
Interesting post! I like the approach but are is the quality of the average citation score significantly better than estimating difficulty based on data provided by the Keyword Planner? Also, how do you balance this against opportunity?
I cannot thank you enough for this post. Seriously, I've been trying to make something like this for a long time. This is extremely helpful. Thanks!!
Hey Martin, I always give first priority to keyword research because it is the key of your success and for this I use lot keyword research tool, but you really given us something amazing. I hope this will bring our seo work to higher level. Thanks Bro.
Great Post Martin .. Thanks for providing nice tools information . But we can save our time instead of checking the keywords manually it is a better approach to follow this procedure to check the bulk keywords at a time.
Great article, thank you for sharing, Martin.
Re the STAT tool. I have been using it for almost two year. It is a REALLY good tool- I finally have everything in one place.
For example, you can manage a huge volumes of terms, track its rankings every single day (across different search engines), pull their average search volumes AND MOST IMPORTANTLY, you can TAG each of your terms (e.g. landing page type, keywords group, but also kw difficulty- just by whatever you need etc.).
When I started working on the current client, I tagged 16k terms so now can see how the rankings improve per tag group (you can watch historical graphs of those rankings etc.) Tagging is definitely very helpful to check if any algo updated affected your site- it is definitely easier to find which pages were impacted so you can quicker dig for reasons behind it. I find it pretty amazing! If you are a control freak (what I don't find helfpul in SEO profession;), this tools definitely helps you to stay calm in times of algorithm updates:)
Finding the right keyword for the website is the most important task in the SEO because the whole foundation will be built upon the focus keyword and these keywords will assist in ranking high in search results. Your mentioned guide for creating own mass keyword difficulty tool is superb and more useful for all SEOs.
Not to mention that everything is always changing a ton!
Thanks, is very useful and it's well explained!
Brilliant Article Martin.
I would like to do the same for 1 Lakh keywords. Can you answer me the following questions.
1. Which version of advanced web ranking will support this task? I have standard version
2. Which plan should I take in trusted proxies?
3. Which moz api plan is required to make this job easier?
Thanks in advance.
This is interesting but I don't see the value in the charts, unless of course I'm targeting 10 keywords and want to look at a pretty charge with the DA of the top 10 sites that rank for that keyword....Why not use filters instead? Sometimes a pretty chart can over-simplify the data and abstract away context (I'd say there's a lot more factors at play here but we're relying on MOZ / Majestic to cram them all into one integer).
Thanks for sharing a new method of optimizing mass keywords performance.
Thanks for sharing this useful information...Really this type of keyword tool bars are really effective in SEO (Search Engine Optimization) to find large amount of keywords accurately...But sometimes we are missing this accuracy.But we can save our time instead of checking the keywords manually it is a better approach to follow this procedure to check the bulk keywords at a time.
But,
I prefer Google keyword tool bar for static website.But for dynamic websites it's is better to follow this procedure.
[link removed by editor]
First, this is a fantastic article. Like other, I've been trying to string something like this together. Having this as a guide is really helpful. Now on to the but wait.... part. I have a file with 20k rows. I simply can't execute =MajesticSEOIndexItemInfo(C2,"CitationFlow","fresh",TRUE) on that may rows. Excel fails. Nor can I get the seogadget version to finish executing using _torange. I can get both of these to work if I do them in batches of 1,000 or so. But that obviously defeats the purpose of a mass keyword analysis. I'll have to do this manual batch command 20 times. Imagine if I had 100-200k rows like article suggests. Any ideas or pointers on how to execute these commands would be greatly appreciated.
Thank you Martin,
Keyword is most important thing in SEO and you provide excellent keyword tool. I will also recommend this to my other SEO friends.
Hey Martin, great post! 100% practical =)
My 5 cents on it: Use Scrapebox's Page Authority Addon to get PA from all pages. It uses moz's API which can speed up things quite a lot.
Cheers!
At the moment Scrapebox has quite a conservative ten second delay between single URL API calls, and doesn't differentiate between free or paid API holders. It's ok if you only have a few to run or plenty of time to run them in, but if you're scaling this there are faster ways to grab the information. It does give you Domain Authority and MozRank in addition, though.
Thanks for the heads up Martin ;)
Great article, I've been running AWR internally for a long time but relying on the Moz KW Difficulty tool too heavily for projects which have outgrown it. Thanks for talking about something which should probably have been more obvious to me given the tools available, much appreciated!
Thanks Martin for sharing such a described post with us about the keyword difficulty tools and I also want to say that I'll definitely use these tools as soon as possible.
Thanks Again
Hello Martin. First I solute you for suggesting such a fantastic post with us. You just break the ice. Because when Google Analytics started providing "Not Provided" keywords, a lots of rumors spread like hell. Now we just find the solution.
Regards
Hey Kandarp, thanks for the nice words - indeed, this approach came about as a workaround for some of the not provided issues, but its grown far beyond that now in my own implementation. The current issue with google has forced us all to be more inventive, and innovative - I think they've done us a favour!
Nice post Martin, great idea. I do agree though that keywords are no longer flavour of the month and that the keyword research stage of a project should no longer carry the weight it once did. Instead of just looking at competition vs. search volume it's becoming much more important to look and relevance and context and use groups of related keywords instead of just one or two per page. Still, this is a great idea and definitely still has a place in SEO... for now! ;-)
Hey Emma - thanks for the kind words! I agree that this kind of approach is becoming more necessary these days, to be fair though, it would have been a good idea to do this whether keywords in GA were provided or not :)
M
Great point Martin! I wonder if they would have removed keyword data if we took other avenue in the first place?
Very interesting viewpoint, Martin. It's a battle I often find myself having with clients; they want to rank for vanity phrases with a new domain, weak backlink profile and very little in the way of content. When I suggest targeting phrases that are far more specific and relevant to their business they look at me like I am some sort of alien!
Martin, great stuff!! Keyword research is still very important and I find it to becoming more challenging. This gives a work around to gather needed data.
Thanks for heads up Martin. Also cool insight on analysis part. It all depends on how we plan and execute our marketing campaign. A well begun research and execution would only lead to better user experience and that would ultimately lead to better marketing. Thanks for sharing the useful tools.
congrats Martin for this post, which reminds some very actionable talks of your freelance consultant era (a sign of your return to independence, maybe?).
Just one note about AWR. It is an amazing tool, that has recently renewed itssf in few facets. Of all the options I'd suggest to use the install-in-the-server one (do you agree?).
More over: if you monitor thousands or more keywords pay attention to the IP range you have for scraping Google SERPS, independently from the tool you decide to use, because Google tend to block calls from scrapers after a while. Maybe it could be useful, Martin, if you could recommend few good IPs service providers in that sense.
Ciao!!
Hey Gianluca - thanks for the comment! :)
Personally, I run instances of the desktop version of AWR in virtual machines - Ive got disc images that I can just mount that contain the OS along with base installed software, so I can roll out instances as required. Its a nice scalable way of doing it. The server version is a must IF you have a number of people wanting direct access to the reporting suite at one time - in my case I dont need that (its just me!) so I am more than happy with (multiple) instances of the desktop version.
Regarding IP ranges - I'll get round to posting something one day, but I can't give away all my secrets ;)
My standard advice however is to use trustedproxies dot com - its not cheap, but they do work. Every other proxy supplier I tested had more reliability issues.
Great post Martin! Thanks for the tip about running instances of AWR, the thought hadn't crossed my mind before. Your right that is a very scalable way of doing it.
Thanks Gianluca and Martin for mentioning AWR.
I second the vote for Trusted Proxies. Our new AWR Cloud service is entirely based on proxies from Trusted Proxies and that says a lot about them.
As far as AWR Desktop is concerned, I think Richard Baxter is the absolute master in using multiple proxy servers with AWR. Maybe he can give us some good advice here.
My only suggestion is to avoid using entire C classes of IP addresses. Use a few IPs from different C classes for maximum protection. And always, ALWAYS use the "be nice to search engine" option. It's better to be safe than sorry. Alternatively, if you hate dealing with proxy servers, forget everything about them and use AWR Cloud instead.
I've also heard great things about GetSTAT but never had the chance to test it. Maybe Rob (@STATrob) can give us some details why his service is so awesome. Rob?
Awesome isn't something that can be explained. It must be experienced. ;)
Hi Rob,
I would like to do the same which did by Martin. Can you help me in regarding this. Can you message me the cost estimates.
Hey Phil. Folks, whenever I have something I don't understand about AWR, Phil emails me. Maybe it's really him or maybe his customer service team is just brilliant. Either way, my issues always get solved in record time. Phil, I'm a virtual walking billboard for AWR. ;-)
Thanks Jenny! I really appreciate it.
Hi Philip,
Which version of advanced web ranking(desktop) will support this job. I have a standard version with me.
Love AWR. And I recommend trustedproxies.com as an add-on. Been using them for over a year with no major problems. The IP address gets blocked, they give me another one. Flat fee per month based on the number of proxies I need. Also, AWR does have capability to get location specific results. We use it all the time, and it's pretty accurate. Great post, can't wait to try this.
Jenny, can you tell me which plan is good in trusted proxies to check 10000 keywords in a day.
Tool Seems too useful and accurate. Once i will use it and check it for my website i will be more comfortable to talk about this tool. Will use and compare with others and Surely will come back with an detailed analysis. :)
Great Post Martin! I was already looking for something like this for the keywords.
Thanks for the post, Martin. This is good stuff. Before reading this post, I wasn't very familiar with AWR - I just saw it mentioned in a few tweets/blog posts, but didn't fully understand how it worked or more importantly, why it's superior to other rank tracking software. All in all, lots of really valuable takeaways here. Thanks again for sharing the insight :)