And so it ends. By the time this post goes live, Yahoo Site Explorer will be gone. Let’s take a moment to silently reflect on the passing of a once great SEO tool.
Thankfully, there are a heap of Yahoo Site Explorer alternatives, with arguably more powerful features available than Y!SE ever had. Today, we’re going to take some fresh link data from your favourite link information mining tool of choice and supplement the hell out of it with even more data. Yey – let’s build a better Yahoo Site Explorer replacement.
A Special Thank You
This (long overdue) post wouldn’t be possible without the assistance of one of the SEO industry’s most unsung heroes – Niels Bosma. He’s the genius behind SEO Tools for Excel, which has opened up another level of SEO data analysis for Excel geeks all over the globe.
I’ve written about Niel’s amazing work over at SEOgadget before, so check out these links for a primer on the basics or an introduction on how to find lost links and get SEOmoz API data into Excel. When you’re done, we’re good to take it to the next level.
Here's the finished product, click the image for a massive, full screen image of this awe-inspiring spreadheet.
#1 - How Many Likes, Google+’s and Tweets Were Received to My Linking URL?
Let’s start nice and easy with a count of the number of Likes, Google+’s and Tweets received by a URL. What’s not to love about a page that received a lot of social love? These two queries will churn happily through your link data until you’ve got more social than you can shake a stick at.
Retrieve the Google+ count for a URL:
=GooglePlusCount()
Get the number of Tweets to a URL:
=TwitterCount()
Get the number of Facebook Likes to a URL:
=FaceBookLikes()
#2 - Are My Links Live and Accessible to Search Engines?
When you’re looking at link data, you’re looking at an internet that has been and gone. At least, you’re looking into the past – and we all know that link decay is an everyday part of the evolution of the internet. To take a super accurate snapshot of your link data, you really need to know if your link is still live.
On that note, check out this clever little formula:
=IF(XPathOnUrl(C2,"//a[contains(@href,'seogadget.co.uk')]")="","NOT FOUND","FOUND")
Translated, it means, “If you found a href link on this page with seogadget.co.uk in all or part of the href, say ‘FOUND’. If the response to that query was a blank cell, say ‘NOT FOUND’.” So you know, the SEO tools XPathOnUrl function returns nothing when no result is found.
#3 - Did Google Even Cache that Link?
In his post on automating SEO, Russ pointed out that not all of your backlinks may have been indexed by Google, and that you should identify them and link to them to get them discovered. That’s a very nice idea Russ! Russ’s solution was excellent, but required some fancy scripting work.
Assuming you’re not tracking new referrers with snazzy custom filters in Google Analytics, here’s an easy way to do it with Linkstant and the =HttpStatus function.
First, grab all of the new referring URLs. I do that with Scraper for Chrome. Export the URLs and then in a new Excel tab, put this URL in to cell A1:
https://webcache.googleusercontent.com/search?gcx=w&sourceid=chrome&ie=UTF-8&q=cache:
A cache: request will respond with a 404 if the URL is not cached. So, a simple concatenate, followed by a "=HTTPstatus" will give you a list of URLs that Google has cached.
This is probably not the best way, but it works just the way you'd expect it to, most of the time:
=CONCATENATE($A$1,[@URL])
Where $A$1 is our cache request URL.
Next, use this function to get the http status of the URL:
=HttpStatus([@Column1])
Grabbing the HTTP status of your URL list will give you a list of results like this:
#4 - Get Search Volume Data for Your Inbound Anchor Text
An interesting way to identify links that might be a little above the radar, penalty-potential wise is to look at the search volume for the inbound anchor text used in the link. I mean, if you’ve got a lot of massively overcooked, highly competitive anchors from PageRank 0 sites, you’ve got a problem.
If you’ve got an Adwords API key, then it’s a piece of cake to use the Adwords API Extension for Excel – simply take a copy of all anchor text in the data, copy it to a separate table, de-duplicate it and run this array formula:
=arrayGetAdWordStats(KW,"EXACT","GB","WEB")
Then, do a VLOOKUP back in your main table and you’ll have search volumes for every anchor text used in your inbound links.
#5 - Extract the Domain From the Linked to URL
In my link data I really like to know if there are any potential problems with the domain I’m getting links from. PageRank 0 links, with extremely competitive anchor text could spell trouble, or at least some less than savvy link purchases. We're spending a lot of our time lately cleaning up this sort of thing, and this method makes it a whole lot easier.
Check out this formula as a very simple way to extract the characters up to, but not including the first trailing slash in a URL (assumes there's a "https://" at the beginning of the URL):
=MID([@URL],8,FIND("/",[@URL],8)-8)
If some of your links are from homepages (which often they are), simply add this extension to display the full URL, should there be no trailing slash in the URL:
=IFERROR(MID([@URL],8,FIND("/",[@URL],8)-8),MID([@URL],8,LEN([@URL])))
#6 - Get PageRank for the Linking URL and Domain
Yes, you heard that right. The old school link auditor in me can’t shy away from the fact that while PageRank is pretty useless as an overall proxy to rankings, it will come in handy if you’re trying to get a sense of the overall quality of the backlinks of a website. Like I mentioned above, a lot of PageRank 0 links from cruddy sites, with highly competitive inbound anchor text might be something you should make yourself aware of.
Here’s how:
=GooglePageRank()
What Could You Build?
There are a few more tricks left that you should go and explore in SEO Tools. I also happen to know there’s an SEO Tools v3.0 coming very soon, and it will kick ass! Though I’m really grateful for Yahoo Site Explorer, I’m not going to miss it. It’s sort of like an Overture Keyword Tool situation. When that disappeared, there was outrage, now, silence.
Have fun rolling your own tools and, as always I'd love to hear how you're getting on! - follow SEOgadget on Google+
hi Mate, very cool review of OSE data.
Will be looking to incorporate some of these into our analysis.
But yeah now YSE has left us atleast we still have
- Majestic SEO
- Open Site Explorer
- Blekko
- Alexa
- Google
& many more ;)
RIP Y!SE, and HELLO robust data set. Thanks again, Richard.
Thank you James,
is there a special way or command to see backlinks in Blekko? never used it.
Think I'll take a look at Blekko and see what i can scrape. If you have any data you'd like to get hold of, let me know in the comments.
You can use alot of the same syntax's as you do on Goolge search, pulls up some interesting stuff that is for sure. + you can use some more Blekko style ones too example:
domain + /urlseo
domain + /links
domain+ /links /rank
Wow! Never used Blekko before - how good is that domain + /urlseo command? Thanks for posting those James - really useful :0)
*rolls sleeves up and heads to Excel*
Did you mentioned Alexa 8/ ?
Alexa does have its own link index, but it is not that great only valuable for a very quick overview from my opinion. If you want a more in depth data set you would never use it lol.
Ah!... ok, now I feel better ;)
That been said I have seen a few sites which I will not name using alexa link data as a main sorce. Not something I would advise on that is for sure. But for some people they like the quick and easy numbers it pulls up even if they are very wrong hehe
I hate you Richard!!!! Every time I start feeling good enough with Excel, you come with a post like this one and make me feel like a total newbie.
I start feeling the Monopoly (return to Start) syndrome!
Jokes apart, Richard, another amazing post!
Hehe! I'll take that as a compliment my friend! This reminds me of something Avinash said at Mozcon - his "give it up tip" was:
"Learn something new at least once a week"
You'll always be on top of your game if you can deliver that promise to yourself. Let me know when you're done and I'll go learn something new for you!
Realy great post, meany thanks!
Good Post :)) And i surprise for my web reference in your screen capture ;-)) Thanks :)
My site is https://carrero.es
I'm in awe of this post and the spreadsheet - it's amazing!
Thanks Jenni!
Thanks RichardBaxterseo for this post.. For me.. I have been always chasing the excel tools for my SEO purpose, and almost on every 10th day I get to see new Excel Tool ..Honestly I have a question to ask from you!
Where do you get all these stuff ? or else in other words..I want to make my skills little bit good in Excel tools, please provide me some resources to learn from... That's a humble request from my side to you sir..
Waiting for your reply...
Good post lost of great info I am very new to all this and have learned so much from reading awesome posts and comments on the web like this one. I learn so much from what people comment also I have build my own site it is https://www.thebestscentstore.com and have another site https://buyascentonline.scentsy.us I have been working very hard on them still have a long way to go and love that there is such wonderful info out there to share with people so they are able to learn and frow their blogs sites whatever is needed. So thanks again this is a site I will be keeping up with hope everyone has an awesome day and thanks again to everyone who comment and took the time to post this on here.
I'm using the =IF(XPathOnUrl(C2,"//a[contains(@href,'mywebsite.com')]")="","NOT FOUND","FOUND") to check if links are still live and it works brilliantly!
I'm noticing however, that if the link contains an image rather than anchor text it also returns "Not Found".
Anyone know of a way around that?
Mat, absolutely true)) I used this formula too (=IF(XPathOnUrl(C2,"//a[contains(@href,'mywebsite.com')]")="","NOT FOUND","FOUND") )
thanks for sharing this nice info...its awesome
This has been awesome week for new tools and methods.
Love this.
Great work Richard.
Thanks for sharing this Very good Extra of posts like this is learning something new about excel every time too!
Who needs football when I can spend the whole weekend building one of these? Thanks for the in-depth tutorial - very helpful content.
The possibilities with Excel keep amazing me, can't wait to try this all out. That being said, i'll definitely miss Yahoo's Site Explorer. It was a quick and easy way to check for backlinks that other tools might have missed.
Thanks for sharing!
Hey Martijn - yeah, something Y!SE was definitely good at was relatively fast levels of indexation of new links. A blog post from a few days ago may have links and Y!SE was a good way to see them. Alas!
One of the things that makes the people in SEO so cool...the bigger the fundamental shift in the landscape, the cooler the tools that emerge!
Thanks for sharing this ... double bonus of posts like this is learning something new about excel every time too!
Sha
Wow!!!
I think I need to brush up my excel skills. Thats awesome!!
You have saved me many hours of work by pointing me to your SEOmozAPI to excel tutorial. I've been trying to get data to excel in order "roll my own" in-depth keyword competition analysis sheet.
Also been trying to cook up a function to extract a domain name from a url. Another half-hour saved.
Thanks! I'm inspired! How am I ever going to get some sleep?
Hello Sir!
Here, you might like this:
______________________
Function CheckPath(URL As String) As String
URL = Replace(URL, "https://", "")
URL = Replace(URL, "www.", "")
URL = Replace(URL, Left(URL, InStr(URL, "/") - 1), "")
CheckPath = URL
End Function
______________________
To check the path behind the domain/tld.
And here is to get the domain + tld:
______________________
Function CleanDomain(URL As String) As String
Dim length, number As Integer
URL = Replace(URL, "https://", "")
URL = Replace(URL, "www.", "")
number = InStr(URL, "/")
length = Len(URL)
URL = Replace(URL, Right(URL, length - number + 1), "")
CleanDomain = URL
End Function
_____________________
I have been using these two little snips for a long time and they have been proven valuable to me :) -Though you might want to include the https://; which I have been too lazy for so far...
Very nice. I like this method. I figured out your method then realized that the SEOTools excel add-in has a function in called "URLproperty". All you need do is give it a url and then enter "domain" as the property. Cleans it up nicely.
thanks for the tip. Now I have 3 ways to extract the domain.
Quite often I use several different computers; so internally I have created my own plug-ins. That's why I use my own plug-in too internally. And I am such a geek, I have saved this one on my phone as a txt file, so I can create a quick plug-in anywhere without creating a 'whole' plug-in in their excel.
Some functions are just worth having somewhere, creates a good feeling of geekiness!
Does DA and PA Seomoz parameters have all it takes to replace YSE?
I'll really appreciate your inputs!
Just my 2 cents here, Y!SE never had a value metric like OSE has. Y!SE was really more about the top 1000 links, rather than attemting to assign a specific value to each link. I suppose from that perspective they're quite different!
Great Rich - I'm always blown away by the amount of work put into your excel articles. Another great example on how excel can be such a valuable tool for search/online professional and not only for account balance (summing the quid)..Excel makes my job easy :)
Thank you for this post. We have been looking at ways of doing exactl this. Great work Richard
Would be great to have some of the metrics as an expandable section on SEOmoz toolbar so that you can see data on the fly...
I'm signing up for Linkstant now, we'll see what happens..
Richard,
Awesome and hats off to Niels! The only thing I am yet to figure out, is how to scrape google serps from Excel. I would like to be able to pull in page 1 urls for a given keyword. Have you figured out a way of doing this yet?