Tonight, our 33rd Linkscape update launched. You'll find new link data in OpenSiteExplorer, the mozBar, Linkscape classic, the Link Intersect Tool and the SEOmoz API.
E.g. this post from mid-October now has link data in OSE
This update represents data crawled from the middle of October to the start of November - processing was unfortunately delayed by Amazon (weirdly, EC2 didn't have machines available due to a "pre-holiday rush"). We're aware of these issues and will be taking precautions to make sure December's index update goes smoothly.
Stats for this index:
- 40,605,301,071 (40 Billion) Pages
- 425,695,258 (425 Million) Subdomains
- 103,776,906 (103 Million) Root Domains
- 395,851,127,399 (395 Billion) Links
- 2.10% of All Links are Nofollowed (up 0.06% from October)
- 56.99% are internal (down from 57.15% in October
- 43.01% are external (up from 42.85% in October)
- 5.88% of pages have rel=canonical (up from 5.42% in October)
- 62.28 links/page on average (down from 62.35 in October)
I'm also excited to say we've got the nascent beginnings of a Wordpress Plugin powered by Linkscape now available. There's just a few features today, but we'd like your help to tell us what would be valuable, useful and interesting to have in the Wordpress tool.
_
Still in alpha stages, but showing links + top pages in the admin panel
The plugin was built using just the FREE SEOmoz API. This is a very early version, and there may be some bugs, still, but if you have suggestions or feature ideas, please leave them in the comments!
You just said the magic frickin words. Wordpress Plugin. Yoink!
+1 (Even to the "Yoink!")
Re the Wordpress plugin, when publishing a post it would be great if it could suggest top pages with similar KWs so that you would have a quick list of places to add internal links into the copy.
Very cool data.. I would have figured a much bigger number on the internal vs. external links. Thanks...
I must agree- almost half of all links are external, to me that is an amazingly high number.
EDIT: Judging from the position of the bullet points, this only refers to nofollowed-links, right? Sorry, I should take a closer look before commenting, next time. Nevertheless, I still believe stats for all links would be cool.
What do you guys think, is it due to small pages maybe created years ago that have a lot of content on one page and hardly any nagivation? Or is it maybe due to giant link-list pages which have hundreds or thousands of links going out and dilute the picture?
Would be interesting to see some more details on the distribution here. Though I must say, its don't really know what I would use the data for...
For a brand new site - these updates are like oasis! Finally some link data to work with :)
Thanks, but WHEN will there be historical records of previous link numbers? It's time.
+1
Asking the same, even though I believe you are working on it, but that it is just a little bit more complicated than creating a WP plugin :).
Anyway, even though I'd love that addon to Linkscape datas, I prefer to be patient and wait for a shiny working historical function, then have a buggy one just for the sake of have it.
Agreed - it's more than past time. That's a feature we've been trying to get for the last 1.5 years, but unfortunately, I was just in a meeting on the product roadmap yesterday, and March is the earliest date we'll have it. The problem isn't the existence of the data (we have every stored index for the last 10 or so, in a format that's usable). The issue is making updates and upgrades to the way Linkscape processes and retrieves data to enable this to work in a scalable fashion. Earlier this year, we thought we had a solution, but it would have slowed down our API to a crawl (and cost us far too much on the Amazon cloud). The engineering work to make it happen is in progress, and we've moved 4 of our 10 engineering staff onto the project, but it's just a ton of work.
When datasets get into the hundreds of billions of items and they need to be compared, things just get really hard. You have my apologies and my promise this is a big priority for us (and has been for a while). It's just more complicated, challenging and time intensive than any of us expected.
Tried to add the wordpress plug in, but once installed and api and secret key were added - i was told the api was invalid...Anyone else having this issue?
Double check to make sure you don't have spaces in your API key. :) That was someone else's issue. Once they got rid of the spaces, everything was A-0kay!
damn - I would love to use the Wordpress plugin, or API in general - But I keep getting the message that "API credentials are invalid".
Any other suggestion, other than previously mentioned spaces?
I am really looking forward to playing with the new WP plugin!
So this wordpress api is free for the first 1 million links? Then I guess it just cuts off?
Or will their be a big charge on the CC bill???
According to this page, first million internal/external/ anchor texts are free then some charges come in...Does this mean combined on one account? say i have 4 ecommerce websites, very easy to go over this limit with anchors included.
https://apiwiki.seomoz.org/w/page/13991147/SEOmoz-API-Pricing
Great questions!
You get up to 1 million rows of data each month per API key. So if you're using the same API for each account, you'll reach your limits sooner.
When you hit your limit, you won't be automatically charged, but we will suspend your access.
If you like what you see and you want even more data, you should think about becoming a Site Intelligence API customer. There are no limits and you can get a lot more data. :)
Thanks for your excellent questions!
Great news on the wp plugin. Gonna have sometime in the next hour to fire it up and give it a whirl.
Great to see my new sites up there. These crawls are so helpful! Thanks
I've got a question about the sites crawled.
Early this year, for a domain of ours, there were almost 2000 external links found. Now almost at the end of the year, this number has dropped by nearly 500. That's a lot! (well in my opinion it is).
This could be due to websites going offline or just removing the link. But i think it's just too much, since it would mean at least 1 link a day.
So my question is; how do you guys decide if a domain should be crawled or not? (because it's more likely you haven't crawled this link, then the link being taken down, right?) Can it be that the number of crawled domains has decreased a lot and within the decrease a large chunk of my 500 links drop was therefor lost?
We're similar to Google, in that we crawl the web roughly in descending order of mozRank (our version of PageRank - very close to the original formula published at Stanford). We do put caps on very large sites, and we drop spam/junk as we find it. It's possible that your links came from sites/pages that no longer exist, those we've dropped from our index due to spamminess/low quality or that they're very deep in large site architectures we're no longer reaching.
If you have specific pages/sites you think we should be crawling/counting but aren't, please do drop us a line. There's a big effort going on to improve depth, quality and freshness of crawl, and examples like that can help the team get better.
Ha ha I just let my programmers out of the tech department after I had them build me a tool for our wordpress site using your API. Guess I should of waited. Of course ours isn't pretty like that. All and all I have to say SEOmoz is the king.
Impressive.. .comparing to the statistics of the previous update: Pages: 41,219,038,886 (41 Billion) Subdomains: 436,693,488 (436 Million) Root Domains: 99,649,652 (99 Million) Links: 402,521,240,277 (402 Billion)
Thumbs up for the Wordpress plugin. Most of our smaller sites all run on Wordpress so will definitely be useful. Our larger sites, however run on Expression Engine.
Anyone else that would like to see an Expression Engine extension?
Give me an "E"... "E"... Give me an "x"... "x"...
Hi when the linkscape data is updated, does the root domain and page domain toolbar update automatically?
Yes - whenever there's an update, we re-calculate all the metrics (mozRank, Domain mozRank, etc.) and these all feed into the machine-learning system that produce Page and Domain Authority numbers.
Yes, the toolbar should reflect the current Linkscape data.
Hmm...when I try to run Linkscape reports it tells me I have existing ones from Oct and the data hasn't updated. FYI.
Just to add my two penn'orth
Would be good to see a joomla plugin as well as a Wordpress plugin.
Nice plugin. Top pages not working on WordPress 3.1-alpha (Error connecting to SEOmoz. Please check your API credentia), could be something I did wrong. I think this plugin can be pretty good.
I get the same error message using wordpress 3.0.1
Hm. Double check that there are no spaces in your API keys. That's got a couple other people screwed up.
:)
Wow ... I switch jobs and pop back into SEOmoz after a month ... now look at Ms. Bird, the active blogging Esquire. NICE!
Same problem as above. Incoming links show just fine, but top pages shows "Error connecting to SEOmoz. Please check your API credentials." even though they use the same key. No spaces in API key. Other hints?
We made a communication mistake. Only customers of the Site Intelligence API have access to Top pages data at this time.
The Site Intelligence API is a completely separate product from PRO (except at the Premier level). Thus, even if you're a PRO member, you won't see Top Pages information in the wordpress Plugin.
Sorry for the complete communication fail there. :(
Checking my and mine clients sites for updated Linkscape datas... and, yes, WordPress & Plugin is a lovely combination :)
Question: have you in your plan also to create Addons for other common CMS? I think that for eCommerce platfoms especially would be great to have access in to Linkscape datas directly from the Admin Panel.
Ha! Great idea! We would love for someone to build this off of our API. :) Any takers out there???
I would love to see a drupal module. If only there were more hours in the day, our team would be all over this. I guess it will just be added to our internal (or eternal depending on how you look at it) to-do list.
Awesome! Excited to try out the WordPress Plugin :)
Great post! I'm looking forward to using the plugin =) Spelling error in the last sentence "Idaes"?
Great work on the Wordpress Plugin!
Very useful addition.
Very cool, just wanted to let you about one minor issue. The link to your API page to retrieve the API information is broken.
https://www.seomoz.org/api The " ' " at the end is throwing the 404
well done on the update. Amazon running out of instances!!! Sign of the times.
Woo, new data!
Thanks for the update