An Illustrated Guide to Matt Cutts' Comments on Crawling & Indexation

Comments 80

Please keep your comments TAGFEE by following the community etiquette.

E-mail me when new comments are posted

Sort by:

Comments are closed on posts more than 30 days old. Got a burning question? Head to our Q&A section to start a new conversation.

rui.jiang

2010-03-17T02:39:28-07:00

I think there was an another important point that Matt Cutts Claimed.

"What we try to do is merge pages, rather than dropping them completely. If you link to three pages that are duplicates, a search engine might be able to realize that those three pages are duplicates and transfer the incoming link juice to those merged pages."

So Google will choose a page from all the duplicated versions to index and they might give that page credits from other duplicated versions.

*My last comment was cannibalised by your system...

6 0

I think there was an another important point that Matt Cutts Claimed. "What we try to do is merge pages, rather than dropping them completely. If you link to three pages that are duplicates, a search engine might be able to realize that those three pages are duplicates and transfer the incoming link juice to those merged pages." So Google will choose a page from all the duplicated versions to index and they might give that page credits from other duplicated versions. *My last comment was cannibalised by your system...
Cancel
- Jacek Tkaczuk
 
 2010-03-17T04:47:29-07:00
 
 I agree, this indeed is quite revealing. Although I think, that this may not work in every case that we would wish it to. For example Google may recognise exactly the same duplicates that were caused for example by ?gclid= parameter, but it cannot recognise properly duplicate products or some listings with different sorting parameters defined. So it is still necessary to use canonicals.
 
 3 0
 
 I agree, this indeed is quite revealing. Although I think, that this may not work in every case that we would wish it to. For example Google may recognise exactly the same duplicates that were caused for example by ?gclid= parameter, but it cannot recognise properly duplicate products or some listings with different sorting parameters defined. So it is still necessary to use canonicals.
 Cancel
 - rui.jiang
 
 2010-03-17T05:50:32-07:00
 
 Yeah, it is absolutely important to prevent duplicating issues.
 
 It is very uncertain to let Google make decisions on behalf of you, and they won't always be kind enough to transfer link juice.
 
 Matt Cutts' statement indicated that Google is obviously aware of this issue as most of the webmaster don't have a clue what duplication issue is, and they now have the ability to determine duplication issue and bring back the link juice a page deserves.
 
 It's won't affect most of the SEOers as we either implement 301 redirection or canonical tag to prevent this from happening, but it's definitely a good news to whom who doesn't know duplicating issue and suffer from it.
 
 3 0
 
 Yeah, it is absolutely important to prevent duplicating issues. It is very uncertain to let Google make decisions on behalf of you, and they won't always be kind enough to transfer link juice. Matt Cutts' statement indicated that Google is obviously aware of this issue as most of the webmaster don't have a clue what duplication issue is, and they now have the ability to determine duplication issue and bring back the link juice a page deserves. It's won't affect most of the SEOers as we either implement 301 redirection or canonical tag to prevent this from happening, but it's definitely a good news to whom who doesn't know duplicating issue and suffer from it. 
 Cancel
- identity
 
 2010-03-17T06:18:54-07:00
 
 Yes, I found this part the most interesting, and one of the best examples of the complexities of SEO...little is truely black or white (not a hat reference), cut and dried, but a complex overlapping of obstacles and impact.
 
 On one hand you have the promise of consolidated PageRankacross duplicates (assuming Google correctly identifies and associates with the desired URL) placed up against diminished crawl equity due to URL bloat.
 
 Unfortunately, I think Matt really down played the diminished crawl equity concern with trying to down play the crawl limits...there are limits, whether they be hard and fast numbers, algorithmically calculated, or simply a matter of space and time.
 
 From my own research across a number of sites, the level of "uniquely crawled" content within a month's time may be much less then people would expect. The homepage, top level pages, and some highly popular pages receive a lot of repeat visits, followed typically by a very small percentage of pages that may receive 1-3 repeat visits a month...often leaving the bulk of a site's pages not being seen within 30 days.
 
 For me, eliminating or greatly reducing duplicate content through whatever means will still be priority #1, even if it means giving up on potential PageRank consolidation that Google might do on its own.
 
 More than anything, this interview illustrated the need to test and measure the impact. How Google reacts to tactics may differ from site to site.
 
 Clients however, who often want a clearly defined answer and expectation of the outcomes, won't be thrilled with the reality that "your mileage may vary."
 
 3 0
 
 Yes, I found this part the most interesting, and one of the best examples of the complexities of SEO...little is truely black or white (not a hat reference), cut and dried, but a complex overlapping of obstacles and impact. On one hand you have the promise of consolidated PageRankacross duplicates (assuming Google correctly identifies and associates with the desired URL) placed up against diminished crawl equity due to URL bloat. Unfortunately, I think Matt really down played the diminished crawl equity concern with trying to down play the crawl limits...there are limits, whether they be hard and fast numbers, algorithmically calculated, or simply a matter of space and time. From my own research across a number of sites, the level of "uniquely crawled" content within a month's time may be much less then people would expect. The homepage, top level pages, and some highly popular pages receive a lot of repeat visits, followed typically by a very small percentage of pages that may receive 1-3 repeat visits a month...often leaving the bulk of a site's pages not being seen within 30 days. For me, eliminating or greatly reducing duplicate content through whatever means will still be priority #1, even if it means giving up on potential PageRank consolidation that Google might do on its own. More than anything, this interview illustrated the need to test and measure the impact. How Google reacts to tactics may differ from site to site. Clients however, who often want a clearly defined answer and expectation of the outcomes, won't be thrilled with the reality that "your mileage may vary." 
 Cancel
 - rui.jiang
 
 2010-03-17T07:36:33-07:00
 
 Hi Identity,
 
 About crawl limits, there is indeed a limit, even the universe has a limit i guess. (maybe it is sitll expanding who knows :D)
 
 But this limit is far larger than what I've expected, from my personal experience.(Below is the story, it's a bit long)
 
 I have a small client, about 1 month ago, they have around 200-300 pages indexed in Google and I found out (using Xenu) they have a "deliver to towns" module that generates tons of links (around 30K), these pages are generally duplicating each other except names and they have very limited content which means it is not important to Google at all.
 
 I asked them to get rid of this module, they agreed to reduce the level from town and county, and they redirected all these town links to an important page around two weeks ago. (which I didn't expect)
 
 This week, when I checked how many pages were indexed, I saw an unbelieveable number - 15000+.
 
 This is a very small website, and Google has the ability to index 15000 more pages immediatel and 98% of them are duplicated version. So I guess it's quite hard to reach the crawl limit.
 
 Also, Matt Cutts once mentioned (cannot remember where, maybe his blog) it doesn't require huge amount of resource to index pages at all. Google's data centres are mainly working on algorithm of ranking.
 
 1 0
 
 Hi Identity, About crawl limits, there is indeed a limit, even the universe has a limit i guess. (maybe it is sitll expanding who knows :D) But this limit is far larger than what I've expected, from my personal experience.(Below is the story, it's a bit long) I have a small client, about 1 month ago, they have around 200-300 pages indexed in Google and I found out (using Xenu) they have a "deliver to towns" module that generates tons of links (around 30K), these pages are generally duplicating each other except names and they have very limited content which means it is not important to Google at all. I asked them to get rid of this module, they agreed to reduce the level from town and county, and they redirected all these town links to an important page around two weeks ago. (which I didn't expect) This week, when I checked how many pages were indexed, I saw an unbelieveable number - 15000+. This is a very small website, and Google has the ability to index 15000 more pages immediatel and 98% of them are duplicated version. So I guess it's quite hard to reach the crawl limit. Also, Matt Cutts once mentioned (cannot remember where, maybe his blog) it doesn't require huge amount of resource to index pages at all. Google's data centres are mainly working on algorithm of ranking.
 Cancel
 - identity
 
 2010-03-17T11:55:04-07:00
 
 Take a look at the crawl frequency on the site's pages over time.
 
 Once they are established into the site, does their crawl frequency stay the same? Being low value, probably not, but more importantly, how does it impact the crawl frequency of the more important pages...do you give up there to gain here?
 
 This is also important to understand with regard to site changes and optimizations and giving them enough time to truly be crawled and for their impact, good or bad, to be realized.
 
 1 0
 
 Take a look at the crawl frequency on the site's pages over time. Once they are established into the site, does their crawl frequency stay the same? Being low value, probably not, but more importantly, how does it impact the crawl frequency of the more important pages...do you give up there to gain here? This is also important to understand with regard to site changes and optimizations and giving them enough time to truly be crawled and for their impact, good or bad, to be realized. 
 Cancel
 - J J
 
 2010-03-17T12:32:53-07:00
 
 Further to this point, if you could identify the crawl frequency of your pages or page types (i.e. blog pages vs product pages, etc.) you can determine which pages are "important" in Google's eyes and make decisions on where to further your SEO efforts.
 
 Just b/c a page is crawled and indexed today doesn't mean it will stay there in perpetuity. Crawl frequency is a good indicator of quality.
 
 1 0
 
 Further to this point, if you could identify the crawl frequency of your pages or page types (i.e. blog pages vs product pages, etc.) you can determine which pages are "important" in Google's eyes and make decisions on where to further your SEO efforts. Just b/c a page is crawled and indexed today doesn't mean it will stay there in perpetuity. Crawl frequency is a good indicator of quality. 
 Cancel
jlseom

2010-03-17T01:32:38-07:00

nice illustrations

5 0

nice illustrations
Cancel
Psikoterapi Tedavi

2010-03-22T12:30:54-07:00

Nice illustrations rand! :D

4 0

Nice illustrations rand! :D
Cancel
Hannah Rampton

2010-03-17T01:29:58-07:00

Thanks for the excellent and clear illustrations... :) I hope your additional questions to Matt are answered at some point in the future. Very useful info!

4 0

Thanks for the excellent and clear illustrations... :) I hope your additional questions to Matt are answered at some point in the future. Very useful info! 
Cancel
Associate

Jane Copland
Associate

2010-03-17T03:10:51-07:00

I love the Twitter / Facebook question. From a technical standpoint, I would think that Google would be able to isolate those networks and apply similar authority / spam metrics to their pages as they do to the "traditional" web (whatever that is ;) ).

If they can determine the worth of pages across the Internet, developing trust metrics within Twitter would seem to me quite elementary, and Facebook only slightly more complicated. People share so much in these two places especially: even in the short term, they could be useful for discovery.

One other question I'd like asked regards dynamic parameters. We all still tell people to use static URLs and avoid multiple parameters, but is this still necessary? I'm well aware that AMP.com.au's URL for home loans is horrifying (https://www.amp.com.au/wps/portal/au/AMPAUCategory3C?vigurl=%2Fvgn-ext-templating%2Fv%2Findex.jsp%3Fvgnextoid%3Deb00ae205f711210VgnVCM10000081c0a8c0RCRD)

... and the site resolves with https. And embraces unnecessary virtual subfolders. But how much junk in a URL is too much in 2010?

JaneCopland edited 2010-03-17T03:14:05-07:00
4 0

I love the Twitter / Facebook question. From a technical standpoint, I would think that Google would be able to isolate those networks and apply similar authority / spam metrics to their pages as they do to the "traditional" web (whatever that is ;) ). If they can determine the worth of pages across the Internet, developing trust metrics within Twitter would seem to me quite elementary, and Facebook only slightly more complicated. People share so much in these two places especially: even in the short term, they could be useful for discovery. One other question I'd like asked regards dynamic parameters. We all still tell people to use static URLs and avoid multiple parameters, but is this still necessary? I'm well aware that AMP.com.au's URL for home loans is horrifying (https://www.amp.com.au/wps/portal/au/AMPAUCategory3C?vigurl=%2Fvgn-ext-templating%2Fv%2Findex.jsp%3Fvgnextoid%3Deb00ae205f711210VgnVCM10000081c0a8c0RCRD) ... and the site resolves with https. And embraces unnecessary virtual subfolders. But how much junk in a URL is too much in 2010? 
Cancel
- Jean Marc Thomas
 
 2010-03-17T04:43:52-07:00
 
 I believe that regarding the Twitter / Facebook question, that Google will also kind of have some "PR" applied to the links, depending on:
 
 - who posts the links? is it someone spamming Twitter, or posting only a a "normal" basis? So some kind of Authority will be playing a role into that
 
 - the content around the link, even if 140 caracters won't make it easy : is this tweet relevant to users
 
 - reputation: how many followers are reading that tweet? I guess that noboday under 15 000 would be taken into considerations
 
 We all know that Google is figuring out our Social Media environment, therefore it is figuring out our Social Media Influence. So Google will be using this SMI do give weight to the links.
 
 2 0
 
 I believe that regarding the Twitter / Facebook question, that Google will also kind of have some "PR" applied to the links, depending on: - who posts the links? is it someone spamming Twitter, or posting only a a "normal" basis? So some kind of Authority will be playing a role into that - the content around the link, even if 140 caracters won't make it easy : is this tweet relevant to users - reputation: how many followers are reading that tweet? I guess that noboday under 15 000 would be taken into considerations We all know that Google is figuring out our Social Media environment, therefore it is figuring out our Social Media Influence. So Google will be using this SMI do give weight to the links.
 Cancel
- identity
 
 2010-03-17T05:52:10-07:00
 
 From my experience, Google has certainly gotten better over the years with dynamic parameters. I've seen some pretty horrendous URLs getting indexed fine...of course, those are also often the exception.
 
 Working with a lot of ecommerce clients, you get to experience your fair share of URL blech. My general push is to try to get them cut down to at least 3 or less, and even then, trying to avoid overly complex or sessionID looking parameters. When I see sites getting over 3 parameters is when I really see hit or miss indexation.
 
 It's often more complex than that though since these sites may often have huge levels of duplication due to parameter ordering, etc.
 
 Agree it would be nice to get some more direct feedback from the engines. I think Yahoo and now Google's ability to allow webmasters the ability to inform them of parameters that can be dropped is about helping to reduce duplication but also illustrates how complex the URL constructs really are.
 
 3 0
 
 From my experience, Google has certainly gotten better over the years with dynamic parameters. I've seen some pretty horrendous URLs getting indexed fine...of course, those are also often the exception. Working with a lot of ecommerce clients, you get to experience your fair share of URL blech. My general push is to try to get them cut down to at least 3 or less, and even then, trying to avoid overly complex or sessionID looking parameters. When I see sites getting over 3 parameters is when I really see hit or miss indexation. It's often more complex than that though since these sites may often have huge levels of duplication due to parameter ordering, etc. Agree it would be nice to get some more direct feedback from the engines. I think Yahoo and now Google's ability to allow webmasters the ability to inform them of parameters that can be dropped is about helping to reduce duplication but also illustrates how complex the URL constructs really are. 
 Cancel
 - Jane Copland
 
 2010-03-17T06:37:41-07:00
 
 Good point about parameters and duplicate content.
 
 On a related note, parameters can cause search reputation management issues due to malicious modification and linking / indexing :D. Does anyone remember in 2008 when we realised that we could change some big ecommerce site's image parameters to show very different images with product descriptions? I don't think it was Wal-Mart (perhaps K-Mart).
 
 We could essentially create the online equivalent of this. Hilarity.
 
 JaneCopland edited 2010-03-17T06:38:34-07:00
 4 0
 
 Good point about parameters and duplicate content. On a related note, parameters can cause search reputation management issues due to malicious modification and linking / indexing :D. Does anyone remember in 2008 when we realised that we could change some big ecommerce site's image parameters to show very different images with product descriptions? I don't think it was Wal-Mart (perhaps K-Mart). We could essentially create the online equivalent of <a href="https://failblog.org/2008/06/26/bambi-fail/" rel="nofollow">this</a>. Hilarity.
 Cancel
 - identity
 
 2010-03-17T07:37:01-07:00
 
 Ha! Nice.
 
 Even keyword-friendly looking sites aren't immune. A number of sites have implemented, either on their own or through 3rd party CMS, keyword-rich URLs with a unique identifier.
 
 So "this-is-my-page-1234" or "this-is-my-page/1234" allows the keywords to be changed to hearts content and the 1234 is really the identifier for the page.
 
 Always fun showing clients the potential issue of this with a screengrab example of a dummied URL, like "this-is-not-the-product-you-are-looking-for-1234".
 
 Or the other example where the title and/or heading for the page pulls in cues from the URL too. Which could totally lead to that spoofed but potentially negative viral impact.
 
 1 0
 
 Ha! Nice. Even keyword-friendly looking sites aren't immune. A number of sites have implemented, either on their own or through 3rd party CMS, keyword-rich URLs with a unique identifier. So "this-is-my-page-1234" or "this-is-my-page/1234" allows the keywords to be changed to hearts content and the 1234 is really the identifier for the page. Always fun showing clients the potential issue of this with a screengrab example of a dummied URL, like "this-is-not-the-product-you-are-looking-for-1234". Or the other example where the title and/or heading for the page pulls in cues from the URL too. Which could totally lead to that spoofed but potentially negative viral impact. 
 Cancel
 - Jane Copland
 
 2010-03-17T07:41:41-07:00
 
 Oh, I know :D I am currently looking at a large British site that, at its worst, is creating 9 versions of every page, the majority of which resolve on static URLs.
 
 I also just got done dealing with a site that did just this with unique identifiers. Absolutely anything would pull the file from the database, so long as the number existed in the URL.
 
 Nine versus infinity!
 
 1 0
 
 Oh, I know :D I am currently looking at a large British site that, at its worst, is creating 9 versions of every page, the majority of which resolve on static URLs. I also just got done dealing with a site that did just this with unique identifiers. Absolutely anything would pull the file from the database, so long as the number existed in the URL. Nine versus infinity! 
 Cancel
 - seo33
 
 2010-03-17T12:57:37-07:00
 
 Is it possible to block more than 15 parameters? I am working on a site that has a lot of parameters that should be blocked. Webmastertools is suggesting to block more than 15 parameters, would it work to block more than 15?
 
 1 0
 
 Is it possible to block more than 15 parameters? I am working on a site that has a lot of parameters that should be blocked. Webmastertools is suggesting to block more than 15 parameters, would it work to block more than 15? 
 Cancel
gerard.top

2010-03-22T12:55:58-07:00

Very interesting summary of the interview!

For me, the 301 not passing all the juice is a new one, although it always did make some sense to me

And, as you've mentioned, I'm glad Google will do some work on the link spam area.. that will make live a lot easier :)

A whole lot wiser now!

3 0

Very interesting summary of the interview! For me, the 301 not passing all the juice is a new one, although it always did make some sense to me And, as you've mentioned, I'm glad Google will do some work on the link spam area.. that will make live a lot easier :) A whole lot wiser now!
Cancel
DanielFendler

2010-03-19T13:18:42-07:00

This sort of summary is great when you don't have tons of time each day to read about SEO because you're busy doing it!

Two thumbs up for the brevity and concentration of good/useful information.

DanielFendler edited 2010-03-19T13:19:22-07:00
3 0

This sort of summary is great when you don't have tons of time each day to read about SEO because you're busy doing it! Two thumbs up for the brevity and concentration of good/useful information. 
Cancel
Steve Wiideman

2010-03-17T01:37:11-07:00

Very nice illustrations and excellent follow up questions for the next round (at SMX perhaps?). There's not much about mobile, video and RTS, all hot topics for the year and all with search ranking criteria we'd all love Google to discuss. Additionally, I think advertisers would be willing to pay hard cash to get a ORM/SPAM representative at Google they could reach out to. Call it Prepaid Google. I know at least 10 brands that would pay $2k+ per month for such a service. Thoughts?

Thanks for the great post and illustrations!

2 0

Very nice illustrations and excellent follow up questions for the next round (at SMX perhaps?). There's not much about mobile, video and RTS, all hot topics for the year and all with search ranking criteria we'd all love Google to discuss. Additionally, I think advertisers would be willing to pay hard cash to get a ORM/SPAM representative at Google they could reach out to. Call it Prepaid Google. I know at least 10 brands that would pay $2k+ per month for such a service. Thoughts? Thanks for the great post and illustrations! 
Cancel
Eryck Dzotsi

2010-03-17T10:33:38-07:00

Rand, you should definitely continue to do these. I could not stop laughing, yet acknowledging the points being made.

2 0

Rand, you should definitely continue to do these. I could not stop laughing, yet acknowledging the points being made. 
Cancel
Kasper Damsgaard

2010-03-17T03:18:18-07:00

I would definately have liked an answer to rands second question.

With the handling of nofollow changing and Google crawling/executing Javascript, what's the best way to link to a document on the web so human visitors can access it but search engines cannot WITHOUT wasting link juice/PageRank (robots.txt, for example, couldn't do this) or cloaking?

With nofollow being debunked as a usable solution, what seems to be best practice these days amongst the seomoz readers?

KasperDamsgaard edited 2010-03-17T03:19:31-07:00
2 0

I would definately have liked an answer to rands second question. With the handling of nofollow changing and Google crawling/executing Javascript, what's the best way to link to a document on the web so human visitors can access it but search engines cannot WITHOUT wasting link juice/PageRank (robots.txt, for example, couldn't do this) or cloaking? With nofollow being debunked as a usable solution, what seems to be best practice these days amongst the seomoz readers?
Cancel
Dr.Cyke

2010-03-17T01:58:14-07:00

Great Piece Rand. Thanks for this.

2 0

Great Piece Rand. Thanks for this. 
Cancel
Tobias Fox

2010-03-17T02:52:50-07:00

I really enjoyed those cartoons! Just great and funny.

2 0

I really enjoyed those cartoons! Just great and funny.
Cancel
ninjamarketer

2010-03-18T13:41:14-07:00

Terrific graphical illustration of the interview. When we talk about the canonicalization issue what percentage of the copied content is considered duplicate? Do duplicate titles / meta tag descriptions impact the indexation and juice flow just like the body content?

1 0

Terrific graphical illustration of the interview. When we talk about the canonicalization issue what percentage of the copied content is considered duplicate? Do duplicate titles / meta tag descriptions impact the indexation and juice flow just like the body content?
Cancel
ballmdr

2010-03-19T08:28:03-07:00

Thanks for posting this, and illustrating it for the visual laerners out there. I read and watched Cutts' post yesterday, but you guys really laid it out there. Thanks!

caseyhen edited 2010-10-02T07:27:49-07:00
1 0

Thanks for posting this, and illustrating it for the visual laerners out there. I read and watched Cutts' post yesterday, but you guys really laid it out there. Thanks!
Cancel
Casey Saliba

2010-03-18T06:34:37-07:00

The illustrations are wonderful, clear, and easy to follow. Thanks for breaking this down.

1 0

The illustrations are wonderful, clear, and easy to follow. Thanks for breaking this down.
Cancel
Perséides Technologie

2010-03-18T04:26:48-07:00

Thanks for this simplified version of the interview. It was easier for me to follow what Matt had to say!

Note to fellow SEOs: You should definitely get yourself a copy of The Art of SEO! This is the book that got me on the path to success in SEO and SEM, where other books failed!

1 0

Thanks for this simplified version of the interview. It was easier for me to follow what Matt had to say! Note to fellow SEOs: You should definitely get yourself a copy of <a href="https://www.amazon.com/Art-SEO-Mastering-Optimization-Practice/dp/0596518862/" rel="nofollow">The Art of SEO</a>! This is the book that got me on the path to success in SEO and SEM, where other books failed!
Cancel
pricelesspromotions

2010-03-18T06:56:03-07:00

This is why I visit this site everyday. You guys rock!

1 0

This is why I visit this site everyday. You guys rock!
Cancel
Kandice Linwright

2010-03-18T06:59:56-07:00

Thanks for posting this, and illustrating it for the visual laerners out there. I read and watched Cutts' post yesterday, but you guys really laid it out there. Thanks!

1 0

Thanks for posting this, and illustrating it for the visual laerners out there. I read and watched Cutts' post yesterday, but you guys really laid it out there. Thanks!
Cancel
Balazs Gy

2010-03-18T09:49:10-07:00

Nice work fot these funny and clear illustrations. Also a very good content about Google SEO, easy to understand and useful!

Thanks!
Balazs

1 0

Nice work fot these funny and clear illustrations. Also a very good content about Google SEO, easy to understand and useful! Thanks! Balazs
Cancel
R Labrie

2010-03-18T12:02:30-07:00

I really appreciate the cartoon slides! Having to learn SEO on the fly and without a budget has been really tough for me, but I really liked reading this post since it feels so beginner-friendly and straightforward. I will have to come back and reread a few times to get all the "meat" from it, but that's mostly because I'm still such a newbie.

A question from someone still learning the lingo... "Affiliate links" are links on a website that point to a different website? And this is not the same as a reciprocal link, correct? I want to make sure I understand this

Please continue the cartoons!

FofM edited 2010-03-18T12:07:31-07:00
1 0

 I really appreciate the cartoon slides! Having to learn SEO on the fly and without a budget has been really tough for me, but I really liked reading this post since it feels so beginner-friendly and straightforward. I will have to come back and reread a few times to get all the "meat" from it, but that's mostly because I'm still such a newbie. A question from someone still learning the lingo... "Affiliate links" are links on a website that point to a different website? And this is not the same as a reciprocal link, correct? I want to make sure I understand this Please continue the cartoons! 
Cancel
- Steven Alig
 
 2010-03-21T20:07:22-07:00
 
 Affiliate links are a type of paid link. That is the key factor in this discussion. That, for example, you create a blog to discuss a certain a certain niche topic and then create tons of affiliate links on your blog to drive traffic to your affiliates, does the link juice get passed on these links? Or should it.
 
 These links generally create revenue for the blog owner if someone clicks on the link or if someone completes an action (purchase) after clicking on the link.
 
 Bottom line is that it is a paid link.
 
 1 0
 
 Affiliate links are a type of paid link. That is the key factor in this discussion. That, for example, you create a blog to discuss a certain a certain niche topic and then create tons of affiliate links on your blog to drive traffic to your affiliates, does the link juice get passed on these links? Or should it. These links generally create revenue for the blog owner if someone clicks on the link or if someone completes an action (purchase) after clicking on the link. Bottom line is that it is a paid link. 
 Cancel
 - R Labrie
 
 2010-03-24T09:42:18-07:00
 
 Thank you! I understand that now!
 
 1 0
 
 Thank you! I understand that now!
 Cancel
Alaric

2010-03-23T04:17:40-07:00

I'd have to add that this is an awesome summary. The illustrations make it easy to understand the key points, although I will read through the interview itself in more depth when I feel like I have a better attention span. :)

1 0

I'd have to add that this is an awesome summary. The illustrations make it easy to understand the key points, although I will read through the interview itself in more depth when I feel like I have a better attention span. :)
Cancel
bbutler68

2010-04-08T17:46:26-07:00

That was great and explains a lot.

1 0

That was great and explains a lot.
Cancel
tanu_l

2010-05-05T02:55:51-07:00

very informative stuff through clear and understandable illustrations....

thanks for the info

1 0

very informative stuff through clear and understandable illustrations.... thanks for the info 
Cancel
GoogleExpertUK

2011-12-28T05:49:55-08:00

Love the cartoons, makes it far easier for people to understand and digest all of Eric's information. It is good that Google are quite open with their algorithms and changes including semantics and LDA. Really interesting article displayed very effectively!

Google Expert

KeriMorgret edited 2011-12-28T10:24:25-08:00
1 0

Love the cartoons, makes it far easier for people to understand and digest all of Eric's information. It is good that Google are quite open with their algorithms and changes including semantics and LDA. Really interesting article displayed very effectively! Google Expert
Cancel
Brian Reynolds

2012-04-06T16:52:59-07:00

Older post, but still a lot of relevant content.

1 0

Older post, but still a lot of relevant content.
Cancel
wyfwyf112

2010-04-02T09:21:33-07:00

Really cool illustration, I LOVE it!

1 0

Really cool illustration, I LOVE it!
Cancel
vipulgupta

2010-03-25T05:46:58-07:00

Awesome, very clear and descriptive. One thing is highlighted again ... Pages should be useful for the user (don't create them just for the bots)

1 0

Awesome, very clear and descriptive. One thing is highlighted again ... Pages should be useful for the user (don't create them just for the bots)
Cancel
RobDalton

2010-03-22T09:49:08-07:00

Good questions answered, great questions left unanswered at the bottom!

Agree on the comment about twitter mentioned. Does this mean that google will be looking more towards sites such as facebook, twitter, etc.?

The last question is the one that I have been asking for a few years now. I almost am believing that google does not discount black hat techniques that were used prior to them addressing them.

Lastly, I am glad to see that google is going to move forward on tackling spam. Personally I would like to see a forum type area within seomoz or a similar site where you can go and discuss spam and then all the users that agree that it is spam report it in their webmaster tools.

1 0

Good questions answered, great questions left unanswered at the bottom! Agree on the comment about twitter mentioned. Does this mean that google will be looking more towards sites such as facebook, twitter, etc.? The last question is the one that I have been asking for a few years now. I almost am believing that google does not discount black hat techniques that were used prior to them addressing them. Lastly, I am glad to see that google is going to move forward on tackling spam. Personally I would like to see a forum type area within seomoz or a similar site where you can go and discuss spam and then all the users that agree that it is spam report it in their webmaster tools. 
Cancel
RobDalton

2010-03-23T01:08:29-07:00

I also noticed twitter was mentioned. I am curious if it was just random that he mentioned twitter of all the social media sites or if that is a hint that twitter may eventually carry more weight than others.

1 0

I also noticed twitter was mentioned. I am curious if it was just random that he mentioned twitter of all the social media sites or if that is a hint that twitter may eventually carry more weight than others.
Cancel
Alexander Boehm

2010-03-18T01:06:55-07:00

I liked that everything is done with beautiful and clear picturesAll would be well explained

1 0

I liked that everything is done with beautiful and clear picturesAll would be well explained
Cancel
Adrian Bold

2010-03-24T07:51:52-07:00

Nice one Rand.

A picture does indeed paint a thousand words!

1 0

Nice one Rand. A picture does indeed paint a thousand words! 
Cancel
Mario Gmaes

2010-03-20T07:40:01-07:00

Thanks for sharing, very nice article. Also by anychance do you know when google next update?

2 1

Thanks for sharing, very nice article. Also by anychance do you know when google next update?
Cancel
seo33

2010-03-17T12:47:25-07:00

Canonical tag is great but listen to what Matt said carefully, try first to fix your site artitecture because canonical tag might not help your crawling efficiency since the search engine must visit each url to see the tag, on the other hand it might help you on the long run because the search engine do not need to visit the url again.

Paramter handling on the other hand would help crawl efficiency and indexing because search engines wont visit the urls that contain a parameter you are blocking.

Read what Vannesa Fox has to say about the use of canonical tag and crawl efficiency https://searchengineland.com/google-lets-you-tell-them-which-url-parameters-to-ignore-25925

seo33 edited 2010-03-17T12:52:06-07:00
1 0

Canonical tag is great but listen to what Matt said carefully, try first to fix your site artitecture because canonical tag might not help your crawling efficiency since the search engine must visit each url to see the tag, on the other hand it might help you on the long run because the search engine do not need to visit the url again. Paramter handling on the other hand would help crawl efficiency and indexing because search engines wont visit the urls that contain a parameter you are blocking. Read what Vannesa Fox has to say about the use of canonical tag and crawl efficiency https://searchengineland.com/google-lets-you-tell-them-which-url-parameters-to-ignore-25925
Cancel
mssfldt

2010-03-17T07:03:45-07:00

Hahaha ...

Rand, this is so funny. Spaminator Matt and spaceboy Rand :D

And btw. its quit easy to understand the point. Looking forward to more of these slides :-)

Martin

1 0

Hahaha ... Rand, this is so funny. Spaminator Matt and spaceboy Rand :D And btw. its quit easy to understand the point. Looking forward to more of these slides :-) Martin 
Cancel
goodnewscowboy

2010-03-17T07:42:43-07:00

"We might put out a call for some feedback on different types of link spam sometime down the road.

That sounds really good - a huge frustration for the SEO world has been the fact that so many SEOs perceive their competitors to be outranking them with black/gray hat linking techniques and feel they must engage as well is order to stay competitive."

I'm totally on the fence with this one Rand. Part of me likes that there is a way to (potentially) remove competitors from the top SERP positions if they are there due to a bunch of black hat methods.

The other part of me doesn't like that I'd be a Google snitch. I suppose it's the old school days code of conduct rearing it's ugly head "never squeal, even if it means you sit through hours of class punishment"

I don't have an answer, it just leaves me feeling...ambivalent.

1 0

"We might put out a call for some feedback on different types of link spam sometime down the road. That sounds really good - a huge frustration for the SEO world has been the fact that so many SEOs perceive their competitors to be outranking them with black/gray hat linking techniques and feel they must engage as well is order to stay competitive." I'm totally on the fence with this one Rand. Part of me likes that there is a way to (potentially) remove competitors from the top SERP positions if they are there due to a bunch of black hat methods. The other part of me doesn't like that I'd be a Google snitch. I suppose it's the old school days code of conduct rearing it's ugly head "never squeal, even if it means you sit through hours of class punishment" I don't have an answer, it just leaves me feeling...ambivalent. 
Cancel
J J

2010-03-17T07:50:38-07:00

These are perfect for explaining technical SEO details to non-SEOs. Not naming any names...just sayin'. ;)

1 0

These are perfect for explaining technical SEO details to non-SEOs. Not naming any names...just sayin'. ;)
Cancel
Chris Horner

2010-03-17T08:42:27-07:00

Nice one Rand, we are all waiting, looking, reading and semi-guessing what is around the corner but with this article we can get 90% of it right in anticipation for there latest update.

What's it called Ceffeine or is it Ice tea!

Chris

SEO Top Page

chrishorner edited 2010-03-17T08:46:15-07:00
1 0

 Nice one Rand, we are all waiting, looking, reading and semi-guessing what is around the corner but with this article we can get 90% of it right in anticipation for there latest update. What's it called Ceffeine or is it Ice tea! Chris SEO Top Page 
Cancel
NeboAgency

2010-03-17T06:51:22-07:00

Great post Rand! Would you consider this Google page https://www.google.com/urchin/usac.html as breaking the rules when it comes to PR sculpting? (turn js off and take a look).

1 0

Great post Rand! Would you consider this Google page https://www.google.com/urchin/usac.html as breaking the rules when it comes to PR sculpting? (turn js off and take a look). 
Cancel
theexo51

2010-03-17T06:46:21-07:00

Will need to grab a LARGE cup of coffee and sit down to read this...

why dont more places utilise video? would much rather kick back, headphones in and just listen to this rather than having to read tiny text on a screen...

1 0

Will need to grab a LARGE cup of coffee and sit down to read this... why dont more places utilise video? would much rather kick back, headphones in and just listen to this rather than having to read tiny text on a screen...
Cancel
A_Q

2010-03-17T04:01:51-07:00

Thanks for this.

It is something that is getting a lot of my time at the moment!

1 0

Thanks for this. It is something that is getting a lot of my time at the moment!
Cancel
Theo van der Zee

2010-03-17T04:19:05-07:00

Just purchased a copy of your book (I didn't even know it had been released already!). Great illustrations on the topics covered in the interview.

1 0

Just purchased a copy of your book (I didn't even know it had been released already!). Great illustrations on the topics covered in the interview.
Cancel
- Tony Mandarich
 
 2010-03-17T10:15:16-07:00
 
 the book is fantastic !
 
 1 0
 
 the book is fantastic !
 Cancel
- VusalZeynalov
 
 2010-03-17T15:17:26-07:00
 
 Which book did you purchased? can you give a link?
 
 1 0
 
 Which book did you purchased? can you give a link?
 Cancel
Kevin Smith

2010-03-17T05:00:56-07:00

As per usual, great post Rand.

1 0

As per usual, great post Rand.
Cancel
Abhishek Mishra

2010-03-17T05:18:03-07:00

Excellent and clear - clear illustrations thanks...!

1 0

Excellent and clear - clear illustrations thanks...!
Cancel
Stephan Baldwin

2010-03-17T08:51:52-07:00

I think your cartoon #4 301 redirects removing some % of link juice is a Google mistake.

I would wager that most websites are intially built with lots of SEO and webmaster best practices mistakes. Many sites start small and as they grow, they come back and fix things like their URLs.

This attrition via 301 encourages people to live with their poorly setup URLs in order to preserve the link juice they have worked for years to acquire.

Google says if you change your PR4 page from www.example.com/shopid3234-cat32-prod1235523 to www.example.com/23-red-widget we may reward your keywords in the URL but we will take a portion of your PR in exchange for making us think more.

If google wants to target people who buy up 10 existing domains just to 301 to their site to make it stronger, then go for it - but thats a totally different problem than internal link changes for better usabililty.

1 0

I think your cartoon #4 301 redirects removing some % of link juice is a Google mistake. I would wager that most websites are intially built with lots of SEO and webmaster best practices mistakes. Many sites start small and as they grow, they come back and fix things like their URLs. This attrition via 301 encourages people to live with their poorly setup URLs in order to preserve the link juice they have worked for years to acquire. Google says if you change your PR4 page from www.example.com/shopid3234-cat32-prod1235523 to www.example.com/23-red-widget we may reward your keywords in the URL but we will take a portion of your PR in exchange for making us think more. If google wants to target people who buy up 10 existing domains just to 301 to their site to make it stronger, then go for it - but thats a totally different problem than internal link changes for better usabililty. 
Cancel
- Dr. Peter J. Meyers
 
 2010-03-17T08:56:59-07:00
 
 Unfortunately, people abuse this with internal links, too. For example, you could take a very popular article or blog post and 301 it to a less popular one targeting more lucrative keywords (I'm not saying you should, but you could). I think this is Google's control valve on 301s, in a sense. Not a perfect solution, but the best they've got for now.
 
 1 0
 
 Unfortunately, people abuse this with internal links, too. For example, you could take a very popular article or blog post and 301 it to a less popular one targeting more lucrative keywords (I'm not saying you should, but you could). I think this is Google's control valve on 301s, in a sense. Not a perfect solution, but the best they've got for now.
 Cancel
 - Stephan Baldwin
 
 2010-03-17T09:23:19-07:00
 
 Then it would be nice if you could tell google prior to a URL reconstruction of the whole or a clean section of your site giving you the chance to not get penalized.
 
 The one off 301's would be most likely to be abused but not sure how you would do this for a large portion of your site.
 
 1 0
 
 Then it would be nice if you could tell google prior to a URL reconstruction of the whole or a clean section of your site giving you the chance to not get penalized. The one off 301's would be most likely to be abused but not sure how you would do this for a large portion of your site. 
 Cancel
Staff

Dr. Peter J. Meyers
Staff

2010-03-17T08:57:49-07:00

I'd just like to point out that Rand Apparently did all of these illustrations between about 8pm and 1am.

1 0

I'd just like to point out that Rand Apparently did all of these illustrations between about 8pm and 1am.
Cancel
- Rand Fishkin
 
 2010-03-17T10:54:44-07:00
 
 Technically, 10pm and 1am :-)
 
 2 0
 
 Technically, 10pm and 1am :-)
 Cancel
 - Casey Henry
 
 2010-03-17T14:27:10-07:00
 
 Your boss should give you a raise or more vacation time!! =)
 
 1 0
 
 Your boss should give you a raise or more vacation time!! =)
 Cancel
 - Rand Fishkin
 
 2010-03-17T16:08:12-07:00
 
 Neither of those things are likely to happen in the near future, sadly, but I'm a pretty happy guy in general :-)
 
 1 0
 
 Neither of those things are likely to happen in the near future, sadly, but I'm a pretty happy guy in general :-)
 Cancel
MaryAnneG

2010-03-17T11:45:21-07:00

I'll fess up that I'm another one who had trouble making it through the interview transcript.

Great illustrations - thanks for breaking it down for us ADHD / "Visual Learner" types :)!

1 0

I'll fess up that I'm another one who had trouble making it through the interview transcript. Great illustrations - thanks for breaking it down for us ADHD / "Visual Learner" types :)! 
Cancel
Jessica Eballar

2010-03-17T12:51:18-07:00

My favorite panel is for point #5. So a low quality page has: few links and sharing functions. So sharing functions is another indicator to Google that that "real users like this page"? But can't the tweets and diggs and others be manipulated? And what about if you have a lot of blog posts that aren't tweeted, etc? Does that bring down the overall SEO quality of your site?

1 0

My favorite panel is for point #5. So a low quality page has: few links and sharing functions. So sharing functions is another indicator to Google that that "real users like this page"? But can't the tweets and diggs and others be manipulated? And what about if you have a lot of blog posts that aren't tweeted, etc? Does that bring down the overall SEO quality of your site?
Cancel
John Alden

2010-03-17T12:58:50-07:00

Nice visualization rand! You always make things more understandable ;)

1 0

Nice visualization rand! You always make things more understandable ;)
Cancel
Stanley WAS

2010-03-17T13:45:56-07:00

This graphical presentation of Matt's primary points has great added value. No problem reading it from start to end. Thanks a lot.

1 0

This graphical presentation of Matt's primary points has great added value. No problem reading it from start to end. Thanks a lot.
Cancel
Xstroy

2010-03-17T11:15:08-07:00

Unfortunately, my level of English is very low and have pictures taken at the level of intuition and basic vocabulary knowledge. Plain text translator copes, but the text in the figures for him a mystery. It would be good to take stock of graphic materials list.

1 0

Unfortunately, my level of English is very low and have pictures taken at the level of intuition and basic vocabulary knowledge. Plain text translator copes, but the text in the figures for him a mystery. It would be good to take stock of graphic materials list.
Cancel
- Dr. Peter J. Meyers
 
 2010-03-17T11:58:21-07:00
 
 I'm going to assist here and translate - I think our friend Xstroy is saying that those readers who use translation software to read the blog miss out on illustrations that contain blocks of text (because the translators can't read that image-based text). A fair point and food for thought.
 
 4 0
 
 I'm going to assist here and translate - I think our friend Xstroy is saying that those readers who use translation software to read the blog miss out on illustrations that contain blocks of text (because the translators can't read that image-based text). A fair point and food for thought.
 Cancel
 - goodnewscowboy
 
 2010-03-17T14:19:13-07:00
 
 Is that Dr. a doctorate in translation? Well done Pete!
 
 1 0
 
 Is that Dr. a doctorate in translation? Well done Pete!
 Cancel
jontuckerusa

2010-03-17T10:38:45-07:00

Nice post. I did a similar post on my blog, but it's much funnier in terms of the animations.

It's Google, SouthPark style ; )

ImJonTucker(dot)com/google-marketing

it's always distilled down quite a bit, as my audience is small business owners and not pro SEO's, but it may be a good way to explain SEO to the non-SEO.

1 0

Nice post. I did a similar post on my blog, but it's much funnier in terms of the animations. It's Google, SouthPark style ; ) ImJonTucker(dot)com/google-marketing it's always distilled down quite a bit, as my audience is small business owners and not pro SEO's, but it may be a good way to explain SEO to the non-SEO. 
Cancel
Kevin Doory

2010-03-17T09:36:01-07:00

Thanks for this. I have been trying to figure out a simple way to "show" my management team the key points from this interview. I knew if I just had them read it they would be tuned out rather quickly.

1 0

Thanks for this. I have been trying to figure out a simple way to "show" my management team the key points from this interview. I knew if I just had them read it they would be tuned out rather quickly. 
Cancel
Sam Crocker

2010-03-17T09:28:01-07:00

Way to make these items identified by Mr. Cutts and co. much easier to digest :)

Thanks!

1 0

Way to make these items identified by Mr. Cutts and co. much easier to digest :) Thanks! 
Cancel
Santosh Bhandarkar

2010-03-17T09:41:26-07:00

Ideally keep all your pages 2 hops away from the the home page, where possible. If you have a wordpress blog, and you have 100 categories, each category has 100 posts, you can have a site with 10k pages - and for each category page, keep 100 links to the individual posts. I think this is the cleanest way to have sites that are 10k pages or less. You just need to plan the categories well. And yes, do not keep category links on all individual pages. Just have the category links on the home page. This will ensure a linear link juice flow. Plus, use plug-ins such as SEO smart links to make links (with appropriate anchor text) flow from relevant pages and you have a killer site architecture.

1 0

Ideally keep all your pages 2 hops away from the the home page, where possible. If you have a wordpress blog, and you have 100 categories, each category has 100 posts, you can have a site with 10k pages - and for each category page, keep 100 links to the individual posts. I think this is the cleanest way to have sites that are 10k pages or less. You just need to plan the categories well. And yes, do not keep category links on all individual pages. Just have the category links on the home page. This will ensure a linear link juice flow. Plus, use plug-ins such as SEO smart links to make links (with appropriate anchor text) flow from relevant pages and you have a killer site architecture. 
Cancel
seo33

2010-03-17T10:24:03-07:00

Rand I like the last part of your presentation, #6, Amazon employ white hat cloaking and both Amazon and the engines win so what is the problem if Google come out and say get rid of duplicate content this way.

1 0

Rand I like the last part of your presentation, #6, Amazon employ white hat cloaking and both Amazon and the engines win so what is the problem if Google come out and say get rid of duplicate content this way. 
Cancel
IrishWonder

2010-03-17T15:49:02-07:00

>Matt Cutts: (with regard to links in ads) Our stance has not changed on that, and in fact we might put out a call for people to report more about link spam in the coming months. We have some new tools and technology coming online with ways to tackle that. We might put out a call for some feedback on different types of link spam sometime down the road.

Ha - so is that their new technology? getting people to report stuff? nothing new then - as usual :D

> That sounds really good - a huge frustration for the SEO world has been the fact that so many SEOs perceive their competitors to be outranking them with black/gray hat linking techniques and feel they must engage as well is order to stay competitive. Shutting this down or making SEOs feel that Google is taking consistent action when obvious manipulation is reported would go a long way to quelling this thorny problem.

Well dohh - SPAM=Sites Positioned Above Mine - frustration, what frustration? you do not rank for pharmacy terms without forum spam and comment spam etc etc, and there are other markets where other types of links are required to rank. You just gotta do what you gotta do. Besides I've seen Google overlook so many really bad things being reported, I mean things outlawed by their webmaster guidelines even - not just those somebody considers "manipulation" subjectively.

1 0

>Matt Cutts: (with regard to links in ads) Our stance has not changed on that, and in fact we might put out a call for people to report more about link spam in the coming months. We have some new tools and technology coming online with ways to tackle that. We might put out a call for some feedback on different types of link spam sometime down the road. Ha - so is that their new technology? getting people to report stuff? nothing new then - as usual :D > That sounds really good - a huge frustration for the SEO world has been the fact that so many SEOs perceive their competitors to be outranking them with black/gray hat linking techniques and feel they must engage as well is order to stay competitive. Shutting this down or making SEOs feel that Google is taking consistent action when obvious manipulation is reported would go a long way to quelling this thorny problem. Well dohh - SPAM=Sites Positioned Above Mine - frustration, what frustration? you do not rank for pharmacy terms without forum spam and comment spam etc etc, and there are other markets where other types of links are required to rank. You just gotta do what you gotta do. Besides I've seen Google overlook so many really bad things being reported, I mean things outlawed by their webmaster guidelines even - not just those somebody considers "manipulation" subjectively. 
Cancel
Kelvin Dsouza

2010-03-30T04:55:52-07:00

Thanks mate for the the wonderful descriptive illustration. Its been of great help. Now i understand slightly better what should be done to make google bot visit my site

caseyhen edited 2010-04-27T12:21:13-07:00
1 1

Thanks mate for the the wonderful descriptive illustration. Its been of great help. Now i understand slightly better what should be done to make google bot visit my site
Cancel

Post Analytics

An Illustrated Guide to Matt Cutts' Comments on Crawling & Indexation

#1 - There is No Hard Indexation Cap; But Indexation Has Limits

#2 - Duplicate Content Might Hurt Your Indexation

#3 - Lots of Qualifiers on Whether Affiliate Links Count

#4 - 301 Redirects Pass Some, But Not All of a Page's Link Juice

#5 - Low Quality, Non-Unique Pages Might Drop Your Indexation

#6 - Faceted Navigation and PageRank Sculpting are Thorny Issues

Comments 80

#1 - There is No Hard Indexation Cap; But Indexation Has Limits

#2 - Duplicate Content Might Hurt Your Indexation

#3 - Lots of Qualifiers on Whether Affiliate Links Count

#4 - 301 Redirects Pass Some, But Not All of a Page's Link Juice

#5 - Low Quality, Non-Unique Pages Might Drop Your Indexation

#6 - Faceted Navigation and PageRank Sculpting are Thorny Issues

Comments 80

Log in to Moz

Don't have an account?