SEO Finds in Your Server Logs, Part 2: Optimizing for Googlebot

Comments 24

Please keep your comments TAGFEE by following the community etiquette.

E-mail me when new comments are posted

Sort by:

Comments are closed on posts more than 30 days old. Got a burning question? Head to our Q&A section to start a new conversation.

Robert Duckers

2013-07-30T04:20:25-07:00

Nice.

To go with this, AJ Kohn just published a killer post on Crawl Budget Optimisation here:

https://www.blindfiveyearold.com/crawl-optimization

4 0

 Nice. To go with this, AJ Kohn just published a killer post on Crawl Budget Optimisation here: <a href="https://www.blindfiveyearold.com/crawl-optimization" rel="nofollow">https://www.blindfiveyearold.com/crawl-optimization</a> 
Cancel
- Salman Farooqui
 
 2013-08-01T03:10:32-07:00
 
 Very Detailed and Informative post. These helpful related posts links really helps in better understanding of the topic.
 
 1 0
 
 Very Detailed and Informative post. These helpful related posts links really helps in better understanding of the topic. 
 Cancel
Ryan Passarelli

2013-08-06T00:31:20-07:00

Excellent post...I;m sending this over to our tech guy right now so he can implement some of these strategies into his plans.

2 0

 Excellent post...I;m sending this over to our tech guy right now so he can implement some of these strategies into his plans. 
Cancel
stroseo

2013-07-30T05:47:02-07:00

Great post, hey quick question, you said:

"Are spiders crawling pages that are excluded by robots.txt?"

We've actually seen this a lot, what would you recommend if they are? Often before an update, we'll see google crawling through many of our site sections that have been disallowed by our robots.txt file.

Thanks!

2 0

 Great post, hey quick question, you said: "Are spiders crawling pages that are excluded by robots.txt?" We've actually seen this a lot, what would you recommend if they are? Often before an update, we'll see google crawling through many of our site sections that have been disallowed by our robots.txt file. Thanks! 
Cancel
Caroline Lyons

2013-08-02T10:02:27-07:00

Thank you for sharing this how-to! I'll be adding this to my to-do list.

1 0

 Thank you for sharing this how-to! I'll be adding this to my to-do list. 
Cancel
MorpheusMedia

2013-08-01T11:33:18-07:00

Hey Tim,

What time frame do you usually review log files in?

Thanks in advance.

1 0

 Hey Tim, What time frame do you usually review log files in? Thanks in advance. 
Cancel
phantom

2013-07-31T06:16:55-07:00

Great analysis, but do you *need* log files for this? Couldn't you pull internal and external linking data with Screaming Frog and come to similar conclusions?

Also, why would you block the page strength tool by robots.txt? With so many external links, wouldn't it stand to reason there would be a high number of navigational searches to that page as well?

1 0

 Great analysis, but do you *need* log files for this? Couldn't you pull internal and external linking data with Screaming Frog and come to similar conclusions? Also, why would you block the page strength tool by robots.txt? With so many external links, wouldn't it stand to reason there would be a high number of navigational searches to that page as well? 
Cancel
- Tim Resnik
 
 2013-08-01T09:25:28-07:00
 
 Hey phantom, your log files are going to tell you the exact pages that Gbot is crawling, ScreamingFrog will give you a good idea of link structure.
 
 Good point on on /page-strength/ it currently redirects to /tools/ so no point in throwing away those links.
 
 1 0
 
 Hey phantom, your log files are going to tell you the exact pages that Gbot is crawling, ScreamingFrog will give you a good idea of link structure. Good point on on /page-strength/ it currently redirects to /tools/ so no point in throwing away those links. 
 Cancel
Swapan Kumar

2013-08-16T11:23:50-07:00

Wow!! it is just amazing to see the strategies. I will definitely implement some of these. Thank for sharing such a useful data. Most interesting post to read and to bookmark. :)

1 0

 Wow!! it is just amazing to see the strategies. I will definitely implement some of these. Thank for sharing such a useful data. Most interesting post to read and to bookmark. :) 
Cancel
SpookSEO

2014-01-26T07:58:05-08:00

Hi Tim,
The bonus screencast is awesome! I so much appreciate the illustration with effective text box guides. This step by step analysis is really vital in order to see an invisible element that highly influences site performance.

1 0

 Hi Tim, The bonus screencast is awesome! I so much appreciate the illustration with effective text box guides. This step by step analysis is really vital in order to see an invisible element that highly influences site performance. 
Cancel
Miki Segal

2013-08-19T08:57:23-07:00

Thanks for sharing this! Learnt alot!

1 0

 Thanks for sharing this! Learnt alot! 
Cancel
Nathan Byloff

2013-07-30T13:56:06-07:00

I have a question about your "page-strength" example. You say this:

"Another potential insight from this data is that /page-strength/, a URL used for posting data for a Moz tool, is being crawled nearly 1,000 times. These crawls are most likely triggered from external links pointing to the results of the Moz tool. The recommendation would be to exclude this directory using robots.txt."

So you are asking Google not to crawl those sites anymore. If you have links pointing to that section of the site, wouldn't this devalue them?

1 0

 I have a question about your "page-strength" example. You say this: "Another potential insight from this data is that /page-strength/, a URL used for posting data for a Moz tool, is being crawled nearly 1,000 times. These crawls are most likely triggered from external links pointing to the results of the Moz tool. The recommendation would be to exclude this directory using robots.txt." So you are asking Google not to crawl those sites anymore. If you have links pointing to that section of the site, wouldn't this devalue them? 
Cancel
Scott Millar

2013-07-30T08:17:04-07:00

So how can I get the spider to look at sections that it's missing off?

1 0

 So how can I get the spider to look at sections that it's missing off? 
Cancel
- Joshua Hedlund
 
 2013-07-30T09:53:56-07:00
 
 1) "Disallow" crawling of sections that aren't important - hopefully the spider will spend less time there and then have more "crawl budget" for the sections it's missing
 
 2) Make sure the sections it's missing are being submitted in sitemaps
 
 3) If you're already doing 1 and 2 and it's still not crawling the pages you prefer, look at how your site is laid out and consider making the links to those sections more prevalent and higher up the page. You can try to build external links to those sections as well.
 
 1 0
 
 1) "Disallow" crawling of sections that aren't important - hopefully the spider will spend less time there and then have more "crawl budget" for the sections it's missing 2) Make sure the sections it's missing are being submitted in sitemaps 3) If you're already doing 1 and 2 and it's still not crawling the pages you prefer, look at how your site is laid out and consider making the links to those sections more prevalent and higher up the page. You can try to build external links to those sections as well. 
 Cancel
 - Sasha Zabelin
 
 2013-07-30T14:39:11-07:00
 
 Please correct and aloso forgive me If I am wrong, by disallowing a page which is important (no matter if it is less important than one which is more crawlable) are we not doing injustice with that page?
 
 Instead of disallowing why not we try to gain some more good quality link for less crawled but more important page as you have mentioned in point no 3? As I think the back link also play an important role in crawling.
 
 I am completely agree with your 2nd and 3rd point.
 
 A website architecture sholud be such that your most important pages have easy accessibilty from all the pages. The pages should also be easily crawlable, should not redirect the spider.
 
 One more important thing that we have not suggested is the PAGE LOAD TIME. Try to make the page load as faster as possible, not only the less crawled page but the entire pages. By doing this we will give spiders extra time so that it can crawl more pages than what it crawls now.
 
 Here are some excellent posts on how and why checking page load time is important.
 
 15 Tips to Speed Up Your Website
 Site Speed - Are You Fast? Does it Matter for SEO?
 
 Regards
 Sasha
 
 1 0
 
 Please correct and aloso forgive me If I am wrong, by disallowing a page which is important (no matter if it is less important than one which is more crawlable) are we not doing injustice with that page? Instead of disallowing why not we try to gain some more good quality link for less crawled but more important page as you have mentioned in point no 3? As I think the back link also play an important role in crawling. I am completely agree with your 2nd and 3rd point. A website architecture sholud be such that your most important pages have easy accessibilty from all the pages. The pages should also be easily crawlable, should not redirect the spider. One more important thing that we have not suggested is the PAGE LOAD TIME. Try to make the page load as faster as possible, not only the less crawled page but the entire pages. By doing this we will give spiders extra time so that it can crawl more pages than what it crawls now. Here are some excellent posts on how and why checking page load time is important. <a href="https://moz.com/blog/15-tips-to-speed-up-your-website" rel="nofollow">15 Tips to Speed Up Your Website</a> <a href="https://moz.com/blog/site-speed-are-you-fast-does-it-matter" rel="nofollow">Site Speed - Are You Fast? Does it Matter for SEO?</a> Regards Sasha 
 Cancel
Springbreaker

2013-07-30T05:39:11-07:00

Wow, timresnik:

This is one of the most interesting posts I´ve readed so far in a while, and that doesn´t circles around the same SEO topics. Just a question: could I get a penalty from this "Googlebot over optimization"? :) lol, just kidding.Thx again for sharing this awesome info.

1 0

 Wow, timresnik: This is one of the most interesting posts I´ve readed so far in a while, and that doesn´t circles around the same SEO topics. Just a question: could I get a penalty from this "Googlebot over optimization"? :) lol, just kidding.Thx again for sharing this awesome info. 
Cancel
Josh Lee

2013-07-30T04:23:51-07:00

Great post! Question Re. under crawled pages: Do you think Google bots respect sitemap priority settings in this respect? In other words, would increasing the priority in a sitemap xml file for certain directories have any noticeable effect on crawl rate?

1 0

 Great post! Question Re. under crawled pages: Do you think Google bots respect sitemap priority settings in this respect? In other words, would increasing the priority in a sitemap xml file for certain directories have any noticeable effect on crawl rate? 
Cancel
- Stephen Moyers
 
 2013-07-30T04:51:29-07:00
 
 Hey I don't think Google is giving importance to priority in the sitemap. I haven't seen priority in sitemap is going to work now. Secondly, if you are going to increase the priority of the less noticeable directory it doesn't affect the crawl rate of that particular directory.
 
 2 0
 
 Hey I don't think Google is giving importance to priority in the sitemap. I haven't seen priority in sitemap is going to work now. Secondly, if you are going to increase the priority of the less noticeable directory it doesn't affect the crawl rate of that particular directory. 
 Cancel
George Andrews

2013-07-30T06:09:08-07:00

Hey Tim, thanks for this. This is a great step-by-step process.

You mention:

"In this example, the directory /webinars pops out as not getting enough Google attention. In fact, only the top directory is being crawled, while the actual Webinar content pages are being skipped."

You say at the beginning of this post that many of the issues you found have been fixed with the migration. Was this issue fixed? Can you tell us why this directory was not being crawled as often as you would like? And, how did Moz fix the issue?

Thanks!

1 0

 Hey Tim, thanks for this. This is a great step-by-step process. You mention: "In this example, the directory /webinars pops out as not getting enough Google attention. In fact, only the top directory is being crawled, while the actual Webinar content pages are being skipped." You say at the beginning of this post that many of the issues you found have been fixed with the migration. Was this issue fixed? Can you tell us why this directory was not being crawled as often as you would like? And, how did Moz fix the issue? Thanks! 
Cancel
- Tim Resnik
 
 2013-08-01T13:52:06-07:00
 
 Hey George, thanks for the question. We're pretty sure this has to do with the content improving and the freshness of the section. Prior to relaunch the section was over 6 months stale.
 
 1 0
 
 Hey George, thanks for the question. We're pretty sure this has to do with the content improving and the freshness of the section. Prior to relaunch the section was over 6 months stale. 
 Cancel
CommercePundit

2013-07-30T06:37:19-07:00

What Can You Do To Get the GoogleBots' Favor?

You absolutely would have to keep away from the "Black Hat SEO" strategy. In no doubt they appear to be logical, if you examine the all-purpose mechanism of the GoogleBots.

What do you think of Googlebot in optimizing your site?

Do you have some techniques or tools used in optimizing SEO in mobile?

1 0

 What Can You Do To Get the GoogleBots' Favor? You absolutely would have to keep away from the "Black Hat SEO" strategy. In no doubt they appear to be logical, if you examine the all-purpose mechanism of the GoogleBots. What do you think of Googlebot in optimizing your site? Do you have some techniques or tools used in optimizing SEO in mobile? 
Cancel
Joshua Hedlund

2013-07-30T06:55:03-07:00

I love me a good technical post. You helped me discover that Google is crawling a lot of links that aren't that relevant and not crawling some directories that are. (Interesting to see that the bot will follow form buttons to a contact page from product pages around the site but barely make it to some of the folders that get linked on every page but only from the footer.) I'm adjusting our robots.txt and thinking about new strategies for the site layout, too!

1 0

 I love me a good technical post. You helped me discover that Google is crawling a lot of links that aren't that relevant and not crawling some directories that are. (Interesting to see that the bot will follow form buttons to a contact page from product pages around the site but barely make it to some of the folders that get linked on every page but only from the footer.) I'm adjusting our robots.txt and thinking about new strategies for the site layout, too! 
Cancel
Gagan Modi

2013-07-30T06:42:45-07:00

Good post, i was missing on Analyzing Logs. Have downloaded weblog expert after going through your post

Also, eager to check on your next post on - how to identify duplicate content

1 0

 Good post, i was missing on Analyzing Logs. Have downloaded weblog expert after going through your post Also, eager to check on your next post on - how to identify duplicate content 
Cancel
sabuhi

2013-07-30T11:30:42-07:00

Hi Tim, thanks for the great article. The only thing I will recommend is to use Data > Text to Columns instead of the formulas. Especially, for this table "text to column" might work best.

1 0

 Hi Tim, thanks for the great article. The only thing I will recommend is to use Data > Text to Columns instead of the formulas. Especially, for this table "text to column" might work best. 
Cancel

Post Analytics

Comments 24

Log in to Moz

Don't have an account?