Recently I was working with a client and I decided to do a little experiment, because sometimes I have nothing better to do. I wanted to know whether changing the crawl rate in Google’s Webmaster Tools really made a difference. Part of me felt Google just put it there to make people feel that Google will come to their site more often, but part of me wanted to trust Google.
I quickly wrote a script that would stored all the user agent info, page visited, and date visited in a database. I let it run for 200 days, 90 of which were set on the “Faster” crawl speed (the rest were on “Normal”). I also tracked the time from which a new blog was posted and when Google first visited that post. I tracked the site for 55 days on “Normal,” 90 days on “Faster,” and then the last 55 on “Normal” again.
Here is some background information before we get into the data I collected. The website had 45 pages cached in Google and was over a year old when I began this experiment. New blog posts were added between 2 and 3 times a week, adding fresh new content to the site. The site ranks well for its brand and a few other keywords related to its field.
During the course of the experiment over 80 new blog posts were added to the site. While the crawl rate was set at “Normal” it took Google an average of 3.4 days to first visit the page. When the crawl rate was set at “Faster” it took Google an average of 2.9 days to first visit the page. That was not much of an improvement in my book, but interesting nevertheless. Note: This is not the time when the page first appeared cached in the index, but when the bot first visited the page itself.
While the crawl rate was set to “Normal,” Google visited the site an average of 6 times a day and favored visiting the site on Tuesdays, Fridays, and Sundays. When the crawl rate was set to “Faster,” Google visited the site an average of 7.5 times a day and favored visiting the site on Wednesdays, Thursdays, and Saturdays. Over 90 days it visited 135 pages more than if I would have left it set to “Normal.”
If you look at the graph above you will see there is slight improvement while the crawl rate is set to “Faster,” but only for a portion of the 90 days. After seeing the results I have come to the conclusion that changing the crawl rate in Google’s Webmaster Tools makes no substantial difference to this site.
Great reasearching. I think I will have a play around with this myself. It may be worth setting this to fast, when a site is new, and then reducing after all pages are crawled and indexed.
thanks
good point. perhaps useful on newer sites or while you still add a lot of content in the initial stages? Well documented article indeed. thank you.
I can't understand why you would say "No Thanks" to a 15% increase in crawl rate. That's pretty substantial and it doesn't cost you a thing, so why not?
Other than that - great post. Informative and nicely documented.
The let down for me was that the 15% increase was really only for a minor part of the time when it was set to "Faster." I was also hoping to see a larger increase but I was glad to see it did increase just not as much as I thought it would.
Nevertheless, it appears that this 'medicine' has some positive effect and nonegative effect. It seems to me that while one might be disappointed with the volume and length/consistency of any additional results, there is no harm in setting the crawl rate to take advantage of whatever you can get from it.
I'd love to see whether the Tues-Wed-Thurs pattern continues over time. That would be a valuable piece of information to have about when to set deadlines for site updates.
As several people pointed out above, "crawl rate" should not be conflated with "crawl frequency". This setting is all about bandwidth and how much load your server can handle, not about telling us to crawl your site more or less often.
I notice from your graphs that the max # of Googlebot visits per day was higher when you had the crawl rate set to "Faster", which would make sense given my understanding of what the setting is intended for.
Since there aren't any labels on your third graph, I'm not really sure what it's supposed to represent...
TheDame, I've added labels to the last graph so you can understand it. Basicly the horizontal numbers are the different blog posts and the vertical numbers are how long it took Google to crawl that page. Sorry for the mess up and thanks for the comments.
That's quite interesting that google bot actually visits the site more often. I always thought that it just reads more pages in the same time. As I imagined that the bot has particular time slots for visiting pages. If you take a look at the Crawl stats you'll see that in times when downloading a page took longer (which can be due to high traffic on your page or lame internet connections etc.) - also the amount of crawled pages was less than in times when downloading a page was faster - so I thought that google just sends more requests within the same time.
If you only had 40 pages in google's cached, I don't think the amount of crawled pages per visit is significant enough ...
As I read out of your post you came to the conclusion that it made no substantial difference to the site ... but it changed in a positive way (according to your graphics), right? So maybe the impact would be greater/better if you had a website with more pages indexed (say 500,000 or so) ... or what do you think? Or would you generally recommend not switching to faster?
Great post by the way :-)
I agree with you Michael, this was a smaller site and I would like to see the impact and data from a site with a much large index.
This is exactly what I was thinking. This feature seems to be targeted towards sites with many more indexed pages. I'd love to see the same test done for a site with tens of thousands of indexed pages.
Great job.
I have tinkered with this a little myself, although I have to admit not nearly as well documented as you have done here.
My finding or at least my take is that it depends on the category and your content as to when the Google bot cycles through your site and with what frequency.
Also It does not cycle through all the sites in a verticle at the same rate.
Although I like what you have done I would hesitate that it would have the same effect in every case.
Excellent work.
Thumbs up for running an experiment and showing us your data - I wish we all did more of that. I think crawl-rate has become pretty smart: Google has a sense of when you update content and how often and uses that and your overall "importance" to adjust. I've been amazed to see blog posts pop up in search next day or even same day in some cases, even though my blog is relatively small. Meanwhile, a client site with much more traffic and authority gets spidered less often because it's (1) updated less often and (2) deeper (more pages = more time to crawl).
Awesome research and thank you so much for sharing!
I would also be interested to see how the size of the site effects this. 15% is an increase, but without really crunching the numbers I would agree it's not really statistically significant.
Yeah, I'm with Gab on this one...surely if the crawl rate inceases (albeit marginally) more amendments / changes / updates are therefore indexed for the bot.
I'm switching to faster even for my smaller sites...
Cheers for the research - really helpful, thanks!
Last night I was browsing through my website settings on googles webmaster tools and was considering changing my google crawl rate from normal, to fast, but decided to do a bit more research before I made that change. Thats when when I came across your article. I read through all the replies and am glad to have found this goldmine of information. After reading everyones comments I decided to go back to webmaster tools and change it to "fast." But at some point between last night and this morning google took away the my ability to change the rate. Now it says "Your site has been assigned special crawl rate settings. You will not be able to change the crawl rate."
That's some great work! Interesting stuff that I have not had time to test myself.
You kept saying the increase was insignificant and at the end you say "changing the crawl rate in Google’s Webmaster Tools makes no substantial difference to this site." Surely having 135 pages more visited over a 90 day period was a benefit to you? In anycase, nice post!
Solid research - thanks for sharing.
A faster crawl rate will allow for new content and other site updates to be picked up on a shorter cycle. I think there's a benefit there. I'd be interested to see the effect on a major website, like NYTimes.com. They're already being crawled multiple times per day. They're competing with multiple other news sources to rank for breaking news stories. If this allows them to "turn up the dial" and be indexed faster than the next guy, that would definitely translate into $$.
Interesting, not long ago Hamlet Batista proposed that a page's crawl rate could become a metric for perceived imortance (an alternative to PageRank). This data would suggest a wide margin of error in that regard.
I've always wondered if changing that setting ever did anything. It almost looks like the Google Bot takes a little vacation on Sunday, Monday, and Tuesday before doing it's real work in the middle of the week. Interesting never the less!
Hi,
Great article. Can you please let me know - how did you identify that google bot had crawled your site ?
We are trying to identify how frequently google crawls our website Leprechaun Salon Software
Your help is highly appreciated.
Thanks,
Thanks so much for this post. After a super frustrating month of Google ignoring a site, even after link building and submitting a sitemap, I was about to set to faster crawl rate to make sure it doesn't happen again, but now I won't.
I always got the impression that Google would level the crawl rates out in the end any way, so if you set to faster and then it turned out you didn't need faster, the crawl rate would go back to normal.
Did you see any slowing of the crawl rate over the 90 days you had it set to faster, or was it faster ewhen you set it back to normal after having had it set to faster?
Yep that's right. But if googlebot could crawl your site faster and thinks crawling faster would maybe exceed your server's resources, it kindly asks you. take this quote from webmaster tools
if there's nothing left to crawl, googlebot won't try to crawl foolishly around :-)
I had the same thought, if there is no much to crawl (more pages) or content refreshens often, what's the point in asking the bot to visit more frequently, apart from seeing what comes out of the experiment....? I would have thought it is worse since the bot would come across the same information repeteadly and so negativaly inform the google algorithms about your site. Am i in the right here or does a higher visit frequency vs same content doesnt affect how google perceives/ranks your site?
definitely a good experimen, chenry
After the speed was set back to "normal," the bot seemed to be at the site more than it was at the start of the project. This could be for a number of reasons but I figured since there was new fresh content and the site was growing were the main reasons for the increase.
It would be interesting to see what happens when this is tried with a more somewhat more leveled off or stagnant site, as opposed to a frequently updated/growing one. I agree with your general thesis here, but there are still some instances where I think a crawl bump can benefit a client's site. After a major architecture change, for instance?
I too pondered the crawl rate. It is pretty cool to see this much work put into a visual aid to help enhance the point
Hi,
Great effort and sharing this invaluable research is excellent. This might be a little off the topic but what is your experience on google crawling a brand new website?! How long does it normally take and based on what factors?! And what can be done to expedite the process if possible?!
Your feedback will be very much appreciated!
I had the same feeling but thank you for your article and charts that confirm this feeling.
Great job, congrats!
Reason for edit: Removing statement made.
Alright, that's really strange. Just a couple of days after dealing a bit more thoroughly with the webmaster-tools crawlrate, they changed it ... at least for me. Now I can set the exact crawlrate (requests per second and seconds between requests) ... did they change it for everybody or is it just a change that went with my granted beta account for beta-analytics features?
They changed it a few days ago for me also! Time for some more research!
great post Chenry. Very valuable for novice like me.Thums up to you for this research.
Hi guys, i was unhappy with my crawl rates, its been 2 months since i've uploaded my website, at the begining it was going well, then it stopped for a long time, so i went to look for solutions and found the option to increase crawl rate, but wanted to search around for that and found this post. Even thow you said that it was not a great thing for you i decided to give it a shot, now.. i don't know if while i was changing and increasing the crawl rate googlebot was crawling my website, but right when i changed that i had already 4 pages indexed and more 15 pages submitted to be indexed(sorry for the bad english if any is found). to be more precise, i've had 460 pages submited and 162 indexed, right after increasing the crawl rate i looked again and i now have 475 submited and 166 indexed. I find very strange as i know google is very slow to do this kind of things, but lets see the results in a longer time, and i'll come back to tell you guys more.
Thank you very much for saving me the time and answering the question I just asked myself and then googled it! 200 days is dedication!! WOW
Do you happen to have that script available? I would like to do a similar experiment. My main site is on my own DL360 server by itself so it should be able to withstand the googlebot speeding along my pages.
I would like to see if I notice a difference as well.
thanks
This data would be far more useful if checked on a site that had far greater than 7 Googlebot visits per day. That seems like a pretty irrelevant crawl level to start with and the "faster" command is probably intended for larger sites with much more comment to be crawled.
I know this is an old post but, I just experienced first hand what happens when the crawl rate is set too high. Have a client site on a dedicated server with 50k articles and when we submitted a new sitemap with 44k links in it, it killed the server over the next few days.
So, we reset the crawl rate from 1 second per link to 15 to stop that.. Be careful with this on!
We also have problems with Google crawl rate and large customer websites. If left to itself, google will hammer a site all day and all night, pushing bandwidth way over an ISP-imposed 2Gb limit. We have to set google's crawl rate to the very lowest. Problem is, Google doesn't leave it like that, it only lets you set it for 3 months, then it reverts to hammering the server again. So every 3 months we have to go back in and reset it. If we are not around to do this, the customer gets billed for the bandwidth overruns. Google, in its arrogant, microsoftesque way, refuses to accept this is a problem.
Google could solve this crawl rate problem by following the Robots.txt directive "crawl delay", which they pointedly state they ignore. Attempts to get an answer from google about this are pointedly ignored.
chenry,
While I commend your desire to test and document Googlebot behavior, I think your blog post title is misleading and could lead to people not requesting an increase in crawl rate when it would benefit their sites.
I have tested this feature on two sites with 50M+ pages (Eventful.com & Yellowpages.com) and we saw substantial increases in Googlebot's crawl rate on both of these sites, subsequently leading to the indexed pages count increasing more rapidly than it had been before.
Wish I could share the #'s with you, but what I am allowed to tell you is that the changes were significant in both instances, and occured within 48 hours of Google confirming the request. At Eventful.com the change was isolated from on-site content / architectural changes, so it would be highly unlikely that the increased crawl rate/index rate was unrelated to this change. As for Yellowpages.com, the change coincided with several on-site architectural changes, so I cannot definitely say how much changing this setting contributed as opposed to the other changes I made.
I hope these comments are helpful for those of you managing large content sites!
John Cole
John,
Thanks for the late comment. About a month after this post was published, Webmaster tools changed the way that Crawl Rate was used on the site. This may have lead to the different results you may have seen on your test. Not to mention this post was completed over a year ago and may things have changed.
I do wish you could share more information but understand your clients may not like that.
Thanks for the comment though!
I test my Android app site, few days back by forcing google to crawl and through normal way as well, i dont know why but when i choose faster way, my content ( new posts ) not rank that well in google but on the other hand with normal they rank good.
anybody else having the same problem?
This is an interesting piece of research. Like the individual before me, I wondered if changing the setting did anything.
Although, I have to admit, I am a little curious if we reduced the crawl rate to its lowest mininum and compared it with the normal crawl rate if any of the statistics above would change.
Especially the crawl rate on the mentioned days.
Interesting. I've been afraid to fool with that setting. I would love to see the script you wrote though. What did you write it in?
thank you for sharing your experiment.
But when Google think that our site is not a factor of the site crawling Goolgle won't activate faster option.
Making Crawl rate fast does not impact on normal sites which add 50 to 100 new pages per day but if the site having more then 1000 new pages per day it will impact on that site.
I have worked on many big portals and small sites too i have analyzed taht if you increase the crawl rate in portal it will affect the indexing of the site.
Making Crawl rate fast does not impact on normal sites which add 50 to 100 new pages per day but if the site having more then 1000 new pages per day it will impact on that site.
I have worked on many big portals and small sites too i have analyzed taht if you increase the crawl rate in portal it will affec the indexing of the site.
Great post Chenry!
My only suggestion would be to make your closing paragraph stronger, your conclusion left me hanging somewhat on why you feel that a faster crawl rate is a no go for you.
Great article.
I myself wondered if making the change would actually make that much of a difference. I see it doesn't make a considerable enough change to make it worthwhile.
Thanks for the great article!
Very interesting post, thanks.
What do you guys use to monitor google crawl only?
We use a custom script written in PHP that stores all of Google's visits in a database.
Hi Chenry, Great, thank you for the info.
Using 'faster' instead of 'normal' crawling once helped me with crawling an old website for a client, which was previously uncrawled even after adding new contents.
We used to have the Faster option, but no longer since 5 months.
We wonder why the option was ommitted.
Still we have about 300 pages on the website
Googlebot uses a variety of algorithms and information about your site to determine how much to crawl it, and how fast. For the vast majority of sites the "Normal" rate gets us all the coverage we need from your site. However, if we feel that there's more content on your site than we're able to crawl at our current rate, or that we're missing things that we'd like to be able to crawl, we'll give you the option of selecting a faster crawl rate.
Like I said, "Normal" is fine for most sites, and it doesn't mean that we're missing any part of your site. And if Googlebot decides it could use more bandwidth in the future, the 'Faster' option will become available to you again.
This is ture that adding site in google webmaster tools make much difference as i realize it too. Few months before i add some of my main sites in too google webmaster tools and after one month i see that those site ranking more higher than which are not added in google web tools.
Mandeep ::$$
I am at Webmaster tool as I write , and debating within myself if I should change the crawl rate, as I plan on adding more content on one of my site each week. I decide to search online, and come across your post. Insightful , and let me know what to expect. thanks.
Awesome Research and Awesome Post!
Hi guys I'm new to all this, I recently signed up an affiliate program from cashgil and I'm find ways to promote my new website, where should I start? sorry if i'm at the wrong place
NO!
that's not what it is!
Many sites run programs and back end operations to generate their pages. When google crawled them pulling 100+ pages/minute like it was static html pages they took the sites down on overload.
People complained, Google responded by making pages rates
Crawl speed has nothing to do with how often google visist, it controls how quickly it pulls your pages per minute. If your site is on a fast server and is all static html, you can set the page rate high and they will go through it fast. If you server is slow and and only handle so much load through programming you have to set the crawl rate as low. Then google won't take down your site.
What the factor Google allowed us to change to the faster crawl rate? until this day i can't change my crawl rate to the faster but google still index my site fast, after i post average 4-5 minutes later google already index my site.
So this study is based on only one site?
Kenneth,
This was only done with one site. I'm working on finding more sites to so I can increase my research. I'll get back with everyone once I collect more data.
Goodie! I assume the results will be posted here on seomoz to keep us updated! :) Interesting theory thought.