Like any other person out there, I fall into habits, good and bad. Recently while working on a client’s website, I created a Sitemap and submitted it to the search engines, like I always do. I started to think if this really helps the site out and what’s the effect when I submit a Sitemap on the site.
I approached one of my clients who has a semi popular blog and uses WordPress and the Google XML Sitemaps Generator plugin for WordPress. I asked for permission to install my tracking script on their site to track the whereabouts of the bots. For those of you who don’t know what the Google XML Sitemaps Generator is, every time you edit or create a post on WordPress it creates a new sitemap and submits it to the major search engines.
My client is good at posting new content to their blog, usually around 2 or 3 posts a week. The script that I installed on their website was written in PHP and tracked every time a bot accessed the Sitemap, every time the Sitemap was submitted, and every page it crawled on the website. The script stored this information in a MySQL database along with a timestamp, IP address, and the user agent. I also modified the Sitemap generator to insert a timestamp every time the sitemap was submitted to the search engines.
Onto the data!
The experiment was to see if submitting a Sitemap to Google and Yahoo would decrease the time it took Google to crawl and index the page. The results for this blog were amazing! When a Sitemap was submitted the average time it took for the bot to visit the new post was 14 minutes for Google and 245 minutes for Yahoo. When no Sitemap was submitted and the bot had to crawl to the post, it took 1375 minutes for Google and 1773 for Yahoo. The averages were calculated on 12 different posts, 6 with Sitemaps being submitted, and 6 with the Sitemaps not being submitted.
After calculating the data, I thought there had to be a mistake. I went to a few of my sites (GR Web Designs and Grand Haven Football) and quickly created new posts and submitted a Sitemap to Google and Yahoo. I checked my tracking script 30 minutes later and Google had already been there and the new posts were indexed. Yahoo followed shortly after Google did also.
After seeing how long it took the bot to crawl without a Sitemap, I figured there was a problem with the structure of the website and the bots couldn’t crawl to the new pages. When I looked at the site and had others look into the crawlability, we found no problems. I also looked and found that the bot assessed the page where the new links pointed to the new posts but never went on to crawl the page until later.
I was doing research for this post and found Rand’s post titled “My Advice on Google Sitemaps - Verify, but Don't Submit," and I found myself perplexed. Why would Rand tell me not to submit my Sitemap when I received such great results from it? After rereading the post, I found that he was more interested in getting the valuable crawl data. Granted that I’m using WordPress and know that all my pages are crawlable, why wouldn’t I submit the Sitemap, especially if I’m going to get results like above?
For sites like the one in the experiment, that know their site has no issues with the natural crawl, I would suggest that they submit a Sitemap because it will lead to a faster crawl and inclusion in the indexes. If you have a site where you are unsure if your link structure is correct, I would suggest that you do NOT submit a Sitemap. This will help you determine whether or not you have problems. For all those people out there who have websites that have great link structure, why not help get things going faster and submit a Sitemap to Google and Yahoo today.
I would love to hear what the SEOmoz community has to say about their use of Sitemaps. Remember, this experiment was only completed on one site and I might do further study on the use of Sitemaps if I get a good response from all of you.
cHenry - Love this post, and your hard work in experimenting and documenting results. I think it's truly great stuff.
With regards to my post - I actually no longer hold that opinion, so I'm going back to edit the piece. I've actually seen the results of Sitemaps be so positive for so many clients that despite the loss of visibility into architectural issues, I tell everyone that 99% of the time, you should be submitting as one of the first actions in your SEO campaigns.
Rand, glad to hear that you have changed your opinion on Sitemaps. Hope you are enjoying the new year!
Good post but...Google has always used feeds to index stuff quickly for blogs?
I get stuff indexed in under 5mins from an RSS ping with blogs.
How conclusive this is for real websites with millions of pages is what would be really useful.
Sure, you might get stuff indexed quicker - But those pages are still not going to rank for anything if your architecture is crap.
Well if you read towards the end of the post, you will see that I stated that this is for sites that "know their site has no issues with the natural crawl." As other users have stated this will help if you need a post to be crawled quickly because it is time sensitive.
OK fair enough comment. I will add then - there is nothing new here.
This has been happening for a long while and I guess I was reffering to your comments with regards to being suprised at Rands post. Because these results just show quicker indexing for blogs with a feed not those points addressed in Rands post.
But I appreciate the research and statistics you have put together and shared.
I think this goes bigger than just site with a blog and a feed. Any sitemap that is submitted will experience the same results regardless if it was produced by a blog or manual by the owner.
This may have been happening for a long time but if you look at most the literature out there, you will see a very mixed consensus on the submitting of sitemaps. Hopefully from this people will see that submitting a sitemap will help them and their site in the long run.
Cheny, testing on single blog posts is very different to thousands or millions of pages from a normal everyday website. Or a big site or url change for example!
The concensus for submitting sitemaps is mixed due to other reasons.
There is way more to it - https://www.smallbusinesssem.com/xml-sitemaps-the-most-overrated-seo-tactic-ever/1193/
Sitemaps STILL won't help a lot of websites one bit 99% of the time, so it's incorrect to suggest that this test shows otherwise.
"I tell everyone that 99% of the time, you should be submitting as one of the first actions in your SEO campaigns." - Rand
OK,..
It's gospel obviously.
Or maybe we should wait to hear the reasons why, consider and evaluate it - rather than quoting points of view from someone else that they are yet to even expand upon.
Your right, I mean who is Rand really, does he even really know anything about the internet and SEO!?
Chenry, slightly embarrassing now...
I like Rand and on the whole he provides good advice. I met him at PubCon this year.
But there are many SEO's who are equally as knowledgeable and say just the opposite. You haven't even heard why he recommends sitemaps yet. LOL.
Rand I admire your fanbase.
I hear where you are coming from - but quoting Matts post isnt the best option here - to start with, he explains that the sitemap should not be used to fix crawl problems. This has been mentioned in the post.
Second, he goves an example to when getting rid of the xml sitemap helped get the site crawled quicker - however even he admits the test isnt conclusive. There could have simply been an issue with the serving script or a range of other issues that caused this.
On the other hand I do believe that WP blogs get indexed quicker - however what the test above is on the same platform at diff times with sitemap active and without. The difference is quite visible.
The question that you ask that I find most important is - will it work on a non-blog platform?
Hey Rishil,
I am not the one quoting...
I gave Matts blog post as an example of the opposing views. Of which there are lots more on both sides.
Blogs do get indexed quicker from feeds - There is no argument there, old news. There is a nice ping service you can even submit your feed to https://blogsearch.google.com/ping
The above merely supports what we already know.
I think you hit the nail on the head with what I am getting at, right at the end in your last sentence.
Well I am glad I havent mistaken the main crux of your argument.
Of course I agree that this isnt "news". However its I am guessing the first practical demonstration of a well known "fact".
Which is why I proposed main blog. Many SEO bloggers, even the respected ones, speak about whats good, what isnt, etc, while quoting their own experiences. Its just nice to see a valid, repeatable and verifiable test :)
This kind of data encourages others to replicate tests on their own sites, which helps keeps things fresh...
I've got a number of blogs, some quite popular and authoritative and others aren't. I never use .XML sitemaps, only a plain HTML sitemap linked from every page along with related posts plugin and a few others to help with a flat architecture.
Google 98% of the time hits the page in well under 5 minutes from making the post, and Yahoo usually under 10 minutes. This is tracked with Patrick's Crawl Rate Tracker from BlogStorm.co.uk
Actually with your average of 14 minutes, my posts are already in Google and generally on the first 2 pages for some longtails by then.
This is just done with pings.
I'd say you are seeing the extended wait without submitting the sitemap due to historical data, Google is used to the "Post > Sitemap Submit" pattern so you got vastly different results when you changed that.
Good post, please see if you can find some blogs with similar authority and post frequency that have never had .XML's and just pinged Blogsearch to do some more testing as comparison.
Our website has several hundred thousand pages, and originally had serious site structure issues, with only a few thousand in Googles index.
Since a complete redesign, hundreds of 301 redirects (sad but true), we've been gaining about 5,000 new pages indexed in Google per week.
This is primarily due to a great new architecture and a series of sitemaps (as the limit if 50,000 with each xml sitemap).
I'll have to write this up in YouMOZ, but in short, I'd agree that sitemaps are a huge help in getting indexed.
But, as you know, getting indexed is only half the game...
As for ranking, they have little to no effect in my opinion.
Chenry isn't talking about how to get rank high but it's about indexing. So, it's nothig to do with ranking.
In fact how can our webpaged be ranked well, if it's not even indexed???
It's reassuring to hear that Rand. After reading your post I had stopped submitting my sitemaps to SEs. Will start again :)
Great research no doubt. But missing something rather crucial - any effect on visits. In other words, does more regular deep indexing = more visitors (for a site whose pages will get indexed without the SiteMap)? We might think so but if we are getting all emprical (and we are), we can't assume this will be so.
It should be simple enough for you to plot traffic on the same graph as your crawl time.
I will include that in my next research. I think any page that gets indexed is going to help get more visitors, especially if your post is relevant. We will see though.
Great analysis. It all our testing we have seen that submitting a sitemap helps indexing if you have alot of pages on your site. But it won't necessarily help traffic unless your pages target long tail terms that don't need link juice to rank well. With us, our primary traffic generation comes from long tail terms, so it helps. But if you plan on going for mid tail or head terms, submitting a sitemap will only get you indexed (but there is no guarantee, as Google says) and probably won't drive traffic until you build links to the page.
Nice job and good test.
Perhaps harder to do for those using a script/plugin without modifying, but creating a sitemap index file and multiple sitemaps can provide additional benefits.
Take the typical ecommerce site with various layers
How you slice and dice it may very depending on your site, but when you can isolate different sections into a separate sitemap, then you can use Google's Webmaster Tools to get a little better insight based on the reporting provided.... X number of sitemaps submitted, Y number of URLs (total and within each sitemap submitted) and Z number of URLs indexed.
Now, this needs to be taken with a grain of salt and not necessarily gospel, but this can help identify issues such as pageload problems or pages that appear to be duplicate to the point of not even being accepted.
I generally like to see at least 80-90% of those URLs counted as being indexed. If I see 50%, then I want to dig in and try to understand why. Maybe you see high indexation on some areas but not others... without this kind of segmentation, you have one big haystack to dig in. But now, I might be able to see that the subcategory pages drop down low on indexation, and in researching, might see that there is a lot of duplication in title tags or maybe those URLs are heavily parameter filled.
Nice work and excellent food for thought.
The only issue I am wondering about is the difference between submitting an XML sitemap and a text sitemap and the ping function. Submitting a sitemap and pinging both sends a signal to the search engines. So I'd be interested in the difference between
I think you would find that all will be similar expect for submiting no sitemap or no signal.
XML Sitemaps are definitely the best kind -- they are the only ones that allow you to submit all of the meta data that you want (assuming you want to submit it). By using the XML format from the start (even if you're only submitting URLs), you can add meta data later on without changing anything on the search engine side.
Pinging is the way to go once you update your Sitemap file. We pick them up regularly, but if you want to get the latency down, sending a ping is best.
Finally, regarding meta data (last modification date, priority, change frequency): please don't submit "default" data like "1.0" for the priority of all URLs. Either give us something useful or don't include the meta data. If at all possible, I'd absolutely recommend using the last modification date (and if you have a blog, that does not mean the post date -- comments are modifications worthy of being indexed too).
Thanks for the clarification John.
For those who dont know, John is currently a "Webmaster Trends Analyst at Google Zürich"
https://johnmu.com/
This is what I was thinking, blogs have a ping function, I don't even have a sitemap on my blog, yet one time got an article indexed in 4 mins.
https://www.hi-tech-it.com/wp-content/uploads/2008/09/atora-beef-suet.JPG - proof, all I did was ping Google.
Yes, indeed, I do have my client website that had been indexed without submitting any sitemap nor link to any website.
but for site analytics I need to submit a sitemap/
Great post and excellent documentation.
I agree that your site will get crawled quicker and more pages indexed if you submit an XML sitemap. Out of courtesey major bots don't crawl your entire site at one time, but come back again and again to lessen the load on your server. Giving them the new URL's is very important to accelerate the process.
On the other side (there always seems to be another angle doesn't there), to gain the maximum page rank value from link sculpting the bots need to crawl your entire site. If they just index individual pages there is no syngery within your site and you might as just have a 1000 individual sites instead of one site with 1000 different pages.
My usual solution (subject to change at any time) for a site that is already indexed is to submit new pages ASAP, along with any entry pages. The search engines gets the new content, and is encourage to crawl from your most important pages again.
Wow, really great experiment Chenry.
I've never used that plugin myself, so I'm not terribly sure exactly how it works, but I'd be a little concerned about having it submit sitemaps automatically for me. I create XML sitemaps for all of my sites, but I like to take alot of care in selection which URLs are included*, and what I want their priority to be.
* um, we have some duplicate content issues due to a lovely IIS server**, and I make sure that only the version of the file I want indexed is in the sitemap (ie. default.aspx vs. Default.aspx).
** yes, I know there are re-write solutions out there for IIS, we just haven't gotten anything up and running yet.
@saffyre9, it's not too bad of a plugin, check it out, you can control what pages are included in your sitemap so it doesn't get too out of control. Plus if you know your PHP you can edit it to make it do what you want.
Great Data
I have found that using the site maps with very time sentative blogs has helped a LOT. When you are trying to rank for events that happened in the last few hours, the quick indexing is nice.
This works especially well for niches that are NOT covered by the major news networks and routers. Since they will rank under news and urgancy naturally well.
The flaw in the research is that the authors blog which was tested also has Feedburner on the same page, which also triggers Google to come crawl new content, usually about every 30 minutes which was the time reported.
Even submitting a sitemap for a low PR site, such as the test subject, doesn't mean it's going to be crawled quickly (if at all) per Google's Webmaster Central.
IMO it was feedburner and not sitemaps that caused the crawl, but who can tell?
The only way to know for sure is to disable the RSS feed and the ping to Feedburner and run the test again.
Only my site uses Feedburner. The clients site who the 12 posts were completed on does NOT have Feedburner.
Which site is deemed as a more important site in Google, your clients site or yours?
It's possible the sitemap caused the client to get crawled quicker yet the feedburner caused the same effect on your site.
Your theory is possible but the research is flawed until you disconnect feedburner and rss feeds from your site and run an additional test.
If the results are the same, then I'll agree with your conclusion. ;)
The research was completed only on the client site and not on my site. The only reason why my site was listed was because I was surprised by the data and wanted to see if it would work on my site also. This post was only to focus on the fact that using a sitemap would increase the speed at which the serach engines come to new content on your site. If the use of feedburner does the same that would have to be researched.
"The research was completed only on the client site and not on my site. "
Then why say you did it on your site too?
"I went to my site (GR Web Designs, shameless plug) and quickly created a new post and submitted a Sitemap to Google and Yahoo. I checked my tracking script 30 minutes later and Google had already been there and the new post was indexed. Yahoo followed shortly after Google did also. "
I'm sure I read it properly.
The use of feedburner definitely does the same thing because I've been using feedburner to rank new content quickly before sitemaps ever existed.
Usually takes Google about 30 minutes to pick up new content via feedburner
You forgot this "After calculating the data, I thought there had to be a mistake." then I made 1 post at my site to see if the same thing would happen, just to verify the information I had from the 12 prevous posts on my clients site.
Hey Bill - I agree tat feedburner also increases the crawl rate - but note that in the test shown, two distinct charts are displayed - one with the xml and one with out "with all other things held equal".
Hence the feed burner effect wold have been in play in both charts even if was on the site.
Nice post, love the graphs, makes for a good report to a client and you cant go wrong with pictures...
As you have seen if you have a very active blog, with 2-5 new articles posted each monday, google bot's seem to understand this pattern and my clients blog set a record with hits around 55 seconds after each post, and would come back and trawl related old articles in the following days.
The interesting thing about this was that the blog wasnt a massive traffic magnet, but seem to always have consistent slow growth in visitors that matched these crazy monday posts.
The blog was setup with wordpress.org, installed on the server and the Google sitemap XML plugin was installed. It is a very important point, that if you have proper sitemaps setup, google loves you.
Other ideas are to tweak elements of the article once it has been trawled by the spider, to get a new hit, change the title, maybe add a relevant image or two and a tweak a few paragraphs. And see how the spider responds to the update, why do you think news websites put out basic outline and slowly provide updates as the story progresses they understand they people/spiders will return and may even pickup old articles and post them to the frontpage of Google News....
Main blog stuff. Period.
Edit for Sphinn Link. https://sphinn.com/story/95448
Great post and documentation! I've always wondered what the effect of sitemap submission was and now I have an idea. I agree with rishil, this is Main Blog Stuff!
I myself have also had a good experience with sitemaps. I also use the Google XML Sitemap Generator for my blog, https://blog.brenelz.com.
It now is almost at the point where as soon as I submit a blog post it shows up in Google. I don't think this would be possible without the use of sitemaps.
-Brenelz
> I don't think this would be possible without the use of sitemaps.
Indeed it is! My personal blog in Danish is a good example. Its actually not that well linked (and just a PR 5 site) and I get everything indexed within minutes - sometimes seconds. I can say the same for the majority of other sites I work with.
There is one important factor missing in this analysis: Do they rank?
In my experience a well linked website or blog, with no internal linking issues will get indexed within minutes of adding new content. My own record, for a post on my blog, was 13 seconds from the time I hit "post" to the time that post was number one in Google for the main keyword. Without Sitemaps!
With sitemaps, some sites may get faster indexed but that is only because they have too weak linking and therefore they won't rank.
So why do you want to get indexed fast if you do not rank? To help Google brag about a bigger, more fresh, index? I don't think thats my job ...
Sorry, my best advise to clients - and everyone else, is still NOT to use XML sitemaps (except for a few rare occasions). You simply get much better results by creating sites that naturally get fast indexed - because such sites rank for the indexed pages too. And the fact is, if you work your site and linking correct you CAN get you pages indexed in seconds or minutes without XML Sitemaps.
On top of this comes the issue already brought up here: Most clients will sooner or later run into internal link structure problems - even if they do not have them now, and when they do not having the sitemaps to clutter everything up my job at finding the problem gets so much easier (read: cheaper for the client).
Mikkel, thanks for talking some sense here.
I gave up. lol
erm... LOL? To you and Mikkel.
You want to get indexed so they can build up age, attract links etc? There's three seperate issues:
Amen to that!
I certainly respect your expertise, Mikkel, but I've got to disagree on this one. Discovery can be a big issue for many sites, especially complex sites or sites that have recently undergone major changes. Sure, slapping an XML sitemap on top of a lousy architecture is a poor excuse for doing things right, but what's wrong with using a well-targeted XML sitemap AND a solid architecture?
I work with some highly dynamic sites where concurrently improving our architecture and creating a targeted XML sitemap that focuses on the level we want to attract visitors to has created some very positive results. It doesn't directly effect ranking, but: (1) So what? You can't get ranked until you get discovered, and (2) By targeting certain pages, we have indirectly boosted ranking.
> but what's wrong with using a well-targeted XML sitemap AND a solid architecture?
Several things
1) In my experience you can get just as fast indexed without sitemaps if you have sufficient links. If you do there is no problem and there is no reason to fix a problem that isn't there :)
2) If, at any point, the site do run into internal linking issues - and most large dynamic websites and commerce sites do in fact, then having the sitemaps on will make it more time consuming and expensive to discover it. I don't see any reason to add additional costs to the work.
3) I believe that pages search engines find naturally tend to rank better than pages we "force" on them. I have had that same believe - and experience - ever since the "good old" submit a site forms.
So all in all: You are trying to fix a problem (indexing) that should be be there if you do you SEO job well enough (including getting the links you need!) :)
I think there are definitely cases where you're right, and people certainly abuse XML sitemaps, but I've found cases where they can be complementary.
For example, search-within-search has gotten to be an issue on a couple of client sites. The sites are highly search driven, but I want Google to focus on the product pages, not the search results pages. Of course, I've built in solid architecture for browsing/spidering (category pages, etc.), but at some point, if I over-engineer for SEO, I'm creating a bad architecture for users. In one case, I've used the XML sitemap to target the product pages, while letting spiders take their natural course (controlled by noindex, etc.). The combination helped clean up the client's search index and ultimately made the product pages rank better. This caused more and more visitors to link directly to those product pages, which led to even better ranking, and on and on.
Used well, I think the XML sitemap can do exactly what it's designed to do - be a signal to the engines of what you feel is most important on a site. Yes, your architecture should also do that, but what's unique about the XML map is that it speaks only to spiders, without affecting visitors, and I think that fact can be used to your advantage.
I agree with Dr Pete - it's clear the article doesn't declare that sitemaps affect ranking and does take a moment to say don't do it if you have architecture problems.
Getting pages indexed will mean a shorter delay waiting to discover how a new page ranks so you can begin to address the ranking /keyword /linking /content issues and, due to the sitemap last modified field, see a quicker result on any changes made to content and keywords.
Nice post - I missed it in YOUmoz originally. It's always been my observation that XML sitemaps were a boost to discovery, but it's great to see some data.
Mikkel, I agree with your theory but I think it's a little bit cocky to be honest. Sure, if your site has a million backlinks and tons of fresh content Google will come back every 10 seconds and all of your pages will rank instantly. Only problem is that it takes time to get to that level...and not all sites (especially big corporate sites) are so eager to go changing up their site structure. Truth is there aren't that many sites out there that are as awesome as yours. Sometimes a sitemap is a good band aid to indexing problems that might otherwise take a long time to fix.
> Sometimes a sitemap is a good band aid
If there is an infection under the band aid then it certainly don't help. In fact it can make it worse :)
Having paying customers write excellent articles for you for free. SEOmoz has the best of both worlds. :-)
Oh so true! I wish I was SEOmoz and had others write great content for me!
I certainly agree - I've been using sitemaps for four uears now.
However: I've also noticed that sometimes Google picks up a new post instantly - within a very few minute. I've always suspected that running Adsense must have something to do with that. It would make sense for them to tie into info gained from Adsense - I don't know that they actually do, but when something shows up on page 1 just two minutes after I posted it, how else could it have gotten there?
Thanks Chenry for posting your findings. While this may not meet everyone's scientific standards, it was better than nothing at all, and is appreciated by the community, or at least some of us :)
Great post and glad to see it was promoted to the main blog! Excellent documentation and highly relevant to me especially as I've just set up a new WordPress blog with the XML Sitemaps Generator plugin. Thanks and keep up the good work chenry!
This was really informative, and it's great to see a measurement of elapsed time versus submit time.
However, my main concern with sitemaps (when we've implemented them in the past) isn't in terms of when google indexes but rather how many. Most of the sites I work with are usually relatively large e-comm sites with thousands of products, and every time we submitted a site map our number of pages indexed would actually go down; and this wasn't because of duplicate content or weird little appendices, actual unique product pages would simply disappear. We'd take sitemaps off and after a few "unassisted" crawls they would come back again. Obviously pages indexed isn't some magical number that automatically improves your success, but it did worry me to see normal "good" pages slip out of the index. In some cases it would go from >10,000 to <1,000 (with about 12,000 unique products).
I don't know if this is still the case, since this happened about half a year ago and I haven't revisited it for these sites. Has anyone else had this experience? Is it just growing pains for sitemaps or (as I suspect) sitemaps being more useful for content that updates on a near daily basis?
That’s interesting. I know there is a limit of 50,000 URLs and a size limit of 10MB uncompressed. I've heard that it can be better for your site if you have multiple sitemaps with only about 1,000 URLs in each one. Might be interesting to try something like that.
We don't care how you split up the URLs in a Sitemap file. Put 10 in a file or put 50,000 in a file, whatever suits your workflow best.
I've got 15 sitemaps and I feel it has helped with indexing.
Sadly, yes!
Whilst our e-commerce site isn't quite in the same magnitude as the ones you work with (ours is almost 3,000 products), since submitting our sitemap the number of indexed pages has plummeted to just over 250 (whilst Webmaster Tools reports Googlebot crawls an average of 1,500 pages per day, its worth noting I slightly increased the crawl rate too to see what would happen).
As an experiment I've deleted the sitemap from WM tools and blocked it with robots.txt, maybe the results will turn into a Youmoz post at some point..
excellent post, really great documentation!
Great post and you have the research to back it up.
So, if you have a site other than a blog, should you submit your sitemap everytime content is changed - even if it's not necessarilytime sensitive? Thanks, again, for the insight.
Westgate,
If your website has no crawl issues I wouldn't mess with submitting a new Sitemap. I would update the"LastChange" on the Sitemap to reflect the new date if "LastChange" is used. Search engines will check your Sitemap often to see if there are any new changes. My guess would be your page will be crawled before that.
Submit XML Versus Ping
For Google, I prefer re-submitting the xml sitemap over pinging when we update our site.
Both are equivalent to Google, I don't know how the other engines handle it though.
If they are equivalent to Google, why are the results different for our website?
Perhaps because it is smaller?
I'll go back and look, but it is definitely better when we re-submit through Google Sitemaps.
If you see any differences those would be from the usual fluctuations. We treat both re-submitting a Sitemap file and pinging them as the same thing on our side.
so true..
I had a malfunctioning sitemap in use for my CMS over which i didnt have much control. The automatically generated sitemap included duplicated pages and left some content items out. But I didnt give it much thought and left it that way; thinking it shouldnt really matter because my site doesnt have much pages, and all were logically interlinked.
But since i fixed the problem with a sitemap that includes exactly what i want indexed (and exludes duplicates), every new page gets indexed 'in no time'. I havent got the tools to exactly measure, but bot comes by almost daily and indexes what sitemap says :-)
Hi Casey
Could you tell us what the error on the site was?
Was it a structural issue, where pages ended up isolated or was it a deeper issue?
Regards
I've found similar from Google crawling the site with Wordpress blogs. While sitemaps are made to address potential crawl problems it's a lot more productive to have an active one submitted regularly.
Cool post as one of my clients is asking me a lot of questions about sitemaps and unfortunately she doesn't have the right knowledge at all about what a sitemaps actually is and does. This post is extremely benefitcial to a lot of people becauseit helps not only identify bad site designs that the search engines cant read but also covers well designed sites and gets them indexed much faster. Great post, well researched and well explained.
If I could give a gold star I would ha!
Interesting. I've just started working on this site for a client and found similar data
https://www.debbiegrattan.com
They too don't have a sitemap.
Obviously Google goes straight to the sitemap, where it appears Yahoo is totally lost.
Is there another way to submit sitemaps except through webmaster tools?
I had a large media site that was ranking "ok" I made a dynamic sitemap and submitted to webmaster tools... shortly after the site was deindexed!!
I have a question..what is the best way to develop sitemap for dynamic sites?Is there a software or scipt available or I need to keep on entering url whenever i add new pages.
Good article; we will be using this. Meanwhile, I notice that your two graphs are actually quite misleading (and not in your favor). Your y-axis scales don't match, so the magnitude of change between data samples is only perceptible by reading the scale. You would have a much clearer (and more honest) graphic if you plotted before and after lines on the same plane … or at least used the same scale on both graphs. If this didn't make your argument weaker on its face, I might call B.S., but as it is you're only hurting yourself.
Nevertheless, good stuff. Thank you!
Sorry to join the party so late - this is an excellent YOUmoz post (best one I've read for ages) - nice analysis. My experience with SEOgadget definitely backs this up. When you update your sitemap and ping Google, pages can be in the index (and visible in the SERPS) within as little as 15 minutes. Seriously!! That's a long cry from the days of yesteryear. Sphunn...
Orginal Sphinn link at https://sphinn.com/story/95448
great to see this posted to main blog - whilst maybe not groundbreaking - its scientific and gives us info on blogs and sitemaps. Personally I think sitemaps do help out, and are very worthwhile to have accurately done.
Great post mate.
A couple of people have stated that there's nothing new here but through ignorance I've not used sitemaps on a few sites and I can now see from your experiment that I definitely should and it's well worth it. Especially when there's a simple Wordpress plugin available.
Well researched and executed!
I understand the benefits of being indexed sooner. I see the value of this for time sensitive, news breaking content.
Does it increase your search rankings or is it merely that new content gets ranked and indexed sooner?
I am just trying to understand the specific benefits. Thank you.
Each site will have different benefits, for my client's site, they have many competitors who are posting on the same subject. To get their post crawled and indexed before the competition makes a huge different to them.
For my own personal blog, I like to know that my content is getting in the index quickly so I can catch some of those long tailed searches that my posts are focused on.
Google Analytics asks for it so I give it.
@Data Entry Services
what did you mean Google Analytics asks for it? I work in Google Analytics every day and haven't seen a request for a sitemap but it would certainly be a compelling argument and a sensible approach - it could help guide new features of Analytics based on "recent updates"
Chenry, this post is a credit to the SEOMoz community. Every site is different and your principles may not apply to all sites, fair enough, but the results of your experiement are valuable learning points for everyone and defintely a baseline for further work. As far as I am concerned, my perceptions about sitemaps are now changed. I am off to experiment with them. thank you.
Very nice post. I agree that it would be a great idea to try similar experiments on other types of sites such as e-commerce sites with a few thousand products and see if the faster indexing leads to more sales.
Also would be nice to hear what tools people generally use to create their sitemaps.
This was really awesome and inspiring at the same time. I knew sitemap's are very important for crawling, but had no idea they'd be so important.
Nice article, that's amazing that the Sitemap had such a profound effect on crawl time.
As a small, lesser ranked site owner, I agree that Google, at least, treats smaller sites differently. Not in a bad way--we have to work our way up.
Thanks for the concrete data. My gut knew the sitemap had an impact but beinng an engineer I needed proof. Thanks for going the extra mile!
Jocelyn
www.mozakdesign.com
Friend,You of all people should know that if you are going to do a piece of research you really need to tie down the variables as tighly as possible. A graph ain't reearch particularly in this SEO world.
I am sure that Sitemaps are good for quick indexing, but you can't draw the conclusions you make from your study. I have seen rankings go down after submitting a sitemap as well as up. You'd need at least 20 sites in a controlled experiemental condition.
I have read through other posts and am sure that you are making the search engines work harder by sharing basic marketing strategies and optmization strategies with visitors. For that I thank you, at this time they do a terrible job.
I'm also glad to see that you are writing nothing about the strategies I've been employing the last 15-years.
:)
What a great post and conversation. For a newbie nosepicker like me, it was great to follow the conversation. I have a lot of small business sites that have a tiny number of pages. Some of my clients do have WP as a CMS or as a blog. I have consistently used the Google XML sitemaps plugin and the All in one SEO plugin to help with the indexing and keywords, etc.
Once I installed these, I noticed a slight increase in pageviews and unique visitors. But the main point isn't that.. it is that, as for me, someone who is in a small agency and trying to do what I can for my small business clients, this step of submitting sitemaps to the search engines is one of the first steps I need to take.
I guess a question I would have for those that do not think sitemaps are very important would be.. What should my first step be if it isn't submitting an xml sitemap to the search engines?
I build all my sites Web standard compliant CSS. Add keyword specific title tags, meta tags, descriptions. I think a lot of times it comes down to "How good is your content?"
Anyway, good post and good conversation.