I am the SEO director of Vitals, a comprehensive health care and doctor information and review site. There are over one million health professionals in the United States alone. This means a lot of categorization and pagination is necessary to organize all the providers by name, city and specialty. Our pagination strategy has changed several times to keep in step with Google's latest recommendations. At the end of this article, I will present a Google recommendation history.
Pagination can occur in many formats. First is article pagination when a single article spans across two or more pages. Next there is gallery pagination when every item in a gallery has its own page. There is also forum pagination where threads can span many pages. Category pagination is when listings span several pages. These lists can be in the form of products or anything else that can be placed in categories. A newer form of pagination is infinite scroll pagination, where data is pre-fetched from a subsequent page and added directly to the user’s current page as the scroll down the page.
You should note every pagination type on your site and discern which of the pagination options shown below works best for your situation. You should also determine which pages in the series would provide additional value to surface in the Google index. An article spanning several pages should allow Google to read and index the keywords from the entire article. Likely, on lists of products, you would want the search engines to have a crawlable path to all your product listings. A component page in a paginated series can be valuable as: 1) a component page with good content that completes the series 2) a crawlable path to reach individual items content.
The best time to deal with pagination structure is during the design process. This will avoid any issues with having to re-code or restructure post launch.
“Do what’s good for the user”
Often product managers are resistant to change existing pagination by citing Matt Cutts, “do what’s good for the user, not for search engines”. This is certainly the top priority, but I would like to add one crucial element, “do what’s valuable to searchers trying to find your business”, otherwise, there’s no value to your content. You also need to code the pages properly, so the search engines can act as the intermediary between your users and your quality content. Over the course of the last few years, Google has laid out instructions in Google Webmaster forums on how paginations pages should be structured and coded.
What can be the problems with pagination?
- Crawl Hog
Google will crawl all the pagination pages if you let it. However, the Google crawler bandwidth can have its site crawl limitations. You don't want the crawler to get tied up in paginated pages, especially if the pagination pages do not add any Google indexation value over page one. Increasing the number of categories or items per page can decrease the depth of pagination.
- Page Juice Dilution
Incorrect code implementation can dilute page juice across the paginated pages which will also prevent link juice transferring to pages that they link to.
- Duplication
Paginated pages are vulnerable to duplication filtering by the search engines. Coding paginated pages correctly will let the search engines know that they are pagination pages and will not be flagged as duplication.
- Thin Content
A lot of paginated pages do not have a significant amount of quality content on them. The Panda algorithm can penalize an entire site if it finds too much low quality content. Thankfully, Google has given us relatively clear guidelines on best practices for pagination. Here are some pertinent excerpts from Google’s recommendations.
A brief history on Google’s recommendations:
https://googlewebmastercentral.blogspot.com/2011/09/pagination-with-relnext-and-relprev.html
9/15/2011 Google Webmaster Central
Here are three options for a series:
- Leave whatever you have exactly as-is. Paginated content exists throughout the web and we’ll continue to strive to give searchers the best result, regardless of the page’s rel=”next”/rel=”prev” HTML markup—or lack thereof.
- If you have a view-all page, or are considering a view-all page, see our post on View-all in search results.
- Hint to Google the relationship between the component URLs of your series with rel=”next” and rel=”prev”. This helps us more accurately index your content and serve to users the most relevant page (commonly the first page). Implementation details below.
A few points to mention:
- The first page only contains rel=”next” and no rel=”prev” markup.
- Pages two to the second-to-last page should be doubly-linked with both rel=”next” and rel=”prev” markup.
- The last page only contains markup for rel=”prev”, not rel=”next”.
- rel=”next” and rel=”prev” values can be either relative or absolute URLs (as allowed by the<link> tag). And, if you include a <base> link in your document, relative paths will resolve according to the base URL.
- rel=”next” and rel=”prev” only need to be declared within the <head> section, not within the document <body>.
- We allow rel=”previous” as a syntactic variant of rel=”prev” links.
- rel="next" and rel="previous" on the one hand and rel="canonical" on the other constitute independent concepts.
- rel=”prev” and rel=”next” act as hints to Google, not absolute directives.
- When implemented incorrectly, such as omitting an expected rel="prev" or rel="next" designation in the series, we'll continue to index the page(s), and rely on our own heuristics to understand your content.
https://productforums.google.com/forum/#!topic/webmasters/YbXqwoyooGM
10/19/2011 – Maile Ohye in Google Forums
If you've marked page 2 to n of your paginated series as "noindex, follow" to keep low quality content from affecting users and/or your site's rankings, that's fine, you can additionally include rel="next" and rel="prev." Noindex and rel="next"/"prev" are entirely independent annotations.
This means that if you add rel="next" and rel="prev" to noindex'd pages, it still signals to Google that the noindex'd pages are components of the series (though the noindex'd pages will not be returned in search results). This configuration is totally possible (and we'll honor it), but the benefit is mostly theoretical.
If you believe the user experience on page 2 to n provides little value -- so much so that you've already marked these pages as noindex -- then to ensure that these low-quality pages aren't returned to users and/or considered in ranking updates such as Panda, even if you choose to add rel="next" and rel="prev," you may want to consider keeping the noindex (or "noindex, follow").
https://googlewebmastercentral.blogspot.com/2012/03/video-about-pagination-with-relnext-and.html
03/01/2012 – Maile Ohye in Google Forums
"Does rel=next/prev also work as a signal for only one page of the series (page 1 in most cases?) to be included in the search index? Or would noindex tags need to be present on page 2 and on?"
When you implement rel="next" and rel="prev" on component pages of a series, we'll then consolidate the indexing properties from the component pages and attempt to direct users to the most relevant page/URL. This is typically the first page. There's no need to mark page 2 to n of the series with noindex unless you're sure that you don't want those pages to appear in search results.
03/12/2012 - Maile Ohye in YouTube Video
https://googlewebmastercentral.blogspot.com/2014/02/faceted-navigation-best-and-5-of-worst.html
02/12/2014 – Maile Ohye in Google Webmaster Central
Best practices for new faceted navigation implementations or redesigns
New sites that are considering implementing faceted navigation have several options to optimize the “crawl space” (the totality of URLs on your site known to Googlebot) for unique content pages, reduce crawling of duplicative pages, and consolidate indexing signals.
- Option 1: internal links
Make all unnecessary URLs links rel=“nofollow". This option minimizes the crawler’s discovery of unnecessary URLs and therefore reduces the potentially explosive crawl space (URLs known to the crawler) that can occur with faceted navigation. rel=”nofollow” doesn’t prevent the unnecessary URLs from being crawled (only a robots.txt disallow prevents crawling). By allowing them to be crawled, however, you can consolidate indexing signals from the unnecessary URLs with a searcher-valuable URL by adding rel=”canonical” from the unnecessary URL to a superset URL
- Option 2: Robots.txt disallow
For URLs with unnecessary parameters, include a /filtering/
directory that will be robots.txt disallow’d. This lets all search engines freely crawl good content, but will prevent crawling of the unwanted URLs. For instance, if my valuable parameters were item, category, and taste, and my unnecessary parameters were session-id and price. I may have the URL:
- Option 3: Separate hosts
If you’re not using a CDN (sites using CDNs don’t have this flexibility easily available in Webmaster Tools), consider placing any URLs with unnecessary parameters on a separate host -- for example, creating main host www.example.com
and secondary host, www2.example.com
. On the secondary host (www2), set the Crawl rate in Webmaster Tools to “low” while keeping the main host’s crawl rate as high as possible. This would allow for more full crawling of the main host URLs and reduces Googlebot’s focus on your unnecessary URLs.
- Be sure there remains at least one click path to all items on the main host.
- If you’d like to consolidate indexing signals, consider adding rel=”canonical” from the secondary host to a superset URL on the main host.
- Improve indexing of individual content pages with rel=”canonical” to the preferred version of a page. rel=”canonical” can be used across hostnames or domains.
- Improve indexing of paginated content (such as page=1 and page=2 of the category “gummy candies”) by either:
- Adding rel=”canonical” from individual component pages in the series to the category’s “view-all” page (e.g. page=1, page=2, and page=3 of “gummy candies” with rel=”canonical” to
category=gummy-candies&page=all
while making sure that it’s still a good searcher experience (e.g., the page loads quickly). - Using pagination markup with rel=”next” and rel=”prev” to consolidate indexing properties, such as links, from the component pages/URLs to the series as a whole.
- Adding rel=”canonical” from individual component pages in the series to the category’s “view-all” page (e.g. page=1, page=2, and page=3 of “gummy candies” with rel=”canonical” to
- Include only canonical URLs in Sitemaps.
- Configure Webmaster Tools URL Parameters if you have strong understanding of the URL parameter behavior on your site (make sure that there is still a clear click path to each individual item/article). For instance, with URL Parameters in Webmaster Tools, you can list the parameter name, the parameters effect on the page content, and how you’d like Googlebot to crawl URLs containing the parameter.
Note: URL "Parameter Handling" in Webmaster Tools allows the site owner to provide information about the site’s parameters and recommendations for Googlebot’s behavior.
Let's analyze Google’s advice:
Option 1: The View All Page
Google clearly favors the View-All page option when the page loads quickly and users can easily find what they are looking for. This means that all items in a paginated series should be listed on the View-All page and all the paginated pages canonical tags to reference the View-All page. The paginated pages in this scenario are there to garner more page views and to make the lists per page more manageable for a user to read. The View-All page is primarily for the search engines.
Coding Instruction for the View-All Option:
- Create a single View-All page with all of the content from the paginated pages within a single series of pagination.
- Once you have created the View-All page, place a rel=”canonical” tag in the head section of each paginated component page, referencing the View-All Page. (example: <link rel=”canonical” “href=https://www.example.com/view-all”/>). This will tell Google to treat each specific page in a paginated series as a segment of the View-All page and queries will return the View-All page as opposed to a relevant segment page of the pagination chain.
- In Google Webmaster Parameter Handling, set the paginated page parameter to “Paginates” and for Google to crawl every URL.
View-All Option Works Well:
- If your pagination does not have so many links or images that the View-All page will take a considerable time to load. Five seconds is already stretching the limit for many users, especially on mobile devices. With their preference of this option, I believe Google is indicating to us that this option is most beneficial. If your View-All pages are too large, then it’s time to think how to break your pagination down to more manageable levels.
- If you don’t mind that the View-All page is the only one that is allowed to be indexed in the search engines. This can undermine the main purpose of your pagination, which was to get more page views, as you want users to scroll through the navigation in manageable chunks of data.
Option 2: Block Pagination Beyond Page One
In some instances, you may want to structure your website so that the search engines do not access the paginated series of pages after the first page. This means that every product must have internal links from a first page of listings. This can be difficult to structure, but I have seen some sites use this method successfully. This method ensures that the bot crawler will not needlessly crawl unimportant pages and only your first main representative page will be indexed by the search engines. Be cautious using this option, as it will prevent search engines from indexing content in the rest of the article or from finding any products listed after the first page. If you will need to stuff in additional categorization to accomplish this goal of linking to every product URL or article on a first page, then this option can have the unintended consequence of a poor user experience and Google will certainly take notice of that.
Coding Instruction for the Blocking Pagination Option:
- Place a nofollow tag on all links to the paginated pages.
- Since the paginated pages will not get crawled, all link equity that the links receive will not get transferred. To prevent loss of page juice, you should limit the number of paginated links that will be shown on the first page.
- In Google Webmaster tools, under the Parameter Handling section, set the paginated page parameter to “Paginates” and for Google to crawl “No URLs”. This is another setting that requires extreme caution as parameters can be shared across various sections of the website and may have negative unintended consequences. If you are not confident and comfortable with these settings, leave the setting to "Let Googlebot Decide".
Blocking Pagination works well if:
- Other pages on the site do not pass link equity to the paginated pages.
- All pages on the site are linked internally on pages the search engines are allowed to crawl and the links are allowed to pass link equity.
Option 3: Implement Pagination Relationships
This option requires the use of “next” and “prev” tags. The next and “prev” tags establish the relationship between all pages in a paginated series. This coding relationship protects the paginated pages from being seen as duplicates. The robots “noindex,follow” tag can implemented on the paginated pages if you believe there is absolutely no purpose for the paginated pages to surface in the Google index. This method ensures that link equity will not be wasted. The downside to this method is if you have excessive pagination, the crawlers may get caught up in crawling the paginated pages and not crawl key areas of your site.
Coding Instruction for the Relationship Option:
- Implement the rel “prev” and “next” tags to indicate a sequence of paginated pages.
- Each page in this paginated series can have the same title tag, meta description and H1 tags. However, if you are allowing the paginated pages to get indexed, you may choose to have targeted keywords in all these tags instead.
- All pages should have the canonical tag set to its own URL and not to the first page. If the URLs have a tracking ID or extra parameters, the canonical tag may need extra consideration.
- If you don’t want the paginated pages to get indexed, set a robots meta tag to “noindex,follow” in the head section of every page in the paginated series, excluding the first page. I will refer to this as Option 3B in the table below.
- In Google Webmaster Parameter Handling, set the paginated page parameter to “Paginates” and for Google to crawl every URL
Paginated Relationships works well if:
- If it can be implemented correctly. This extra coding can be challenging for some sites.
- You don’t have excessive pagination and the crawlers are not having trouble crawling your entire site.
A single site can use one or all the options shown above. Each pagination template on your site should be reviewed thoroughly to see which option makes sense to use. You may choose to use one or all of the above options on different content sections. I checked a selection of competitor sites and they all use Option 2 (block pagination) or Option 3 (paginated series). I again want to stress the major challenges with Option 2 and this option requires perfection in implementation to work correctly and the safer choices are either Option 1 (View-All page) or Option 3 (paginated series). I would surmise that although Google is promoting Option 1 (View-All page), most webmasters have not figured out how to fit the View-All page into their user experience, and therefore will not implement it. However, if Google is promoting the view-All option, I am sure Google has discovered that the View-All Option is the preferred option by searchers, so webmasters may sometimes need to cast aside their own business objectives.
NYTimes and Zocdoc use Option 2 and blocked out all pagination pages from getting crawled and indexed. The other sites all use Option 3 with Vitals setting the robots tag on the pagination pages to “nofollow,index”. Avvo’s strategy is a combination of Option 1 with the canonical tag set to the primary page and Option 2 with the links to pagination tagged as “nofollow”. It is advisable not mix-up or combine the various strategies or risk sending wrong signals to the search engines.
Major Pagination Challenges with All Options:
- Pay close attention to the crawler settings in Webmaster tools and also to your log files. Make sure Google is properly crawling all intended areas of the site.
- Make sure the parameter handling, robots.txt file, robots tag, anchor tag settings (follow or nofollow) and canonical tags all complement with each other and are implemented correctly. This is where most sites misconfigure their pagination.
- If your pagination is JavaScript driven, you should make sure that users can still access the pagination even when they disable JavaScript. More importantly, the crawlers will not crawl the paginated pages if this functionality is not enabled.
- Endless pagination is a major concern. If your last pagination page has the URL https://www.example.com/page4, then that page should result in a 404 and page4 should not have the rel=”next” pointing to page 5. This sounds obvious, but it is a common issue that can cause the crawlers to get bogged down and stuck in your pagination.
- Include only crawler accessible canonical URLs in your XML and HTML sitemaps. All URLs that are blocked by robots.txt, “noindex” robots tag, non-self canonicals and redirected URLs should not be included in the sitemaps. Only the first URL in a paginated series using “next” and “prev” should be included in the sitemaps.
Pagination is complication. I hope that this article provides enough insight so that you can plan a proper strategy and provide the search engines logical paths and quality content. These methods will allow the search engines to crawl efficiently, resulting in strong rankings for your site content.
Hey there Ladies and Gents,
Thanks for such a great post, and all the comments.
I just wanted to chime in with my question, as i am having some issues. I will only look at 1 issue in this post.
The site blog is hubblehq.com/blog
We are currently using "next" and "prev" tags setup to indicate the sequential orders for SE's.
If you Google site:hubblehq.com/blog you will see that all 'category' pages are being indexed, which I am not really after. Ideally i would want to see only the first page indexed as the content on the actual ?page=X URLs is not important. The blogs posts are of course important and that is what i want to get indexed.
Is the answer simply to amend to <meta ng-if="meta" name="robots" content="noindex, follow" class="ng-scope">
Does doing the above likely risk the blog posts themselves not being indexed any more?
Thanks. :)
A complete guide for pagination. nice article. thanks!
Does that mean Google love to see view all page instead of pagination of pages?
Great to see my article is still getting attention years later. The view all page is certainly ideal in most situations and in situations where there are a lot of items to view, you can program the page for continuous scrolling with pagination.
B S Tanwar
Thanks for the very useful information that is required for any SEO
Thanks for this informative and comprehensive post, Irving.
I hope you can answer this question for me:
We are using the 3rd method you outlined, the "Relationships" method.
You write, "Each page in this paginated series can have the same title tag, meta description and H1 tags." This is exactly how we implemented it. However, in GWMT ("HTML Improvements") Google is reporting these as duplicate meta descriptions. I also noticed that Rand (in this ancient post) warns that it is not recommended to use the same meta descriptions.
We're considering deleting the meta description from all pages except for the first. I would love to hear your advice.
it's great to hear from a landsman!
Google Webmaster has this to say about "HTML Improvements"
The HTML Improvements page shows you potential issues Google found when crawling and indexing your site. We recommend that you review this report regularly to identify changes that potentially increase your rankings in Google search results pages while providing a better experience for your readers.
The same rings true for other notices in Google WMT as well. Since you know the reason why you are getting the notice, I wouldn't be concerned at all. Rand's suggestion was predated Google's recommendations.
I assume your allowing these pages to be indexed, which in essence means that you feel that there is some unique content value with each paginated page and as such, a unique title and meta description tag should be considered as well.
Thanks for the swift reply, Irving! I'm a little nervous ignoring Google's message, but if you say so... Do you think it can hurt to take out the meta description, or that it's simply a waste of time?
After reading your piece, I think we should consider making subsequent pages "noindex", since we actually only want the first page to appear in the SERPS.
If you are making them "noindex,follow", the warnings will go away on their own. There is no harm in removing the meta description tag on your paginated pages.
Hi Irving
First of all thanks for insights on the topic.
I have question about sites which are already indexed and maybe have a lot of paginated pages indexed.
Just blocking the indexing may create problems with "Unusual activity" with Google.
What would you suggest?
Thanks
H.S.
We use next prev on paginated pages but unfortunately google wemastetools claimed for duplicated meta title and meta description. Examle: /exmple/?p=3 or p4 etc same meta details...Have you solution? Thanks
Totally agree. Moz is good but i think they need to adjust their algorithm.
Thanks Irving a detailed post on Google Webmaster Recommendation for pagination.
@Irving. Great piece of information. You can resolve pagination issues using different methods.
Either you can use rel="prev" and rel="next" or rel="canonical"
I prefer rel="canonical"
Irving Weiss,
Do you think the post need some editing as Google has recently published post on rendering pages. The Link is here. https://googlewebmastercentral.blogspot.in/2014/05/...
Hardik,
Thanks for your feedback. While I couldn't publicize this in the article, a very well known Googler had proofread and made some modifications to my article right before it was published. One of the major modifications was the suggestion to downplay the second option which is to block the crawlers from accessing the pagination files. The Googler did not feel that most sites would implement this correctly and the Google crawler would not be able to access all pages on their site without full pagination. The article was already kind of lengthy and I didn't even approach the deeper complexities of faceted navigation and types of code used. The article you showed about Google accessing JS pages is very noteworthy and every site will need to evaluate impact on their own. One common point for JS or pagination or any page is dutifully noted:
Thanks
Irving
Hi,
very nice article... but I've a question about the possibility of having an ecommerce page with pagination and sorting.
For example... /mysite/shoes paginated with /mysite/shoes?page=1 and /mysite/shoes?page=2 ecc.
but also I could have /mysite/shoes?page=1&order=1 and /mysite/shoes?page=2&order=1
In this case, how can mix "canonical" and "pagination" tags?
Maybe, it could be right set /mysite/shoes?page=1 with "next" /mysite/shoes?page=2 without canonical
and /mysite/shoes?page=1&order=1 with "canonical" to /mysite/shoes?page=1 with or without next tags?
What do you think?
Thanks for the compliment. The best advice for faceted navigation can be found on the Google forum - https://googlewebmastercentral.blogspot.com/2014/02/faceted-navigation-best-and-5-of-worst.html. You should prevent the crawlers from accessing pages with the &order=1 parameter. In most circumstances, there is no benefit in having the engines crawl faceted navigation and it can even be detrimental. The order parameter should be blocked in robots.txt and in WMT parameter handling. The canonical tags aren't relevant if the pages are blocked, but I would set them to the first page of the navigation (/mysite/shoes?page=1).
I agree, this is really good info for any site that has to deal with the issue of making sure all of their pages are crawlable, and at the same time ensuring that they are compliant with Google's guidelines. I have found that many sites are not intentionally going against the rules, but rather are unaware of the panda issues that could cause their site to have traffic issues simply because they are allowing too many spammy pages to be indexed unintentionally. Nice job Irving.
Whoah, Irving this looks great! I only too a brief glance through it at the moment, but I am excited to come back to it this evening! This is such a common issue that I have come across recently. I think this will be a great resource. Thank you!
Thank you Irving for providing us very useful info for any site.
Thank you for this information
So I've had rel=prev rel=next set up on my ecommerce site for months. For some reason when you search site:domain.com/categoryname in Google the SERPs return all the pagination pages. I've noticed my competitors that have rel=prev rel=next only have the one version in SERPs.
Any ideas on what I may be doing wrong?
Google rep John Mueller says not to noindex canonicaled pages, btw. Were you aware of this and just think he's wrong?
"You're just entering a somewhat undefined state, which is why we don't recommend it. If you say page A is equivalent to B (rel=canonical), and then set page A to be noindex, what does that say about page B? Sure, it can still work, but anytime you give conflicting signals like this you have to consider that they may be interpreted differently in the future (or by other search engines)."
https://plus.google.com/u/0/+JohnBritsios/posts/4m16FC2kgTj
That's 100% correct when using the canonical as a redirect. However, if the canonical tag is a selfie, you can safely set to noindex without fear that it's a conflicting signal.
I agree. Otherwise the millions and millions of Yoast SEO plugin users would be screwed.
There can be many reasons.
Take for example this site.
Their first page of pagination is: https://www.plumbersstock.com/category/6/utility-room/
which has a canonical tag to another version of their page:
<link rel="canonical" href="https://www.plumbersstock.com/category/6/utility-room/?page=1" />
and all paginated pages link to this second version. The pagination must be setup correctly in order for the prev and next tags to work properly.
Also, I would advise this site to avoid the robots index,follow on the paginated pages and let Google decide for themselves or I would set to noindex since their secondary pages do not provide any special index value over page one.
Thank you Irving,
You are providing us very useful info for any site.
Thanks,
Aum InfoTech
I 've been using MOZ for local listing and I found it very helpful It tells directly your position across the web and provide you reliable analytics.
Totally agreed. Moz local is one of the most advanced local listing platform that is available to online businesses.