Welcome to the newest installment of our educational Next Level series! In our last post, Brian Childs offered up a beginner-level workflow to help discover your competitor's backlinks. Today, we're welcoming back Next Level veteran Jo Cameron to show you how to find low-quality pages on your site and decide their new fate. Read on and level up!
With an almost endless succession of Google updates fluctuating the search results, it’s pretty clear that substandard content just won’t cut it.
I know, I know — we can’t all keep up with the latest algorithm updates. We’ve got businesses to run, clients to impress, and a strong social media presence to maintain. After all, you haven’t seen a huge drop in your traffic. It’s probably OK, right?
So what’s with the nagging sensation down in the pit of your stomach? It’s not just that giant chili taco you had earlier. Maybe it’s that feeling that your content might be treading on thin ice. Maybe you watched Rand’s recent Whiteboard Friday (How to Determine if a Page is "Low Quality" in Google's Eyes) and just don’t know where to start.
In this edition of Next Level, I’ll show you how to start identifying your low-quality pages in a few simple steps with Moz Pro's Site Crawl. Once identified, you can decide whether to merge, shine up, or remove the content.
A quick recap of algorithm updates
The latest big fluctuations in the search results were said to be caused by King Fred: enemy of low-quality pages and champion of the people’s right to find and enjoy content of value.
Fred took the fight to affiliate sites, and low-value commercial sites were also affected.
The good news is that even if this isn’t directed at you, and you haven’t taken a hit yourself, you can still learn from this update to improve your site. After all, why not stay on the right side of the biggest index of online content in the known universe? You’ll come away with a good idea of what content is working for your site, and you may just take a ride to the top of the SERPs. Knowledge is power, after all.
Be a Pro
It’s best if we just accept that Google updates are ongoing; they happen all.the.time. But with a site audit tool in your toolkit like Moz Pro's Site Crawl, they don’t have to keep you up at night. Our shiny new Rogerbot crawler is the new kid on the block, and it’s hungry to crawl your pages.
If you haven’t given it a try, sign up for a free trial for 30 days:
Set up your Moz Pro campaign — it takes 5 minutes tops — and Rogerbot will be unleashed upon your site like a caffeinated spider.
Rogerbot hops from page to page following links to analyze your website. As Rogerbot hops along, a beautiful database of pages is constructed that flag issues you can use to find those laggers. What a hero!
First stop: Thin content
Site Crawl > Content Issues > Thin Content
Thin content could be damaging your site. If it’s deemed to be malicious, then it could result in a penalty. Things like zero-value pages with ads or spammy doorway pages — little traps people set to funnel people to other pages — are bad news.
First off, let’s find those pages. Moz Pro Site Crawl will flag "thin content" if it has less than 50 words (excluding navigation and ads).
Now is a good time to familiarize yourself with Google’s Quality Guidelines. Think long and hard about whether you may be doing this, intentionally or accidentally.
You’re probably not straight-up spamming people, but you could do better and you know it. Our mantra is (repeat after me): “Does this add value for my visitors?” Well, does it?
Ok, you can stop chanting now.
For most of us, thin content is less of a penalty threat and more of an opportunity. By finding pages with thin content, you have the opportunity to figure out if they're doing enough to serve your visitors. Pile on some Google Analytics data and start making decisions about improvements that can be made.
Using moz.com as an example, I’ve found 3 pages with thin content. Ta-da emoji!
I’m not too concerned about the login page or the password reset page. I am, however, interested to see how the local search page is performing. Maybe we can find an opportunity to help people who land on this page.
Go ahead and export your thin content pages from Moz Pro to CSV.
We can then grab some data from Google Analytics to give us an idea of how well this page is performing. You may want to look at comparing monthly data and see if there are any trends, or compare similar pages to see if improvements can be made.
I am by no means a Google Analytics expert, but I know how to get what I want. Most of the time that is, except when I have to Google it, which is probably every second week.
Firstly: Behavior > Site Content > All Pages > Paste in your URL
- Pageviews - The number of times that page has been viewed, even if it’s a repeat view.
- Avg. Time on Page - How long people are on your page
- Bounce Rate - Single page views with no interaction
For my example page, Bounce Rate is very interesting. This page lives to be interacted with. Its only joy in life is allowing people to search for a local business in the UK, US, or Canada. It is not an informational page at all. It doesn’t provide a contact phone number or an answer to a query that may explain away a high bounce rate.
I’m going to add Pageviews and Bounce Rate a spreadsheet so I can track this over time.
I’ll also added some keywords that I want that page to rank for to my Moz Pro Rankings. That way I can make sure I’m targeting searcher intent and driving organic traffic that is likely to convert.
I’ll also know if I’m being out ranked by my competitors. How dare they, right?
As we've found with this local page, not all thin content is bad content. Another example may be if you have a landing page with an awesome video that's adding value and is performing consistently well. In this case, hold off on making sweeping changes. Track the data you’re interested in; from there, you can look at making small changes and track the impact, or split test some ideas. Either way, you want to make informed, data-driven decisions.
Action to take for tracking thin content pages
Export to CSV so you can track how these pages are performing alongside GA data. Make incremental changes and track the results.
Second stop: Duplicate title tags
Site Crawl > Content Issues > Duplicate Title Tags
Title tags show up in the search results to give human searchers a taste of what your content is about. They also help search engines understand and categorize your content. Without question, you want these to be well considered, relevant to your content, and unique.
Moz Pro Site Crawl flags any pages with matching title tags for your perusal.
Duplicate title tags are unlikely to get your site penalized, unless you’ve masterminded an army of pages that target irrelevant keywords and provide zero value. Once again, for most of us, it’s a good way to find a missed opportunity.
Digging around your duplicate title tags is a lucky dip of wonder. You may find pages with repeated content that you want to merge, or redundant pages that may be confusing your visitors, or maybe just pages for which you haven’t spent the time crafting unique title tags.
Take this opportunity to review your title tags, make them interesting, and always make them relevant. Because I’m a Whiteboard Friday friend, I can’t not link to this title tag hack video. Turn off Netflix for 10 minutes and enjoy.
Pro tip: To view the other duplicate pages, make sure you click on the little triangle icon to open that up like an accordion.
Hey now, what’s this? Filed away under duplicate title tags I’ve found these cheeky pages.
These are the contact forms we have in place to contact our help team. Yes, me included — hi!
I’ve got some inside info for you all. We’re actually in the process of redesigning our Help Hub, and these tool-specific pages definitely need a rethink. For now, I’m going to summon the powerful and mysterious rel=canonical tag.
This tells search engines that all those other pages are copies of the one true page to rule them all. Search engines like this, they understand it, and they bow down to honor the original source, as well they should. Visitors can still access these pages, and they won’t ever know they've hit a page with an original source elsewhere. How very magical.
Action to take for duplicate title tags on similar pages
Use the rel=canonical tag to tell search engines that https://moz.com/help/contact is the original source.
Review visitor behavior and perform user testing on the Help Hub. We’ll use this information to make a plan for redirecting those pages to one main page and adding a tool type drop-down.
More duplicate titles within my subfolder-specific campaign
Because at Moz we’ve got a heck of a lot of pages, I’ve got another Moz Pro campaign set up to track the URL moz.com/blog. I find this handy if I want to look at issues on just one section of my site at a time.
You just have to enter your subfolder and limit your campaign when you set it up.
Just remember we won’t crawl any pages outside of the subfolder. Make sure you have an all-encompassing, all-access campaign set up for the root domain as well.
Not enough allowance to create a subfolder-specific campaign? You can filter by URL from within your existing campaign.
In my Moz Blog campaign, I stumbled across these little fellows:
https://moz.com/blog/whiteboard-friday-how-to-get-an-seo-job
https://moz.com/blog/whiteboard-friday-how-to-get-an-seo-job-10504
This is a classic case of new content usurping the old content. Instead of telling search engines, “Yeah, so I’ve got a few pages and they’re kind of the same, but this one is the one true page,” like we did with the rel=canonical tag before, this time I’ll use the big cousin of the rel=canonical, the queen of content canonicalization, the 301 redirect.
All the power is sent to the page you are redirecting to, as well as all the actual human visitors.
Action to take for duplicate title tags with outdated/updated content
Check the traffic and authority for both pages, then add a 301 redirect from one to the other. Consolidate and rule.
It’s also a good opportunity to refresh the content and check whether it's... what? I can’t hear you — adding value to my visitors! You got it.
Third stop: Duplicate content
Site Crawl > Content Issues > Duplicate Content
When the code and content on a page looks the same are the code and content on another page of your site, it will be flagged as "Duplicate Content." Our crawler will flag any pages with 90% or more overlapping content or code as having duplicate content.
Officially, in the wise words of Google, duplicate content doesn’t incur a penalty. However, it can be filtered out of the index, so still not great.
Having said that, the trick is in the fine print. One bot’s duplicate content is another bot’s thin content, and thin content can get you penalized. Let me refer you back to our old friend, the Quality Guidelines.
Are you doing one of these things intentionally or accidentally? Do you want me to make you chant again?
If you’re being hounded by duplicate content issues and don’t know where to start, then we’ve got more information on duplicate content on our Learning Center.
I’ve found some pages that clearly have different content on them, so why are these duplicate?
So friends, what we have here is thin content that’s being flagged as duplicate.
There is basically not enough content on the page for bots to distinguish them from each other. Remember that our crawler looks at all the page code, as well as the copy that humans see.
You may find this frustrating at first: “Like, why are they duplicates?? They're different, gosh darn it!” But once you pass through all the 7 stages of duplicate content and arrive at acceptance, you’ll see the opportunity you have here. Why not pop those topics on your content schedule? Why not use the “queen” again, and 301 redirect them to a similar resource, combining the power of both resources? Or maybe, just maybe, you could use them in a blog post about duplicate content — just like I have.
Action to take for duplicate pages with different content
Before you make any hasty decisions, check the traffic to these pages. Maybe dig a bit deeper and track conversions and bounce rate, as well. Check out our workflow for thin content earlier in this post and do the same for these pages.
From there you can figure out if you want to rework content to add value or redirect pages to another resource.
This is an awesome video in the ever-impressive Whiteboard Friday series which talks about republishing. Seriously, you’ll kick yourself if you don’t watch it.
Broken URLs and duplicate content
Another dive into Duplicate Content has turned up two Help Hub URLs that point to the same page.
These are no good to man or beast. They are especially no good for our analytics — blurgh, data confusion! No good for our crawl budget — blurgh, extra useless page! User experience? Blurgh, nope, no good for that either.
Action to take for messed-up URLs causing duplicate content
Zap this time-waster with a 301 redirect. For me this is an easy decision: add a 301 to the long, messed up URL with a PA of 1, no discussion. I love our new Learning Center so much that I’m going to link to it again so you can learn more about redirection and build your SEO knowledge.
It’s the most handy place to check if you get stuck with any of the concepts I’ve talked about today.
Wrapping up
While it may feel scary at first to have your content flagged as having issues, the real takeaway here is that these are actually neatly organized opportunities.
With a bit of tenacity and some extra data from Google Analytics, you can start to understand the best way to fix your content and make your site easier to use (and more powerful in the process).
If you get stuck, just remember our chant: "Does this add value for my visitors?” Your content has to be for your human visitors, so think about them and their journey. And most importantly: be good to yourself and use a tool like Moz Pro that compiles potential issues into an easily digestible catalogue.
Enjoy your chili taco and your good night’s sleep!
Hey Jo,
Great tips to share.
Duplicate titles are indeed very confusing for both - users and crawlers.
I would also add here, sometimes we have everything in place - original content, relevant title, meta tags, images, etc., but if the information is not written in the presentable format, that may lead to high bounce rate.
We recently optimized a similar website with good content, but just because the content was not presented in a proper format, they had a high bounce rate. We changed the content format by adding sub-headings, pointers, relevant images/graphics, relevant/similar articles, etc. As a result, we saw a decline in the bounce rate from nearly 80% to 50% within weeks on those pages - and that too, without adding anything new to the website's content.
Thanks
It's great to hear about the real tangible results you've had from reviewing your content format. Give people the content they want with a good user experience and they'll stick around :]
Thanks for the post Jo. Opportunities are always there to improve the last performance. Just like, we can do the same by following these steps.
The only thing that always bothers me in MOZ site audit is to see the thin content issue for the pages that’s doesn’t requires content. Each time I need to ignore them, doesn’t know how Google treat those pages that doesn’t need and include thin content.
Great point Shalu, thanks for the feedback!
The tool is designed to flag these issues so you can decide whether there’s an opportunity to improve your content.
I know that sometimes you just know you’re not going to make any changes to that page, whether it’s not a priority, or you have no intention of making changes. In this case you can use the Ignore Issue option in the tool. This prevents the same issues from being reported the next time we crawl your site.
The other thing you can do is export your thin content pages to csv (for triple checking), and then Ignore Issue Type, to ignore all thin content issues all in a few clicks :]
I’ve got a guide for ignoring issues here https://moz.com/help/guides/moz-pro-overview/site-crawl/issues
I hope this helps!
Hi Jo, thanks for the post, some great points in there.
I think it's important not to isolate too many stats and remember what the goal of the page is (as Rand mentioned in a recent WBF). For example, on a product or service page, if people are just browsing for a price it will likely garner a high bounce rate and low time on the site. Of course you could argue that your site and content should wow them enough to keep them there, but in today's price-driven economy that's not always possible :)
Yes exactly. It's important to consider the purpose of the page, and then look at how people are behaving on that page to make sure it's serving its purpose.
Sometimes a 'wow' is just being able to find the)relevant information quickly and easily :)
Hello Jo
You have convinced me that I need to try at least a 30 day trail in really dig in to see what my pages, especially the low quality pages, are doing. The first content I put up is somewhere between hot mess and flaming dumpster fire so I have been wanting to re-up it for some time. There are also the "great ideas" I had that went absolutely no where. Ugg, where to start. I guess that's where getting a real site assessment and some no kidding data would be invaluable. So much SEO to do and so little time. I really appreciate MOZ writers like you to teach and inspire me. Super cool that you offered a second trail period too.
I'm loving the idea of a hot mess to flaming dumpster rating scale :D
Let us know how you get on with the trial and digging into those pages, hopefully you can start to prioritize your issues and move out of the dumpster and over to the puppies and rainbows rating scale.
This a useful way to read the guides and manual, I'm always reading the blog posts and the QA but I dont have the motivation to read all the manuals. I always prefer test and error. And this a great way to do it.
Glad to help Roman!
Time for me to subscribe to MOZ products. So much insight I am missing in my SEO analysis.
Thanks for the great post!
Get on it Jean-Christophe! :D
Hi Jo,
Great points about thin content and low quality pages. Often, I find with my larger clients that what you tell search engines not to index is just as important as what you want them to index. Many websites would be well served to start using the meta robots "noindex" tag on low value pages with thin or duplicate content.
My criteria for the "noindex" tag is usually if a page has thin content, has not gotten any traffic from organic search over the past few months and has no backlinks. Ideally every page on your website should be valuable enough to garner organic traffic.
I'm really interested to hear of other strategies for dealing with thin content, I know there are probably numerous ways to tackle them. I like that you check traffic and links first to avoid self-sabotage.
NoIndex is a good move for pages you still want people to find through your site, but not through organic search.
Also remember not to disallow noindex pages from your robots.txt or googlebot will be going in blind and may still index them :]
Hey Jo,
My concern is if you dont want your thin content to be organically discovered but still want the content to be found out , wont it increase the bounce rate, wont give good user experience (as it is a thin content) and in all wont serve much of a purpose to the reader.
Please guide me more about it
Hi Shilpa,
Great question!
This really depends on the purpose of the page. A high bounce rate doesn't always mean that the user had a bad experience. One often used example is if your page has a phone number, or business contact details, it may be expected that the visitor will find the information and leave the computer to call or visit. So I would recommend thinking about the purpose of your page to make sure you're providing the best experience.
I hope that helps!
Nice tip. I'ill try it on my web!
With On-Page SEO, I've found Moz Pro's Site Crawl which you recommend to be an excellent tool for duplicate/thin content issues. I find most of my clients have some serious issues with duplicate content and I often have to explain why it's a huge problem with Google. This post is great information and really breaks down everything perfectly.I'll keep it on file to show my clients.Thanks Jo!
Hi Jo, really great post, thanks. I wasn´t even aware of this option "content issues" with the "thin content" in the MOZ campaigns section. I imediatelly found some pages on our site that can be improved upon a lot (like you said we want to add value for our visitores). So thanks again for these valuable insights.
Hi Jo
If a page does not like the customer, it is difficult for search engines and vice versa. Great tool to dictate the quality of a page
Thank you very much for the information
Hi Jo, thanks for the brief insight into another fantastic Moz product. Enticing...
"I can’t not link to this title tag hack video."
I think you added a comma at the end of your link :)
Thanks for the heads up! :) We got that all fixed now -- cheers!
Spurious comma in your link to the 7 Title Tag hacks post there - link 404's!
All fixed -- thanks for letting us know! :)
Great post Jo,
Should I be worried about duplicate content created from tags or categories? WordPress creates archive pages for tags and categories which can create duplicate content. This should not inherently flag the site right? Usually this is an issue with new launches however as the site grows its easier to organize to prevent duplicate content.
Hi Donald!
That's a great question!
I would recommend noindexing tags using a WordPress plugin. Dan Shure goes into more detail on WordPress setup for SEO and how to treat tags and categories here https://moz.com/blog/setup-wordpress-for-seo-succe...
Dan Shure also offers more help in this thread on our Q&A https://moz.com/community/q/wordpress-tags-duplica...
A quick point on noindexed pages - make sure that you don't disallow googlebot from these pages in your robots.txt or it won't see the noindex.
In terms of the Moz crawler, rogerbot, you can either disallow rogerbot from the noindex pages if you don't want them showing up in your Site Crawl in Moz Pro, or you can use the 'issue-ignore' function to stop them showing up in future.
I hope that helps!
Helped me discover duplicate content more than once. Something you won't find in Search Console (Google shows you only duplicate titles and meta-descriptions there).
Hi,
Would it be advisable for me to be stressed over copy content made from labels or classes? WordPress makes file pages for labels and classes which can make copy content. This ought not characteristically signal the site right? Generally this is an issue with new dispatches however as the site develops its less demanding to sort out to counteract copy content.
Please Visit my website watchmovie365
Hey!
I am not sure about when I should use canonical or 301 redirect. I am new in this, so I think the best way will be trial an error.
Thank you for the article!
great article, their are 2 type of content one is unique and another one copied or badly written content. We must focus on good quality content
Hey Jo,Thanks for the post,The only thing that always bothers me in MOZ site audit is to see the thin content issue for the pages that’s doesn’t requires content. useful,,
After reading this article, I have decided to try the free trial, learn how to use it 100% and to see if it is for me. I am sure it will. Thanks for explaining this so well and for sharing your knowledges with us. :)
Great tutorial with a great writing style. I think many times these simple optimizations are overlooked! I also loved the way you engage people in having a second trial period :)