Read about the 5 new upgrades we've added to Site Crawl (as of September 2017) here!
When you're faced with the many thousands of potential issues a large site can have, where do you start? This is the question we tried to tackle when we rebuilt Site Crawl. The answer depends almost entirely on your site and can require deep knowledge of its history and goals, but I'd like to outline a process that can help you cut through the noise and get started.
Simplistic can be dangerous
Previously, we at Moz tried to label every issue as either high, medium, or low priority. This simplistic approach can be appealing, even comforting, and you may be wondering why we moved away from it. This was a very conscious decision, and it boils down to a couple of problems.
First, prioritization depends a lot on your intent. Misinterpreting your intent can lead to bad advice that ranges from confusing to outright catastrophic. Let's say, for example, that we hired a brand-new SEO at Moz and they saw the following issue count pop up:
Almost 35,000 NOINDEX tags?! WHAT ABOUT THE CHILDREN?!!
If that new SEO then rushed to remove those tags, they'd be doing a lot of damage, not realizing that the vast majority of those directives are intentional. We can make our systems smarter, but they can't read your mind, so we want to be cautious about false alarms.
Second, bucketing issues by priority doesn't do much to help you understand the nature of those problems or how to go about fixing them. We now categorize Site Crawl issues into one of five descriptive types:
- Critical Crawler Issues
- Crawler Warnings
- Redirect Issues
- Metadata Issues
- Content Issues
Categorizing by type allows you to be more tactical. The issues in our new "Redirect" category, for example, are going to have much more in common, which means they potentially have common fixes. Ultimately, helping you find problems is just step one. We want to do a better job at helping you fix them.
1. Start with Critical Crawler Issues
That's not to say everything is subjective. Some problems block crawlers (not just ours, but search engines) from getting to your pages at all. We've grouped these "Critical Crawler Issues" into our first category, and they currently include 5XX errors, 4XX errors, and redirects to 4XX. If you have a sudden uptick in 5XX errors, you need to know, and almost no one intentionally redirects to a 404.
You'll see Critical Crawler Issues highlighted throughout the Site Crawl interface:
Look for the red alert icon to spot critical issues quickly. Address these problems first. If a page can't be crawled, then every other crawler issue is moot.
2. Balance issues with prevalence
When it comes to solving your technical SEO issues, we also have to balance severity with quantity. Knowing nothing else about your site, I would say that a 404 error is probably worth addressing before duplicate content — but what if you have eleven 404s and 17,843 duplicate pages? Your priorities suddenly look very different.
At the bottom of the Site Crawl home, check out "Moz Recommends Fixing":
We've already done some of the math for you, weighting urgency by how prevalent the issue is. This does require some assumptions about prioritization, but if your time is limited, we hope it at least gives you a quick starting point to solve a couple of critical issues.
3. Solve multi-page issues
There's another advantage to tackling issues with high counts. In many cases, you might be able to solve issues on hundreds (or even thousands) of pages with a single fix. This is where a more tactical approach can save you a lot of time and money.
Let's say, for example, that I want to dig into my 916 pages on Moz.com missing meta descriptions. I immediately notice that some of these pages are blog post categories. So, I filter by URL:
I can quickly see that these pages account for 392 of my missing descriptions — a whopping 43% of them. If I'm concerned about this problem, then it's likely that I could solve it with a fairly simple CMS page, wiping out hundreds of issues with a few lines of code.
In the near future, we hope to do some of this analysis for you, but if filtering isn't doing the job, you can also export any list of issues to CSV. Then, pivot and filter to your heart's content.
4. Dive into pages by PA & crawl depth
If you can't easily spot clear patterns, or if you've solved some of those big issues, what next? Fixing thousands of problems one URL at a time is only worthwhile if you know those URLs are important.
Fortunately, you can now sort by Page Authority (PA) and Crawl Depth in Site Crawl. PA is our own internal metric of ranking ability (primarily powered by link equity), and Crawl Depth is the distance of a page from the home-page:
Here, I can see that there's a redirect chain in one of our MozBar URLs, which is a very high-authority page. That's probably one worth fixing, even if it isn't part of an obvious, larger group.
5. Watch for spikes in new issues
Finally, as time goes on, you'll also want to be alert to new issues, especially if they appear in large numbers. This could indicate a sudden and potentially damaging change. Site Crawl now makes tracking new issues easy, including alert icons, graphs, and a quick summary of new issues by category:
Any crawl is going to uncover some new pages (the content machine never rests), but if you're suddenly seeing hundreds of new issues of a single type, it's important to dig in quickly and make sure nothing's wrong. In a perfect world, the SEO team would always know what changes other people and teams made to the site, but we all know it's not a perfect world.
I hope this gives you at least a few ideas for how to quickly dive into your site's technical SEO issues. If you're an existing customer, you already have access to Moz's new Site Crawl and all of the features discussed in this post.
"Almost 35,000 NOINDEX tags?! WHAT ABOUT THE CHILDREN?!!"
Update: My children were, thankfully, unharmed by the NOINDEX tags.
The new layout looks really nice. Great work Moz team! I think site crawl/technical site fixes can be one of the biggest things holding a site back, and what your team tackled definitely makes it easier to focus on the high priority issues. I still use Screaming Frog to run one-off checks, but this looks great for ongoing site monitoring.
I've been working with the new crawl all day, really immensely more useful than the old one! It does a much better job of filtering the issues into different categories which makes them easier to fix.
Very glad to hear it! We're hoping to make the actionable insights even stronger over the coming months.
Thanks, love these posts that really 'crawl' into the moz toolset inner workings.
Thanks for the update. It's very specific, explains why and what to do next - something that we've found difficult to explain to clients without nicely arranged data. The updates on new crawl issues is the icing to the cake. We're tracking multiple sites, and it's difficult to find time daily to check each one by one. :)
Moz come up always with a solution and what exactly I need, to fix my websites errors... Infact, what I was looking for..
Is it a coincidence? Donno...
But, helpful Peter..thanks a ton..
This was a great article that 'filled in' some areas of SEO and site crawling that I didn't previously know. Thanks!
Thank you for posting this! As someone who is still learning SEO tactics, this gives me a guideline on how to proceed when the site crawl results are...terrifying. I usually use Raven and Yoast for crawling purposes but now I want to try Moz!
Counterintuity LLC | Burbank, CA
[Link removed by editor]
Yeah, great UI, easy to navigate. Love the "How to fix it" section with the pictures! Cheers, Martin
You are a brave man indeed for letting an SEO see raw data and even braver still for having him work on the back end personally to fix issue. Perhaps it's all part of a devious plot where you leverage the company to install your working backup data once the suits crash the whole thing.
I was not aware of the new changes. Spotted 301s issues and got the list by clicking over the portion of the pie graph drawn for 3xx status codes, just brilliant...great UI!
Thank you very much for the article, it is interesting enough.
Is there a way to customize the order of the crawl? thanks,
How to solve 5XX errors? are dangerous?
That's a bit complex -- if they're persistent (it's possible to just have a temporary outage) then yes, they're a big problem. They're blocking crawlers from seeing those pages. There are a lot of reasons for 500-series errors, but that's generally happening on the server level, so it can get into the specifics of your platform/OS.
Thanks for sharing.. All ideas are great ,it's really helpful when happen these type of crawl issues.. It's a really helpful and follow the guidelines...
Really helpful guide to the new site crawl! Thanks for this.
This is very helpful, thank you very much for this post.
This article was a reminder that we need to keep our websites top notch! Thank You for this post and looking foward to more posts like this!
Thank you for providing such a wonderful list of new website as i hardly needed them.
Thank you so much once again. Keep it up
very interesting article about the site crawling of your website.