The Penguin update sent a strong message that not knowing SEO basics is going to be dangerous in the future. You have to have the basics down or you could be at risk. Penguin is a signal from Google that these updates are going to continue at a rapid pace and they don't care what color your hat is, it's all about relevance. You need to take a look at every seemingly viable "SEO strategy" with this lens. What you don't know can hurt you. It's not that what you are doing is wrong or bad, the reality is that the march towards relevance is coming faster than ever before. Google doesn't care what used to work, they are determined to provide relevance and that means big changes are the new normal.
All that said doing great SEO is an achievable goal, make sure you are taking these steps.
1. Understand your link profile
This is essential knowledge post Penguin. The biggest risk factors are a combination of lots of low quality links with targeted anchor text. There seems to be some evidence that there is a new 60% threshold for matching anchor text but don't forget about the future, I recommend at most 2 rankings focused anchor texts out of 10. The key metrics I look at for this are:
- Anchor text distribution
- The link type distribution (for example, article, comment, directory, etc.)
- Domain Authority and Page Authority distributions
The goal here is to find out what is currently going on and where you should be going. Compare your site with the examples below.
Tools for this:
For anchor text Open Site Explorer gives you an immediate snapshot of what's going on while MajesticSEO and Excel can be better at digging into some of the really spammy links.
Great Excel templates for DA/PA analysis
For link type analysis I use Link Detective but it seems to be down at the moment (please come back!).
2. Learn what makes a good link
Great links:
- Come from respected brands, sites, people and organizations
- Exist on pages that lots of other sites link to
- Provide value to the user
- Are within the content of the page
- Aren't replicated many times over on the linking site
Those are lofty requirements but there is a lot of evidence that these high value links are really the main drivers of a domain's link authority. At the 1:00 mark Matt Cutts talks about how many links are actually ignored by Google:
That's not to say there isn't wiggle room but the direction of the future is quite clear, you have no control over how Google or Bing values your links and there's plenty of evidence that sometimes they get it wrong. The beauty of getting great links is that they aren't just helping you rank, they are VALUABLE assets for your business SEO value aside. At Distilled this was one of the primary ways we built our business, it's powerful stuff.
3. Map out your crawl path
This is a simple goal but it can be very difficult for larger sites. If it's really complex and hard to figure out then it's going to be hard for Google to crawl. There are few bigger wins in SEO than getting content that wasn't previously being indexed out there working for you.
Sitemaps unfortunately can only help you so much in terms of getting things indexed. Furthermore, putting the pages that are the most important higher up in the crawl path lets you prioritize which pages get passed the most link authority.
4. Know about every page type and noindex the low value ones
I have never consulted on a website that didn't have duplicate or thin content somewhere. The real issue here is not that duplicate content always causes problems or a penalty but rather if you don't understand the structure of your website you don't know what *could* be wrong. Certainty is a powerful thing, knowing that you can confidently invest in your website is very important.
So how do you do it?
A great place to start is to use Google to break apart the different sections of your site:
- Start with a site search in Google
- Now add on to the search removing one folder or subdomain at a time
- Compare this number you get to the amount of pages you expect in that section and dig deeper if the number seems high
Note: The number of indexed pages that Google features here can be extremely inaccurate; the core idea is to reveal areas for further investigation. As you go through these searches go deeper into the results with inflated numbers. Duplicate and thin content will often show up after the first 100 results.
5. Almost never change your URLs
It's extremely common to change URLs, reasons like new design, new content management systems, new software, new apps... But this does serious damage and even if you manage it perfectly the 301 redirects cut a small portion of the value of EVERY single link to the page. And no one handles it perfectly. One of my favorite pieces of software Balsamiq has several thousand links and 500+ linking root domains pointed at 404s and blank pages. Balsamiq is so awesome they rank their head terms anyway but until you are Balsamiq cool you might need those links.
If you are worried that you have really bad URLs that could be causing problems Dr. Pete has already done a comprehensive analysis of when you should consider changing them. And then you only do it once.
6. Setup SEO monitoring
This is an often overlooked step in the process. As we talked about before if your content isn't up and indexed any SEO work is going to go to waste. Will Critchlow has already done a great job outlining how to monitor your website:
- Watch for traffic drops with Google Analytics custom alerts
- Monitor your uptime with services like Pingdom
- Monitor what pages you noindex with meta tags or robots.txt (you would be shocked how often this happens)
Some more tools to help you keep an eye out for problems:
- Dave Sottimano's traffic and rankings drop diagnosis tool
- Google Analytics Debugger
- The various rank tracking tools
- SEOmoz's Google Analytics hook formats landing pages sending traffic in an easy graph
7. Embrace inbound marketing
To me inbound marketing is just a logical progression from SEO, thinking about your organic traffic in a vacuum really just doesn't make sense. Dedicate yourself to improving your website for your users and they will reward you, Balsamiq which I mentioned earlier is a perfect example of this. I guarantee you they have done little to no SEO and yet they rank first for their most important keywords and have a Domain Authority of 81. How did they do it? Less features.
So what does that really mean? Balsamiq had a rigorous dedication to what their customers really wanted. That's really good marketing, smart business and intelligent product design all in one. Remember the future is all about relevance to your users, if you aren't actively seeking this you will get left behind. There is no excuse anymore there are plenty of proven examples of making seemingly boring page types fascinating and engaging.
Want to learn more?
If you need more high impact changes to your SEO check out the topic list for SearchLove San Francisco, it's the first time Distilled is going to be doing a conference on the West Coast.
I think it's a good article. But I want to add something to number 4:
I have found another new query (possible loophole) in Google which gives a list of low quality/thin pages directly. Instead of telling people the query and have google turn it off I built a simple tool which returns the data from Google at webtaker.com and it actually works very well.
Hi Jim. Tool worked great. The data is limited but creates a good place to start when looking across a site.
How do you manage to run the query against Google and not get blocked as 'robot traffic'? - do you use the an API in or a 'normal query' via multiple proxies?
Just asking because I find I get challenged by Google if I manually run a series of advanced queries via the search bar - I know I can be quick at running queries but I ain't no robot :-)
Thanks. I use something simple I wrote myself using PHP. Curl to fetch a normal URL and I fetch 3 pages so total maximum output is 30 results. I use sleep(4); between each fetch and I randomize a new proxy for each query. Proxies from ipfreelyproxies.net has worked best for me (I am not affiliated).
Loved the tool Jim, but having sort of double mind about it, because the secret ingredient you are using seems to be somewhere in the 'link structure only rather than content'.
Searching a blog made in "BLOGGER" platform can actually show this.
There are two same quality pages on my blog, one can be accessed through the homepage link that is www.abc.com, while the other page with same structure and same quality content can be accessed at an ugly navigation links of blogger(When u click older posts, blogger gives you an ugly link).... And your tool is showing me the ugly link as the low quality content, which in reality is a page with same quality content as that of my homepage.
Yes, I have now come to the conclusion that it returns pages in the supplemental results.
That tool is fantastic!
That tool is great. It works like a charm.
Great tool, Jim. Thank you very much. I love competitive analysis, so your tool is proving useful in checking up on our competitors and making sure we're not doing the same.
One thing... I ran it on webtaker.com, and it came up pretty ugly. =) You may want to look at normalizing the "click.php?" parameter. Just something I noticed. Have a great day, and thanks again for the tool!
Nice catch. I bought the domain and didn't realize that there are still junk indexed on it. But all those URLs return a 404 now so they should be deindexed soon.
Hi Jim,
Great tool, bookmarked and ran.
Sarah
Wow very cool, but if it pulls up a few product pages on our site that have the same url structure and no offsite links as other product pages what does that mean? Somehow these were flagged?
Yes, sometimes the query (and tool) returns results that it probably should not return. I have yet to determine any common denominator with those URLs. It could just be false positives and that's what I write on the site as well.
Hi Jim,
Nice little tool but I am having the same problem. When I run a query for my site, it returns a number of pages that wouldn't appear to me to be of low quality. It certainly doesn't seem that Google thinks they are low quality because most are ranked in the top 3 (Google Australia).
Thanks,
Brad
Thanks for sharing Jim, though i would caution to folks using this to find look for Google's "quality" assessment. I think I know what query using, and I believe it would be more appropriate to call it a check for "supplemental" index pages. This generally speaks less to content quality signals than it does link structure and hierarchy of these pages within the site. Still a nice trick for folks looking for what might filtered as supplemental, but this is different from what Google is doing with Panda and i think positioning it as such could confuse people.
Hi Tom,
Thanks for your feedback on this. I did a little more research and yes, you are right, it really seems to be the supplemental index pages that are being returned. I did a lot of searches in google and finally found some other query similar to mine that actually returns almost an identical result set. In some domain the same result set and in some domains the query I use performs better. That would also explain why the tool returned "false positives" of pages with high quality content (link structure and hierarchy).
I have now repositioned the tool to a tool to locate supplemental index pages.
Thanks again.
Thanks for the response Jim. And i think this is a very useful tool for lots of publishers to get a sense for the type of things that Google would house in a supplemental index.
Great tool! Thanks for sharing this Jim!
Hello everyone! I just joined SEOmoz Pro after working in online marketing for years. Glad finally to be a part!
Just wanted to add some things I saw and fixed after Pengin hurt the rankings of one of my personal sites (Buffy the Vampire Slayer Online at www.btvsonline.com). This is based on the idea that Peguin somehow targets overoptimization and (perceived) spam.
1. I had unknowingly overoptimized my URLs. I changed:
https://www.btvsonline.com/buffy-merchandise-collectibles/buffy-mugs/
TO
https://www.btvsonline.com/merchandise-collectibles/mugs/
2. Meta titles were too similar. I had:
Watch Buffy "Anne" Online for FreeWatch Buffy "Chosen" Online for FreeWatch Buffy "Becoming" Online for FreeEtc.
Changed to:
Buffy "Anne" (Season 3, Episode 1) Buffy "Chosen" (Season 7, Episode 22) Buffy "Becoming" (Season 2, Episodes 21 and 22)
3. Identical meta descriptions except for the keyword. This should be obvious. I manually rewrote each one to be entirely different.
I made all these changes, and rankings seem to be on the way back up. I hope this helps people!
Brilliant! I also think brand signals have a lot to do with this update. I've got a few clients who have websites that have been hit by this update in several ways and I've noticed that the ones that have done similar amounts of off page work in similar ways have been affected differently, the only thing consistent between the sites that haven't completely dropped off is the amount of brand recognition they've got compared to the sites that got hit hard.
So on top of the above points (which are very valid) I'd be arguing right now is time for brand marketing in every which way possible (from a link building perspective i expect that relates back to point 1).
Great post, I just had one quesiton regarding 5:
Do perfect 301 redirects really lose link juice?
I only ask this because I've only ever seen this happen when redirects aren't perfect. Doesn't it make sense for Google to pass 100% of the link juice of a page that has stayed 100% identical (I'm talking code here, not content) and the URL has just had some rubbish removed?
I've heard a lot people say that 301 redirects lose a small amount of link juice. For me it just seems to be one of those myths that just gets stuck in the SEO psyche...
In my experience they stick for a while and then lose some of their value (obviously there isn't a perfect test, but I've seen it happen when trying to replace a result for a site with a more accurate and useful page).
Nick, you might want to read this then - https://www.marketingpilgrim.com/2010/03/google-confirms-301-redirects-result-in-pagerank-loss.html - Matt Cutts confirming the "natural PageRank decay" that occurs with 301 redirects. Remember that every 301 redirect introduces latency caused by the server being instructed to look elsewhere.
Thanks Chris for this post, especially because it helps putting the Penguin update into a larger Search optimization context.
Actually, I agree with you that the main and most visible "factor" of Penguin is the nature and - especially - the diversification in anchor texts and sources of your link profile.
But I liked you did a reference to the internal information architecture issue and the duplicated content one, which seem to be the linking chain between Panda and Penguin.
My take - as always when an update appears - is to understand the factors of the update but then, somehow, forget them in order to not thinking just in terms of optimizing your site because of that particular update, but simply because it needs to be optimized. Just doing you can really better your site in a "update-independent" way, or - at least - that is how it works for me.
Finally... do SEO, think SEO, practice SEO... but make SEO invisible.
Finally... do SEO, think SEO, practice SEO... but make SEO invisible.
That is a fantastic quote, treating SEO as if it were independent of updates has always worked for me as well. It can be a tough sell to say, no you shouldn't put all that targeted anchor text in your links even though you might rank immediately. I think those are the important conversations to have and puts SEO into a larger conversation of what is the vision for your website? What is it going to be doing for you in 1 year, 5 years or 10 years?
"Finally... do SEO, think SEO, practice SEO... but make SEO invisible."
I always think of SEO as being like plastic surgery - if it's noticeable, you're doing it wrong.
Thanks a lot for sharing this latest information about Google Penguin Jim.. I think we should always go for a natural way of link building so that we can still be on top of Google.
Nice post! These are definitely great steps. I love the last one as that has been the next logical step for us and it's been wonderful for us and for clients!
Thanks for sharing this Chris.
Incredibly informative post, bookmarking this. Thank you.
All I have to say is WOW! You put a lot of time and effort into this post, it really enlightened me on some aspects of Penguin that I didn't know of before. I look forward to reading more posts by you.
Love to hear it Zachary, thanks for letting me know it means a lot when your work really does help someone.
Nice Post. Yes, Open Explorer & distilled.net/excel-for-seo/ both are good Tools.
Personally i recommended Open Explorer.
Hola Chris, great article!
I`ve doing SEO since a while -"Old School"- but looking how Google is becoming more and more "Social".
PANDA, Penguin, Knowledge Graph are just some indicators of what´s Google pointing out: quality search results. The algorithm math still the same; the objective is changing trying to build *better* search results to the users.
I don`t know how some of the SEO "Old School" techniques are going to work, but at least the "Content" still on stage :) And don´t forget the "Web design & UX". Sometimes great content needs a good looking design too!
Cheers!
Hey Chris - I really enjoyed the article, it is especially nice to see someone step away from the speculative end of Penguin and give some universal advice that holds truth and value beyond the nitty-gritty details. In a very recent personal experience I saw a penalty for having over 10% of a link profile contain a 3 word anchor text phrase, so I think the threshold has a lot more to do with other signaling factors on the domain in question beyond just a straight ratio. Thanks again.
Thanks Chris for sharing great helful post.
Thanks Chris! got completely sidetracked by the Balsamiq mockups tool..
in regards to changing a url, what if you remove hashbangs from a url? same effect?
This is a great write up, I also use link detective.
It's not just for SEO, Pingdom and GotSiteMonitor actually give us peace of mind during vacations.
This is a good reminder of what's important with getting links, although I think the Penguin update primarily hit sites that were doing "very bad links". No good SEOs in their right mind would go back to the old linking tactics, which were never a good idea in the first place.
This is a very good article - however I struggle with having confidence with many of these practices making much of a difference. Let's use one site I manage that has been around for 10+ years. There was never took out any SEO or linkbuilding campaigns for this popular. Every effort was organic up until Panda and both it and Penguin created a very noticeable problem. The site content is vast, excellent quality and well curated. Not one suggestion made by "Bionic Webmasters" from Google's forum worked. Many of the suggestions here just didn't apply to this site - you can't "fix" the inbound links with better anchor text that were organically generated over more than a decade. The URLs didn't change and we didn't have any major 404 errors. The only move we made that made that worked - removing Google Analytics for over 1 month. Traffic soon jumped over 10% immediately and over a few months even more. After putting Analytics back into the system, Penguin came and continued to cause problems. If there are doubters, I've got the data and a very nice looking site to prove it. Hard to believe.
Having high quality links to your website is absolutely a powerful stuff. It can be a valued assets for your online business.
I still don't understand for point number 3. How google see path for crawl? Is it from url or click from homepage? If the individual product and sub-category i place it homepage, is it google crawl still go category to sub-category then individual product or direct crawl to sub-category and individual product?
Very useful post. I was wondering what you take was on off-site blog creation? Having a blog dedicated to a brand with a separate domain. Should links used in the content be consistent with your 2:10 ratio of ranking focused anchor texts? Who Penguin flag this since it is primarily only link to the one brand, even though the content is relevant.
I haven't seen an issue with this yet and was wondering as a best practice should all branded blogs reside on their site because of Penguin.
The update was done in the SERPs by Penguin have distorted the results of research, often bringing on top websites very poor. Many good sites have been unfairly penalized, hopefully that will correct the bugs.
I like the point 2 its really useful tip for building quality link
Recently, I was working on some projects. I wanted to ask about my keywords and Pages problem. My project keywords show up on Google search engine instantly and after some time all keywords disappeared and pages too so, why did they disappear from Google? Is it because Google Dance or Penguin hit my keywords and pages? In site their is no duplicate content available. If anyone know answers about this problem kindly share with me.
Thank You
In a world where we always see a lot of articles that say nothing, your article is really good. I was fed-up reading articles that were wrote only to rank for some kws. Ok, your text has some merchandising and received a link on the final, but at least has a good content too. This is what seo talk about, no one cares if you are making a link building (I'm talking about users and not seo professionals) if you provide good content.
Great article,I'm so much thankful that I found this site, hope to see more post.
Thanks Chris, very good summary and insights into how to battle the all mighty Penguin of Doom! Some of what you suggest has already been put before some of our clients.
Hi Chris,
I am fully agree with the your achievable techniques.
Google penguin update aimed to catching web-spam links. Right ?
Recently one of my website got an penalty for the “phishing”. Phishing website means our website is hacked by any hacker and putted some unnatural code into our websites so that is consider into the black hat seo and not safe for eCommerce. Rewarding” high-quality and punishing black hat web-spam is the goal of this algorithm. So be activate and increase you security J
And another thing: Google is attempting to listen to webmasters. Cutts recently tweeted, “If you want to report spam that Google is missing, fill out a spam report and add the word penguin." Of course, there are no guarantees when it comes to listings in Google's search engine, but it may be worth a try for those who want to help Google battle web-spam.
Thanking for excellent post sharing.
I had a similar problem. My Wordpress got a hack and somebody put a bunch of spam links in a post of mine. I can recommend to everybody to check their CMS like wordpress for unwanted links in the articles and then to make a hack more difficult by harden your instalation against hackers.
Agreed. Seems to have a lot of impact in a post-Penguin world. I'd recommend installing Bulletproof Security, Block Bad Queries and Ultimate Security Checker. And if you find any links to fake pages the spammers have created, block them via robots.txt after deleting.
HI jenni
first of all thanks for nice reply.
Agreed with your suggestion. that spammy link are not available in our site but shown into the webmaster tool message.
so how can i delete this!
@Jenni +1 Thank you
Hi Michael
thanks for your great appreciate reply
I'm thinking anchor text distribution is a big contributing factor. When someone links to your content, chances are they aren't using exact match anchor text. If too many of your anchor text phrases are keywords it's going to look suspicious and unnatural.
Thanks for that great post! I am suffering heavily on some pages from the penguin update so I will take your post to review my SEO Strategy for further actions.
Awesome post Chris. I liked your choice of monitoring tools... I'll have to add a few of them to my arsenal as well.
Thanks a lot for posting such an informative blog article about Google Penguin Updates. I think that now the SEO Sector in India and other Nations of the World have to be very careful in preventing themselves from Overoptimizing any Website.
Fantastic post Chris, no-indexing the low value or 'thin' pages is an excellent suggestion and something I hadn't necessarily considered previously.
Would love to see some more regular posts from you here, great work! : )
I am happy to say that Link Detective is back, i was wondering this weekend where it was finished!!!
There is no fixed "rule of thumb" to invetigate what lacks in the strucure, and what is the required crawl path for optimum results. ratgher all that involves constant investigations
Thank you very much for the ratio of 2 out of 10 links. It is extremley important. I would like to know: If you have 10 links and 2 links out of that are very good once (maybe even authorities): Would you put your money keywords there or is it better to put your money keywords in the link text on sites that are less good?
Congrats on making it to the front page Chris.
IMO I think the best way to get loved by Google is first getting loved by the visitors/community. Make content that's highly shareable (read: awesome content) and be as natural and as human as you can. The days where you could trick and manipulate search engines are over, plain and simple, and Google's put an emphasis on that with both the big bad Panda and Penguin.
One of the best articles I've read in SEOmoz in a while. Thanks for the insight, and tips.
Thanks that's quite a compliment!
Chris, the beginning of your post seems to suggest that with quick updates, Google is returning results relevant to the user. My experience has been the opposite - earlier, Google seemed to read my mind. So for example, even if I was unable to phrase what I am searching for properly, Google would either return a relevant link or somehow point me to that direction. Now, it just seems to be searching for a keyword in the page and returning that to the user.
In fact, as a webmaster, I was surprised to see my own page in the top 10 results for Google for completely irrelevant keywords - https://www.anindyamozumdar.com/2012/05/disproportionate-influence-of-a-post-title-on-organic-search-rankings/ - it is rare for someone to be unhappy after ranking highly, right?
Another example, a tweet which I posted to show the complete irrelevance of results - https://twitter.com/#!/AnindyaMozumdar/status/203791306818985984/photo/1
Hi Anindya, I think Google is *trying* to serve up the most relevant results to the user but we are far, far away from a perfect world particularly when it comes to longtail searches. There have even been plenty of examples of pure spam ranking like the "car insurance" example someone posted here on SEOmoz a few months ago. In aggregate I think we are seeing more relevant results than ever before but there's always examples to the contrary.
One more thing to note, for true rank tracking the best way to get clean results is to check results with an incognito type browser (and in my experience not Chrome) you could be seeing your site with personalized results.
True. However, in my experience, earlier Google was providing me with relevant results - now they are just trying. However, in time, hope things "stabilize" and things become better over time.
Thanks, Chris!
I wasn't touched by Penguin but that's great info for preventive SEO strategies.
Chris,
A couple of things I like about your article:
1) The importance of knowing where your links are coming from. It seems to many are concerned on what they want to get but It's more important to know what is already in the profile so you can balance them out.
2) I agree about thin content. A site we started working with a few weeks back had a handful of great pages then hundreds of thin pages. Still the average person doesn't understand these thin pages are hurting their seo efforts.
Chris, great article. I've just tried to use Link Detective for the first time and even after 3 hours the report has not run (after I imported the link data from OSE). Is this normal? My CSV only includes 422 links. Can anyone else shed light on this?
Thanks
Yup that's normal, can take up to a day in my experience.
Hey Brad. Link Detective can analyze a report of that size in 30 seconds to 2 minutes, depending on the response time of the pages it fetches. The problem is you have to wait in line behind everyone else who has submitted reports, and most of the reports submitted have a lot more links than that.
I'm obviously grateful when a post like this comes out that features Link Detective and says nice things about it (big thanks to Chris!), but it invariably means a ton of new users sign up and submit reports through for analysis... which means everyone waits longer for results.
I'm thinking about rolling out a premium tier of service that would have its own queue and have access to more servers to run reports. I'd need to see if people were interested in that before I'd commit to the additional cost.
I started using WebTaker.com tool today to check backlinks and when I ran dmoz.org I was rather startled to see it come up as having low qulaity links. I am curious as to your thoughts on this result.
What I find interesting about point 2, the Matt Cutts video, is that he's saying Google likes to show a sort of subset of links in it's site: command not only for the typical reason an SEO might think of (because Google doesn't want people to game their search) but also because they don't want legit sites don't shooting themselves in the foot by copying bad link profiles.
The post Penguin world sucks. I believe Google will turn down the link weighting they've recently introduced, because site owners don't control their link graph. Which means that you can currently take down competitors sites with spammy link building.
They will act, or there will be blood! Here's an example of what's happening:
https://webbactivemedia.net/marketing-blog/285-google-penguin-update-could-it-be-used-as-a-weapon.html
Nice post, Chris. It really emphasizes how much things have changed. The current SEO best practices are pretty much just inbound marketing.
Very true, that's where Google is going and as progressive marketers that's where I think we should all be headed. Right now there are lots of "details" but some day those will all be gone, and then there will just be people who are good at marketing.
Excellent stuff Chris.
I didn't know changing URLs could be damaging EVEN if all the right steps were taken. And Will's monitoring tips and tools were very useful.
Thanks for sharing this valuable information..........
such a very helful for SEO's to update thier ranking ....
www.chooserank.com
Its a best tool for SEO and Much beneficial for quality web site.
Thanks Chris. The information you put here is awesome. It is like need of water when somebody with thirst.