I wish I didn't have to say this. I wish I could look in the eyes of every victim of the last Panda 4.1 update and tell them it was something new, something unforeseeable, something out of their control. I wish I could tell them that Google pulled a fast one that no one saw coming. But I can't.
Like many in the industry, I have been studying Panda closely since its inception. Google gave us a rare glimpse behind the curtain by providing us with the very guidelines they set in place to build their massive machine-learned algorithm which came to be known as Panda. Three and a half years later, Panda is still with us and seems to still catch us off guard. Enough is enough.
What I intend to show you throughout this piece is that the original Panda questionnaire still remains a powerful predictive tool to wield in defense of what can be a painful organic traffic loss. By analyzing the winner/loser reports of Panda 4.1 using standard Panda surveys, we can determine whether Google's choices are still in line with their original vision. So let's dive in.
The process
The first thing we need to do is acquire a winners and losers list. I picked this excellent one from SearchMetrics although any list would do as long as it is accurate. Second, I proceeded to run a Panda questionnaire with 10 questions on random pages from each of the sites (both the winners and losers). You can run your own Panda survey by following Distilled and Moz's instructions here or just use PandaRisk like I did. After completing these analyses, we simply compare the scores across the board to determine whether they continue to reflect what we would expect given the original goals of the Panda algorithm.
The aggregate results
I actually want to do this a little bit backwards to drive home a point. Normally we would build to the aggregate results, starting with the details and leaving you with the big picture. But Panda is a big-picture kind of algorithmic update. It is specially focused on the intersection of myriad features, the sum is greater than the parts. While breaking down these features can give us some insight, at the end of the day we need to stay acutely aware that unless we do well across the board, we are at risk.
Below is a graph of the average cumulative scores across the winners and losers. The top row are winners, the bottom row are losers. The left and right red circles indicate the lowest and highest scores within those categories, and the blue circle represents the average. There is something very important that I want to point out on this graph. The highest individual average score of all the losers is less than the lowest average score of the winners. This means that in our randomly selected data set, not a single loser averaged as high a score as the worst winner. When we aggregate the data together, even with a crude system of averages rather than the far more sophisticated machine learning techniques employed by Google, there is a clear disparity between the sites that survive Panda and those that do not.
It is also worth pointing out here that there is no positive Panda algorithm to our knowledge. Sites that perform well on Panda do not see boosts because they are being given ranking preference by Google, rather their competitors have seen rankings loss or their own previous Panda penalties have been lifted. In either scenario, we should remember that performing well on Panda assessments isn't going to necessarily increase your rankings, but it should help you sustain them.
Now, let's move on to some of the individual questions. We are going to start with the least correlated questions and move to those which most strongly correlate with performance in Panda 4.1. While all of the questions had positive correlations, a few lacked statistical significance.
Insignificant correlation
The first question which was not statistically significant in its correlation with Panda performance was "This page has visible errors on it". The scores have been inverted here so that the higher the score, the fewer the number of people who reported that the page has errors. You can see that while more respondents did say that the winners had no visible errors, the difference was very slight. In fact, there was only a 5.35% difference between the two. I will save comment on this until after we discuss the next question.
The second question which was not statistically significant in its correlation with Panda performance was "This page has too many ads". The scores have once again been inverted here so that the higher the score, the fewer the number of people who reported that the page has too many ads. This was even closer. The winners performed only 2.3% better than the losers in Panda 4.1.
I think there is a clear takeaway from these two questions. Nearly everyone gets the easy stuff right, but that isn't enough. First, a lot of pages just have no ads whatsoever because that isn't their business model. Even those that do have ads have caught on for the most part and optimized their pages accordingly, especially given that Google has other layout algorithms in place aside from Panda. Moreover, content inaccuracy is more likely to impact scrapers and content spinners than most sites, so it is unsurprising that few if any reported that the pages were filled with errors. If you score poorly on either of these, you have only begun to scratch the surface, because most websites get these right enough.
Moderate correlation
A number of Panda questions drew statistically significant difference in means but there was still substantial crossover between the winners and losers. Whenever the average of the losers was greater than the lowest of the winners, I considered it only a moderate correlation. While the difference between means remained strong, there was still a good deal of variance in the scores.
The first of these to consider was the question as to whether the content was "trustworthy". You will notice a trend in a lot of these questions that there is a great deal of subjective human opinion. This subjectivity plays itself out quite a bit when the topics of the site might deal with very different categories of knowledge. For example, a celebrity fact site might be very trustworthy (although the site might be ad-laden) and an opinion piece in the New Yorker on the same celebrity might not be seen as trustworthy - even though it is plainly labeled as opinion. The trustworthy question ties back to the "does this page have errors" question quite nicely, drawing attention to the difference between a subjective and objective question and the way it can spread the means out nicely when you ask a respondent to give more of a personal opinion. This might seem unfair, but in the real world your site and Google itself is being judged by that subjective opinion, so it is understandable why Google wants to get at it algorithmically. Nevertheless, there was a strong difference in means between winners and losers of 12.57%, more than double the difference we saw between winners and losers on the question of Errors.
Original content has long been a known requirement of organic search success, so no one was surprised when it made its way into the Panda questionnaire. It still remains an influential piece of the puzzle with a difference in mean of nearly 20%. It was barely ruled out from being a heavily correlated feature due to one loser edging out a loss against the losers' average mean. Notice though that one of the winners scored a perfect 100% on the survey. This perfect score was received despite hundreds of respondents. It can be done.
As you can imagine, perception on what is and is not an authority is very subjective. This question is powerful because it pulls in all kinds of assumptions and presuppositions about brand, subject matter, content quality, design, justification, citations, etc. This likely explains why this question is beleaguered by one of the highest variances on the survey. Nevertheless, there was a 13.42% difference in means. And, on the other side of the scale, we did see what it is like to have a site that is clearly not an authority, scoring the worst possible 0% on this question. This is what happens when you include highly irrelevant content on your site just for the purpose of picking up either links or traffic. Be wary.
Everyone hates the credit card question, and luckily there is huge variance in answers. At least one site survived Panda despite scoring 5% on this question. Notice that there is a huge overlap between the lowest winner and the average of the losing sites. Also, if you notice by the placement of the mean (blue circle) in the winners category, the average wasn't skewed to the right indicating just one outlier. There was strong variance in the responses across the board. The same was true of the losers. However, with a +15% difference in means, there was a clear average differentiation between the performance of winners and losers. Once again, though, we are drawn back to that aggregate score at the top, where we see how Google can use all these questions together to build a much clearer picture of site and content quality. For example, it is possible that Google pays more attention to this question when it is analyzing a site that has other features like the words "shopping cart" or "check out" on the homepage.
I must admit that the bookmarking question surprised me. I always considered it to be the most subjective of the bunch. It seemed unfair that a site might be judged because it has material that simply doesn't appeal to the masses. The survey just didn't bear this out though. There was a clear difference in means, but after comparing the sites that were from similar content categories, there just wasn't any reason to believe that a bias was created by subject matter. The 14.64% difference seemed to be, editorially speaking, related more to the construction of the page and the quality of the content, not the topic being discussed. Perhaps a better way to think about this question is: would you be embarrassed if your friends knew THIS was the site you were getting your information from rather than another.
This wraps up the 5 questions that had good correlations but substantial enough variance that it was possible for the highest loser to beat out the average winner. I think one clear takeaway from this section is that these questions, while harder to improve upon than the Low Ads and No Errors questions before, are completely within the webmaster's grasp. Making your content and site appear original, trustworthy, authoritative, and worthy of bookmarking aren't terribly difficult. Sure, it takes some time and effort, but these goals, unlike the next, don't appear that far out of reach.
Heavy correlation
The final three questions that seemed to distinguish the most between the winners and losers of Panda 4.1 all had high difference-in-means and, more importantly, had little to no crossover between the highest loser and lowest winner. In my opinion, these questions are also the hardest for the webmaster to address. They require thoughtful design, high quality content, and real, expert human authors.
The first question that met this classification was "could this content could appear in print". With a difference in mean of 22.62%, the winners thoroughly trounced the losers in this category. Their sites and content were just better designed and better written. They showed the kind of editorial oversight you would expect in a print publication. The content wasn't trite and unimportant, it was thorough and timely.
The next heavily correlated question was whether the page was written by experts. With over a 34% difference in means between the winners and losers, and literally no overlap at all between the winners' and losers' individual averages, it was clearly the strongest question. You can see why Google would want to look into things like authorship when they knew that expertise was such a powerful distinguisher between Panda winners and losers. This really begs the question - who is writing your content and do your readers know it?
Finally, insightful analysis had a huge difference in means of +32% between winners and losers. It is worth noting that the highest loser is an outlier, which is typified by the skewed mean (blue circle) being closer to the bottom that the top. Most of the answers were closer to the lower score than the top. Thus, the overlap is exaggerated a bit. But once again, this just draws us back to the original conclusion - that the devil is not in the details, the devil is in the aggregate. You might be able to score highly on one or two of the questions, but it won't be enough to carry you through.
The takeaways
OK, so hopefully it is clear that Panda really hasn't changed all that much. The same questions we looked at for Panda 1.0 still matter. In fact, I would argue that Google is just getting better at algorithmically answering those same questions, not changing them. They are still the right way to judge a site in Google's eyes. So how should you respond?
The first and most obvious thing is you should run a Panda survey on your (or your clients') sites. Select a random sample of pages from the site. The easiest way to do this is get an export of all of the pages of your site, perhaps from Open Site Explorer, put them in Excel and shuffle them. Then choose the top 10 that come up. You can follow the Moz instructions I linked to above, do it at PandaRisk, or just survey your employees, friends, colleagues, etc. While the latter probably will be positively biased, it is still better than nothing. Go ahead and get yourself a benchmark.
The next step is to start pushing those scores up one at a time. I give some solid examples on the Panda 4.0 release article about improving press release sites, but there is another better resource that just came out as well. Josh Bachynski released an amazing set of known Panda factors over at his website The Moral Concept. It is well worth a thorough read. There is a lot to take in, but there are tons of easy-to-implement improvements that could help you out quite a bit. Once you have knocked out a few for each of your low-scoring questions, run the exact same survey again and see how you improve. Keep iterating this process until you beat out each of the question averages for winners. At that point, you can rest assured that your site is safe from the Panda by beating the devil in the aggregate.
Hey folks, I'll try to respond to all of these today. The KS for your comments. Also, if you want to come ask in person I'm at Pubcon this week and should be at the Remove'em table most of the time.
Thanks for the link to the known Panda factors on The Moral Concept. I'd never seen that before. Fascinating stuff! Nice write up btw.
thanks for the link Russ. i may even have a Whiteboard Friday coming up on my COMPLETE Panda Google Leaked Do and Don't List !!! STAY TUNED
Looking forward to it!
Yes Russ, it would be a great subject for a WF! I didn't know the PANDA Do & Don’t LIST, this is worth its weight in gold! Thanks Russ.
That is a ripper article, ta.
Great article, cheers for sharing. :)
Deliver high valued content to your audience make it viral and voila! you will never face issues from Panda
Hey Russ,
This is one of the best Moz posts I've read in a while. The details of Panda has been a bit of an enigma to me for way too long (admittedly I haven't done enough reading on it). So, thanks.
PS. The link to Moz on Remove'em's footer still points to seomoz.org. I'm sure the lads and ladies at the mozplex don't mind, but I'm one of those guys that believes the devil IS in the detail (but ALSO in the aggregate, now!).
That will be all. Please and thank you.
Thanks! Fixed the link to Moz and added another of our data sources to that list!
Very useful information, Thank you Jones :)
I was surprised that ads didn't make more of a difference.
frankly so was I :) but it bore itself out time and time again in the data
Great Post..we need to face this devil panda 4.1
Great post, with penguin here any second now any tips for that too ;)
re-validated with Penguin 2.1Check out the post on Moz called Penguin Hunting Season - the model was re-validated with Penguin 2.1 so it is the best we got.
What a way to start the week ;)
Excellent post. Thank you.
whoa thats much, I have to read later - but I think here are some panda/penguin mistakes (of the last Penguin 4.1)
That happens to me to - I wrote so much about panda and penguin last weeks :) They are going hand in hand - in my head its only a penga actually...
thanks for pointing out the error. Dr Pete fixed it for me. Thanks Moz team!
No Problem, guess it happens to me to in some posts ;)
Great to see numbers on Google Panda 4.1 (and tests on websites). Do you think it's becoming more semantic with each update?
And by the way, the article was terrific. Thanks for the insight and the data, Russ.
thanks for the response. Frankly, I don't know. The link to Josh's list might give some insight. The problem with validating Panda underlying factors is we know that a good deal of those factors - like interaction between search users and the listing in the search results - is just not available to us to build into the model. This is why we can't build a machine learned correlate to Panda the same way we did with Penguin in the Open Penguin Data project
I think one of the biggest problems people are running into is duplicate content due to their social media campaigns. Maybe Facebook and Google+ are indexed faster than your site. And you probably copy/pasted some stuff from that blog post, didn't you? It was easier, after all, and you were in a rush. Social strategies need to change.
This is interesting, but I think probably not a huge deal. Most social content is ephemeral in nature (sticks around for a while but then goes away) and is also a shortform version.
Hi, Russ, with regard to "who is writing your content", did not Google recently announced they dropped authorship and don't use it their ranking (I personally just removed all my rel=author links)?
correct, they did. However, Google can detect expertise through simple bag-of-words analysis like Latent Dirichlet Allocation by distinguishing between content that caries language more distinct than random content. Think for example of an article written on Odesk by a novice. The content will have the main keywords, but the filler content will be generally similar to other content on the web because the author won't know the inside language that an expert would. So a health article on heart disease is more likely to include the word ventricular if it is written by an expert. Notice that Google doesn't need to know semantically that ventricular is related. It merely needs to see that this content uses words that just aren't part of your average content in conjunction with the primary keywords.
Very interesting. So, we just need to sprinkle throughout a page some rare professional terms to prove our expertise? (-:
Google didn't get rid of authorship, they took the authorship photos out of the SERPs.
That's 2 COMPLETELY different things...
Authorship didn't go away. Authorship photos in SERP went away. Google still has and uses authorship.
They dropped the authorship program. https://plus.google.com/u/0/+JohnMueller/posts/HZf...
They have instructed users to instead use Schema markup which existed prior to the Google Authorship program.
Does Google perhaps still look at authorship as a metric? Sure, who knows. But the Authorship Program proper is now gone.
Wow, these are great insights and some serious efforts have gone into getting these awesome analytics. Loved it.
Thanks fro the details about Google panda
It's quite fascinating. I had never seen this aspect of Panda.
No doubt seo needs great content now, panda 4.1 doing his job , doing research with few webs , some of them got best result with panda updates but some of them got no luck with the panda 4.1
I have seen several posts on this topic, it is very interesting to read and learn different points of view of each writer
Thanks, found this really useful, especially the moralconcept stuff I'd not seen before. Used PandaRisk for 10 sites, it's pretty good (as a reference point) - although I need to check with them as two of my sites have just come back with a blank PDF.
Cheers.
Hey Steve, thanks for the response! If you got blank PDFs it means the report failed to run, they will fix it for you.
Great insights @Russ. Love the way Panda Analysis has been brought about. Have implemented related techniques on few of our clients and they have also recovered. Thanks MOZ Team.
Great analysis as always. Why do you think your best-scoring loser lost and why do you think your worst-scoring winner won? They were very similar in their overall score.
Thanks for giving brief description about Panda
Great stuff!
Another factor that would be interesting to test: Is this website updated frequently? or Is this website up-to-date?
We believe that timeliness and frequent updating also positively impacts SEO, but it would be interesting to see whether users' perception of frequency or timeliness correlates.
Good Job Russ!!