Sometime in the last week, the first Penguin update in over a year began to roll out (Penguin 2.1 hit around October 4, 2013). After a year, emotions were high, and expectations were higher. So, naturally, people were confused when MozCast showed the following data:
The purple bar is Friday, October 17th, the day Google originally said Penguin 3.0 rolled out. Keep in mind that MozCast is tuned to an average temperature of roughly 70°F. Friday’s temperature was slightly above average (73.6°), but nothing in the last few days indicates a change on the scale of the original Penguin update. For reference, Penguin 1.0 measured a scorching 93°F.
So, what happened? I’m going to attempt to answer that question as honestly as possible. Fair warning – this post is going to dive very deep into the MozCast data. I’m going to start with the broad strokes, and paint the finer details as I go, so that anyone with a casual interest in Penguin can quit when they’ve seen enough of the picture.
What’s in a name?
We think that naming something gives us power over it, but I suspect the enchantment works both ways – the name imbues the update with a certain power. When Google or the community names an algorithm update, we naturally assume that update is a large one. What I’ve seen across many updates, such as the 27 named Panda iterations to date, is that this simply isn’t the case. Panda and Penguin are classifiers, not indicators of scope. Some updates are large, and some are small – updates that share a name share a common ideology and code-base, but they aren’t all equal.
Versioning complicates things even more – if Barry Schwartz or Danny Sullivan name the latest update “3.0”, it’s mostly a reflection that we’ve waited a year and we all assume this is a major update. That feels reasonable to most of us. That doesn’t necessarily mean that this is an entirely new version of the algorithm. When a software company creates a new version, they know exactly what changed. When Google refreshes Panda or Penguin, we can only guess at how the code changed. Collectively, we do our best, but we shouldn’t read too much into the name.
Was this Penguin just small?
Another problem with Penguin 3.0 is that our expectations are incredibly high. We assume that, after waiting more than a year, the latest Penguin update will hit hard and will include both a data refresh and an algorithm update. That’s just an assumption, though. I firmly believe that Penguin 1.0 had a much broader, and possibly much more negative, impact on SERPs than Google believed it would, and I think they’ve genuinely struggled to fix and update the Penguin algorithm effectively.
My beliefs aside, Pierre Far tried to clarify Penguin 3.0’s impact on Oct 21, saying that it affected less than 1% of US/English queries, and that it is a “slow, worldwide rollout”. Interpreting Google’s definition of “percent of queries” is tough, but the original Penguin (1.0) was clocked by Google as impacting 3.1% of US/English queries. Pierre also implied that Penguin 3.0 was a data “refresh”, and possibly not an algorithm change, but, as always, his precise meaning is open to interpretation.
So, it’s possible that the graph above is correct, and either the impact was relatively small, or that impact has been spread out across many days (we’ll discuss that later). Of course, many reputable people and agencies are reporting Penguin hits and recoveries, so that begs the question – why doesn’t their data match ours?
Is the data just too noisy?
MozCast has shown me with alarming clarity exactly how messy search results can be, and how dynamic they are even without major algorithm updates. Separating the signal from the noise can be extremely difficult – many SERPs change every day, sometimes multiple times per day.
More and more, we see algorithm updates where a small set of sites are hit hard, but the impact over a larger data set is tough to detect. Consider the following two hypothetical situations:
The data points on the left have an average temperature of 70°, with one data point skyrocketing to 110°. The data points on the right have an average temperature of 80°, and all of them vary between about 75-85°. So, which one is the update? A tool like MozCast looks at the aggregate data, and would say it’s the one on the right. On average, the temperature was hotter. It’s possible, though, that the graph on the left represents a legitimate update that impacted just a few sites, but hit those sites hard.
Your truth is your truth. If you were the red bar on the left, then that change to you is more real than any number I can put on a graph. If the unemployment rate drops from 6% to 5%, the reality for you is still either that you have a job or don’t have a job. Averages are useful for understanding the big picture, but they break down when you try to apply them to any one individual case.
The purpose of a tool like MozCast, in my opinion, is to answer the question “Was it just me?” We’re not trying to tell you if you were hit by an update – we’re trying to help you determine if, when you are hit, you’re the exception or the rule.
Is the slow rollout adding noise?
MozCast is built around a 24-hour cycle – it is designed to detect day-over-day changes. What if an algorithm update rolls out over a couple of days, though, or even a week? Is it possible that a relatively large change could be spread thin enough to be undetectable? Yes, it’s definitely possible, and we believe Google is doing this more often. To be fair, I don’t believe their primary goal is to obfuscate updates – I suspect that gradual rollouts are just safer and allow more time to address problems if and when things go wrong.
While MozCast measures in 24-hour increments, the reality is that there’s nothing about the system limiting it to that time period. We can just as easily look at the rate of change over a multi-day window. First, let’s stretch the MozCast temperature graph from the beginning of this post out to 60 days:
For reference, the average temperature for this time period was 68.5°. Please note that I’ve artificially constrained the temperature axis from 50-100° – this will help with comparisons over the next couple of graphs. Now, let’s measure the “daily” temperature again, but this time we’ll do it over a 48-hour (2-day) period. The red line shows the 48-hour flux:
It’s important to note that 48-hour flux is naturally higher than 24-hour flux – the average of the 48-hour flux for these 60 days is 80.3°. In general, though, you’ll see that the pattern of flux is similar. A longer window tends to create a smoothing effect, but the peaks and valleys are roughly similar for the two lines. So, let’s look at 72-hour (3-day) flux:
The average 72-hour flux is 87.7° over the 60 days. Again, except for some smoothing, there’s not a huge difference in the peaks and valleys – at least nothing that would clearly indicate the past week has been dramatically different from the past 60 days. So, let’s take this all the way and look at a full 7-day flux calculation:
I had to bump the Y-axis up to 120°, and you’ll see that smoothing is in full force – making the window any larger is probably going to risk over-smoothing. While the peaks and valleys start to time-shift a bit here, we’re still not seeing any obvious climb during the presumed Penguin 3.0 timeline.
Could Penguin 3.0 be spread out over weeks or a month? Theoretically, it’s possible, but I think it’s unlikely given what we know from past Google updates. Practically, this would make anything but a massive update very difficult to detect. Too much can change in 30 days, and that base rate of change, plus whatever smaller updates Google launched, would probably dwarf Penguin.
What if our keywords are wrong?
Is it possible that we’re not seeing Penguin in action because of sampling error? In other words, what if we’re just tracking the wrong keywords? This is a surprisingly tough question to answer, because we don’t know what the population of all searches looks like. We know what the population of Earth looks like – we can’t ask seven billion people to take our survey or participate in our experiment, but we at least know the group that we’re sampling. With queries, only Google has that data.
The original MozCast was publicly launched with a fixed set of 1,000 keywords sampled from Google AdWords data. We felt that a fixed data set would help reduce day-over-day change (unlike using customer keywords, which could be added and deleted), and we tried to select a range of phrases by volume and length. Ultimately, that data set did skew a bit toward commercial terms and tended to contain more head and mid-tail terms than very long-tail terms.
Since then, MozCast has grown to what is essentially 11 weather stations of 1,000 different keywords each, split into two sets for analysis of 1K and 10K keywords. The 10K set is further split in half, with 5K keywords targeted to the US (delocalized) and 5K targeted to 5 cities. While the public temperature still usually comes from the 1K set, we use the 10K set to power the Feature Graph and as a consistency check and analysis tool. So, at any given time, we have multiple samples to compare.
So, how did the 10K data set (actually, 5K delocalized keywords, since local searches tend to have more flux) compare to the 1K data set? Here’s the 60-day graph:
While there are some differences in the two data sets, you can see that they generally move together, share most of the same peaks and valleys, and vary within roughly the same range. Neither set shows clear signs of large-scale flux during the Penguin 3.0 timeline.
Naturally, there are going to be individual SEOs and agencies that are more likely to track clients impacted by Penguin (who are more likely to seek SEO help, presumably). Even self-service SEO tools have a certain degree of self-selection – people with SEO needs and issues are more likely to use them and to select problem keywords for tracking. So, it’s entirely possible that someone else’s data set could show a more pronounced Penguin impact. Are they wrong or are we? I think it’s fair to say that these are just multiple points of view. We do our best to make our sample somewhat random, but it’s still a sample and it is a small and imperfect representation of the entire world of Google.
Did Penguin 3.0 target a niche?
In that every algorithm update only targets a select set of sites, pages, or queries, then yes – every update is a "niche" update. The only question we can pose to our data is whether Penguin 3.0 targeted a specific industry category/vertical. The 10K MozCast data set is split evenly into 20 industry categories. Here's the data from October 17th, the supposed data of the main rollout:
Keep in mind that, split 20 ways, the category data for any given day is a pretty small set. Also, categories naturally stray a bit from the overall average. All of the 20 categories recorded temperatures between 61.7-78.2°. The "Internet & Telecom" category, at the top of the one-day readings, usually runs a bit above average, so it's tough to say, given the small data set, if this temperature is meaningful. My gut feeling is that we're not seeing a clear, single-industry focus for the latest Penguin update. That's not to say that the impact didn't ultimately hit some industries harder than others.
What if our metrics are wrong?
If the sample is fundamentally flawed, then the way we measure our data may not matter that much, but let’s assume that our sample is at least a reasonable window into Google’s world. Even with a representative sample, there are many, many ways to measure flux, and all of them have pros and cons.
MozCast still operates on a relatively simple metric, which essentially looks at how much the top 10 rankings on any given day change compared to the previous day. This metric is position- and direction-agnostic, which is to say that a move from #1 to #3 is the same as a move from #9 to #7 (they’re both +2). Any keyword that drops off the rankings is a +10 (regardless of position), and any given keyword can score a change from 0-100. This metric, which I call “Delta100”, is roughly linearly transformed by taking the square root, resulting in a metric called “Delta10”. That value is then multiplied by a constant based on an average temperature of 70°. The transformations involve a little more math, but the core metric is pretty simplistic.
This simplicity may lead people to believe that we haven’t developed more sophisticated approaches. The reality is that we’ve tried many metrics, and they tend to all produce similar temperature patterns over time. So, in the end, we’ve kept it simple.
For the sake of this analysis, though, I’m going to dig into a couple of those other metrics. One metric that we calculate across the 10K keyword set uses a scoring system based on a simple CTR curve. A change from, say #1 to #3 has a much higher impact than a change lower in the top 10, and, similarly, a drop from the top of page one has a higher impact than a drop from the bottom. This metric (which I call “DeltaX”) goes a step farther, though…
If you’re still riding this train and you have any math phobia at all, this may be the time to disembark. We’ll pause to make a brief stop at the station to let you off. Grab your luggage, and we’ll even give you a couple of drink vouchers – no hard feelings.
If you’re still on board, here’s where the ride gets bumpy. So far, all of our metrics are based on taking the average (mean) temperature across the set of SERPs in question (whether 1K or 10K). The problem is that, as familiar as we all are with averages, they generally rely on certain assumptions, including data that is roughly normally distributed.
Core flux, for lack of a better word, is not remotely normally distributed. Our main Delta100 metric falls roughly on an exponential curve. Here’s the 1K data for October 21st:
The 10K data looks smoother, and the DeltaX data is smoother yet, but the shape is the same. A few SERPs/keywords show high flux, they quickly drop into mid-range flux, and then it all levels out. So, how do we take an average of this? Put simply, we cheat. We tested a number of transformations and found that the square root of this value helped create something a bit closer to a normal distribution. That value (Delta10) looks like this:
If you have any idea what a normal distribution is supposed to look like, you’re getting pretty itchy right about now. As I said, it’s a cheat. It’s the best cheat we’ve found without resorting to some really hairy math or entirely redefining the mean based on an exponential function. This cheat is based on an established methodology – Box-Cox transformations – but the outcome is admittedly not ideal. We use it because, all else being equal, it works about as well as other, more complicated solutions. The square root also handily reduces our data to a range of 0-10, which nicely matches a 10-result SERP (let’s not talk about 7-result SERPs… I SAID I DON’T WANT TO TALK ABOUT IT!).
What about the variance? Could we see how the standard deviation changes from day-to-day instead? This gets a little strange, because we’re essentially looking for the variance of the variance. Also, noting the transformed curve above, the standard deviation is pretty unreliable for our methodology – the variance on any given day is very high. Still, let’s look at it, transformed to the same temperature scale as the mean/average (on the 1K data set):
While the variance definitely moves along a different pattern than the mean, it moves within a much smaller range. This pattern doesn’t seem to match the pattern of known updates well. In theory, I think tracking the variance could be interesting. In practice, we need a measure of variance that’s based on an exponential function and not our transformed data. Unfortunately, such a metric is computationally expensive and would be very hard to explain to people.
Do we have to use mean-based statistics at all? When I experimented with different approaches to DeltaX, I tried using a median-based approach. It turns out that the median flux for any given day is occasionally zero, so that didn’t work very well, but there’s no reason – at least in theory – that the median has to be measured at the 50th percentile.
This is where you’re probably thinking “No, that’s *exactly* what the median has to measure – that’s the very definition of the median!” Ok, you got me, but this definition only matters if you’re measuring central tendency. We don’t actually care what the middle value is for any given day. What we want is a metric that will allow us to best distinguish differences across days. So, I experimented with measuring a modified median at the 75th percentile (I call it “M75” – you’ve probably noticed I enjoy codenames) across the more sophisticated DeltaX metric.
That probably didn’t make a lot of sense. Even in my head, it’s a bit fuzzy. So, let’s look at the full DeltaX data for October 21st:
The larger data set and more sophisticated metric makes for a smoother curve, and a much clearer exponential function. Since you probably can’t see the 1,250th data point from the left, I’ve labelled the M75. This is a fairly arbitrary point, but we’re looking for a place where the curve isn’t too steep or too shallow, as a marker to potentially tell this curve apart from the curves measured on other days.
So, if we take all of the DeltaX-based M75’s from the 10K data set over the last 60 days, what does that look like, and how does it compare to the mean/average of Delta10s for that same time period?
Perhaps now you feel my pain. All of that glorious math and even a few trips to the edge of sanity and back, and my wonderfully complicated metric looks just about the same as the average of the simple metric. Some of the peaks are a bit peakier and some a bit less peakish, but the pattern is very similar. There’s still no clear sign of a Penguin 3.0 spike.
Are you still here?
Dear God, why? I mean, seriously, don’t you people have jobs, or at least a hobby? I hope now you understand the complexity of the task. Nothing in our data suggests that Penguin 3.0 was a major update, but our data is just one window on the world. If you were hit by Penguin 3.0 (or if you received good news and recovered) then nothing I can say matters, and it shouldn’t. MozCast is a reference point to use when you’re trying to figure out whether the whole world felt an earthquake or there was just construction outside your window.
Jeez - a man could lose himself in that data. :)
The comments here should be interesting - your data certainly matches our experience but I wonder if that is the same for all?
We are really seeing very little for most of our clients and niches we monitor - some small movement and some new folks on the page but very little in the way of recovery or penalisation even for some very spammy results we just watch. In all honesty, out of several spammy niches we monitor only one has seen any real movement so it is really subtle.
Google seemed to promise those that worked hard to clean up should see some resolution - we have worked with a few folks to help with a clean up and one in particular that has been truly diligent but as of yet there is seemingly no movement at all.
It leads me to wonder if this is a completely different approach to managing link spam - maybe an approach that will take time to see the impact of sites / pages / links devalued and penalties applied. If this is the case, if it is a different approach, a slower and more subtle approach then maybe it needs a new name - how about Google Slug?
We will keep watching the skies as I suspect the true impact is not yet apparent.
Marcus
I think this is what we need to sort out - is this the Penguin we all hoped for and expected? I believe the answer to that is "no". There seems to have been a Penguin update, and the impact of that update for some individuals and possibly niches was large, but it's not on the scale we'd expect after a year. The question is, then - what does this tell us about Google's intent or capabilities? Trying to understand that is how we prepare for the next Penguin.
I agree. I think this is either a different beast (mutant penguin?) or a more nuanced take on the original concept that may take some time to truly reveal the impact. Which, as you state, makes it all but impossible to contrast from A to B.
The capabilities question is interesting though - if the original was too harsh then have they softened it's impact? Looking at some of the junk we see being churned out in the UK that is still working it's pretty much business as usual and if I can manually see patterns (even if they are are not as simple as Whois, C Class, IP etc) then surely, surely, surely Google can.
Intent is interesting as well - clearly whilst some should know better and were in it for the money many small businesses that paid out £x per month in good faith were thrown on the coals with earlier updates. Should the intent not be to neutralise link building and spam links? Rather than this punitive approach against the business owners when the 'SEO companies' simply rebrand and start again with new clients who they promise 100% white hat, safe, ethical, penguin friendly, yada, yada backlinks.
I am an optimist and idealist so I hope that it is a slower, safer neutralisation. That they are looking at more than algorithm tweaks and as much at things like localisation and personalisation to neutralise this link buyers marketplace. That they can reach the end goal without putting innocents (?) out of business.
I tend to think it will be an interesting few months.
What I'm about to say is mostly opinion, but I believe that Penguin 1.0, because it was punitive - most updates focus on maintaining quality over punishing guideline violators - went wrong, and went wrong much more than Google expected. They can test updates all day long, but once it goes out into the world, it's still going to be a different animal.
They've shown unusual caution with subsequent Penguin updates, many of which seem to have been small, and what we're seeing after waiting a year is underwhelming. All of this says to me that Penguin is an animal that has gotten out of Google's control, and they're still struggling with how to cage it.
I think you are probably right - penguin created a lot of bad blood with business owners and really, they want business owners on board hence the Google My Business platform. Not that folks have much choice at the moment but who knows how things like Facebook and Apple Maps will change the local space at least.
"I believe that Penguin 1.0, because it was punitive - most updates focus on maintaining quality over punishing guideline violators - went wrong, and went wrong much more than Google expected."------
This sounds extremely likely, although Google could never admit it. What little I have seen of this update so far is in close comparison to basic daily flux. Whatever update this was, it doesn't seen related to much at all.
Interesting read on the whole penguin 3.0 piece - theory of course but interesting no less.
https://www.northcutt.com/blog/2014/11/mounting-evi...
More and more I think this is a different tact that looks more at the linking sites and devaluation rather than punitive measures for the link recipient.
Will be interesting to see how the niches we track and the folks we suspected would get hurt / recover look in six months.
Thank you for providing such great post backed by thorough data. Based on what John Mueller said October 20th, do you believe the Penguin we hoped for will come out January or February 2015, due to testing complications for the October launch?
We have been reading this and researching lots of sites. We have seen huge improvements which is great news but honestly we didn't change anything, so either what we do is being favoured by the new algorithm or lots of sites have dropped rankings.
It is interesting also with these roll outs, they are not the same with a Matt Cutt's update in someway.
Great analysis, Dr. Pete.
I'm awaiting two recoveries from two Penguin clean-up clients, so I got excited on Friday when lots of (mostly UK-based) SEOs on my Twitter stream started shouting about Penguin 3.0, followed by disappointment when I noticed that my clients hadn't yet recovered. Then there was panic when John Mueller mistakenly suggested that the roll-out was complete, followed by relief when Pierre Farr corrected this and said that it'd carry on for a few more weeks. Lots of emotions in a couple of days...!
I'm confident that I've done a thorough enough removal/disavow job on each of them, so in my eyes it's a matter of "when," not "if." I have noticed another client's spammy competitor drop from the #1-3 spot to God-knows-where for their main keywords, with my client taking their place, so that's something...
I think we're so used to algorithm updates/refreshes happening almost immediately that it's been a surprise to see one roll out more slowly/gradually. It'll be really interesting to see what these next few weeks hold...
Historically, Penguin was a big, one-day hit, too, so that expectation is grounded in reality. I think we're going to see more and more rolling updates, though.
Wow. Well that was quite an epic read, and for me got increasingly disappointing as it goes on. The post is less about "How Big was Penguin 3.0" and more an (increasingly desperate as the post goes on) defence of the supposed quality of MozCast data.
The data sets aren't just a "small and imperfect representation of the entire world of Google", it's positively microscopic. I think I read somewhere there are around 6 billion searches on Google a day (of course not all unique, but that's just noted for scale), and you're tracking 10,000 keywords?
I honestly admire your tenacity and your belief in the product, and the open nature of you and your company is genuinely to be admired, but all a post like this does for me to is remind me why I'd never be a user for a product such as this when the data is so clearly flawed. Or if not flawed, so so freaking tiny as to be insignificant.
You also to your credit note that Pierre Far has said the scope of the update and how long it would take to roll out, so an analysis after 3 weeks rather than 5 days would be of more interest to me.
There's not really a way of me being able to say all this without coming across as a douchebag, which isn't my intention (although "your truth is your truth" on this matter as well!) but the data in this tool may as well be random for all the insight it gives me. I really appreciate your thinking about the tool being in existence to answer the question "is it just me?" and trying to put a scientific process to the data, but for me, it's just not anywhere near the place it would need to be.
Edit: bad spelling as per usual!
Martin,
This post doesn't mention specific timescales or scopes (albeit "slow worldwide rollout").
It's important to note that Pierre confirmed it would take a few weeks, from when it was released on Friday 19th (which you were one of the first to share - thanks!).
Also, Pete asks "could Penguin 3.0 be spread out over weeks or a month?". The above G+ statement from Pierre answers this..
Lastly:
"MozCast is built around a 24-hour cycle – it is designed to detect day-over-day changes. What if an algorithm update rolls out over a couple of days, though, or even a week? (edit: or a few weeks as Pierre confirmed) Is it possible that a relatively large change could be spread thin enough to be undetectable? (edit: Yes! Maybe the idea of this refresh was to be "undetectable"..) Yes, it’s definitely possible, and we believe Google is doing this more often." - this is precisely what (I believe) Google has done with this latest Penguin re-fresh and there-in lies the challenge that MozCast faces..
To Pete's credit, he did confirm (via Twitter) that MozCast hadn't picked up anything from Friday through to Sunday, in spite of reports coming in of a roll-out.
Now a small rant on Penguin posts (though not this one!)
I suggest we all wait until the timescale Pierre gave us has passed.
But a few words of warning: there's a &%$£ ton of posts being written about fixes and case studies. I suggest ignoring these (and not sharing them) as we simply don't know the full facts (do we ever?!) Once the dust has settled, we can all analyse data-led posts / studies and form our own opinions / share insights :)
*EDIT* - Over on SE Round Table, Jenny Halasz mentioned a hunch she had about this particular re-fresh / update here - the more I think about it, the more I'm agreeing with her
I share your frustration that this data isn't particularly actionable. I wish I could say more about Penguin with certainty, but I can't at this point. I don't mean the post to sound defensive - I'm sincerely interested in finding out why the data says what it does, and I basically dropped everything for two days to do that. This is a problem that matters to me.
As for the sample size, I disagree, for a number of reasons. Yes, 10K is a lot less than billions, no question. However, the exact sample size is just part of the picture. A big question is "Is the sample representative?", and we've taken pains to make this sample represent a large swath of queries and industries. A good political poll, carefully conducted, can make reasonable predictions about all voters in the US based on only talking to a thousand people - the trick is finding the right people and asking the right questions.
When we originally launched MozCast, we talked about using customer data. Anonymity issues aside (I think those can be addressed, if you're careful), the problem is that customer data is inherently biased. It also changes - people can add or remove thousands of keywords on any given day. Plus, at that scale, our system are collecting rankings all the time, so there's not much consistency to the process (I can't get a clean 24-hour window, for example).
So, we opted to create more of a laboratory and control the conditions. More data could help cancel out some of the noise, but it would also be qualitatively different data that naturally had more noise - it's hard to say which is better. It's interesting to note that the noise really didn't go down that much when we went from 1K to 10K keywords. Some problems were solved (the old data set occasionally got caught up in a Google test, because it didn't hit enough data centers), but this is more than a numbers game. It's also important to note that the 10K unique queries we use represent millions of searches a day, as many of them are high-volume terms. Comparing the 10K to 6B is apples and oranges. Google thinks in terms of total searches, not unique queries.
I'd also argue that our track record is solid, and many people have commented that MozCast has helped them figure out if a problem was algorithmic or something they had (or hadn't) done. This is a work in progress and an open platform - flaws and all. It is *not* part of Moz's paid offering for that reason, and we don't have a paid version. My hope is that putting this information out there helps us collectively solve these problems better.
Finally, there's a lot to MozCast that goes beyond flux, and that's where much of my focus has been. Even if we knew the temperature were accurate the vast majority of days, it's mostly a gut-check. It's not always actionable to know that Google made a change, when they make 600+ changes per year. That's why more of my work has started focusing on things like the rich feature graph and detecting new features Google is testing. That work is also based on MozCast data, and has made what I feel are a number of important discoveries in the past year.
I appreciate that a representative sample size is important rather than pure size, but you can't make a comparison to political polls because (in the US) there are basically only two variables - you're either a donkey or an elephant out there, right?! Google SERPs are to all intents and purposes infinite in the amount of variable outcomes, but I understand the underlying point you're making around "asking the right questions" (I just don't think you've used a great example there to highlight this). To be honest I don't know enough about how your 1-10k keywords were selected so I wouldn't want to pass too much further comment on that.
I appreciate that it's not a paid for tool and that others may find it useful for which you have cited anecdotal evidence, but I just wanted to outline what "my truth" was for this data because I worry too many people in the industry take such data sets as de facto in such matters when as you and me both know, it certainly isn't. I know you have gone at length to explain that this is indeed the case but in posts as lengthy and complex as this, it's just that it is all to easy for many *ahem* "professional" SEOs just to look at the headline stats and make business decisions off that without even a cursory understanding of the data. That of course is arguably their problem though, not yours!
As per usual I hope that my critical analysis of such posts on Moz is taken in the constructive, counter-point light which it is intended, and genuinely it is your posts I look forward to reading the most on Moz.
It's interesting - people's knee-jerk reactions to Google's announcements is exactly what got me interested in building MozCast (before I even knew what it would be). Google was launching 500+ changes per year (over 600 now), and yet we only named a handful, or waited for Google to name them, and then made huge decisions on the tiny amount of data they handed us.
So, in many ways, I see this data as a counterbalance to that. I actually welcome the other, similar tools that have come along, because I think they all add to that information and help us not just take Google's word for things. Way too many decisions and dollars have been spent over-reacting to Panda and Penguin, or reacting the wrong way.
That said, the same warning definitely applies to this data. If you see your traffic drop 40% and then you look at MozCast and say "Sunny day - I guess we're good to go!", then that's a terrible way to run your business. I hope what people will do is add this data to what Google is saying, what people like Barry are saying (interpreting Google and chatter the best they can), and what their own data is saying, to ultimately make better decisions.
You're both pretty much saying the same thing. Of course a sample of 10,000 isn't going to mirror or 100% accurately represent the unfathomable amount of data you would need to get a true "temperature" of the Google algo.....but it's also not tiny enough to just write it off. My issue with this mean "temperature" is that, as the Doc alluded to, it really doesn't matter to the people who just dropped 10+ spots for their most valued keywords. If the average temperature in America is 80 and sunny, but there is a tornado in my backyard in Kentucky, this metric is useless.
I noticed that in your categories you don't really have any keyword categories for manufacturing....was this in purpose?? I work for a custom case manufacturer, all of our actual manufacturing is handled in Asia, and I see so much blackhat SEO from my competitors it's laughable. Almost none of them take any hits from Google even though they are constantly spamming and breaking every best-practice known to man. I wonder if Google is trying to crack down on these types of industries with their updates, and that's why they're not showing up on the Mozcast???? Or, it could be that Google is not really concerned with Black Hat sites with less than 100k hits per month. Just ideas and thoughts....
First of all guys it was just ""Penguin refresh"" not an penguin update so these why it effects only 1% English queries, on other hand experts are saying it is still rolling out. But now Penguin refresh is completing their 2 weeks and i don't think so now there will be something more with ranking factor.
I have seen those kinds of sites which are not moved much up or down from their previous ranking where most of the sites recovered well their ranking even i had applied same techniques of disavow process for all of the sites, so what should i do now from here?. Should i start disavow process again for those all sites which are not came on rank after lots of expectations with this latest penguin.
Wow. Quite the scientific review! I definitely got off the train, but jumped back on to see your conclusion. I appreciate all that went into figuring out that not much has changed! ;)
And that was meant in the kindest way possible... in case the down vote was the result of taking my comment differently. :/
My comment above yours got a down vote too. Probably just a douche that's down voting everything.
I did not take even 1% offense at your comment. It was a lot of analysis to basically say "Yep, I still think not much happened." I viewed that as a "show your work" project, even knowing the actual answer on the exam would be the same.
I have to admire a post that starts out with a warning that it's tl;dr (and ends with amazement if you finished it). Lots of interesting data and supposition.
I think you're a bit too generous to Google in this line, however
I firmly believe that Penguin 1.0 had a much broader, and possibly much more negative, impact on SERPs than Google believed it would, and I think they’ve genuinely struggled to fix and update the Penguin algorithm effectively.
Panda has become something more of a known quantity. It's easy to understand poor content and how it affects your SERPs. Avoiding Panda isn't terribly hard (and my company benefited greatly). But Penguin is a different animal (no pun intended). When Penguin rolled out I initially thought it was one of Google's broad tests. The reason was that normally high quality SERPs (for even uncompetitive terms) went down the drain. I remember in the earliest days that I become so frustrated with Google returning bad results for searches I had to turn to Bing in some cases. Surely, I thought, this was a mistake of some sort. Google wouldn't swing its sword to cut spam and cut itself in the process. But that appears to be exactly what has happened.
Penguin hasn't gone away. It's stayed (with a vengeance) and I believe it did what Google wanted it to: it turned the SEO industry on its head. I believe that Penguin was more than a link spam tool; it was also designed to obfuscate their algorithm even more. That we're 2 years out and we still don't have any idea how Penguin works speaks volumes. That is can still be easier to launch a new site than to try and clean up a Penguin penalized site also speaks volumes. That you can now get penalized for otherwise benign behavior and not have any chance to recover in a reasonable time... I empathized with Search Engine land when they called for Google to remove Penguin. You're currently better off getting a manual penalty than getting hit by Penguin.
I think you're right in that Google doesn't know what to do with it (or, more accurately, how to make it move at the speed of Internet instead of the speed of government). They've changed the SEO landscape and apparently done so permanently. But they don't seem to know how to control the beast they've turned loose into the wild. But it's disturbing that Google is willing to harm sites to make a point.
Yeah, it's tough to say. I think Penguin was definitely punitive and was meant to send a message, but my gut feeling is that the collateral damage went beyond what Google expected. This isn't their usual method, and maybe they overshot. Speaking to their intent is nearly impossible but a lot of signals went haywire after Penguin 1.0. The ripple effects were huge.
Very, very interesting post, there is lot to digest here.
First off, I believe that it is dangerous and counterproductive to second guess the results of your data based on outcome that really is undefined. There are many factors that could support the results seen just as there are many that can lead us to think that your model is erred.
I diligently follow your Mozcast metric on a daily basis and I think it is great. I update my spread sheet daily with my GA KPI’s and the Mozcast metric. It serves as kind of base line to have an idea whether events are unique to my sites or generalized.
The following comments are similar to comments left at SER ten days ago before Barry announced the update, since Google officially announced an update but nothing has significantly changed in the numbers.
I have been tracking the Mozcast metric since mid-February and since July 30th the date of the last (almost significant spike >90%) the 30 day moving average of the metric has been close to its max of 70. In fact the daily metric has only fallen below 60 twice in the last 30 days. The moving average has been above 67 for 62 out of the last 67 days and those 9 days were between the July 30 and August 13, basically the time it took for the average to adjust to this new point. Prior to July 30th the moving average reached this level only 15% and most of those days were the result of the Feb 5/6 update. Also of note is that the standard deviation of the Mozcast metric has been relatively stable. So the increased average is not the result of spikes and jumps but the result of ongoing change.
What this suggests to me is that Google has been testing on ongoing basis since sometime in late July. The tests are likely targeting smallish groups of sites. The rankings for these sites are changing but since the number of sites for any give “test” is small the change does not have statistically significant impact on the Mozcast metric. With continuous “testing” ongoing the number of sites affected is likely growing but again no single test has enough impact to be noticed. But over the long run the impact is being seen by the unusually high and sustained 30 day average. The October 13 weekend was likely a larger test but Mozcast reported 80 for Sunday, which is less than two standard deviations from the mean. Since that weekend the metrics remain within the same range. The only change was Google’s announcement and even there they have mentioned that it is slow and progressive rollout.
Since chatter about this latest update began more than two month ago (since early August) Google (through John Mueller and others) has said:
Short of the delighted comment, all the others seem to be supported by the data.
The question now is, will the data show us when the update is completed?
This is what's tough, from a prediction standpoint. We don't know what the "right" answer is to compare our answer to. We get the occasional "yes"/"no" about whether there was an update, but Google claims 600+ change per year and maybe we get a dozen yes's each year. Plus, except for % of queries, they don't tell us how big any change is or what the baseline is. I don't want to ever say "I'm right, because I said so" or celebrate because my number matched what Google said on just one day, but at the same time, there's not a good benchmark for success here.
Thanks for another great post!
Certainly it seems that it wasn't exactly an earth-shattering update, but I'm wondering about your keyword sampling hiding some of the effect, even at the 10K level.
Assuming that Penguin is punitive and hits whole sites more than individual URLs, then most of the sites getting punished would not be sites that had a lot of top 10 results for head terms.
If 15% of the page one results you are tracking are from the "big 10", then I think it's pretty safe to assume that Penguin isn't going to touch those sites and many other big ones that hold first page for head terms, so those big sites that are very unlikely to get penalized are flattening the results out.
If there's Penguin effects to be found I'd expect them to be either deeper in the results for head terms or first page for long-tail stuff. I'd expect more of it on first page for long-tail than deep in head terms actually, since Penguin feels like it's kind of a whac-a-mole targeting offenders in the more thin-content areas that were having success with spammy tactics.
How much of that 1k or 10k is long-tail? Sites that I've seen that have come up or down with Penguin have mostly been long-tail sites that don't have a normal share of long-tail vs. head keyword traffic.
So since you made the polling analogy, I guess I'm asking what the cross-tabs look like. :)
We saw big hits to the Top 10 in Penguin 1.0, but it seems like the refreshes have been more subtle, and it's a fair point. We don't have a lot of very-long-tail data, and I haven't seen big difference by volume or query-length in the past, but it may be worth looking over the past week. I haven't done that breakdown. I decided infinite graphs might be too many :)
I discovered the best strategy with these Google rollouts - ignore them. I write content, it's original, and I don't bother with linking because I don't care about my traffic. I care about my audience.
My site just goes up and up. Don't run around like an idiot - do what you know is right.
I see articles like this and I know there's a lot of short-term thinkers that I can easily pass by.
What's with the mass down voting? Phantom down voter ?
"(let’s not talk about 7-result SERPs… I SAID I DON’T WANT TO TALK ABOUT IT!)."
Thanks for the laugh this morning, Pete. And you've got me morbidly thinking back to the Calculus days, wondering what the equation for the slope of that curve is...
Serp Metrics also doesn't show much flux for the US. But it does for the UK. Algooroo is run by an Australian search agency so I'd guess it tracks more Australian SERPs than anything else. Given that the US tends to lead on SEO tactics it's possible the effects where bigger in other countries?
Anecdotally out of 6 UK clients I saw one get big improvements in the search results and the rest saw no change. The winner was a garden sheds website.
I have same Brand name promoting in USA, UK and Australia .
The UK and Australian domain got the tremendous boost in ranking and traffic after 18 Oct but there is no sign of improvement for my US based site.
It'd be interesting to get Dan Petrovic's thoughts as his tool https://algoroo.com/ appears to show a major shift on the 19th.
I would welcome Dan's thoughts. He's a smart guy who I generally admire.
@lbi-tr majority of our tracking data is based on Australian search results.
Awesome Dr. Pete. Awesome. I think the small update with the slow roll-out and additional turbulence of potentially concurrent extended Panda 4.1 roll-out all contributed to making this one just hard to detect. Thanks for the great update!
Great work and great analisys!! Thanks
Generally webmasters are looking for detailed information/analysis on moz about any Google algorithm updates and from the day first, i was surprised to see no such post at moz blog. Pet, i closely monitoring this updates which is still refreshing and i have found that many websites has recovered who lost their rankings in previous penguin update.
I want to share one strange outcome of this update; one website that i was monitoring since last 2 years has recovered a huge portion of traffic even though they are still having hidden, non-relevant and spammy link! I cross verified the data and really surprised to see that the website has recovered 70-80% of traffic with spammy back links.
Have you or anyone on this board observed such cases?
Penguin 1.0 was really hard for webmasters. But as you all know that the recent Penguin 3.0 came after 1 year so there was lots of expectations. I expected recovery for 2 to 3 websites, I am working hard since last year. Hope its a slower update and I'll get desired results in coming weeks.
Google is rolling out it's algorithmic updates slowly, so that the bad boys will be confused with the producing noise.
Wauw.. a really intense article. That I might have to read again. But still Thanks for some awesome data and guidance.
What the Perfect Post . Thank you Dr. Peter J. Meyers
Hi Peter, Nice post i recover 80% of my site,but still confuse i did same work on recovery but some site working well and some site are as is it, i don't know why is not rank well.After seeing your post i will try to analyze my site as per your research work.Thank you so much for sharing.
I say that as long as the content is attractive, there is no need to worry. However, you have done a titanic work that deserves attention. I think Penguin is just there to guide us and this is only one tool among many that must be tamed.
Hi there,
Am also into this penguin thing with Google. my website: https://resourceevents.co.za have been the #3 in the first page of Google south Africa where my focus market is, then we heard the Secure connection things for a better ranking so i thought in order to keep my site safe and always up i should purchase the certificate, then i bought one but one... right some few days after installing the certificate my website when like number 7 in the second page...
Question is why that? is that the penguin thing too? whilst some other posts says Google will rank you better if you've a certificate?
Any help???
Thanks for the information, I have been very useful since I was looking for information on the subject and finally I could find something interesting
I recovered 12 websites, those sites were completely demoted after 17th October (My birthday). After removing risky links it didn't work, so i had to look into the onsite SEO, i found lots of onsite issues, i fixed them and the ranking fixed overnight. You can read full story on my blog post here
https://www.gitexpertgroup.com/blog/google-penguin-3-0-recovery-story-wordpress/
So big that is still rolling and causing us trouble even in holidays
My sports blog is penalized by latest update of Google Panda and there is almost 35% fall in daily traffic. To recover, I disavowed links using Google Webmaster Tools, but I'm still facing same problem.
Update 3.0 was really big. Big to an extent that made me disavow over 8k links to get out of manual penalty just 2 weeks ago for my site [link removed]. I did check a competitor who uses terrible practices and was disappointed to see them in the same spots. But one of my another site got destroyed. From 2nd to 47th for one of my main keywords. Ranking in the 80's now for others. IDK what should I be doing now?
Can anyone suggest ?
Geat!
I have a website https://vietnamprotravel.com and now i try learning about many thing relates SEO, but there's a lot software such as Moz, Seo powersuite .... which is the best?
I try free trial and let you know my experience ....
Dr. Peter J. Meyers
My keyword ranking is drop locally in Pakistan
My keyword ranking is dropped today Saturday 25 October ,i have checked on Friday 24 october night yesterday, my keyword ranking was in top 10 at last night for the home page (website) keywords like (online shopping in Pakistan) and other internal pages . In the morning my website major keyword ranking is down on google.com,pk, i have 4K organic searches on my site from Pakistan but its dramatically change and today i have get only 600 searches organically today .
Dr. Peter J. Meyers and any one tell me is there any google algorithm update Penguin 3.0 in google.com.pk . Any body has noticed dropped ranking in local search engines
Any one please tell me the solution of this problem?
This would actually be a great question for our Q&A section at https://moz.com/community/q. Best of luck!
Very good article, and very thorough research. However, in my industry and I'm sure a lot of people will say the same; the impact has been very significant. Our traffic has increased by around 10-20% from organic (We were a Pengiun 3.0 winner) and I've noticed some major players in my industry fall off the map completely.
Perhaps across the board there may be no movement/negligible/very little but look closer at niche keywords/industries and you can see a lot.
The industry I work in is car leasing in the UK if you're interested.
There's been a weird pattern of reports with this Penguin indicating that it hit outside of the US before it hit the US, which is very rare. Unfortunately, I can't validate that, but I'm hearing from a lot of folks in the UK and Australia.
That's been the scuttlebutt in a lot of forums as well, for whatever that is worth.
Great job as usual Dr P.
Some anecdotal evidence for the UK vs USA theory:
A friend of mine manages a portfolio of 18 sites. 8 x UK and 10 x US. It's a tiny sample, but thought it's worth sharing the spread:
UK:
- 4 sites up by an average of 17.6% (range 5%-43%)
- 3 unchanged (<1% change)
- 1 down by 78%
US:
- 7 sites up by an average of 6.9% (range 3%-17%)
- 3 sites down (range 1%-5%)
I'm not going to draw conclusions. Feel free to make your own.
But I can confirm that the sites were spread across 5-6 niches. Most had a similar history of debatable link tactics, and I believe some had received partial manual link penalties within the last year, with a decent amount of subsequent cleaning up.
So this data set seems like a poster-child case study for comparing US and UK effects, if only it was a sample size 10-100 times larger.
Use it, don't use it.
I had a big surge of traffic too (USA traffic). But I hadn't been hit by penguin before, so it wasn't a recovery. All I can think of is that other sites went down, benefiting us, or that this algo wasn't just penalising bad sites, there an actual positive boost added to the good ones.
I won't lie and say I read the whole thing, but I got as far as the math phobia alert.
Have you tried analyzing the 10-20% of your keyword index with the highest CPC? If someone's spamming they're trying to rank for commercial intent, and CPC is the best and easiest proxy for how commercial a keyword is. Something like panda would affect informational keywords, too, but Penguin is going to be inherently commercial because that's where it makes sense to spam.
So, I'm wondering if you've considered a "Commercial 500" index of the highest CPC terms that you're tracking, to try and separate the nature of the keywords for further analysis?
This would be more comparable to what you mentioned other agencies are tracking, and they obviously showed higher flux like the graph that Cyrus shared from 7 sources the other day.
I've divvied it up by volume and competition in the past (since I have that AdWords data, although it's a little outdated), and the signs weren't terribly clear, unfortunately. The data set in general tends to lean a bit commercial.
My head hurts after reading that
Do we also get drinks vouchers if we made it all the way to the end? :P
They're for Southwest Airlines, and they're a bit crumpled, and also expired, but sure!
My guess is it has to do with the fact that you're only monitoring changes in results 1-10.
Our site escaped Penguin after this last update, and we started ranking again for a lot more terms. Thing is ... most of the terms we started ranking for again are on pages 2 to 5. Still, we're incredibly happy to have some of our pages ranking #18 and #26 (for example) rather than being off the map completely, which was the case before.
I think that if you were looking beyond the first page you'd see a lot more change. Many of the sites that benefited from this update may be returning to page 3 from oblivion rather than moving from page 2 to page 1.
It's a valid point, and we've seen that on rare occasion with other updates. We've chosen to focus on the top 10 and the highest impact rankings, but there are definitely updates that hit farther down the chain. Interesting to note that SERPmetrics splits out the Top 10 vs. Top 100, and they don't seem to be seeing an impact across the board in the Top 100.
All of the data! I have a feeling I might need to read through this a second time to really get it in there. Was a little overwhelming. Great post though.
Looking across my own clients I've seen some big rises in one in particular. They had some spammy links when we took them over. Got them disavowed but they could never get onto the first page despite us seemingly doing everything right. From 17th to the 19th they shot up. Top 5 for every KW now.
This suggest a couple of things to me. Link ghosts might be real (I think they are) and this penguin update also caused a data refresh and those negative link ghosts that were being "remembered" have been removed and the "black mark" for lack of a better term is now gone.
great post... but the good thing is there will be more frequent penguin refreshes which help to regain the effects of it...
My suspicion is that this is the start of smaller, more frequent, Penguin updates. So, in a way, while this one may seem small, it's the start of a new process that's going to have a bigger impact over the course of the next few months.
I actually loved this article. Although the results may not have shown any sort of differential, it's taking leaps like this that we evolve our understanding. Keep up the great work.
Took every points on my note, I am sure it will help a lot to me. Thanks Rebecca Maynes. Great and useful post.
I am doing more commodities and financial based KWS I saw a hit around the 17th. Seems to be coming back to normal. Although the change in organic traffic could be more market based than Penguin. TBD
Hmm going back to analytics I can see one site starting to slide down on the 13-17 and bouncing back up after the 17th. But not all the way back up pre Oct 13th interesting.
when new algo changes occur Its important not to over react. Watch the results over 30 days before you make any major changes. Sometimes G will sort itself out.
Fantastic post, rather a lot to get lost int here can imagine you had "fun" digging through it all! Do you have any plans for Mozcast for the future to help it?
The update Pinguin fell to many pages that should not be up, I still remember to see pages in the first places of google only because they lelnaron the text of keywords everywhere and nonsense
I have had a few old sites recover. pretty exciting
I was hit by Penguin 1.0 and went from over 15k unique visitors per day to barely 80 from Google. As you can imagine, that was a big hit to my AdSense earnings.
Since then, I've removed over optimisation of my sites and, last year in December, I did a massive disavow that mainly comprised of directory submissions that I had done years ago. I'd missed the last Penguin refresh 2 months prior - so you can see that I've been waiting for this Penguin refresh for a long time.
I've seen no change in results after this latest update, so I'm just hoping that it's still in progress.
The main two websites affected were funfactz.com and funny-insults.com.
I used to list #1 for "funny insults" and "fun facts" among many other popular terms. All I ask is to at least have another chance to re-list in Google, alas I am found no where. Do a site:funny-insults.com search and the homepage is not even on the first page.
I've read that Google have crushed small businesses with this update and enabled negative-SEO. Why do they not care for us?
Thanks for the insight! I wonder if this is any different for highly competitive niches.
Does your data set include terms like 'payday loans', 'viagra', 'online casinos', etc?
I assume they would fall into different categories (e.g. 'payday loans' probably is under 'finance', 'viagra' probably is under 'health' and 'online casinos' might fall under 'arts & entertainment') and not be easily stand out from the whole data set?
It does include highly competitive terms, but once you isolate them enough, not much data is left for analysis. I haven't seen a clear relationship between either volume or competitiveness on temperature, but that's generally over time (not analyzed for any given day).
I think the effect is hidden in the timeframe of the update. We personally experienced some waves. I think they are looking for people to paint with broader strokes. They want websites to have deep page depth and high quality content. Not like I'm saying anything new but it's my .02 cents.
God bless the weatherman! Thanks Doc!
I think one of the reasons we get into SEO is these sexy graphs and charts. I mean look at these things!!!
I was reading comments and trying to find out that how a bad backlinks are costly to our sides! What if someone else make bad backlinks of our website in case of jealousy? Does it have a bad impact on our website? I have my own website of attractive facebook covers, i am not a proper SEO, i am just a straggler trying to protect my side from spams and bad backlinks. please do reply me if any one of you have my answer!
Dear Sir,
What Is The Penguin 3.0?
Please Suggest Your Site Not Get Penalty for penguin 3.0 Update.
I just waiting to see this here. But still waiting to see the post about the optimization after the penguin penalty. Moz is the best place where i found all the best solution about SEO. Last three days i have searched a lot of blog but i only found "https://www.abac-bd.com/google-penguin-3-0-update-rolling-out-effects-and-cure" this blog are describing a little about the cure. When do you people released the update for optimization after penguin 3.0.