The Science of Ranking Correlations: How Does PageRank Perform?

Comments 89

Please keep your comments TAGFEE by following the community etiquette.

E-mail me when new comments are posted

Sort by:

Comments are closed on posts more than 30 days old. Got a burning question? Head to our Q&A section to start a new conversation.

Ben Oren

2010-04-22T01:52:55-07:00

Rand your (or Ben's) reasoning for using Spearman correlation instead of Pearson is wrong. The difference between two correlations is not that one describes linear and the other exponential correlation, it is that they differ in the type of variables that they use. Both Spearman and Pearson are trying to find whether two variables correlate through a monotone function, the difference is that they treat different type of variables - Pearson deals with non-ranked or continuous variables while Spearman deals with ranked data.

I am not sure whether using Spearman for number of linking domains or external links is the correct statistical test to use.

Since Pagerank can be treated as ranked variable, using Spearman sounds about right, I am not sure so much about mozRank and mozTrust (don't know enough about those variables)

Another point is that regardless of what test you use, Correlation Coefficient of .2-.3 is extremely low. It very roughly translates to the fact that the chance of a significant monotonous correlation between two variables is 20-30% which is considered random. Taken your standard errors into account, the differences between the correlation coefficients are not significant enough to be able to draw conclusions. If one takes into the account that there are more than 200 parameters that Google takes into account when ranking a website, it would make sense that correlation between a single parameter and ranking would be statistically best described as random

Branko

Ben_Oren edited 2010-04-22T01:55:27-07:00
6 0

Rand your (or Ben's) reasoning for using Spearman correlation instead of Pearson is wrong. The difference between two correlations is not that one describes linear and the other exponential correlation, it is that they differ in the type of variables that they use. Both Spearman and Pearson are trying to find whether two variables correlate through a monotone function, the difference is that they treat different type of variables - Pearson deals with non-ranked or continuous variables while Spearman deals with ranked data. I am not sure whether using Spearman for number of linking domains or external links is the correct statistical test to use. Since Pagerank can be treated as ranked variable, using Spearman sounds about right, I am not sure so much about mozRank and mozTrust (don't know enough about those variables) Another point is that regardless of what test you use, Correlation Coefficient of .2-.3 is extremely low. It very roughly translates to the fact that the chance of a significant monotonous correlation between two variables is 20-30% which is considered random. Taken your standard errors into account, the differences between the correlation coefficients are not significant enough to be able to draw conclusions. If one takes into the account that there are more than 200 parameters that Google takes into account when ranking a website, it would make sense that correlation between a single parameter and ranking would be statistically best described as random <a href="https://twitter.com/neyne" rel="nofollow">Branko</a>
Cancel
- Martin Pezet
 
 2010-04-22T03:35:04-07:00
 
 This is an excellent response and one that shouldn't be glossed over by people just because they might not understand how statistics work. I myself have only a fleeting experience of these correlation techniques, but it is enough to know that what whiteweb_b said:
 
 "Another point is that regardless of what test you use, Correlation Coefficient of .2-.3 is extremely low."
 
 ...is very true. Don't be put off by the first part of this guys response if you don't understand it (I only did at a very basic level), read the second half because people need to understand what can actually be taken from this data. (i.e. not a great deal if you're looking for a relational pattern)
 
 It was still an interesting orginial article, and the first half was still particularily useful to me.
 
 1 0
 
 This is an excellent response and one that shouldn't be glossed over by people just because they might not understand how statistics work. I myself have only a fleeting experience of these correlation techniques, but it is enough to know that what whiteweb_b said: "Another point is that regardless of what test you use, Correlation Coefficient of .2-.3 is extremely low." ...is very true. Don't be put off by the first part of this guys response if you don't understand it (I only did at a very basic level), read the second half because people need to understand what can actually be taken from this data. (i.e. not a great deal if you're looking for a relational pattern) It was still an interesting orginial article, and the first half was still particularily useful to me.
 Cancel
- Rand Fishkin
 
 2010-04-22T08:52:29-07:00
 
 I'm going to let Ben tackle the question regarding Spearman. We had previously used Pearson, but after significant research into potential correlation methodologies, determined this was the right one. Ben can certainly explain better than I.
 
 With regard to the "nearly random" - that would be a gross inaccuracy. Random would be a correlation of 0.00. If we include the highest standard error of 0.00559586, not one of these correlation is close to 0.00 or randomness. They all clearly have a significant, measurable correlation with ranking. That's not to say that any of them are excellent, but as single metrics in an algorithmic formula that contain 200+ factors is somewhat significant.
 
 Even with the 0.18 correlation (could be as low as 0.1745 with the standard error), I wouldn't try to claim that Google is lying on the corporate technology page. It may not be huge, but there is correlation with rankings, suggesting that indeed, pages with more PageRank have some better opportunity to rank higher.
 
 randfish edited 2010-04-22T09:11:16-07:00
 2 0
 
 I'm going to let Ben tackle the question regarding Spearman. We had previously used Pearson, but after significant research into potential correlation methodologies, determined this was the right one. Ben can certainly explain better than I. With regard to the "nearly random" - that would be a gross inaccuracy. Random would be a correlation of 0.00. If we include the highest standard error of 0.00559586, not one of these correlation is close to 0.00 or randomness. They all clearly have a significant, measurable correlation with ranking. That's not to say that any of them are excellent, but as single metrics in an algorithmic formula that contain 200+ factors is somewhat significant. Even with the 0.18 correlation (could be as low as 0.1745 with the standard error), I wouldn't try to claim that Google is lying on the corporate technology page. It may not be huge, but there is correlation with rankings, suggesting that indeed, pages with more PageRank have some better opportunity to rank higher. 
 Cancel
 - theprofessor31
 
 2010-05-14T00:05:23-07:00
 
 Let me divert and digress. Are we totally missing a bigger problem?
 
 1. Using random keywords for input data is bad. Instead, select keywords which are competitive (cpc and search volume). Those phrases are optimized. Much more to learn from difficult phrases than easy phrases.
 
 2. consider backlink anchor text. a pr10 means nothing if your anchor text is the word "click here" Simple case in point: search for the phrase "search engine" on google. Guess what, Google (pr10) is nowhere near top, while Dogpile is (pr8 which is way less than a 10)
 
 Come on guys. Please re-do with a better designed test.
 
 1 0
 
 Let me divert and digress. Are we totally missing a bigger problem? 1. Using random keywords for input data is bad. Instead, select keywords which are competitive (cpc and search volume). Those phrases are optimized. Much more to learn from difficult phrases than easy phrases. 2. consider backlink anchor text. a pr10 means nothing if your anchor text is the word "click here" Simple case in point: search for the phrase "search engine" on google. Guess what, Google (pr10) is nowhere near top, while Dogpile is (pr8 which is way less than a 10) Come on guys. Please re-do with a better designed test.
 Cancel
- Sean Weigold Ferguson
 
 2010-04-22T09:03:54-07:00
 
 If I recall, Spearman correlation is the appropriate statistic for comparing ranked and non-ranked data. I believe Kendall serves a similar purpose.
 
 While a correlation of .2 is low, it can be significant with a large enough n. The significance levels of the correlations weren't reported here.
 
 Now to go out on a limb. If two correlations both have a standard error of .005, A difference of .01 or greater between them should be significant at p < .05. I may however be completely wrong, and wouldn't know for sure without seeing the data.
 
 1 0
 
 If I recall, Spearman correlation is the appropriate statistic for comparing ranked and non-ranked data. I believe Kendall serves a similar purpose. While a correlation of .2 is low, it can be significant with a large enough n. The significance levels of the correlations weren't reported here. Now to go out on a limb. If two correlations both have a standard error of .005, A difference of .01 or greater between them should be significant at p < .05. I may however be completely wrong, and wouldn't know for sure without seeing the data. 
 Cancel
- Ben Hendrickson
 
 2010-04-22T11:58:33-07:00
 
 Hi White Web,
 
 Sean's response is exactly right, but I'll repeat it more verbosely :-)
 
 Your statement that "both Spearman and Pearson are trying to find whether two variables correlate through a monotone function" is not accurate. The distinction between measuring only linear correlation or any monotonic correlation is actually the critical difference between them. I will touch on why that is, but first let me just cite the relevant Wikipedia articles.
 
 The Wikipedia article on "Pearson correlation coefficient" starts by noting that "measure of the correlation (linear dependence) between two variables".
 
 The Wikpedia article on "Spearman's rank correlation coefficient" starts with an example in the upper right showing that a "Spearman correlation of 1 results when the two variables being compared are monotonically related, even if their relationship is not linear. In contrast, this does not give a perfect Pearson correlation."
 
 Technically, Spearman's correlation the same as Pearson's except one first replaces the values of both variables with what their indices would be if one sorted them. This is what makes any monotonic function become a linear function which normal Pearson's will score perfectly.
 
 You make the comment "Pearson deals with non-ranked or continuous variables while Spearman deals with ranked data". This is only true because Spearman converts everything to ranked variables! If variables are all ranked to begin with, Pearson and Spearman are identical. So it certainly is not correct so suggest one can only apply Spearman's to already ranked data, or else there would never be a case to use it where it would give a different value than Pearson's!
 
 .....
 
 Your second point I think is confusing the idea of a small correlation coefficient with small certainty about the value of a correlation coefficient. We aren't looking at factors such as "does the page have any content relevant to the query" so we would expect the correlation to be fairly low. Nevertheless, understanding what metrics are the best to measure this query-independent strength of a page is quite important to SEO. This can be done by assuming independence between page content and the metrics, and then measuring the correlation just to the query-independent measure. One can argue about to what extent this assumption of independence is not valid and may cause some bias, but it is statistically quite sound. Does this make sense why this is valid, and why we would expect the correlation values to be fairly low?
 
 Anyway, I like discussing math more than my usually programming, so do answer back if any of this doesn't make sense or you still take exception to the approach of the article :-)
 
 Ben
 
 6 0
 
 Hi White Web, Sean's response is exactly right, but I'll repeat it more verbosely :-) Your statement that "both Spearman and Pearson are trying to find whether two variables correlate through a monotone function" is not accurate. The distinction between measuring only linear correlation or any monotonic correlation is actually the critical difference between them. I will touch on why that is, but first let me just cite the relevant Wikipedia articles. The Wikipedia article on "Pearson correlation coefficient" starts by noting that "measure of the correlation (linear dependence) between two variables". The Wikpedia article on "Spearman's rank correlation coefficient" starts with an example in the upper right showing that a "Spearman correlation of 1 results when the two variables being compared are monotonically related, even if their relationship is not linear. In contrast, this does not give a perfect Pearson correlation." Technically, Spearman's correlation the same as Pearson's except one first replaces the values of both variables with what their indices would be if one sorted them. This is what makes any monotonic function become a linear function which normal Pearson's will score perfectly. You make the comment "Pearson deals with non-ranked or continuous variables while Spearman deals with ranked data". This is only true because Spearman converts everything to ranked variables! If variables are all ranked to begin with, Pearson and Spearman are identical. So it certainly is not correct so suggest one can only apply Spearman's to already ranked data, or else there would never be a case to use it where it would give a different value than Pearson's! ..... Your second point I think is confusing the idea of a small correlation coefficient with small certainty about the value of a correlation coefficient. We aren't looking at factors such as "does the page have any content relevant to the query" so we would expect the correlation to be fairly low. Nevertheless, understanding what metrics are the best to measure this query-independent strength of a page is quite important to SEO. This can be done by assuming independence between page content and the metrics, and then measuring the correlation just to the query-independent measure. One can argue about to what extent this assumption of independence is not valid and may cause some bias, but it is statistically quite sound. Does this make sense why this is valid, and why we would expect the correlation values to be fairly low? Anyway, I like discussing math more than my usually programming, so do answer back if any of this doesn't make sense or you still take exception to the approach of the article :-) Ben 
 Cancel
 - Ben Oren
 
 2010-04-22T14:41:06-07:00
 
 Hi Ben
 
 Thanks for the response
 
 So for the first part, I did not claim that Pearson is better suited here than Spearmans. My claim was that the justification as stated in the article was not correct. While we can argue the validity of each test, it still seems to me that the explanation that says that we use Spearman instead of Pearson because "link counts are generally exponentially correlated" is wrong. If we know that the link counts are exponentially correlated (or any other variable), there would be no need to establish independence. The rank correlation is used when we don't know whether variables are correlated and want to test that null hypothesis. Furthermore, the fact that Spearman deals with ranked data (either because that is its nature or because we ranked it so we can perform Spearman's on it) already tells us that the correlation (if existing) will be linear, hence the nickname of Spearman's as "the Pearson test of ranked data". Additionally, I wrote that Pagerank correlation to rankings would suit the Spearman rank correlation perfectly but am not so sure about the backlink count. One would have to transform link counts into ranks and I do not see how can that be done consistently over large number of SERPs (but please do correct me if i am wrong).
 
 As for the second point, I do not think there is confusion on my part. I was talking about low values of correlation coefficient, not about the small certainty of the value. My point was that, yes you can use Spearmans coefficient to testt a null hypothesis which claims that two ranked variables are independent of one another (and to do that on a sample like yours, one must perform a student-t test which i don't see in the article) but you cannot use Spearman's coefficient measured on different variables to compare the strength of correlations between those variables and the dependant variable. In other words, you can say that your Spearman's test rejects the null hypothesis which says that there is no correlation between the PR and ranking, but the strength of correlation cannot be established from it, let alone compare with different rho values of other parameters (especially parameters that are not naturally ranked like link count).
 
 I have been breaking my teeth on similar measurements on a different SEO subject and have consulted several statisticians (and large volume of literature) on these issues, so i can definitely appreciate the effort invested in trying to give interpretation to so much gathered data, but unfortunately the wish to publish the conclusion does not correlate with the significance of the results :)
 
 Again, thanks for your response. I second the request expressed here for some more in-depth study on statistical analysis of different types of data we gather when performing SEO research.
 
 Ben_Oren edited 2010-04-22T14:43:17-07:00
 2 0
 
 Hi Ben Thanks for the response So for the first part, I did not claim that Pearson is better suited here than Spearmans. My claim was that the justification as stated in the article was not correct. While we can argue the validity of each test, it still seems to me that the explanation that says that we use Spearman instead of Pearson because "link counts are generally exponentially correlated" is wrong. If we know that the link counts are exponentially correlated (or any other variable), there would be no need to establish independence. The rank correlation is used when we don't know whether variables are correlated and want to test that null hypothesis. Furthermore, the fact that Spearman deals with ranked data (either because that is its nature or because we ranked it so we can perform Spearman's on it) already tells us that the correlation (if existing) will be linear, hence the nickname of Spearman's as "the Pearson test of ranked data". Additionally, I wrote that Pagerank correlation to rankings would suit the Spearman rank correlation perfectly but am not so sure about the backlink count. One would have to transform link counts into ranks and I do not see how can that be done consistently over large number of SERPs (but please do correct me if i am wrong). As for the second point, I do not think there is confusion on my part. I was talking about low values of correlation coefficient, not about the small certainty of the value. My point was that, yes you can use Spearmans coefficient to testt a null hypothesis which claims that two ranked variables are independent of one another (and to do that on a sample like yours, one must perform a student-t test which i don't see in the article) but you cannot use Spearman's coefficient measured on different variables to compare the strength of correlations between those variables and the dependant variable. In other words, you can say that your Spearman's test rejects the null hypothesis which says that there is no correlation between the PR and ranking, but the strength of correlation cannot be established from it, let alone compare with different rho values of other parameters (especially parameters that are not naturally ranked like link count). I have been breaking my teeth on similar measurements on a different SEO subject and have consulted several statisticians (and large volume of literature) on these issues, so i can definitely appreciate the effort invested in trying to give interpretation to so much gathered data, but unfortunately the wish to publish the conclusion does not correlate with the significance of the results :) Again, thanks for your response. I second the request expressed here for some more in-depth study on statistical analysis of different types of data we gather when performing SEO research. 
 Cancel
 - Ben Hendrickson
 
 2010-04-22T17:50:52-07:00
 
 Thanks for engaging on this. I'm still pretty sure I applied the math right, but I also enjoy chatting about it, and think we can reach some conclusions about this together.
 
 Here is my argument for Spearman's over Pearson's a little more verbosely:
 
 1) We cannot assume linearity of our data. Said more precisely, we cannot a priori assume that the metrics we are looking at are correlated if and only if they are linearly correlated. If I were correct that links were more exponentially correlated than linearly correlated this would be obvious, although even if you don't accept that, I still think you must conceded there is no reason to assume any correlation would necessarily be linear.
 
 2) Given the correlations might not be entirely linear, we shouldn't use a measure of correlation that only measures linear correlation.
 
 3) Pearson's only measures linear correlation. Spearman's does not. The quotes from Wikipedia in my first reply make this clear, although if you doubt this we can argue the math directly. So it seems pretty clear Spearman's is the way to go for this problem over Pearson's.
 
 .....
 
 It is true I didn't do the math to reject the null hypothesis on each value to show they were correlated, but that is only because I thought it was pretty clear we would always reject those null hypotheses and find them correlated. However, it is pretty easy to do this math. Consider the correlation of PR to Google.com. The correlation coefficient is greater than 0.18, standard error is less than 0.0056, so the null hypothesis being right would be an event of more than 32.143(=.18/0.0056) standard deviations. This is unlikely enough that most online calculators round that to 0 probability, although using wolferamalpha.com we can see the chance is less than 1*10^-278. That is less likely than winning a major lottery 30 times in a row.
 
 I think that is strong enough significance to publish :-)
 
 ....
 
 The claim you make in bold is that one cannot use Spearman's to compare the relative strength of correlations. Why is that? These coefficients are a measurement of correlation between 0 and 1, so greater and lesser are well defined. We compared them in the post, how did it not work? Do you really mean to say it is unfair to claim a Spearman's correlation of 0.99 is more correlated (measured by Spearman's coefficent) than a Spearman's correlation of 0.01? Spearman's correlation is a measure of how close the correlation is to being a monotonic function, and so it seems pretty clear a measurement of 0.99 would be very close to monotonic correlation and 0.01 would be far away (at least relative to each other). How is that not valid?
 
 You are right that one can use Spearman's coefficients to try to reject a NULL hypothesis of independence, but that doesn't mean one cannot also use the coefficients as a measure of correlation.
 
 .....
 
 You make the point that you think Spearman correlation cannot be applied to link counts, although you think it can be to PR, because one cannot convert the raw values to ranked values consistently over a large number of SERPs. Here is how one can (and must) do it. For each SERP, replace each link value with the index of that value if the values in that SERP were sorted. In fact, when applying Spearman's to PR values, one also needs to do this. It is part of Spearman's algorithm to do this, and because not all PR values will be in every SERP, one would get the wrong answer if one did not.
 
 ....
 
 Am I being clear? I tried to answer all of your objections, let me know if I missed something. I'm sure we can clarify everything.
 
 Ben
 
 bhendrickson edited 2010-04-22T17:56:41-07:00
 4 0
 
 Thanks for engaging on this. I'm still pretty sure I applied the math right, but I also enjoy chatting about it, and think we can reach some conclusions about this together. Here is my argument for Spearman's over Pearson's a little more verbosely: 1) We cannot assume linearity of our data. Said more precisely, we cannot a priori assume that the metrics we are looking at are correlated if and only if they are linearly correlated. If I were correct that links were more exponentially correlated than linearly correlated this would be obvious, although even if you don't accept that, I still think you must conceded there is no reason to assume any correlation would necessarily be linear. 2) Given the correlations might not be entirely linear, we shouldn't use a measure of correlation that only measures linear correlation. 3) Pearson's only measures linear correlation. Spearman's does not. The quotes from Wikipedia in my first reply make this clear, although if you doubt this we can argue the math directly. So it seems pretty clear Spearman's is the way to go for this problem over Pearson's. ..... It is true I didn't do the math to reject the null hypothesis on each value to show they were correlated, but that is only because I thought it was pretty clear we would always reject those null hypotheses and find them correlated. However, it is pretty easy to do this math. Consider the correlation of PR to Google.com. The correlation coefficient is greater than 0.18, standard error is less than 0.0056, so the null hypothesis being right would be an event of more than 32.143(=.18/0.0056) standard deviations. This is unlikely enough that most online calculators round that to 0 probability, although using wolferamalpha.com we can see the chance is less than 1*10^-278. That is less likely than winning a major lottery 30 times in a row. I think that is strong enough significance to publish :-) .... The claim you make in bold is that one cannot use Spearman's to compare the relative strength of correlations. Why is that? These coefficients are a measurement of correlation between 0 and 1, so greater and lesser are well defined. We compared them in the post, how did it not work? Do you really mean to say it is unfair to claim a Spearman's correlation of 0.99 is more correlated (measured by Spearman's coefficent) than a Spearman's correlation of 0.01? Spearman's correlation is a measure of how close the correlation is to being a monotonic function, and so it seems pretty clear a measurement of 0.99 would be very close to monotonic correlation and 0.01 would be far away (at least relative to each other). How is that not valid? You are right that one can use Spearman's coefficients to try to reject a NULL hypothesis of independence, but that doesn't mean one cannot also use the coefficients as a measure of correlation. ..... You make the point that you think Spearman correlation cannot be applied to link counts, although you think it can be to PR, because one cannot convert the raw values to ranked values consistently over a large number of SERPs. Here is how one can (and must) do it. For each SERP, replace each link value with the index of that value if the values in that SERP were sorted. In fact, when applying Spearman's to PR values, one also needs to do this. It is part of Spearman's algorithm to do this, and because not all PR values will be in every SERP, one would get the wrong answer if one did not. .... Am I being clear? I tried to answer all of your objections, let me know if I missed something. I'm sure we can clarify everything. Ben
 Cancel
 - Sean Weigold Ferguson
 
 2010-04-22T18:32:31-07:00
 
 Ben,
 
 Really enjoying this exchange. Statistics are one of my biggest passions. Being well-versed in the field is certainly valuable for SEO. I'm glad to see that SEOmoz has a team member like you.
 
 I'd like to second the request below for SEOmoz to write a blog post about using statistics for SEO. I think that the community would find it very valuable, and it would help us gain a deeper understanding of a lot of SEOmoz's research.
 
 5 0
 
 Ben, Really enjoying this exchange. Statistics are one of my biggest passions. Being well-versed in the field is certainly valuable for SEO. I'm glad to see that SEOmoz has a team member like you. I'd like to second the request below for SEOmoz to write a blog post about using statistics for SEO. I think that the community would find it very valuable, and it would help us gain a deeper understanding of a lot of SEOmoz's research. 
 Cancel
thetoad01

2010-04-22T06:44:24-07:00

I emailed this to my boss and he asked what we can do to get our page rank higher . . . go figure.

5 0

I emailed this to my boss and he asked what we can do to get our page rank higher . . . go figure.
Cancel
- goodnewscowboy
 
 2010-04-22T11:12:20-07:00
 
 LOL! It's kinda like you're living in a Monty Python sketch. Next thing you know, someone from the Ministry of Silly Walks will be emailing you asking for help to keyword stuff their home page.
 
 1 0
 
 LOL! It's kinda like you're living in a Monty Python sketch. Next thing you know, someone from the Ministry of Silly Walks will be emailing you asking for help to keyword stuff their home page.
 Cancel
 - thetoad01
 
 2010-04-23T06:59:30-07:00
 
 I was already asked to keyword stuff the homepage. My boss also asked me if I can make the font color white so people can't read it. I actually remember 12 years ago when that worked (when Yahoo was #1).
 
 1 0
 
 I was already asked to keyword stuff the homepage. My boss also asked me if I can make the font color white so people can't read it. I actually remember 12 years ago when that worked (when Yahoo was #1).
 Cancel
JohnAndrews

2010-05-03T18:19:40-07:00

I'm late to this discussion and not a statistician, but I will claim that comparisons between the "measured" coefficients e.g. "It's about 51% better correlated than PageRank - a big step up" are not valid.

I see Branko already noted this as well.

Given that these claims Rand makes such as "It's about 51% better" are what most people will take away from the discussion, and noting that they also suggest Rand's Tools are better than PageRank numbers, any responsible SEO has to question the ethics of publishing this (and labeling it science) -- as Michael Martine also noted above.

Now.. can *I* prove that it is invalid to compare the derived coefficients and assume a linear relationship such that claims of "51% better" are allowed? Correlation coefficients can almost never be compared to each other linearly as Rand compares them here. Branko noted that as well.

Maybe Ben can run some fabricated test data through his computers, to show that known-to-be correlated data produce high correlation coefficients via his scripts? And known-to-be-uncorrelated data produce low correlation coefficients? And then, using that test data, can we test whether or not the produced correlation coefficients relate to each other in a fashion that supports claims like "51% better"? Surely you can fabricate data that is actually 51% better, and run it through the programs. That would certainly remove a lot of the doubt surrounding these claims, and the validity of the approach.

The SEO community (or perhaps more accurately, the seomoz community) needs to decide whether or not or how much to trust seo "research" based in such statistical analyses. Most work this deep into statistics is outside our reach. But does that mean we accept it, and allow the claims to stand?

I pitched a talk on this issue for the SMX Advanced meeting, were I included an outline of how one can safely report SEO research findings without stepping into the muddy waters of unverifiable claims. It didn't get accepted this time. It is an important issue that we all need to address.

4 1

I'm late to this discussion and not a statistician, but I will claim that comparisons between the "measured" coefficients e.g. "It's about 51% better correlated than PageRank - a big step up" are not valid. I see Branko already noted this as well. Given that these claims Rand makes such as "It's about 51% better" are what most people will take away from the discussion, and noting that they also suggest Rand's Tools are better than PageRank numbers, any responsible SEO has to question the ethics of publishing this (and labeling it science) -- as Michael Martine also noted above. Now.. can *I* prove that it is invalid to compare the derived coefficients and assume a linear relationship such that claims of "51% better" are allowed? Correlation coefficients can almost never be compared to each other linearly as Rand compares them here. Branko noted that as well. Maybe Ben can run some fabricated test data through his computers, to show that known-to-be correlated data produce high correlation coefficients via his scripts? And known-to-be-uncorrelated data produce low correlation coefficients? And then, using that test data, can we test whether or not the produced correlation coefficients relate to each other in a fashion that supports claims like "51% better"? Surely you can fabricate data that is actually 51% better, and run it through the programs. That would certainly remove a lot of the doubt surrounding these claims, and the validity of the approach. The SEO community (or perhaps more accurately, the seomoz community) needs to decide whether or not or how much to trust seo "research" based in such statistical analyses. Most work this deep into statistics is outside our reach. But does that mean we accept it, and allow the claims to stand? I pitched a talk on this issue for the SMX Advanced meeting, were I included an outline of how one can safely report SEO research findings without stepping into the muddy waters of unverifiable claims. It didn't get accepted this time. It is an important issue that we all need to address.
Cancel
Stephan Baldwin

2010-04-22T10:18:30-07:00

Im not nearly as smart as Ben but I have a couple thoughts:

What if Ben gathered information only about competitive keywords like with a global monthly search query over say 3000, I wonder if the correlation between PR and rankings would increase dramatically.

He may be using only popular keywords, but if he is using more random keywords then I could see where PR would be less predictive.

When I look up a couple random popular keywords here is what I find:

Pizza

Page 1 SERPS:

1. PR 7

2. PR 7

3. PR 7

4. PR 6

5. PR 5 (note domain name is pizza.com)

6. PR 6

7. PR 5

8. PR 4

9. PR 4

10. PR 5

Franchise

Page 1 SERPS:

1. PR 5 (note domain name is franchise.com)

2. PR 6

3. PR 6

4. PR 6

5. PR 6

6. PR 3 (note dictionary.reference.com - alexa rank 165)

7. PR 4

8. PR 4

9. PR 5

10. PR 5

To me PR is just one of a number of factors, but I still think that in popular keyword searches, the results will basically show sites with similar PR in decending order unless other things are overwhelming PR. Looking at PR of other sites that come up can at least give you indication if that a keyword you should target with the page you had in mind or if you are probably out of your league.

Improve your PR by acquiring great links and your site will generally rank higher for various keywords.

franchisegator edited 2010-04-22T10:21:38-07:00
3 0

Im not nearly as smart as Ben but I have a couple thoughts: What if Ben gathered information only about competitive keywords like with a global monthly search query over say 3000, I wonder if the correlation between PR and rankings would increase dramatically. He may be using only popular keywords, but if he is using more random keywords then I could see where PR would be less predictive. When I look up a couple random popular keywords here is what I find: Pizza Page 1 SERPS: 1. PR 7 2. PR 7 3. PR 7 4. PR 6 5. PR 5 (note domain name is pizza.com) 6. PR 6 7. PR 5 8. PR 4 9. PR 4 10. PR 5 Franchise Page 1 SERPS: 1. PR 5 (note domain name is franchise.com) 2. PR 6 3. PR 6 4. PR 6 5. PR 6 6. PR 3 (note dictionary.reference.com - alexa rank 165) 7. PR 4 8. PR 4 9. PR 5 10. PR 5 To me PR is just one of a number of factors, but I still think that in popular keyword searches, the results will basically show sites with similar PR in decending order unless other things are overwhelming PR. Looking at PR of other sites that come up can at least give you indication if that a keyword you should target with the page you had in mind or if you are probably out of your league. Improve your PR by acquiring great links and your site will generally rank higher for various keywords. 
Cancel
- Sean Weigold Ferguson
 
 2010-04-22T10:51:47-07:00
 
 You raise a good point. I touched on this is my comment above. There are probably additional factors that mediate the relationship between PageRank and SERP rankings. Query volume may be one of them.
 
 1 0
 
 You raise a good point. I touched on this is my comment above. There are probably additional factors that mediate the relationship between PageRank and SERP rankings. Query volume may be one of them.
 Cancel
 - Ben Hendrickson
 
 2010-04-22T18:16:24-07:00
 
 It is a good point. More slicing and dicing would be good.
 
 I did compare one word vs two word queries, as that was easy to do. It showed that one word queries where more correlated to about all of the metrics. One word queries are probably high traffic, and so this suggests that you are probably right that higher traffic terms correlated better to PR and all of the other metrics. But I thought this made a few too many assumptions to put in the post.
 
 Adwords data would really be the right thing to use.
 
 3 0
 
 It is a good point. More slicing and dicing would be good. I did compare one word vs two word queries, as that was easy to do. It showed that one word queries where more correlated to about all of the metrics. One word queries are probably high traffic, and so this suggests that you are probably right that higher traffic terms correlated better to PR and all of the other metrics. But I thought this made a few too many assumptions to put in the post. Adwords data would really be the right thing to use. 
 Cancel
Fábio Ricotta

2010-04-22T04:35:37-07:00

Hi Rand and SEOmoz team,I loved this article. As you know we are trying to do some correlations here in Brazil but I was wondering if you can give us some good resources to understand more about correlation and how can we do that.

Maybe Ben could create an article explaining how to measure correlation like you did here in this article. I'd love to learn and help the SEO community as you are already doing.

Again, congrats for this post. You are doing a great job!

From a big fan of yours,Fábio

fabioricotta-84038 edited 2010-04-22T04:36:21-07:00
3 0

Hi Rand and SEOmoz team,I loved this article. As you know we are trying to do some correlations here in Brazil but I was wondering if you can give us some good resources to understand more about correlation and how can we do that. Maybe Ben could create an article explaining how to measure correlation like you did here in this article. I'd love to learn and help the SEO community as you are already doing. Again, congrats for this post. You are doing a great job! From a big fan of yours,Fábio 
Cancel
- Gianluca Fiorelli
 
 2010-04-22T04:41:44-07:00
 
 A thumbs up for the request of an article from Ben
 
 And a virtual one (I don't want to spam) for the will you show to help the SEO community.
 
 3 0
 
 A thumbs up for the request of an article from Ben And a virtual one (I don't want to spam) for the will you show to help the SEO community.
 Cancel
- Sean Weigold Ferguson
 
 2010-04-22T09:12:00-07:00
 
 Correlation is actually a fairly simple statistic. Excel, or a similar program can calculate it as a function. Essentially, correlation (r) indicates how much of the variability in Y can be attributed to changes in X. Variance (r^2) gives this a percentage.
 
 3 0
 
 Correlation is actually a fairly simple statistic. Excel, or a similar program can calculate it as a function. Essentially, correlation (r) indicates how much of the variability in Y can be attributed to changes in X. Variance (r^2) gives this a percentage.
 Cancel
Associate

Thomas Høgenhaven
Associate

2010-04-22T02:09:13-07:00

Thank you for being so transparent about SEOmoz' own metrics corelation with rankings.

I am sure many would find it tempting just to say they got the best tools on the web and try hiding the flaws.

And I am very happy to have some stats that prove PR shouldn't be used. I guess I should be proud of not having the google toolbar installed ;)

3 0

Thank you for being so transparent about SEOmoz' own metrics corelation with rankings. I am sure many would find it tempting just to say they got the best tools on the web and try hiding the flaws. And I am very happy to have some stats that prove PR shouldn't be used. I guess I should be proud of not having the google toolbar installed ;) 
Cancel
mikeseoaustin

2010-04-22T04:54:44-07:00

This is nice detail posts and we have more to read on PR to clear and affirm our views which we share with our clients

2 0

This is nice detail posts and we have more to read on PR to clear and affirm our views which we share with our clients
Cancel
Geoff Andrews

2010-04-22T16:53:04-07:00

Ah, PageRank. The amber nectar than turned to water.

About as relevant as a dmoz listing.

Maybe any post extolling the virtues of PageRank and dmoz listings should be auto blocked by Google et al. to limit long conversations with the partially educated managers / clients?

2 0

Ah, PageRank. The amber nectar than turned to water. About as relevant as a dmoz listing. Maybe any post extolling the virtues of PageRank and dmoz listings should be auto blocked by Google et al. to limit long conversations with the partially educated managers / clients?
Cancel
Perséides Technologie

2010-04-22T07:03:28-07:00

Definitely need to get back to stats at some point. Useful information... I love trying to find THE SEO metric that beat'em all, although I know it's impossible.

One day, all in-house SEO teams will need a mathematician to be able to get the best results. It's the heart of SEO R&D imho

One question: What was this nice little application SEOmoz developed that could help to find correlation between two variables using curves? Can someone point it to me again?

2 0

Definitely need to get back to stats at some point. Useful information... I love trying to find THE SEO metric that beat'em all, although I know it's impossible. One day, all in-house SEO teams will need a mathematician to be able to get the best results. It's the heart of SEO R&D imho One question: What was this nice little application SEOmoz developed that could help to find correlation between two variables using curves? Can someone point it to me again?
Cancel
- Sean Weigold Ferguson
 
 2010-04-22T12:01:02-07:00
 
 Online Non-Liner Regression Tool
 
 2 0
 
 <a href="../user_files/online-regression/">Online Non-Liner Regression Tool</a>
 Cancel
 - Perséides Technologie
 
 2010-04-22T12:52:33-07:00
 
 Thank you! Exactly what I was looking for! :)
 
 1 0
 
 Thank you! Exactly what I was looking for! :)
 Cancel
Himanshu Sharma

2010-04-22T03:10:28-07:00

I haven't see any crappy site with high PR say 7 or above. Conversely, i have never seen any authority site with zero or no PR (unless it gets penalized). As a site grows in popularity i see gradual rise in PR. See how the PR of twitter has increased in the last one year. Although algorithmically there may be no relationship between PR and popularity, but from a boss/client's point of view who doesn't know SEO at the algorithmic level, it looks like there is a relationship. This relationship is not just between Popularity and PR but also between domain authority and PR. Your client can always say that "all the popular websites i see, have high /very high PR or all high PR sites are popular in their niches. If you have low PR, then it means you have low global link popularity i.e. not many people find your site worth linking or it hasn't got juicy back links.". This is something which is very hard to explain without wrestiling with spearman correlations.

Another problem is, on one side we say to your client that PR is not important and on the other side we want him to change his site architecture or do link consolidation so that the link juice can flow to the most important pages. This is something which contradicts our own statement of PR is not important. PR may not be important ranking wise but is very important to keep pages on a site (esp. very big sites) in the main index and hence rankable. If this was not the case, then is there a need to fix site crawlability issues.

3 1

I haven't see any crappy site with high PR say 7 or above. Conversely, i have never seen any authority site with zero or no PR (unless it gets penalized). As a site grows in popularity i see gradual rise in PR. See how the PR of twitter has increased in the last one year. Although algorithmically there may be no relationship between PR and popularity, but from a boss/client's point of view who doesn't know SEO at the algorithmic level, it looks like there is a relationship. This relationship is not just between Popularity and PR but also between domain authority and PR. Your client can always say that "all the popular websites i see, have high /very high PR or all high PR sites are popular in their niches. If you have low PR, then it means you have low global link popularity i.e. not many people find your site worth linking or it hasn't got juicy back links.". This is something which is very hard to explain without wrestiling with spearman correlations. Another problem is, on one side we say to your client that PR is not important and on the other side we want him to change his site architecture or do link consolidation so that the link juice can flow to the most important pages. This is something which contradicts our own statement of PR is not important. PR may not be important ranking wise but is very important to keep pages on a site (esp. very big sites) in the main index and hence rankable. If this was not the case, then is there a need to fix site crawlability issues. 
Cancel
- Jane Copland
 
 2010-04-22T03:34:20-07:00
 
 If you have low PR, then it means you have low global link popularity i.e. not many people find your site worth linking or it hasn't got juicy back links.
 
 If you have low true, under-the-hood PageRank, this is probably true. However, perform a highly competitive search like https://www.google.co.uk/search?q=bingo and note how the toolbar PageRank doesn't correlate to the ranking sites. WinkBingo's home page has a toolbar PR of 6 - higher than all the other ranking sites, but it ranks sixth.
 
 There is going to be a good reason why Google ranks these pages, or any others, where it does; however, the the green bar at the top of the ranking pages isn't much of a factor.
 
 Secondly, paying close attention to any one page's toolbar PageRank (which is updated infrequently, often cosmetic and inaccurate due to PageRank's logarithmic nature, e.g. "One PR4 page might have 5 times more PageRank than another PR4 page") and building a site that can have authority easily passed around it, are two entirely different things.
 
 I have also worked on a site that had a PR7 and ranked for nothing. Its PR7 had been obtained by a legitimate content sharing scheme that looked like spam due to content containing links, even though it wasn't spam and wasn't done by SEOs or with SEO in mind. Google gave it its toolbar PageRank, but the site was penalised. We explained the issue and the penalty was lifted. The site's PageRank went down, but it began ranking top-10 for its primary, competitive phrases. In other words, after our reconsideration request Google discounted all of the PageRank passing through those links and the toolbar PageRank went down. However, those links weren't helping to begin with (i.e. the toolbar PageRank score wasn't helping). No one wants a tbPR 7 page that doesn't rank. I'll take a tbPR4 page that ranks over that any day.
 
 From the above example, it's worth noting that a penalised site won't always lose its toolbar PageRank, making toolbar PageRank even less trustworthy a metric.
 
 JaneCopland edited 2010-04-22T05:47:36-07:00
 6 0
 
 <blockquote>If you have low PR, then it means you have low global link popularity i.e. not many people find your site worth linking or it hasn't got juicy back links. </blockquote> If you have low true, under-the-hood PageRank, this is probably true. However, perform a highly competitive search like https://www.google.co.uk/search?q=bingo and note how the toolbar PageRank doesn't correlate to the ranking sites. WinkBingo's home page has a toolbar PR of 6 - higher than all the other ranking sites, but it ranks sixth. There is going to be a good reason why Google ranks these pages, or any others, where it does; however, the the green bar at the top of the ranking pages isn't much of a factor. Secondly, paying close attention to any one page's toolbar PageRank (which is updated infrequently, often cosmetic and inaccurate due to PageRank's logarithmic nature, e.g. "One PR4 page might have 5 times more PageRank than another PR4 page") and building a site that can have authority easily passed around it, are two entirely different things. I have also worked on a site that had a PR7 and ranked for nothing. Its PR7 had been obtained by a legitimate content sharing scheme that looked like spam due to content containing links, even though it wasn't spam and wasn't done by SEOs or with SEO in mind. Google gave it its toolbar PageRank, but the site was penalised. We explained the issue and the penalty was lifted. The site's PageRank went down, but it began ranking top-10 for its primary, competitive phrases. In other words, after our reconsideration request Google discounted all of the PageRank passing through those links and the toolbar PageRank went down. However, those links weren't helping to begin with (i.e. the toolbar PageRank score wasn't helping). No one wants a tbPR 7 page that doesn't rank. I'll take a tbPR4 page that ranks over that any day. From the above example, it's worth noting that a penalised site won't always lose its toolbar PageRank, making toolbar PageRank even less trustworthy a metric. 
 Cancel
 - Gianluca Fiorelli
 
 2010-04-22T03:47:00-07:00
 
 May I say this? Jane, I love the clarity of your answers here in the blog (where I'd love to see you more) and in the Q&A.
 
 3 0
 
 May I say this? Jane, I love the clarity of your answers here in the blog (where I'd love to see you more) and in the <a href="www.seomoz.org/qa">Q&A</a>.
 Cancel
 - Jane Copland
 
 2010-04-22T03:48:36-07:00
 
 Cheers! Shockingly, haven't blogged on here since my "I'm leaving Seattle" post in January of last year. Will do something about that when I can think of something original to say ;)
 
 1 0
 
 Cheers! Shockingly, haven't blogged on here since my "I'm leaving Seattle" post in January of last year. Will do something about that when I can think of something original to say ;)
 Cancel
 - goodnewscowboy
 
 2010-04-22T10:52:47-07:00
 
 My word Jane. When you come out of the woodwork, you really come out with a splash! I echo gfiorelli (It seems like that's all I've been doing this week Gianluca. Will you get out of my head!) I'd love to see a blog post here from you.
 
 1 0
 
 My word Jane. When you come out of the woodwork, you really come out with a splash! I echo gfiorelli (It seems like that's all I've been doing this week Gianluca. Will you get out of my head!) I'd love to see a blog post here from you.
 Cancel
 - Jane Copland
 
 2010-04-23T00:39:35-07:00
 
 Thanks for the kind words! I'll think about something to write :)
 
 1 0
 
 Thanks for the kind words! I'll think about something to write :)
 Cancel
- Philip Nikolayev
 
 2010-04-25T07:29:13-07:00
 
 Himanshu makes an excellent point regarding crawlability and indexation. Not all SEO value is about ranking correlations. If the page isn't indexed there can be no rankings.
 
 Matt Cutts has recently admitted that "the number of pages that we crawl is roughly proportional to your PageRank," speaking chiefly of the "crawl budget" of large sites and their crawling and indexation issues (a must read). This by itself gives PR significant importance.
 
 As to ranking value, what I get, simplistically, from this excellent post and enusing discussion, after all the statistical back and forth, is what we have always known (or did we?!): that in general, PR plays some small positive part in rankings that we shouldn't obsess too much about. To me, this study says the same in a more elaborate way, and is valuable as confirming this notion. For which thank you.
 
 Philip-SEO edited 2010-04-25T07:32:48-07:00
 5 0
 
 Himanshu makes an excellent point regarding crawlability and indexation. Not all SEO value is about ranking correlations. If the page isn't indexed there can be no rankings. Matt Cutts <a href="https://www.stonetemple.com/articles/interview-matt-cutts-012510.shtml" rel="nofollow">has recently admitted</a> that "the number of pages that we crawl is roughly proportional to your PageRank," speaking chiefly of the "crawl budget" of large sites and their crawling and indexation issues (a must read). This by itself gives PR significant importance. As to ranking value, what I get, simplistically, from this excellent post and enusing discussion, after all the statistical back and forth, is what we have always known (or did we?!): that in general, PR plays some small positive part in rankings that we shouldn't obsess too much about. To me, this study says the same in a more elaborate way, and is valuable as confirming this notion. For which thank you.
 Cancel
Associate

Will Critchlow
Associate

2010-04-22T02:26:08-07:00

Yet more reasons why I'm trying to teach myself about machine learning (thanks for Ben's help behind the scenes!).

2 0

Yet more reasons why I'm trying to teach myself about machine learning (thanks for Ben's help behind the scenes!).
Cancel
Tola

2010-04-22T02:56:23-07:00

I guess when its all said and done, its gonna take as a while to completely wean ourselves off PageRank. Its in our veins!!

I'm not too crazy about statistics so I'm not going to make mention about the figures and all. But bottomline, if anyone asks about the importance of Page rank, I'll just point them here!

Thanks for that Rand.

2 0

I guess when its all said and done, its gonna take as a while to completely wean ourselves off PageRank. Its in our veins!! I'm not too crazy about statistics so I'm not going to make mention about the figures and all. But bottomline, if anyone asks about the importance of Page rank, I'll just point them here! Thanks for that Rand.
Cancel
Kien Lai

2010-04-22T10:50:54-07:00

I believe PR is part of the SEO mix, but not the be all, end all metric. It's interesting to hear what others think about PR in the SEO industry - particularly those who discredit PR think it's worthless. Not to bash on those who think that, but more to learn and understand why they think that way.

2 0

I believe PR is part of the SEO mix, but not the be all, end all metric. It's interesting to hear what others think about PR in the SEO industry - particularly those who discredit PR think it's worthless. Not to bash on those who think that, but more to learn and understand why they think that way.
Cancel
- Sean Weigold Ferguson
 
 2010-04-22T10:59:26-07:00
 
 "Worthless" is certainly a value judgement. Does PageRank provide utility? Is it useful in some way? Does knowning a URL's PageRank provide actionable information? If the answer to any of these questions is yes, then it is certainly not worthless.
 
 1 0
 
 "Worthless" is certainly a value judgement. Does PageRank provide utility? Is it useful in some way? Does knowning a URL's PageRank provide actionable information? If the answer to any of these questions is yes, then it is certainly not worthless.
 Cancel
bootleg

2010-04-22T03:15:44-07:00

First, great post.

Second: it's often enough to tell your boss/client that page rank is one of over 200 signals - because many people think that page rank = serp rank. i recently came across a blog post stating that "google says that page speed will influence page rank" ... which of course isn't true

Third: with your first image (hook it into my veins) I hope you're somehow referring to this simpsons scene ;-)

bootleg edited 2010-04-22T03:16:46-07:00
2 0

First, great post. Second: it's often enough to tell your boss/client that page rank is one of over 200 signals - because many people think that page rank = serp rank. i recently came across a blog post stating that "google says that page speed will influence page rank" ... which of course isn't true Third: with your first image (hook it into my veins) I hope you're somehow referring to <a href="https://www.youtube.com/watch?v=wWOzAW4ttSY" rel="nofollow">this simpsons scene</a> ;-)
Cancel
- Ehren Reilly
 
 2010-04-22T15:01:27-07:00
 
 Yeah. It is very unfortunate that Larry "The Web" Page had to name the algorithm after himself. Laypersons assume that PageRank is equivalent to the rank of your pages.
 
 1 0
 
 Yeah. It is very unfortunate that Larry "The Web" Page had to name the algorithm after himself. Laypersons assume that PageRank is equivalent to the rank of your pages.
 Cancel
BarryW

2010-04-26T14:58:21-07:00

Thanks, good to know someone understands PR so I don't have too. Is there any truth in the gossip that pagerank can be inherited from a previously defunct URL that has been repurchased?

1 0

Thanks, good to know someone understands PR so I don't have too. Is there any truth in the gossip that pagerank can be inherited from a previously defunct URL that has been repurchased?
Cancel
wyfwyf112

2010-04-25T21:01:29-07:00

PR is just one of many factors to estimate a website's value.

I think PR is so popular cause it provide a simple way for ordinary peopl.

many people want to know one website's value, but they don't want to or have no time to use so many tools to look inside a website.

1 0

PR is just one of many factors to estimate a website's value. I think PR is so popular cause it provide a simple way for ordinary peopl. many people want to know one website's value, but they don't want to or have no time to use so many tools to look inside a website. 
Cancel
SolidSquid

2010-04-26T08:32:59-07:00

I realise this might be an obvious question, but in regards to the UK search results, what did you do to eliminate localization as a factor when testing pagerank? Google has pretty clearly said that localization will re-order things based on your location, so to get a relevant result for google.co.uk it would seem reasonable to do the searches from a UK IP (maybe through vpn) to make sure localization is eliminated as a factor

1 0

I realise this might be an obvious question, but in regards to the UK search results, what did you do to eliminate localization as a factor when testing pagerank? Google has pretty clearly said that localization will re-order things based on your location, so to get a relevant result for google.co.uk it would seem reasonable to do the searches from a UK IP (maybe through vpn) to make sure localization is eliminated as a factor
Cancel
- Ben Hendrickson
 
 2010-04-26T12:10:01-07:00
 
 Good question.
 
 I made no effort, and I fetched all SERPs from American IPs.
 
 I am not exactly sure the latest on the work to do IP geolocation based personalization, although the impression I had was that it was only going to be for a pretty limited number of queries such as "pizza" and "restaurant", and I hadn't heard news that it was actually being used yet. I would be interested to hear otherwise.
 
 1 0
 
 Good question. I made no effort, and I fetched all SERPs from American IPs. I am not exactly sure the latest on the work to do IP geolocation based personalization, although the impression I had was that it was only going to be for a pretty limited number of queries such as "pizza" and "restaurant", and I hadn't heard news that it was actually being used yet. I would be interested to hear otherwise. 
 Cancel
Dawn Lawson

2010-05-05T06:36:44-07:00

It is my understanding that when switching to a new domain, PageRank is inheritated through a 301 redirect. Does the same apply if the domain is forwarded, as oppose to the 301?

Thanks,

Dawn

1 0

It is my understanding that when switching to a new domain, PageRank is inheritated through a 301 redirect. Does the same apply if the domain is forwarded, as oppose to the 301? Thanks, Dawn 
Cancel
destin2008

2010-04-26T22:02:57-07:00

It's funny that page rank gets easily discounted mostly by those, who don't have it and appreciated more by those who do. Some of my sites rank well with low PR( to me anything under PR4 is low), some rank high with high PR ( to me PR7 and up is high).

Despite above, I will take non ranking PR7 site over ranking PR4 site any day, simple because I can always make it rank. And those who don't know how to take advantage of high PR site -they probably never owned one to begin with. Now, owning one and managing one for a client there is a big difference :)

destin2008 edited 2010-04-26T22:06:27-07:00
1 0

It's funny that page rank gets easily discounted mostly by those, who don't have it and appreciated more by those who do. Some of my sites rank well with low PR( to me anything under PR4 is low), some rank high with high PR ( to me PR7 and up is high). Despite above, I will take non ranking PR7 site over ranking PR4 site any day, simple because I can always make it rank. And those who don't know how to take advantage of high PR site -they probably never owned one to begin with. Now, owning one and managing one for a client there is a big difference :)
Cancel
- Belaid
 
 2010-04-30T10:57:13-07:00
 
 If the objective of your site is a high PR and you consider a PR>7 a success in itself, then you are absolutely right. PR is important and you should spend time doing whatever it takes to rank.
 But, if conversions are the objective, then you might want to further ‘investigate’ whether it always follows that, PR7 sites always lead to better conversions than PR4 sites.
 
 1 0
 
 If the objective of your site is a high PR and you consider a PR>7 a success in itself, then you are absolutely right. PR is important and you should spend time doing whatever it takes to rank. But, if conversions are the objective, then you might want to further ‘investigate’ whether it always follows that, PR7 sites always lead to better conversions than PR4 sites.
 Cancel
10thdegree

2010-11-05T13:15:03-07:00

Absolute Dynamite post! The more time I spend on seomoz and the more time I use the tools....the beeter analyst I become. I would like to see how Alexa rankings and SERPs correlate just for fun!

1 0

Absolute Dynamite post! The more time I spend on seomoz and the more time I use the tools....the beeter analyst I become. I would like to see how Alexa rankings and SERPs correlate just for fun!
Cancel
humanmathematics

2011-05-17T21:31:08-07:00

Your statement about standard errors is only true if Google.com rankings come from a Gaussian distribution -- which I would guess is not the case. Gaussians arise when many independent random processes come together to influence a phenomenon. That sounds like the opposite of Google.

1 0

Your statement about standard errors is only true if Google.com rankings come from a Gaussian distribution -- which I would guess is not the case. Gaussians arise when many independent random processes come together to influence a phenomenon. That sounds like the opposite of Google.
Cancel
humanmathematics

2011-05-17T22:23:38-07:00

A couple other statistical suggestions for y'all:
- a boxplot of the correlations is a clear, no-fudge way of communicating the important bulk of your testing results
- a nonparametric regression would give you the magnitudes of effectiveness of PageRank, mozRank, and other factors at predicting SERP standing
- a nonparametric regression will also, separately, give you a significance score so you can make sure you've done enough tests
- but this still won't be accurate unless you've tested a sufficiently broad coverage of the range of options for the regression to be considered an interpolation. And how do you interpolate over 4,000,000 dimensional space? (20,000 English language words, squared for bigrams)
humanmathematics edited 2011-05-17T22:39:19-07:00
1 0
A couple other statistical suggestions for y'all: <ul><li>a boxplot of the correlations is a clear, no-fudge way of communicating the important bulk of your testing results</li> <li>a nonparametric regression would give you the magnitudes of effectiveness of PageRank, mozRank, and other factors at predicting SERP standing</li> <li>a nonparametric regression will also, separately, give you a significance score so you can make sure you've done enough tests</li> <li>but this still won't be accurate unless you've tested a sufficiently broad coverage of the range of options for the regression to be considered an <a href="https://blog.hiremebecauseimsmart.com/post/627821398/tufte-socialscientists" rel="nofollow">interpolation</a>. And how do you interpolate over 4,000,000 dimensional space? (20,000 English language words, squared for bigrams)</li> </ul>
Cancel
WebProRob

2011-11-24T02:06:33-08:00

Great Post!

I got to say that pagerank has always been really confusing. Sometimes you'll see a Pagerank 0 at the top of a high level competition keyword! Anyway, this article gives me some great information to rely to my clients and blog (Web-Directory-SEO). Great article, and keep them comming!

KeriMorgret edited 2011-11-25T13:03:42-08:00
1 0

Great Post! I got to say that pagerank has always been really confusing. Sometimes you'll see a Pagerank 0 at the top of a high level competition keyword! Anyway, this article gives me some great information to rely to my clients and blog (Web-Directory-SEO). Great article, and keep them comming! 
Cancel
humanmathematics

2011-05-16T23:49:02-07:00

re Spearman. Yeah! Spearman was developed specifically for correlations among ranked things.

1 0

re Spearman. Yeah! Spearman was developed specifically for correlations among ranked things.
Cancel
Eshwar J

2010-04-23T09:01:56-07:00

i bet u used other ranking signals to correlate with the ranking besides, compete, alexa, moz, google toolbar PR & yahoo inlink for your own use.

I would like to know what are those signals. Lets say page loading time and etc...

i like to see all the known signals colerralated with ranking and how it all fits together to give a better picture and how to prioritize them for better understanding. Because this math is so fun.

I am tempted to create a sample of my own and compute it using perl. So im curious to know more about your sample used.

Ty for the experiment, looking forward to more o these.

Cheers.

Esh

eshmoz edited 2010-04-23T09:03:35-07:00
1 0

i bet u used other ranking signals to correlate with the ranking besides, compete, alexa, moz, google toolbar PR & yahoo inlink for your own use. I would like to know what are those signals. Lets say page loading time and etc... i like to see all the known signals colerralated with ranking and how it all fits together to give a better picture and how to prioritize them for better understanding. Because this math is so fun. I am tempted to create a sample of my own and compute it using perl. So im curious to know more about your sample used. Ty for the experiment, looking forward to more o these. Cheers. Esh 
Cancel
- Ben Hendrickson
 
 2010-04-23T11:32:36-07:00
 
 If you do your own experiments, I would be interested to hear the results, and would probably try to convince you to write a YOUmoz post :-)
 
 Ben
 
 2 0
 
 If you do your own experiments, I would be interested to hear the results, and would probably try to convince you to write a YOUmoz post :-) Ben 
 Cancel
SEOler

2010-07-08T10:42:09-07:00

Hi everybody,

I just stumbled over this entry and I think there are some mistakes how you make your conclusions.

Just calculating a correlation between two values will not be enough. Because as you pointed out, Google uses 200 signals to rank pages.

You need have pages with the other 199 signals similar to actual measure the correlation between the PageRank to the rank. If the correlation is then low, then your statement is valid.

Here it could be that a page with PR5 is ranked higher then a page with PR8 because all other signals are better. If all 199 other signals are similar a page with PR6 will probably outrank a PR5 page.

And if this is the case all the time, then we a have a correlation of 1.

Cheers

1 0

Hi everybody, I just stumbled over this entry and I think there are some mistakes how you make your conclusions. Just calculating a correlation between two values will not be enough. Because as you pointed out, Google uses 200 signals to rank pages. You need have pages with the other 199 signals similar to actual measure the correlation between the PageRank to the rank. If the correlation is then low, then your statement is valid. Here it could be that a page with PR5 is ranked higher then a page with PR8 because all other signals are better. If all 199 other signals are similar a page with PR6 will probably outrank a PR5 page. And if this is the case all the time, then we a have a correlation of 1. Cheers 
Cancel
Lisa Thomason

2010-09-02T10:33:42-07:00

Like anything with SEO its a long term project, and to manage to attain a high PR is certainly your goal, and as others have stated Google has 200 variables for their calculations, and 12 years in perfecting their algorithms with almost unlimited and huge resources, so statistically very hard to emulate, but does not mean it is impossible, and I really found this article very interesting and the feedback even more so. LT

1 0

Like anything with SEO its a long term project, and to manage to attain a high PR is certainly your goal, and as others have stated Google has 200 variables for their calculations, and 12 years in perfecting their algorithms with almost unlimited and huge resources, so statistically very hard to emulate, but does not mean it is impossible, and I really found this article very interesting and the feedback even more so. <a href="https://www.craigfordham.net" rel="nofollow">LT</a>
Cancel
Ronaguo

2010-06-02T19:38:24-07:00

Thanks for the article.

I think it's really helpful for me.

caseyhen edited 2010-08-20T14:51:28-07:00
1 0

Thanks for the article. I think it's really helpful for me.
Cancel
HiveDigitalInc

2010-04-22T11:14:38-07:00

I think it is important to note that data like Alexa and Compete are directionally inaccurate. Yes, this post is about correlation, not causation, but unlike the other measurements which less directly affected by rankings, alexa and compete data are very much inflated as sites move up the rankings... ie: while PageRank might be a Cause of good rankings, Traffic Measurements are a Result of good rankings.

Once again, I understand this is about correlation, but I think it is still worth being said. Don't go out and start using Alexa and Compete data to determine whether a domain is worthwhile for acquiring a link.

1 0

I think it is important to note that data like Alexa and Compete are directionally inaccurate. Yes, this post is about correlation, not causation, but unlike the other measurements which less directly affected by rankings, alexa and compete data are very much inflated as sites move up the rankings... ie: while PageRank might be a Cause of good rankings, Traffic Measurements are a Result of good rankings. Once again, I understand this is about correlation, but I think it is still worth being said. Don't go out and start using Alexa and Compete data to determine whether a domain is worthwhile for acquiring a link. 
Cancel
- Sean Weigold Ferguson
 
 2010-04-22T12:06:24-07:00
 
 I think the cause and effect relationships between these metrics is probably so nuanced, that stating them with too much conviction would be a mistake. In my experience, multicollinearity becomes a big problem when working with multivariate statistics.
 
 SeanWF edited 2010-04-22T12:24:43-07:00
 1 0
 
 I think the cause and effect relationships between these metrics is probably so nuanced, that stating them with too much conviction would be a mistake. In my experience, multicollinearity becomes a big problem when working with multivariate statistics.
 Cancel
- Paracelsus
 
 2010-04-22T12:14:46-07:00
 
 Is PageRank a Cause of good rankings or is it just the Result of good rankings, globally speaking? I don't know that's why I'm asking.
 
 I believe it's probably the consequence, not the reason why you will rank better. The association between a good PageRank and a good SERP won't always be 100%, not even near it, because to each website certain keywords rank better than others and even the keywords themselves shift their relevance across the internet as more (or less) people search for them and there other parameters that PR take into account... so in the end PageRank will be a big round number that gives you a global estimate of how "important" your site (but since you have just 10 levels to rank every single website on earth it will be hard to go up and down as an yoyo)
 
 1 0
 
 Is PageRank a Cause of good rankings or is it just the Result of good rankings, globally speaking? I don't know that's why I'm asking. I believe it's probably the consequence, not the reason why you will rank better. The association between a good PageRank and a good SERP won't always be 100%, not even near it, because to each website certain keywords rank better than others and even the keywords themselves shift their relevance across the internet as more (or less) people search for them and there other parameters that PR take into account... so in the end PageRank will be a big round number that gives you a global estimate of how "important" your site (but since you have just 10 levels to rank every single website on earth it will be hard to go up and down as an yoyo) 
 Cancel
levis315

2010-04-22T05:59:41-07:00

It's so amazing !tks again

1 0

It's so amazing !tks again
Cancel
Richard Baxter

2010-04-22T06:43:53-07:00

These are by far my favourite kind of SEO blog posts, Rand - thanks for writing all of this out! Now, I just need to find the time to read it...

Bookmarked :-)

1 0

These are by far my favourite kind of SEO blog posts, Rand - thanks for writing all of this out! Now, I just need to find the time to read it... Bookmarked :-)
Cancel
Tony Mandarich

2010-04-22T07:10:34-07:00

thanks for the post...very in depth...what do you think Google's replacement will be for PAGE RANK? Maybe a PAGE TRUST?

tony ;~)

1 0

thanks for the post...very in depth...what do you think Google's replacement will be for PAGE RANK? Maybe a PAGE TRUST? tony ;~) 
Cancel
Paracelsus

2010-04-22T04:59:12-07:00

But Google must be doing something right, since if I search for instance for "Macintosh" in Google.com the first result has the most PR of all in the first page of search results (9/10) - Apple's website - and the word "Macintosh" isn't nowhere in Apple's homepage (not even once in all meta-tags or page-content).

I agree if people say PageRank is biased towards some type of criteria but I disagree to describe it as "useless". Of course, you miss a lot of information, but what could you expect of a single number as a measure of everything inside a website? There will always be a bias related to the relative weight you give each parameter that account for PR, but that is also true to every other kind of ranking or metric.

1 0

But Google must be doing something right, since if I search for instance for "Macintosh" in Google.com the first result has the most PR of all in the first page of search results (9/10) - Apple's website - and the word "Macintosh" isn't nowhere in Apple's homepage (not even once in all meta-tags or page-content). I agree if people say PageRank is biased towards some type of criteria but I disagree to describe it as "useless". Of course, you miss a lot of information, but what could you expect of a single number as a measure of everything inside a website? There will always be a bias related to the relative weight you give each parameter that account for PR, but that is also true to every other kind of ranking or metric.
Cancel
- SolidSquid
 
 2010-04-22T06:05:00-07:00
 
 Google also takes synonyms into account when considering keywords. Since Apple's home page will have a large number of links to the domain it will already have a high (non-toolbar) pagerank. Combine this with "Macintosh" most likely being classed as a synonym for "Apple" and it would perform well on that search, even though the word itself doesn't appear
 
 edit: Also, I think the aim of this article was to point out that using the pagerank published to google toolbar as an indicator of SEO success isn't always the best idea, and that there are other methods which can match it better. This isn't the same as google's internal algorythm, which will take into account more than just pagerank (which, iirc, is based mainly on links alone, whereas the google algorythm also probably incorporates keyword density, domain age and a host of other factors)
 
 SolidSquid edited 2010-04-22T06:07:16-07:00
 1 0
 
 Google also takes synonyms into account when considering keywords. Since Apple's home page will have a large number of links to the domain it will already have a high (non-toolbar) pagerank. Combine this with "Macintosh" most likely being classed as a synonym for "Apple" and it would perform well on that search, even though the word itself doesn't appear edit: Also, I think the aim of this article was to point out that using the pagerank published to google toolbar as an indicator of SEO success isn't always the best idea, and that there are other methods which can match it better. This isn't the same as google's internal algorythm, which will take into account more than just pagerank (which, iirc, is based mainly on links alone, whereas the google algorythm also probably incorporates keyword density, domain age and a host of other factors) 
 Cancel
 - Paracelsus
 
 2010-04-22T08:30:28-07:00
 
 I agree, PageRank isn't a measure of SEO success, and in fact I'm glad it is so because otherwise it would mean that if you followed every SEO practice, no matter how irrelevant the content on your site would be to others, it would always rank above another site with poorer SEO but much more "acclaimed" content by others.
 
 PR gives you an estimate of "importance" of a website, and this "importance" is defined according to how important people who developed PR think backlinks, metatags and other parameters are, relative to each other, compared to other websites. Since other metrics give these parameters another relative importance, then their result will be different, as would be PR if you changed the algorithm
 
 So it depends on the way you see it, because what is important for people who made PR is probably not the same for people who made seomoz metrics... and probably I tend to have a different definition of important in a certain context and you another one which doesn't match with neither PR nor seomoz metrics.
 
 1 0
 
 I agree, PageRank isn't a measure of SEO success, and in fact I'm glad it is so because otherwise it would mean that if you followed every SEO practice, no matter how irrelevant the content on your site would be to others, it would always rank above another site with poorer SEO but much more "acclaimed" content by others. PR gives you an estimate of "importance" of a website, and this "importance" is defined according to how important people who developed PR think backlinks, metatags and other parameters are, relative to each other, compared to other websites. Since other metrics give these parameters another relative importance, then their result will be different, as would be PR if you changed the algorithm So it depends on the way you see it, because what is important for people who made PR is probably not the same for people who made seomoz metrics... and probably I tend to have a different definition of important in a certain context and you another one which doesn't match with neither PR nor seomoz metrics.
 Cancel
mssfldt

2010-04-22T03:46:17-07:00

As you said the toolbar-PR could be old, depending on last PR-Update. So what I miss : how old is the toolbar-PR you have tested. The correlation should be much higher a few days after an update then some week later.

In the "Google model in my head" the pagerank ist not directly correlated with the serps. But there is a very clear correlation between the pageRank and link-power. That means : a site that is linked by high pr-sites has a better chance to rank well. Often after a while this effekt comes out in the visible pagerank - but it took some months.

So - to have pagerank is not important for the sites ranking but for all linked sites.

Thanks for the post :-)

mssfldt edited 2010-04-22T03:48:06-07:00
1 0

As you said the toolbar-PR could be old, depending on last PR-Update. So what I miss : how old is the toolbar-PR you have tested. The correlation should be much higher a few days after an update then some week later. In the "Google model in my head" the pagerank ist not directly correlated with the serps. But there is a very clear correlation between the pageRank and link-power. That means : a site that is linked by high pr-sites has a better chance to rank well. Often after a while this effekt comes out in the visible pagerank - but it took some months. So - to have pagerank is not important for the sites ranking but for all linked sites. Thanks for the post :-) 
Cancel
- Gianluca Fiorelli
 
 2010-04-22T03:51:07-07:00
 
 And isn't this the main marketing claim of sooo many scummy directories? Add your site to improve your PR
 
 gfiorelli1 edited 2010-04-22T03:51:31-07:00
 1 0
 
 And isn't this the main marketing claim of sooo many scummy directories? Add your site to improve your PR
 Cancel
 - Karl Moss
 
 2010-04-22T17:02:27-07:00
 
 Yes, definitely. I've even seen ones named things like free pr web directory, seo friendly directory etc. and they charge people to add their sites, scummy is a nice way of putting it!
 
 1 1
 
 Yes, definitely. I've even seen ones named things like free pr web directory, seo friendly directory etc. and they charge people to add their sites, scummy is a nice way of putting it!
 Cancel
- SolidSquid
 
 2010-04-22T05:58:47-07:00
 
 Firstly, as pointed out in the article, it's very difficult to know when the updates are going to happen. As such, waiting until after one to make the comparison doesn't simulate the situation most SEOs would be in when making use of it.
 
 To answer the question though, wikipedia says that April 3rd was last update, so this test will be working with a fairly up-to-date number
 
 1 0
 
 Firstly, as pointed out in the article, it's very difficult to know when the updates are going to happen. As such, waiting until after one to make the comparison doesn't simulate the situation most SEOs would be in when making use of it. To answer the question though, <a href="https://en.wikipedia.org/wiki/PageRank#Google_Toolbar" rel="nofollow">wikipedia </a>says that April 3rd was last update, so this test will be working with a fairly up-to-date number 
 Cancel
- Ben Hendrickson
 
 2010-04-22T18:08:09-07:00
 
 I fetched all of the values after Sunday, so like SolidSquid notes, it would be using the values updated on April 3rd.
 
 It is a good point if we had access to Google's internal PR numbers that are fresher and have more resolution it would almost certainly perform a bit better.
 
 1 0
 
 I fetched all of the values after Sunday, so like SolidSquid notes, it would be using the values updated on April 3rd. It is a good point if we had access to Google's internal PR numbers that are fresher and have more resolution it would almost certainly perform a bit better. 
 Cancel
Gianluca Fiorelli

2010-04-22T01:50:27-07:00

First: great insight

Second: I have to read it more more quietly to really assume all the infos you gave

Third: you say > The next time your boss or client asks you about increasing their PageRank; show them this chart.

I'd love (and I will surely), but the effectivness could be weaker as the data sets are about US and UK. Is it too much to ask you to follow up with this analysis and check out also other regional Google, for instance the Italian one (or those ones can be useful accordingly to the origin of the visits to SEOmoz)?

PD: and I have to find my own Ben Hendrickson in order to deal with the math formulas... I was subscribed to F with Math when at school.

gfiorelli1 edited 2010-04-22T02:20:54-07:00
2 1

First: great insight Second: I have to read it more more quietly to really assume all the infos you gave Third: you say > The next time your boss or client asks you about increasing their PageRank; show them this chart. I'd love (and I will surely), but the effectivness could be weaker as the data sets are about US and UK. Is it too much to ask you to follow up with this analysis and check out also other regional Google, for instance the Italian one (or those ones can be useful accordingly to the origin of the visits to SEOmoz)? PD: and I have to find my own <a href="../../../../team/ben">Ben Hendrickson</a> in order to deal with the math formulas... I was subscribed to F with Math when at school. 
Cancel
- Sean Weigold Ferguson
 
 2010-04-22T08:54:18-07:00
 
 I'll be your very own Ben if you're hiring ;-)
 
 1 0
 
 I'll be your very own Ben if you're hiring ;-)
 Cancel
 - goodnewscowboy
 
 2010-04-22T10:48:31-07:00
 
 But...that's impossible. You're a Sean.
 
 2 0
 
 But...that's impossible. You're a Sean. 
 Cancel
 - Sean Weigold Ferguson
 
 2010-04-22T11:53:06-07:00
 
 That's never stopped me before..
 
 3 0
 
 That's never stopped me before..
 Cancel
 - Ben Hendrickson
 
 2010-04-22T18:00:50-07:00
 
 I checked my email, and Sean is actually who recommended to Danny who then recommended to me to use Spearman's instead of Pearson's for this sort of analysis. I edited the post above to note this!
 
 So I think it is only fair to concluded he is a bit better at this stuff than myself!
 
 bhendrickson edited 2010-04-22T18:24:04-07:00
 6 0
 
 I checked my email, and Sean is actually who recommended to Danny who then recommended to me to use Spearman's instead of Pearson's for this sort of analysis. I edited the post above to note this! So I think it is only fair to concluded he is a bit better at this stuff than myself! 
 Cancel
PaulPedersen

2010-04-22T03:10:32-07:00

The compulsion to check my PageRank is similar to the compulsion to Google myself. When I Google myself, it doesn't mean I'm going to show up for anything anyone actually searches ...it just helps me sleep better. It's kind of the SEO equivalent to checking my locks before going to bed.

PaulPedersen edited 2010-04-22T03:12:32-07:00
1 0

The compulsion to check my PageRank is similar to the compulsion to Google myself. When I Google myself, it doesn't mean I'm going to show up for anything anyone actually searches ...it just helps me sleep better. It's kind of the SEO equivalent to checking my locks before going to bed.
Cancel
Bartjan

2010-04-23T00:23:43-07:00

"The data here is especially interesting. Yahoo!'s link count is a good deal better than Google's PageRank in correlating with Google's own search results!"

Is this about the link-queries or linkdomain-queries? (AKA links to the page or links to the domain?)

Also what I really would like to see is how rankings correlate to how old the cache-date is. Which in my opinion is a better indicator then pagerank to see how much authority (in the eyes of google) a page has.

Bartjan edited 2010-04-23T00:34:36-07:00
1 0

"The data here is especially interesting. Yahoo!'s link count is a good deal better than Google's PageRank in correlating with Google's own search results!" Is this about the link-queries or linkdomain-queries? (AKA links to the page or links to the domain?) Also what I really would like to see is how rankings correlate to how old the cache-date is. Which in my opinion is a better indicator then pagerank to see how much authority (in the eyes of google) a page has.
Cancel
- Ben Hendrickson
 
 2010-04-23T11:29:27-07:00
 
 Comparing PageRank to Yahoo's link counts we used the number of external links reported by yahoo site explorer to the URL.
 
 Later in the post, when comparing domain metrics like PageRank of the homepage and our Domain Authority score, we used the total number of external links to the domain.
 
 I didn't check cache-date. That would be interested to look at.
 
 1 0
 
 Comparing PageRank to Yahoo's link counts we used the number of external links reported by yahoo site explorer to the URL. Later in the post, when comparing domain metrics like PageRank of the homepage and our Domain Authority score, we used the total number of external links to the domain. I didn't check cache-date. That would be interested to look at.
 Cancel
Luci-Creare

2010-04-22T08:08:14-07:00

Great post , I found it quite hard to follow as correlations and maths really aren't a strong point, but the depth and explanations were really useful - and I appreciate the transpareny too :)

1 0

Great post , I found it quite hard to follow as correlations and maths really aren't a strong point, but the depth and explanations were really useful - and I appreciate the transpareny too :)
Cancel
algogmbh_petra

2010-04-22T22:33:57-07:00

But above all you (seomoz) even show us PR e.g. on https://www.seomoz.org/directories --> this "seduces" us (at least me) to look a PR :-)

Petra

1 0

But above all you (seomoz) even show us PR e.g. on <a href="../directories">https://www.seomoz.org/directories</a> --> this "seduces" us (at least me) to look a PR :-) Petra
Cancel
- Rand Fishkin
 
 2010-04-22T22:42:46-07:00
 
 Fair point - we're planning to update that list in the near future, so that will be a good incentive to drop PR and put in DA/PA.
 
 1 0
 
 Fair point - we're planning to update that list in the near future, so that will be a good incentive to drop PR and put in DA/PA.
 Cancel
Danny Richman

2010-04-22T22:55:34-07:00

A fascinating post (and responses) despite my complete lack of understanding of the statistical methods discussed.

My understanding of Google PR is that it is not intended to have a bearing on a page's ability to rank for any given keyword. If it means anything at all, it is an indication of the value that a link from that page would pass on to another page.

A page that has been highly optimised for a specific keyword, with the same keyword in the anchor text of links into that page will always rank better than a poorly optimised page but with a high PR.

For me, PageRank is only a broad indication of the value of a link from that page. The PageRank of the pages on my clients' sites are only of interest to me in terms of the link juice they may pass internally.

1 0

A fascinating post (and responses) despite my complete lack of understanding of the statistical methods discussed. My understanding of Google PR is that it is not intended to have a bearing on a page's ability to rank for any given keyword. If it means anything at all, it is an indication of the value that a link from that page would pass on to another page. A page that has been highly optimised for a specific keyword, with the same keyword in the anchor text of links into that page will always rank better than a poorly optimised page but with a high PR. For me, PageRank is only a broad indication of the value of a link from that page. The PageRank of the pages on my clients' sites are only of interest to me in terms of the link juice they may pass internally.
Cancel
Matt Schmoldt

2010-04-22T21:12:39-07:00

I always hear people saying pagerank means nothing at all anymore - and I always knew it meant a little something - thanks for the data to prove it :)

1 0

I always hear people saying pagerank means nothing at all anymore - and I always knew it meant a little something - thanks for the data to prove it :)
Cancel
Ashish Kothari

2010-04-22T15:18:23-07:00

Just yesterday I had to educate a client about PageRank and Google Search Rankings. I wish I had read this post earlier to show him the hard data. Thanks for the post.

1 0

Just yesterday I had to educate a client about PageRank and Google Search Rankings. I wish I had read this post earlier to show him the hard data. Thanks for the post. 
Cancel
Sean Weigold Ferguson

2010-04-22T09:18:20-07:00

Thanks for looking into using Spearman's correlation guys. I'm glad to see it was useful.

Because I'm a research fanatic, I would love to see a follow-up exploring these results further. If it were me, I would ask:
- Under what conditions are these metrics extremely good predictors of SERP rankings?
- Under what conditions are they poor predictors?
I believe the answers to these questions would give a great deal of insight into the strengths and limitations of measures like PageRank.

1 0
Thanks for looking into using Spearman's correlation guys. I'm glad to see it was useful. Because I'm a research fanatic, I would love to see a follow-up exploring these results further. If it were me, I would ask: <ul><li>Under what conditions are these metrics extremely good predictors of SERP rankings? </li><li>Under what conditions are they poor predictors?</li></ul>I believe the answers to these questions would give a great deal of insight into the strengths and limitations of measures like PageRank.
Cancel
Bharati Ahuja

2010-04-22T01:49:52-07:00

Thanks for the post.

Now atleast when the clients ask about PageRank I can show them these details and tell them to focus more on content of the website and do some qualitative link building then the PageRank shall take care of itself.

Bharati Ahuja

2 1

Thanks for the post. Now atleast when the clients ask about PageRank I can show them these details and tell them to focus more on content of the website and do some qualitative link building then the PageRank shall take care of itself. Bharati Ahuja
Cancel
shane Murray

2010-08-17T19:23:02-07:00

I have been looking for a post like this~!!!
Thanks!!

1 1

I have been looking for a post like this~!!! Thanks!!
Cancel
Belaid

2010-04-25T21:35:37-07:00

The real and simple objective of any site (or at least commercial sites) is not to achieve a pageRank of 8,9 or 10, but it is to convert visitors into customers (i.e $$) period.

Yes, a pageRank usually means a site and its content are important, and will get more traffic. But if all that traffic is not contributing to the site's goals, then pageRank 'matters not.'

In addition to this very useful post that sheds a light on the pageRank's usefulness, managers would probably love to digg one more step and see if a correlation between PageRank and conversions exists.

In other words, can we establish that high pageRank means high ROI? if so, then pageRank should be considered by both SEOs and Google a 'big deal'

Put blatantly, I will pay attention to PageRank if I can say to my manager: "look Jim, our revenues doubled because our pageRank went from 3 to 7 this quarter"

1 1

The real and simple objective of any site (or at least commercial sites) is not to achieve a pageRank of 8,9 or 10, but it is to convert visitors into customers (i.e $$) period. Yes, a pageRank usually means a site and its content are important, and will get more traffic. But if all that traffic is not contributing to the site's goals, then pageRank 'matters not.' In addition to this very useful post that sheds a light on the pageRank's usefulness, managers would probably love to digg one more step and see if a correlation between PageRank and conversions exists. In other words, can we establish that high pageRank means high ROI? if so, then pageRank should be considered by both SEOs and Google a 'big deal' Put blatantly, I will pay attention to PageRank if I can say to my manager: "look Jim, our revenues doubled because our pageRank went from 3 to 7 this quarter" 
Cancel
- Sean Weigold Ferguson
 
 2010-04-25T22:07:37-07:00
 
 Ah, but correlation does not necessarily mean causation.
 
 1 1
 
 Ah, but correlation does not necessarily mean causation.
 Cancel
 - Belaid
 
 2010-04-30T10:18:36-07:00
 
 I agree.
 
 But, what is the practical ultimate objective of looking at any metric?
 
 In my case, I look how that metric is impacting my bottom line (i.e. conversions)
 
 1 0
 
 I agree. But, what is the practical ultimate objective of looking at any metric? In my case, I look how that metric is impacting my bottom line (i.e. conversions)
 Cancel

Post Analytics

The Science of Ranking Correlations: How Does PageRank Perform?

How Well Does PageRank Correlate to Rankings?

Is PageRank the Best Metric of its Kind?

Are Other Commonly Available Metrics Better Correlated?

Can We Value Websites/Domains with PageRank (or Other Metrics)?

Where/How to Access These Metrics

Information about the Dataset Used for this Analysis

The Big Picture in Just a Few Words

Comments 89

How Well Does PageRank Correlate to Rankings?

Is PageRank the Best Metric of its Kind?

Are Other Commonly Available Metrics Better Correlated?

Can We Value Websites/Domains with PageRank (or Other Metrics)?

Where/How to Access These Metrics

Information about the Dataset Used for this Analysis

The Big Picture in Just a Few Words

Comments 89

Log in to Moz

Don't have an account?