Tips For Understanding Data: Regression Analysis

Domain Mentions	Value
1890000	100
1280000	100
866000	100
659000	96
584000	94
247000	80
115000	65
32500	45
13400	30
11300	28
6590	15
218	5
4	1

Comments 33

Please keep your comments TAGFEE by following the community etiquette.

E-mail me when new comments are posted

Sort by:

Comments are closed on posts more than 30 days old. Got a burning question? Head to our Q&A section to start a new conversation.

duz

2008-06-30T02:49:39-07:00

Playing with large amounts of data is fun, the problem is extracting information that is really useful.

When using nonlinear regression the equations that best fit the data very rarely correspond to a meaningful model.

Having been through this exercise myself I came to the conclusion that with some data sets a different and more modern approach is required.

I suggest you look closly at using an adaptive heuristic search algorithm. I wrote my own (which is also fun!) but there are many programs available, search for "genetic algorithm software".

- Michael

4 0

Playing with large amounts of data is fun, the problem is extracting information that is really useful. When using nonlinear regression the equations that best fit the data very rarely correspond to a meaningful model. Having been through this exercise myself I came to the conclusion that with some data sets a different and more modern approach is required. I suggest you look closly at using an adaptive heuristic search algorithm. I wrote my own (which is also fun!) but there are many programs available, search for "genetic algorithm software". - Michael
Cancel
- Nick Gerner
 
 2008-06-30T07:54:39-07:00
 
 You've got a good point. Often with lots of data and complex relationships, if you really want an automated inference algorithm you might be better off with some other approach (such as Genetic Algorithms, Neural Networks, or Support Vector Machines).
 
 However, the advantage of the models of a tool like this is that they are simple. Using these you can often get a good feel for trends in the data. More complex models can often over fit your data or are very difficult to understand intuitively.
 
 1 0
 
 You've got a good point. Often with lots of data and complex relationships, if you really want an automated inference algorithm you might be better off with some other approach (such as Genetic Algorithms, Neural Networks, or Support Vector Machines). However, the advantage of the models of a tool like this is that they are simple. Using these you can often get a good feel for trends in the data. More complex models can often over fit your data or are very difficult to understand intuitively.
 Cancel
- solid28
 
 2008-06-30T09:09:59-07:00
 
 Playing with large amounts of data is fun, the problem is extracting information that is really useful.
 
 w3rd!
 
 ps. how do you do the quotes?
 
 1 0
 
 Playing with large amounts of data is fun, the problem is extracting information that is really useful. w3rd! ps. how do you do the quotes? 
 Cancel
 - justFred
 
 2008-06-30T18:40:54-07:00
 
 ps. how do you do the quotes?
 
 indent button, to the right of strikethorugh
 
 1 0
 
 <blockquote> ps. how do you do the quotes?</blockquote> indent button, to the right of strikethorugh
 Cancel
Rand Fishkin

2008-06-30T00:06:31-07:00

This is a terrific tool, and one I think was sorely needed. Prior to it, I'd been using this Simple Regression Utility, but it only dealt with a few different ways of creating the formula, whereas this one goes all out. Way to make the web a better place, Nick!

BTW - For SEOs wondering what to do with something like this, one application that I've done is to grab data about sites, match them up to rankings, and see how the correlation curve fits :) You'd really need a lot more data and a system that took multiple sets into consideration to be truly valuable, but it can be fun to see PR matched up to rankings or link numbers of Alexa data.

p.s. The tool Nick's teasing above is called "Trifecta" and should be launched a week from today. It's replacing our Page Strength tool and will let you compare blogs, pages and domains separately, based on the factors that affect each.

randfish edited 2008-06-30T00:08:31-07:00
4 0

This is a terrific tool, and one I think was sorely needed. Prior to it, I'd been using this <a href="https://people.hofstra.edu/Stefan_waner/RealWorld/newgraph/regressionframes.html" rel="nofollow">Simple Regression Utility</a>, but it only dealt with a few different ways of creating the formula, whereas this one goes all out. Way to make the web a better place, Nick! BTW - For SEOs wondering what to do with something like this, one application that I've done is to grab data about sites, match them up to rankings, and see how the correlation curve fits :) You'd really need a lot more data and a system that took multiple sets into consideration to be truly valuable, but it can be fun to see PR matched up to rankings or link numbers of Alexa data. p.s. The tool Nick's teasing above is called "Trifecta" and should be launched a week from today. It's replacing our Page Strength tool and will let you compare blogs, pages and domains separately, based on the factors that affect each. 
Cancel
- Nick Gerner
 
 2008-06-30T07:58:50-07:00
 
 You make a good point about needing a lot of data and taking into account multiple variables. This tool does single regression. What you're describing is multiple regression. But that requires some linear algebra and a couple of Guassian eliminations. So I skpped it... for now ;)
 
 2 0
 
 You make a good point about needing a lot of data and taking into account multiple variables. This tool does single regression. What you're describing is multiple regression. But that requires some linear algebra and a couple of Guassian eliminations. So I skpped it... for now ;)
 Cancel
 - Kristy Bolsinger
 
 2008-06-30T10:18:39-07:00
 
 Can get kind of complicated but the multiple regression is nice because you can see the correlation between efforts and results.
 
 Good work on explaining what can be a mind-numbing and hard to understand subject!
 
 1 0
 
 Can get kind of complicated but the multiple regression is nice because you can see the correlation between efforts and results. Good work on explaining what can be a mind-numbing and hard to understand subject! 
 Cancel
- Scottis
 
 2008-06-30T13:17:06-07:00
 
 Well, some SEOs already do maintain trended data (e.g. PR, ranks, links of different types and other metrics) for their projects and competitors... So aside from looking forward to Trifecta, this new regression tool unto itself: word.
 
 1 0
 
 Well, some SEOs already do maintain trended data (e.g. PR, ranks, links of different types and other metrics) for their projects and competitors... So aside from looking forward to Trifecta, this new regression tool unto itself: word.
 Cancel
 - Joost de Valk
 
 2008-07-02T13:27:56-07:00
 
 Yup, we've been playing that within the company I work for, and even hired some econometrists to do the cool multiple regression analyses and more. Problem always remains that for the "coolest" SERPs you want to run tests like that on, there's a LOT of hand editing going on, and so the data turns out to be less useful...
 
 2 0
 
 Yup, we've been playing that within the company I work for, and even hired some econometrists to do the cool multiple regression analyses and more. Problem always remains that for the "coolest" SERPs you want to run tests like that on, there's a LOT of hand editing going on, and so the data turns out to be less useful... 
 Cancel
Rob Kerry

2008-06-30T05:03:54-07:00

My head hurts! As the line graph points upwards though, it must be good :)

4 0

My head hurts! As the line graph points upwards though, it must be good :)
Cancel
Associate

Kate Morris
Associate

2008-06-30T05:43:06-07:00

Just got done with a class for my MBA called "Advanced Statistics" that makes all of this sound really familiar! :) This was a nice analysis and I can't wait for the new tool.

Who knew that they taught us stuff in school that's like USEFUL! ;)

2 0

Just got done with a class for my MBA called "Advanced Statistics" that makes all of this sound really familiar! :) This was a nice analysis and I can't wait for the new tool. Who knew that they taught us stuff in school that's like USEFUL! ;) 
Cancel
- Kristy Bolsinger
 
 2008-06-30T10:59:44-07:00
 
 Thank god they only require me to take stats in my first year of the MBA program....it nearly killed me! :P Such valuable stuff but man all mighty its hard to wrap your brain around. Good job on surviving "Advanced" Stats!!!!!
 
 1 0
 
 Thank god they only require me to take stats in my first year of the MBA program....it nearly killed me! :P Such valuable stuff but man all mighty its hard to wrap your brain around. Good job on surviving "Advanced" Stats!!!!!
 Cancel
SeanMaguire

2008-06-30T05:29:34-07:00

I suspect you'll garner quite a few edu links with this little tool, that's for sure. Nice work.

2 0

I suspect you'll garner quite a few edu links with this little tool, that's for sure. Nice work.
Cancel
- Nick Gerner
 
 2008-06-30T07:56:16-07:00
 
 There are plenty of other online regression tools out there. I kind of feel bad adding another one to the fray. But honestly, none of the other ones did any graphing, and most wouldn't show you a model's predictions with residuals.
 
 1. I hope this is useful
 2. If it's useful, maybe I will get a few edu links ;)
 
 2 0
 
 There are plenty of other online regression tools out there. I kind of feel bad adding another one to the fray. But honestly, none of the other ones did any graphing, and most wouldn't show you a model's predictions with residuals. 1. I hope this is useful 2. If it's useful, maybe I will get a few edu links ;)
 Cancel
Helene Hall

2008-06-30T01:46:16-07:00

Wow, you guys really have been busy :)

Will this tool be available free or will it just be for Pro members?

2 0

Wow, you guys really have been busy :) Will this tool be available free or will it just be for Pro members?
Cancel
- Rand Fishkin
 
 2008-06-30T08:22:51-07:00
 
 Like Page Strength, it will be free to run 1-2X per day, then require a PRO account to access more.
 
 2 0
 
 Like Page Strength, it will be free to run 1-2X per day, then require a PRO account to access more.
 Cancel
Associate

Will Critchlow
Associate

2008-06-30T02:50:21-07:00

Nick, you rock. Great post and great tool.

I actually whipped up an online regression tool to help out.

If you want to come and work in London at any point, let me know (sorry Rand).

You've now killed my productivity twice in one morning - between this and your email.

2 0

Nick, you rock. Great post and great tool. <blockquote>I actually whipped up an online regression tool to help out.</blockquote> If you want to come and work in London at any point, let me know (sorry Rand). You've now killed my productivity twice in one morning - between this and your email.
Cancel
KyleBunch

2008-07-10T06:10:39-07:00

Have you guys checked out SEOintelligence? It's another great tool to take a look at for regression analysis and SEO tracking purposes.

1 0

Have you guys checked out <a href="https://seointelligence.com" rel="nofollow">SEOintelligence</a>? It's another great tool to take a look at for regression analysis and SEO tracking purposes.
Cancel
Joost de Valk

2008-07-02T13:42:06-07:00

Aren't a lot of these variables co-linear, and as such "hard" to calculate with, as individual predictors may have weird results on the outcome?

1 0

Aren't a lot of these variables co-linear, and as such "hard" to calculate with, as individual predictors may have weird results on the outcome?
Cancel
- Chris Goulet
 
 2008-09-10T10:36:37-07:00
 
 I wish someone would answer this question from joost.
 
 1 0
 
 I wish someone would answer this question from joost.
 Cancel
Hans Braumüller

2009-05-06T12:00:15-07:00

Hi,

Do you have another example in concrete, for example putting the metrics together from comparision of domains by linkscape?

An step by step help page can be helpful.

Thanks,

1 0

Hi, Do you have another example in concrete, for example putting the metrics together from comparision of domains by linkscape? An step by step help page can be helpful. Thanks,
Cancel
- Adam Henige
 
 2009-06-23T20:13:12-07:00
 
 you could really just grab excel and do some regressions with the data points made available via the tools on here. a quick excel regression tutorial like the one at https://phoenix.phys.clemson.edu/tutorials/excel/regression.html can get you rolling.
 
 2 0
 
 you could really just grab excel and do some regressions with the data points made available via the tools on here. a quick excel regression tutorial like the one at <a href="https://phoenix.phys.clemson.edu/tutorials/excel/regression.html" rel="nofollow">https://phoenix.phys.clemson.edu/tutorials/excel/regression.html</a> can get you rolling.
 Cancel
Salvino Fidacaro

2010-09-01T07:17:11-07:00

I did not understand, I have a bit of confusion.

1 0

I did not understand, I have a bit of confusion.
Cancel
binaryday

2008-07-01T07:35:26-07:00

Regression analysis tool is available in microsoft excel as well. Under Tools ---> Data Analysis ---> Regression. Why are you creating a new one?

While using regression analysis apart from standard error, you should look at the significance number as well. And make sure not to include 2 factors that have a high level of correlation between themselves like pagerank and number of links. Else you may end up with absolutely wrong conclusion.

1 0

Regression analysis tool is available in microsoft excel as well. Under Tools ---> Data Analysis ---> Regression. Why are you creating a new one? While using regression analysis apart from standard error, you should look at the significance number as well. And make sure not to include 2 factors that have a high level of correlation between themselves like pagerank and number of links. Else you may end up with absolutely wrong conclusion.
Cancel
- Nick Gerner
 
 2008-07-03T13:46:52-07:00
 
 You make a great point about Excel's tools (as long as you install the data analysis pak). I've done a lot of work with it.
 
 However, I found that for the application where I'm trying to take a set of know datapoints and extrapolate between them, it's nice to have a single, free tool which computes several models, along with RMSE, and graphs the functions with one click.
 
 1 0
 
 You make a great point about Excel's tools (as long as you install the data analysis pak). I've done a lot of work with it. However, I found that for the application where I'm trying to take a set of know datapoints and extrapolate between them, it's nice to have a single, free tool which computes several models, along with RMSE, and graphs the functions with one click.
 Cancel
tstolber1

2011-11-15T17:56:43-08:00

I know this is an old thrad now but.... do you knwo of (or are you developing) a non-linear regression tool that does high (6) order polynomials?

I would love to find some PHP app app the does high order polynomials.

1 0

I know this is an old thrad now but.... do you knwo of (or are you developing) a non-linear regression tool that does high (6) order polynomials? I would love to find some PHP app app the does high order polynomials.
Cancel
Staff

Dr. Peter J. Meyers
Staff

2008-06-30T07:09:45-07:00

Sounds really interesting, Nick. I'll be curious to see how you work the results into the new tools. I've actually been playing around with some multivariate regression models for SEO lately, and it's really starting to sink in for me just how much the impact of different variables varies widely across industry, competitive landscape, site size, and other situational variables.

1 0

Sounds really interesting, Nick. I'll be curious to see how you work the results into the new tools. I've actually been playing around with some multivariate regression models for SEO lately, and it's really starting to sink in for me just how much the impact of different variables varies widely across industry, competitive landscape, site size, and other situational variables.
Cancel
Judith Lewis

2008-06-30T06:58:40-07:00

This takes me back to 2nd year stats analysis class.

Who know I'd ever need to use that stuff again and here you go talking it all that step further.

Fab! Well done and THANKS!

1 0

This takes me back to 2nd year stats analysis class. Who know I'd ever need to use that stuff again and here you go talking it all that step further. Fab! Well done and THANKS! 
Cancel
identity

2008-07-01T05:46:31-07:00

This is great... of the charts in mindnumbingness, but in a good way. And I'm sure this is just the start of seeing the tools stepping up to the next level.

1 0

This is great... of the charts in mindnumbingness, but in a good way. And I'm sure this is just the start of seeing the tools stepping up to the next level.
Cancel
Kristy Bolsinger

2008-06-30T10:13:41-07:00

Although this stuff has always given me brain cramps it is extremely valuable (don't tell my stats prof I said that :) ) It's invaluable having a way to graphically express your point, especially to those who may not have the same in depth understanding of the environment.

Some may argue with me on this one...but I always found excel to be pretty easy to use for this type of analysis as well.

Great tool!!

1 0

Although this stuff has always given me brain cramps it is extremely valuable (don't tell my stats prof I said that :) ) It's invaluable having a way to graphically express your point, especially to those who may not have the same in depth understanding of the environment. Some may argue with me on this one...but I always found excel to be pretty easy to use for this type of analysis as well. Great tool!! 
Cancel
Gustavo Parra

2008-06-30T11:25:22-07:00

Hi Rand, I don't know if it is just for me, but the results I have been getting from the SEO Pro Tools are very poor. For example the backlink anchor text analysis only gives a few results (50) for a Site that has thousands of links, and the same with others, I have mentioned this to you before. Please check on this issue. Thanks.

1 0

Hi Rand, I don't know if it is just for me, but the results I have been getting from the SEO Pro Tools are very poor. For example the backlink anchor text analysis only gives a few results (50) for a Site that has thousands of links, and the same with others, I have mentioned this to you before. Please check on this issue. Thanks.
Cancel
- Rand Fishkin
 
 2008-06-30T11:29:10-07:00
 
 We have been looking into it and I think our solution, at least with backlink anchor text, is going to be to use some different systems for grabbing data. That's part of our October release as these things take a long time to develop if you want them scalable.
 
 In the meantime, though, the Trifecta tool is launching next week, and to date, it's given us extremely fast, very accurate results, which we expect will make it one of the most valuable tools in the collection very quickly.
 
 1 0
 
 We have been looking into it and I think our solution, at least with backlink anchor text, is going to be to use some different systems for grabbing data. That's part of our October release as these things take a long time to develop if you want them scalable. In the meantime, though, the Trifecta tool is launching next week, and to date, it's given us extremely fast, very accurate results, which we expect will make it one of the most valuable tools in the collection very quickly. 
 Cancel
VusalZeynalov

2008-06-30T02:22:12-07:00

Hmm... interesting

1 1

Hmm... interesting 
Cancel

Post Analytics

Comments 33

Log in to Moz

Don't have an account?