Since Google released their patent application on historical information way back in March of 2005, search marketers have recognized that trends in temporal link and content analysis do have a real impact on rankings. What isn't quite clear is how this concept functions, and it's the subject of tonight's stay-up-until-2am-blogathon.
The engines are trying to measure patterns - they're looking for indications of increasing or decreasing relevance and authority that temporal trends provide. There's several specific items they want to identify:
- Content growth patterns - how often does a particular site tend to add new pages
- Content update patterns - how often are documents edited and updated
- Link growth patterns - how often are new links pointing to the site
- Link stagnation patterns - does the number of links to the site stagnate or decrease
There's also some calculus here - they're not just interested in how many links pointed today vs. yesterday (or how many pages have been added), there's a fundamental interest in tracking patterns over time. Below, I've drawn some graphs showing the rate of new external links (and in the last two instances, pages) created over time, with some speculations about what the trends might indicate.
These assumptions don't necessarily hold true for every site or instance, but the graphs make it easy to see how temporal link and conten growth information can be used by the engines to make guesses about the relevance or worthiness of a particular site. Let's look at some guesstimates of a few real sites and how these trends have effected them.
Wikipedia has had tremendous growth in both pages and links over the last 5 years. This success shows itself in the search engines, who reward Wikipedia's massive link authority with high rankings for much of its content.
DMOZ has experienced a relative decline in popularity. While they were once a default reference link for many sites, their relative influence has waned. My experience has also seen them appearing in far fewer competitive rankings that 2-4 years ago they dominated.
SEOmoz itself has a relatively steady rate of content additions, but the number of new links pointing to the site on a weekly or monthly basis has continued to rise over the last few years. The results have been high rankings for some competitive queries in a very competitive sphere (Internet marketing). (BTW - the graphs are obviously not to scale with one another)
This metric makes me think that many forms of spam and manipulative link building are going to stand out like a sore thumb when put under the temporal microscope. When a large gain in links relative to a site's sphere, influence and historical link growth appears, the engines can take a closer look at the source of the links or even trigger a manual review. Common sense would dictate that a small-time local real estate site doesn't usually attract a few thousand new links in a week unless they've done something news or link-worthy.
There's little limit to what the engines can do with data like this, and no reason they shouldn't be analyzing it (since it's easily available). I'd invite you to consider how the link and content growth patterns for your own sites may affect the engines' perspectives on your rankings.
Rand,
I was all like "Yeah, Go Rand" until I read this:
Common Sense and Link patterns aren't always best friends especialy with the viral nature of the internet. I would say some good examples of small time sites that exploded with links without doing anything news or linkworthy are social networking sites. I especially think that LinkedIn is a good example of this. I would suspect that links to LinkedIn exploded when they enabled the ability to link to your external profile, yet content did not immediately increase since it was only people who had LinkedIn profiles that had the ability to add those links. This was not news worthy or a byproduct of creating content that was linkworthy. But more a way of creating additional links to the content that was already there - such as what many widgets are enabling companies to do. Of course as more people saw LinkedIn links on sites the more people joined so there was a period where content increased after links increased. I don't believe that this could be seen in a month to month chart as there had to be a period of time where content remained the same even though the content eventually increased due to people noticing LinkedIn badges and joining the service.
And re: "Despite more content, relevance or appeal may be declining"
I have to disagree with the analysis in that chart; again using social networking sites as an example. The ability of 1 user in a site like myspace to create content via blogs, video, etc that is never linked to is far greater than someone that is creating one funny video that gets linked to by tons of people. So in my view, a consumer generated content site that has enabled multiple streams of content creation by 1 user (whose users are dedicated to that content creation) will always have a constant content increase without necessarily increasing link counts at an equalivent or greater rate; and this would not be an indicator of stagnancy. In fact it would be quite the opposite. Of course the developers of said CGC or social networking platform could enable linkage to the new content :) which would show a correlation between the two.
BTW: While it goes without saying that this is a great article, I'll say it. And look at what you made me write before my morning coffee, I'm sure I'll read this later and think what the heck was I writing... Thanks alot! :)
Natasha - You make some very good points, but I think that "LinkedIn" is significantly different than Bob's Scottsdale Real Estate business. And, I'd surmise that while LinkedIn had a very spiky link profile, it certainly was not something that experienced a downturn. After that bump, the level of links and content added to the site continued to be high.
As far as UGC, where users contribute content - again, I'd say that anytime a site gets popular enough to where tons of folks are making pages, there will also be a lot of new links pointing in. It would be very, very rare for the one to happen without the other - think MySpace, Wikipedia, YouTube, Squidoo, Technorati, etc.
Right - LinkedIn NOW is significantly different than Bob's Scottsdale Real Estate business. But there was a point in time when it was on par with Bob's when it was a fledging site. How would an algorithm determine the difference? A human definetly, could but a algorithm I'm not so sure.
Re: "anytime a site gets popular enough to where tons of folks are making pages, there will also be a lot of new links pointing in." Not always, see Friendster - which enabled all of these UGC areas after popularity waned and I'm not sure that the links increased with it.
Now you have the hampster in my brain pondering a new site example (that starts off as Bob's - from an algorithm's POV - and becomes the next MySpace or peaks and declines like Friendster that can be tracked over time to test your theory. I would love to see a site like Work.com or BUMPzee.com (i read about both this week) track this data an publish their findings.
Friendster though could easily fall into the stgnant site category. Look at where they were and where they are.
With LinkedIn I wonder if it's not so much a question of a single spike. I would think LinkedIn was gaining links both before and after the spike. I'd think an algorithm could see the difference in that when compared to a site that had say two spikes in links, but only marginal growth outside the links. This is about trends over time after all. One spike might only say so much, but when looking at a site over a greater period of time the data should reveal more.
You raised some interesting points though and I'm trying to think of some examples to support your argument since I'm sure there are some.
VanGogh99 -(& Rand) Thanks for seeing some points in my pre-coffee post - lol. I can agree with your analysis about Friendster and trends over time. Here's something that may throw a monkey-wrench into the theory though - Micro Sites.
An example, the Blackberry Pearl Micro site. The links to content on the day of launch was like 50:1 (because of anticipation). The content has rarely increased over the few months the site has been live, yet there is always a spike in the linkage when a new ad campaign is run for the Pearl (even when it isn't run by Blackberry directly). Now how would an algorithm decipher a mom&pop (or spammer for that matter) which is running a balls to the walls link building campaign (that shows spikes as links are gained over time) without adding much content, from a Micro site which sees bumps in linkage over time? To me, the links to content patterns of the Micro site and the mom&pop if graphed would look similiar.
A point could definitely be made about the authority of sites linking in to the pearl site. But in the Blackberry Pearl case there is a healthy mix of Authoritative links as well as links from non-topical based sites far and wide. Which to me would mimic the patterns of most linkbuilding campaigns. What do you think?
(BTW: kudos to whoever owns the domain https://www.d-e-f-i-n-i-t-e-l-y.com/ since i never spell that word correctly.)
I'm not really sure why they would need to recognize the difference between the micro site and the mom and pop. I think search engines are only concerned with presenting the most relevant results to a search query and not who happens to own that particular site. Either the micro site or the mom and pop could be just as relevant to the end user depending on the specific query.
i think in the case of the sppamer the situation changes. I'm just speculating, but I would suggest that while both the micro site and the mom and pop would have spikes in links they might also be showing an increase in links between the spikes. The spam site might not be seeing those in between spike gains.
A lot of what I get from this post is less the specific cases Rand pointed out, but more that search engines can combine a lot of different factors algorithmically instead of having to look at everything in isolation. So I think to distinguish between all three sites other factors could be looked at as well.
Maybe the sites of the incoming links could be looked at and the footprint of the links can be compared to other footprints. Even though the micro site and the mom and pop could show similar spikes in links with a lack of page content growth over time I would suspect that the links would still show different footprints. I think the spam site would definitely show a different footprint.
My guess is the spikes in links would reach different amplitudes. Something tells me a typical mom and pop isn't going to show the same spike as a blackberry micro site. It could, but I'd think over time the two sites would end up showing different patterns and the longer the time frame the more different.
I'm also thinking there could be other metrics or combinations of metrics that could distinguish all three of the sites. Comparing the growth of links and content over time would probably just be one way to compare them.
Again I could very well be wrong since this is all speculation on my part. It's fun thinking about though.
Oh lord - for some reason when I saw this, it reminded me of the sort of thing you'd see on a final exam. I imagine just receiving the graphs printed a page with the instructions: Explain the circumstances that would result in these trends (For 5 points: give 1 real-life example of a site that fits each of these).
I now want to curl up in a ball. Thanks for bringing back bad memories of SEO 451 - which, by the way, isn't offered this quarter. :)
Wait, so I have to wait until NEXT quarter to take it?
Man, there goes spring graduation...
That's a great overview of this - thanks Rand. I wonder how far they go on the bigger sites in taking into account second-order calculus (rates of change of the rates of change). There's a huge amount they could be doing.
Incidentally, what did you draw the graphs in? They're purty... I know you make your documents available in openoffice format - did you make the graphs in openoffice calc?
Fantastic line of thought and seems really intuitive and simple but takes a smart, sharp mind to come up with it *winks*
One of the things I did at my last job was link building campaigns and I used to email the inhouse list regularly (not spammy regularly - as in over the course of a year or two regularly) and request those people who have websites give us some link love.
We got lots and I think they keep that up even now plus there is code on the front page to create a banner link on a page.
Ask and ye shall receive *winks*
But this now makes me wonder about the spikes I created in incoming links as a result of the campaigns - I should have kept graphs... *smiles* Yours are so sexy Rand *winks*
I need a winking icon....
Perfect timing as I've been putting more focus on link building.
Of course this is one piece of the pie as I think others have questioned how this plays into the trust factor of the sites where new links are coming from as well as spiking in links.
What this adds to the mix is also the correlation of links to content, so you also factor in spikes in content creation.
I wouldn't be surprised if they also factor in link destination... are new links pointing more towards specific content pages, new or old, or at the home page. There are lots of assumptions that could be made at this level.
The graphs are excellent in getting the ideas across. I think this is a prime demonstration to clients of the bigger picture...
Just like companies create long-term marketing calendars, it would be just as beneficial to create content development and link building calendars, and obviously work to overlay and intergrate all three when possible.
Good points about the bigger picture. I think this is yet another example of the need for a more holistic approach to seo as so many things work together and are looked at together. It's one thing to get a lot of new links, but if other aspects of your site like content aren't growing proportionally to the influx of links it could signal something isn't quite right in sitesville.
Very interesting and very well worded article, but what conclusions can be made out of it?
There are offcourse a few that are obvious, such as if nr of pages and links increase, so will the rankings, but what about the following.1) Nr of new pages are declining, but nr of links are increasing. To me this sounds like a good site that deserves better rankings. Why do the nr of new pages have to grow? There is only so much good content that can be written and deserve links.
2) Is it really bad to have spikes of incoming links? Can't remember where I read it, but somebody onces investigated some other sites linking patterns and they found that it's pretty normal for a site to have spikes. If there are sites that consistently get 100 links per month, now that is suspicious and looks like manipulation. Spikes are good IMO.
Good insight, as ever. It will be interesting to see if Google starts to treat linkbaiting, DIGG, PR stuff in a less positive light - and look instead to find gradual, natural-looking acretion. Not entirely sure how they'd determine what exactly 'natural-looking acretion' is, but I'm sure they probably don't want to serve results to the layfolk purely on the basis of a geek-orientated linkbaiting exercise or media hype.
There's probably a third metric in there somewhere (or will be) that looks at where the links actually come from in terms of neighbourhood and timeliness.
Search engines are very good at deciphering what type of site you are and placing your site into a topical neighbourhood. The search engine can then analyze your site or page and compare it to your 'neighbours' in a trillion different ways (besides the 4 that Rand mentioned).
In this neighbourhood analysis, some sites will stick out like a sore thumb. In some way they have deviated radically from the rest of the neighbourhood and this then trips a flag at a search engine. This could then lead to a number of possibilities: eg. further investigation (either automatic or manual), an application of a penalty or an application of a filter.
There was a great white paper that (I think) Microsoft released a few years ago which had lots of sexy graphs plotting website distributions. It showed how certain sites existed outside of a specified 'normal' range and proposed that if a site breached a number of these neighbourhood factors, that it could be flagged for being 'unnatural'. I'm paraphrasing a bit, but that was the gist of it. (It's 6am and I can't find the paper in del.icio.us... i'm sure some other SEO has seen it!)
Shor - I'd love it if you posted that link when you find it.
No problem Natasha,
Spam, Damn Spam, and Statistics is a very 'cool' white paper from the people at Microsoft Research.
It showed how search engines could use statistical methods to identify sites that show symptoms of being spammers. These include changes over time of a site's link structure, on-page text and the number of pages. As mentioned, it has lots of sexy color graphs and is extremely digestable (for a research paper). This was released in 2004, so it's scary to think what the big SEs are doing in 2007.
If that did not slake your thirst, Bill Slawski is the man when it comes to papers and patents - he has more current revelations on this topic here: Phrase Based Information Retrieval and Spam Detection
Thanks for the link Shor!
Jeez - I thought I'd spend Thurs watching Ugly Betty, Men In Trees and Grey's Anatomy... now I'm gonna be sidetracked because i know I won't be able to tear myself away from this. If I miss something good, I'm blaming you! :)
Just another sign that spamming and automation is going away and SEO is going to turn into more or a traditional marketing endevor.
Good for white hats, bad for black hats.
Either that, or good for black hats to know so they can adjust their manipulation accordingly :)
I'd say that most blackhats already have a good understanding of these principles - since they generally can control the amount of links incoming into the site and the rate at which they create them.
Just happen to come across this site so im new here but I will say, this is by far one the most informationl SEO related sites I know. I will be sharing, recommending, posting, word of mouthing (if thats even an actual term?) to everyone I know! Another great post!
Ive read like 25+ tonight. Keep it up!
(as if you wont)
Thanks!
Do you think that the spam and automation we've seen in the past will resurge (hate that word) in a different form in the coming years?
Very nicely done - especially the pretty graphs. I didn't see it discussed, but I wonder how another component of these trends might play: the proportion of New Links to New Pages and its relationship with New Pages.
From a human standpoint, a steadily growing number of new links to new content would evidence a site that is continuously producing content that is relevant and link-worthy. I am guessing that this trend might be tracked and computed accordingly, as well.
With the exception of linkbait campaigns, I tend to compartmentalize content creation and link dev, but it may be time to update some methods.
Food for thought!
I think the new links graph might (or should) be slightly higher than the new content but it is usually the opposite. Links are slower while content is faster at growth.
Rand, this is EXCELLENT, with a capital "E" oh I capped the rest too.
I'm, going to circulate to the math whiz on the team, yet it still makes a lot of sense to me. I have been thinking...will G also spider / review things like press releases (not for SEO but legit ones) as ways of realizing that a spike may also be legitimate?
Just something I have been thinking lately myself as it relates to the speed at which links are attained.
I would suggest that the SE's might well add another dimension to these graphs.....so they can track this data between index page links and deep links pointing to internal pages. The deep links should grow over time and might also be more responsive to link bait.
The sandbox issue and or site trust might well be addressed by
how tolerant the algorithm is to changes on these graphs.
What would be really interesting to know is, if you have a real
"spike" in your links and that is checked ( manually?) and judged genuine....does this increase your "trust rank" the next time?
so the best recepee is constant growth of content pages and links?
like wikipedia?
Great post Rand. I'm having one of those seems so obvious not that I see it moments.
One thing to take from this is the idea that most seo factors should never be looked at in isolation, but rather in their relation to other factors. As search engines grow in the amount of data they can and do collect they will be able to recognize more complex patterns in just about everything we do and they will be able to fine tune their algorithms based on those new complexities.
Good overveiw. Content like this is the reason why you get those links.
thanks
What (free) tool would anybody suggest if I wanted to get detailed rankings like that. I guess I could use Google Analytics, but my client is doing his own Google campaign, and I don't know if I want to track his clicks throug my campaign until he's ready to give up control to me.
I'm using Active Meter right now, which seems to be pretty descent. Also, has anybody used Caphyon's Advanced Web Ranking & Advanced Link Manager? It's like a Mac version of Web Positions.
As long as your site is getting spidered and indexed with decent frequency, Yahoo! Site Explorer is a pretty solid, cheap and dirty way to get link and page data.
I'd say G.A personally. You can do some seriously cool stuff with it. :)
What's with all of this professionalism lately, Rand? I mean, graphs, charts, GoogleBots? You're making the rest of us look lazy.
I have to ask the obvious follow-up: what will befall those whose sites exhibit the link-bait spike in Figure 1? My blog pulled in a huge number of cross-links from a reputable site (due to a strange linking strategy), jumping my inbounds from about a dozen to over 10,000 in a month. Now, they're slowly crawling back down to a couple/few hundred (I've build up a respectable amount since then). Should I worry? Or, as the Canadians might say, "Am I hosed?".
Isn't this related to why some sites vanish from the Google SERPs and they reappear by magic a few days later? Specially if they are young domains?
We had a discussion here about this a few days ago, when our site gained a link from a site that could be considered to be a spam site.