Don’t panic – it’s not what you think. Last fall, I did an analysis of 50 blog posts before and after Google+ to see what factors drove traffic. At the time, I really wanted to do more, but collecting the data posed multiple challenges. Rand suggested that 50 was ok, but 500 would be great. So, I set out to make it 1000, just to make the boss proud. Then, I thought, “Why not 2000?!”. Three months passed...

Long story short, I built a crawler and not only expanded the 50-post analysis to 2011 posts, but added a chunk of variables for good measure. This analysis covers the top 2011 SEOmoz posts of 2011, ranked by Unique Pageviews (UPVs). These posts could be written at any time (some go back to 2005) – I’m just looking at which pages got traffic during 2011.

Let’s See Those Numbers

I could keep talking, or I could show you the numbers. The following graph shows Spearman correlations (r-values) for 13 variables with UPVs. Blue bars are social factors, green are community factors, and purple are content factors:

Correlations with Unique Pageviews

Most of the variables are self-explanatory, but a few that might need elaborating:

  • Words (Post) is the word count of the post’s content
  • Words (Title) is the word count of just the post’s title
  • Headers is the count of all header tags (<h1>, <h2>, etc.)
  • Bold Tags is the count of all <b> and <strong> tags
  • Lists is the count of all <ol> and <ul> lists

We use Spearman rank-order correlations because many of these variables tend to be skewed (for example, some posts get a ton of Tweets, whereas many get very few). As always, correlation does not imply causation. I originally captured both Pageview (PV) and Unique Pageview (UPV) data, but the correlation between them was very high (r = 0.998), so I decided to just keep it simple. Every cited r-value is significant at p < 0.01. Many thanks to our resident stats guru, Dr. Matt Peters, for helping me pull the numbers together.

What Does It All Mean?

First off, I’d better explain the “Post Age” data (in red). That’s actually a negative correlation with UPVs. In other words, the older the post, the less traffic it got. That may sound counterintuitive, but remember that the traffic data was only from 2011, whereas the posts could be written at any time. Naturally, posts written in 2011 tended to get more traffic in 2011. In retrospect, that seems obvious. Interestingly, thumbs up was also negatively correlated with post age (r = -0.76) – the other reality is that the community has just grown over time.

Clearly, social factors had the strongest influence in this data set. Causality is a bit tough to pin down, as we do have a chicken-vs-egg problem. Likes, for example, may drive sharing and traffic, but posts with a lot of traffic will naturally get more clicks on the Like button. Which came first? Probably a little of both. As we saw in the smaller data set last year, there does seem to be “cross-talk” between the 3 social buttons. People that like a post will naturally +1 it. For reference, here are the inter-correlations between social factors:

Social Factor Inter-Correlations

As you can see, they’re pretty highly correlated with each other. It’s hard to separate why, at least from this data. It could be that (1) The best content attracts the most social signals and the most traffic, (2) People who regularly use social tend to use all 3 services, or (3) People use all 3 because the buttons are close to each other.

Community factors are similarly tricky – posts with more traffic get more thumbs, all else being equal. Still, it seems that our community metrics have some validity – posts that get a lot of thumbs up and comments tend to also get a lot of traffic.

The content factors are the weakest group, as a whole, but here the causality is at least clear. No post magically got longer or had more images in it because more people visited it. It does appear that longer posts tended to fare pretty well with our audience.

Where Do We Go From Here?

While we can’t predict the future of any given social network, and Google+ is still in its infancy (even by internet time), I think that 2011 was the year where social really made its mark. It’s clear that social is driving traffic, and the impact of social factors on SEO is growing fast.

I think both studies suggest that you shouldn’t be afraid to use all 3 of the major social buttons. I wouldn’t go crazy (if you have 50 social buttons, you weaken them all), but the inter-correlations strongly suggest that, at worst, the 3 big buttons don’t hurt each other. People who regularly use social probably send multiple signals.

It’s also interesting to me that long posts seem to do pretty well on SEOmoz. When I wrote my duplicate content mega-post, it was a bit of an experiment. We had talked about doing another guide for e-commerce SEO and opted to try a long-form post on one sub-topic instead. I don’t think that every post needs to be that long, but there’s certainly room for mega-posts when the topic merits them. To give credit where credit’s due, Oli’s mega-post made that point before mine did.

Of course, every audience is different. I admit that I do these analyses as much for myself as anyone else – I’m really fascinated by trying to figure out what works and what doesn’t. Much like with SEO in general, though, “quality” is a complicated thing. If you write a long post just to fill up space, you’ll have a mountain of crap instead of a pile. Use the data wisely.