On Wednesday I presented at SMX on the panel called "Facebook, Twitter, and SEO". I was excited to speak alongside Horst Joepen (SearchMetrics), Jim Yu (BrightEdge), and Michael Gray (Atlas Web Services). In my talk, I showed some information from patents that talked about how a search engine might detect a person's topical relevance and authority and use a scale on which to pass link juice from their social shares or not. Let's explore some of these a bit more.
What Factors Might Search Engines Look At?
There are three concepts I would like to introduce you to.
Topical Trustrank
The first concept you should be familiar with is "topical Trustrank". The original Trustrank was first mentioned in this Yahoo patent from 2004. At the time, it seemed underdeveloped, since it relied on sites to label themselves. And worse than underdeveloped, it was open to spam since it relied on websites to tag themselves (not unlike the meta keywords tag). The patent was granted in 2009 as a way to rank sites based on labels given them by people, according to this article called Google Trustrank Patent Granted.
Another take on Trustrank is Topical Trustrank, which was introduced in 2006. Because Trustrank seemed to be biased heavily towards larger communities that could attract more spam pages (without tripping a spam threshold, maybe?), Topical Trustrank aimed to build trust based on the relevance of the connecting sites (and I would argue now, the topical relevance of those sharing links via social networks).
Author Rank
According to one Yahoo patent application, "...author rank is a 'measure of the expertise of the author in a given area.'" Since this is delightfully vague, here are some specific areas (taken from Bill Slawski's How Search Engines May Rank User Generated Content) that the search engines might look at to determine if you are authoritative:
- A number of relevant/irrelevant messages posted;
- Document goodness of all documents initiated by the author;
- Total number of documents initiated posted by the author within a defined time period;
- Total number of replies or comments made by the author; and,
- A number of [online] groups to which the author is a member.
We can take these and apply them to social as well. If they are calculating author rank based off of content taken from around the web, why would they not also use this author rank for your social shares? Here are some more questions a search engine might ask about a user (according to an email I received from Bill Slawski):
- Do they contribute something new, useful, interesting?
- Are they tweeting new articles, or recycling old articles? Are they sharing articles from just one site, or are they sharing articles from a number of different sites? What's their engagement/CTR?
- Do they participate in meaningful conversations with others?
- Are they replying to others through @replies or others (DMs. maybe?)? What topics?
- Do those others contribute something new, useful, interesting?
- Are they themselves keeping the cycle going and replying to various others, or always responding to the same users?
Agent Rank
According to this article from Search Engine Land, Google applied for a patent around a way to determine an agent, or author's, authority in a specific niche. According to the article:
Content creators could be given reputation scores, which could influence the rankings of pages where their content appears, or which they own, edit, or endorse.
Also according to the article, here are some of the goals of Agent Rank:
- Identifying individual agents responsible for content can be used to influence search ratings.
- The identity of agents can be reliably associated with content.
- The granularity of association can be smaller than an entire web page, so agents can disassociate themselves from information appearing near the information for which the agent is responsible.
- An agent can disclaim association with portions of content, such as advertising, that appear on the agent’s web site.
- The same agent identity can be attached to content at multiple locations.
- Multiple agents can make contributions to a single web page where each agent is only associated to the content that they provided."
Does the following sound like the new rel=author markup that we're seeing in the search results? I think it does:
"Tying a page to an author can influence the ranking of that page. If the author has a high reputation, content created by him or her many be considered to be more authoritative that similar content on other pages. If the agent reviewed or edited content instead of authoring it, the score for the content might be ranked differently." "An agent may have a high reputation score for certain kinds of content, and not for others – so someone working on site involving celebrity news might have a strong reputation score for that kind of content, but not such a high score for content involving professional medical advice."
The article goes on to explain that authority scores will be hard to build up, but easy to harm. This would be one way to keep authors producing high quality content. Some more factors that may influence authority:
- Quality of the response
- Relevance of the response
- The authority of those who respond to what you post
The Google Person Theory
- If you share articles frequently around a certain topic, you must be involved with that topic.
- If you are involved with that topic, you will also be writing about that topic.
- If you are writing about that topic, others will be sharing your writing on that topic.
- If others are sharing your writing on that topic, you must be authoritative about it.
- Therefore, articles you share within that same topic can be trusted (and potentially ranked higher).
How might search engines view my sharing and that of my followers?
Two weeks ago Duane Forrester from Bing posted an interesting article showing how a they might visualize if someone is attempting to game their ranking signals by sharing a lot, or if the increased rise in sharing is natural. According to the Information Retrieval based on historical data (PDF) patent:
A large spike in the quantity of back links may signal a topical phenomenon (e.g., the CDC web site may develop many links quickly after an outbreak, such as SARS), or signal attempts to spam a search engine (to obtain a higher ranking and, thus, better placement in search results) by exchanging links, purchasing links, or gaining links from documents without editorial discretion on making links.
If we take "back links" and replace it with social shares, we get this:
A large spike in the quantity of [social shares] may signal a topical phenomenon (e.g., the CDC web site may develop many links quickly after an outbreak, such as SARS), or signal attempts to spam a search engine (to obtain a higher ranking and, thus, better placement in search results) by exchanging [shares], purchasing [shares], or gaining [shares] from [others] without editorial discretion...
If you are automatically tweeting every interesting article that comes your way, and you have a large network of people who do the same in an attempt to game the signals, here is the image of how Bing might view those manipulated ranking signals (the below is an example of a "Like Farm"). Check out all of the hubs on the image below:
And here is an image of non-manipulated, truly viral signals. Check out the wide scatter of sources:
Some quick pieces of data to dissuade you from spamming or completely automating
We hear a lot of talk around automating your social stream. This seems like an oxymoron to me, since it undercuts the whole purpose of "social" media. Here is an interesting statistical graph for you: Manual tweets get twice the clicks on average!
Next, if you're interested in whether automating your Twitter stream will increase your followers, take this next graph into account:
Key learning: Less automation = more followers
(All data gathered from Triberr - The Reach Multiplier)
How can I build my author trustrank with the search engines?
Here are some ways to benchmark and build your author presence in the eyes of the search engines:
Author microformats - if you own a website, you most definitely should implement the new rel=author microformat, validating through Google Plus. This is a fantastic way to directly claim your content to the search engines. Here is how to do implement it on Wordpress (via Joost de Valk) and here is the official Google page on authorship.
Klout Topics- Since we were talking about topical trustrank earlier as well, you might want an idea of which topics the search engines might consider you authoritative about. I think that Klout Topics is a good place to start.
Gravatar - Ross Hudgens wrote a great post a few months ago called Generating Static Force Multipliers for Great Content wherein he talked about the importance of a consistent personal brand and image across the Internet. If you have the same photo across many different sites, how could the search engines not use this in determining trustworthiness?
KnowEm is a website where you can find if your username has been taken across many different social networks. This is a great place to go to learn where you need to sign up to protect your username, and therefore your personal brand and author trust.
Conclusion
Author authority has long been a topic of discussion in SEO circles and we've wondered "Does Google have an author rank?" From these patents, I think it is obvious that they have the capability, and especially now with Google Plus for Google, and Facebook for Bing, both are going to be making this even more of a priority.
I'd love to hear your thoughts.
A lot of good information here. I'm going to share it. :)
Great stats, thanks for your work
I think its worthwhile on certain social media platforms to take the time and receive author status in a specific niche or profession that pertains to your business. But a lot of social media sites are spammed so heavily, I would only take the time and effort to do this on the big boys. High PR sites like Reddit, Digg, LinkedIn and such. Awesome post.
I can see value in the "rel=author" markup in helping to establish authority. My dilemma is that I'm in a couple of quite diverse niches - home and garden and health.
I may also wish to enter other niches too.
Does this diversity simply undermine authority in the "rel=author" context, rather than help establish it? Or does it just mean you've got to work twice or more times as hard to establish authority in each niche.
I'm wondering if its better to stay under the author radar, rather than put my hand up to being a multi- nicher.
Any ideas?
Social media is actually already a major factor and it can't be overlooked. It started to be complicated and it require attention - there are to many signals and factors that come into play. Even if it can be considered as an early concept (Social media related with rankings) I think the old SEO cowboys days from 2001-2002 are long gone and those won't repeat with the social network signals.
I think Google knew a long time ago that if they wanted to use social signals as a ranking factor, author authority would have to come into play. It's too easy for social signals to be misused for the benefit of one site or another. Author authority isn't perfect, but it's one way to cut down on spam gaining too much power.
Exactly, Nick! Well said. This is why I like author rank - it makes social signals harder to game, especially surrounding the idea of Topical Trustrank.
I honestly wonder about Klout sometimes - it seems to peg me for topics that I've never or hardly ever mentioned in my tweets, or even of Facebook for that matter.
Apparently, I'm influential about Heroes (the TV series), although I've never mentioned it ... ever ... it's not on my radar. I've been marked influential in a couple of fairly random topics before.
Whether this hurts of helps is something I haven't been able to figure out yet.
Google are now trying out the Celebrity Endorsement on the Adwords so I guess there is some kind of correlation about who the Author is etc.
Great post, thanks for the advice on knowing the social pitfalls when sharing and not just re-sharing.
I couldn't agree more. Automating posts is spam (period) and soon enough the social media sites will be better about flagging and "black hatting it" - Anyone who would want to contribute to the dilution of such powerful tools, valuable social tools, should rethink.
Great post on what search engine look for while crawling keep updating such post thanks
If you haven’t already claimed your name on the major social network sites, start with that. If you want to make that process a bit easier, go to Claimbrand.com to see whether your name is available you can reserve it on more then 600 sites and for half price compared to namechk or knowem https://claimbrand.com/.
Thanks John for your research and brilliant post.
A very nicely compiled post must I say! The ay you have compared nuances like the Topical Trustrank, Author Rank and Agent Rank apart from highlighting them well with graphical representations shows the kind of research involved in the post.
Somehow "KnowEm" slipped through my filter of sites to check out. Thanks for pointing that one out, it'll be extremely useful for some of my clients.
We have implimented Authorship on some of our top clients websites and I have already seen a huge improvement and in some cases even a increase in page rank.
Does Google's "authorship" mechanism work well enough that content copied by scraper sites disappears from Google results?
Google seems to be trying to do something to solve their provenance problem (tracking where something originated, and distinguishing the original from copies). Are they succeeding?
John -
It seems too early to tell if this will be the case. Theoretically, I would doubt that scraper sites will be removed from the index, but the authorship markup may make it easier for Google to find out who is a scraper and who is not.
This is something to keep an eye on!
Thanks for the comment. That is interesting that you have seen some improvements in rankings. Fancy writing a case study about it? I'm sure YOUmoz would love to see it!
Social is bad for search, and search is bad for social.
Social signals for search have been tried, many times. They've backfired badly. In October 2010, Google started merging “Places” results into web search results. Spamming Google Places was known to be easy, but until last October, few people bothered, because spamming the search engine for Google Maps wasn’t worth much. After the merger into web results, SEO-generated places spam via social inputs went mainstream. It’s cheaper than link farm building, which requires setting up web sites. SEO firms went overboard. "Guaranteed 1st page placement or your money back!" That outfit wasn't the only one. Google search quality went way down, and for the first time, Google search was heavily criticized in major media, like the New York Times.
Google is still being hammered in the press. This week, their spam problems have hit both the New York Times and Fox News. When the NYT and Fox both say you have problems, you have big problems.
Google was burned because, at first, social signals seem to help search. Not much SEO effort went into spamming Google Maps Search until those results were merged into web results. Nobody seems to be bothering to spam Blekko, because their reach is so tiny. Hook Google+ to web search SERPs, and the floodgates of spam will open.
For social spam, Yelp, Citysearch, Twitter, Facebook and Google host your spam for free. The typical small business only needs tens of spam entries to rank higher, not the hundreds or thousands of bogus links needed to gain equivalent rank. That’s how search spam is ruining social. Social turns into a pipe between spambots and web crawlers. Twitter and Citysearch are barely readable by humans now. This is an SEO blog; you all know this.
Now we have Google+, where Google has access to personal information about their users and can violate their privacy to try to find spammers and fakes. Maybe that will work. Maybe it will get Google in more legal and political trouble. On September 21, Eric Schmidt appears before the U.S. Senate Committe on the Judiciary to explain Google's previous privacy violations and other embarassingly evil activity. He will not have a good day on Capitol Hill.
This is why search is bad for social, and social is bad for search.
It's quite possible to stop most web spam. "Social" isn't the answer. Due diligence on companies is. But that's all I'm going to say about that for now.
Hey John -
Thanks for the well-written and argued comment. You know, in a lot of ways I agree with you, though I do not go as far as saying that search is bad for social and vice versa. I think the devil lies in the details, and those details are in how you use social.
I agree that spam is a problem online and I don't like that especially Local listings are so gameable. It also frustrates me that it is an issue, since Google has obviously thought about it and has even applied for patents to deal with it!
This is also why I like that they might give ranking preference to those using rel=author, because this rewards people who are being transparent and ethical online. Google+ is a step in this direction. Also, I hope that Google+ will be harder to game and harder to have spam accounts on. But this remains to be seen.
I think you are also right that "social" isn't the answer to web spam. But could it be part of the solution? Social, rel=author, verified accounts. I think these are all parts of the solution. There is no "one solution".
Thanks again.
Hi John,
I would really like to believe that rel=author will be a part of the solution, but I fear that once again scammers are being handed an opportunity to benefit while being protected by the widespread expectation that use of the markup is reliable.
In my view, the absence of any kind of robust external verification process for the implementation of rel=author leaves the way open for those with bad intent to assume identities online.
Implementation of the markup will provide validity and therefore authority. However, it remains ridiculously easy to set up the profile account(s) required for implementation.
The only difficulty for those who wish to assume an identity is to identify a person with significant offline personal or brand recognition who does not have a strong online presence already established.
Download a few pics from various news or gossip sites. Set up a profile in the person's name. Create a bogus blog, add some code and place a few links, create a fake twitter account and suddenly you have the eyes and ears of the world (and the dollars from ads placed on the blog for all those visitors to see!).
I'm afraid my only real peev with Google is their penchant for developing solutions for dealing with bad guys that actually just create a new problem!
Don't get me wrong, rel=author is a great idea IF it is built upon a highly secure personal verification process. After all - it is effectively an invitation to claim a person's identity. I for one am seriously perturbed by the potential for harm to people whose only mistake is not to have found a need for a Google profile yet! :(
OK...here endeth the rant :)
Sha
You write "Local listings are so gameable." Yes.
There is an objective right answer to "is there a locksmith shop at this address?" That is a fact, not an opinion. "Likes" are opinions. SERPs are opinions. Physical locations of businesses are objective facts. Getting that right is the first step to cleaning up the mess on Google Places. There are objective hard data sources for such data - business licenses, corporation records, and credit ratings. Those are hard to fake, and trying to fake them has real-world consequences.
Google has proposed looking at WHOIS data and Yellow Pages data, but those, too, are easily gamed. You have to look at the hard data sources. The data is available, although not necessarily for free.
"Rel=author" rewards people for signing up for Google+. If scraper sites start signing up for Google+ and inserting the Google-required "rel=author" links, they could get an edge over real sources that don't bother.
Google's track record at filtering out fake accounts is not impressive. Look up "Jiffy GMail Creator" and "PVA accounts for sale". The PVA (phone verified account) business is so blatant that Google is running three ads on that search. There's already "Facebook Devil" for creating Facebook accounts in bulk. Presumably a similar product will be available for Google+, if there isn't one already. It's just like link farms, except that Facebook and Google will host the link farm for you.
I wrote previously "Presumably a similar product will be available for Google+, if there isn't one already."
Three weeks later, "buyplusonenow.com" appears. Which does exactly what you think it does. It's not even considered black hat SEO any more. They issued a press release on PR Newswire.
I stand by my previous statement: social is bad for search, and search is bad for social.
John NagleSiteTruth
I put the whole picture of black-hat SEO for social together in a paper ""Social is bad for search, and search is bad for social". It's not a pretty picture, and it's rapidly getting worse. There's now a program available to create fake Google+ accounts in bulk for SEO purposes. “250,000 +1 votes per day on a fast connection”. There's now a whole ecosystem for social spamming, and it's growing rapidly. The paper has names and screenshots.
I'm tempted to present this paper at a SEO conference, but I might need bodyguards.
Google's plan was to enforce a tough "real names" policy to try to limit social spam, but had to back down. Anything that does stop generation of fake user accounts has to be so intrusive that it will run into privacy laws.
What we really need is more intrusive investigation of advertisers.
Interesting stats about Less automation = more followers. Socail media has been given more and more importance not just by search engines but also business, using these streams as a point of contact, an avenue for sales, and support. Building an authority on them would make sense even if search engines don't take this as a factor.
Keep up the Good Content John! Appreciate the keen insights.
Cheers!
Thanks man! I'm doing my best :-)
Gravatar! Absolutely.
Ross Hudgens mentioned it as seen in this (thanks John!) great post, it has all the sense in the world.
I've been using Gravatar for some 3 years now and just hued my profile picture acording to each platforms corporate colors, so you may find a green-blue-grey version of myself. Have to say that instinctly thought this may be good for solid and coherent identity across the 2.0 space, same as a consistent address reference all across the internet seems to be good for being featured on Google Maps for a search near your place. In adition, it makes life easier for logins on blog comments and some other places.
Find more at (english) https://es.gravatar.com/ with an intro video (want more? see an actual profile)
This is something intersting for me. Author Markup/rank is really good concept and we would definately used in many ways. Like when we publish the article we can put our author markup "rel=author" so likewise search engine aware about the authors. Recently i have read in google wmc blog about the author markup and it is really helpful.
Lastly i would like to thanks dohertyjf for sharing this wonderful article.
-Hiren
A lot of good info in this post. My main takeaway - implement the author tag on all my sites and articles! Thanks =]
The Google Person Theory- seems straight forward, but there is a lot of truth to it. Great Article.
Great article! I really enjoyed it. One of the things that I love about Search Engine Optimization is that it is always growing and changing. You have to stay up-to-date and informed if you want to stay at the top of your game. Thanks for sharing!
*edited for links*
Hi John,
Thanks for posting on this and for pulling all the threads together to create a clearer picture. It's really easy to catch each little clue as it goes past in a tweet, a post, a comment thread etc, but also very easy to become focused on whichever of them happens to get the most attention and forget the total picture.
Clarity is a prerequisite for intelligently targeted action. Thanks for bringing it!
Sha
Sha -
Thanks for the comment! You're very right that it is easy to be focused on whatever is the hot topic at the moment and forget the full picture.
I think I said in my SMX presentation (maybe not, I don't remember. If not, I meant to :-) that sometimes I hate that we talk about how social affects rankings, because the purpose of social is not for ranking, but rather for information exchange and conversation. Over of the most-shared posts on Moz are about social and how it affects rankings. I wonder why.
Cheers!
Nice post! It's interesting to see how social media has been evolving to become a search factor. The quantity of social shares already affects results moderately, but with the advent of influence quantifiers like Klout, Followerwonk and Peerindex, I can see how the engines might begin using the sharer's influence score to affect rankings as well. Obviously it's not going to be a huge signal, but still, it's another incentive to work on your social profile.
Good stuff, thanks for working on this!
Thanks Mitch! I appreciate your feedback and insights!
I thnk that author signals will become stronger in some areas than others. For example, in news I think it will be more of a factor. In depersonalized niches, like ecommerce, it won't be a signal at all.
At least, that's how I'd do it if I was Google.