If I had been paying attention on April 15th, instead of swanning around New Zealand, I would have noticed that Facebook had launched what amounts to an elementary keyword research tool... and as I write that, I realise that Facebook Lexicon is perhaps less elementary than some of the tools that we already pay for. An anonymous aggregation of "public and semi-public" keywords from across the site, the tool does not provide actual statistics as to how many times a keyword or two-word phrase appears, but like Google Trends, it shows spikes and trends in keyword / phrase popularity. However, it is different to a lot of keyword research tools in that we can be fairly certain that its stats are accurate.



Numerous tools like this already exist for Twitter, who has never been anything but forthcoming with its data. Facebook, on the other hand, contains arguably more involved conversations and has always fiercely protected its content. People, Robert Scoble included, have run into trouble with Facebook for using "its" data, even when simply doing something with a script that could have been carried out manually.

Terms of Service aside, it would have been very difficult for a third party to put together a tool like this for Facebook, as it isn't easy to access the profiles of people with whom one is not directly connected. A tool like this - one that is actually useful - really could only have been developed by the company themselves.

The "public and semi-public" information that Facebook uses includes comments on users' profiles, groups and events. It actually surprises me that a company whose attention to privacy borders on extreme is using data from profile comments. If users want to, they can completely privatise their profiles so that only their friends have access: even though Lexicon's Help section makes the anonymity of comments explicit, I never would have guessed that Facebook would begin scraping profiles for a tool like this. For advertising revenue? Go for it. But they're providing Lexicon to us for free.

At the moment, the tool isn't as useful as it may one day be. There is mention of a "threshold" which a keyword must exceed before it's included in the Lexicon data. From what I can see, the threshold is relatively high at the moment. Queries that actually bring up results are rather generic. Given a lower threshold, we could begin to use this tool for some buzz-centered keyword research.

As to the tool's accuracy, I see no reason why it doesn't look fairly solid. Its findings for a few of this year's more popular Internet phenomena aren't surprising:



"Rick roll" and "rick rolled" only register on Lexicon's radar on and around April 1



SXSW becomes popular...



... and the rise & rise of Twitter follows.

Some other interesting graphs include the fall of Ron Paul, the fact that Digg is huge and Reddit doesn't register and that people talked about commercials at a far higher rate than usual during this year's Super Bowl. It's also interesting to compare synonyms and similar words to see how Facebook users are referring to certain things (which is what I was doing when I noticed the Super Bowl spike). This comparative data is obviously just as interesting as the ability to track trends, although it does highlight how the graphs tell us nothing about the actual number of instances of a word. A search for "twitter" alone makes it look like Twitter suddenly became a hugely popular word, but in comparison to a really popular word, the graph looks much different.

To my mind, Lexicon may also help us define the Facebook demographic a little further as well. When Facebook was solely a university community, it was easier to pinpoint what constituted a typical Facebook user, but the site's diversification has raised arguments as to who and what constitutes the heralded "average user." The SWSX and Twitter stats tell me that, at the least, the population of tech-savvy people is still relatively high... given the relatively high threshold for keywords, both terms must have been mentioned a lot to register. What would make this data way more interesting is an intersection of Facebook's statistics with similar stats from MySpace and Bebo.

Keep an eye on Lexicon. Facebook isn't known for releasing features and never improving or changing them, so I can only guess that the tool will evolve. One final note: it seems you have to be signed in to Facebook to use it, so if you're a marketer without an account, now is about the time you should get one!