Term Weight & Glasgow Weight vs. Keyword Density

For search engines, keyword density has long been held to be a reliable measure of the concentration of a particular keyword or phrase on a specific page. However, this method of mesurement has long been known in the scientific community to be a poor representation and poor methodology for discovering the 'weight' a particular page has for a term.

Alternatively, IR developers (search engineers) have proposed and used a system called 'term weight' and variations thereof (normalized term weight, Glasgow weight, etc.).

Classic Normalized Term Weight uses the following equation:

Wi = tfdi / max tfdi * log (D/dfi)

Where:
tfdi = term (or phrase of a given length) frequency in document
tfdi = maximum frequency of any (same number word) phrase in document
D = number of documents in the database (when using Google, I estimate at 8.1 billion)
dfi = number of documents containing the term/phrase (# of results for a search in quotes)

A second equation, Glasgow Weight, can also be useful (I generally use both when analyzing my own site vs. the competition):

Wij = log(freqij + 1) / log(lengthj) * log (N/ni)

Where:
freqij = frequency of term i (a word or phrase of a given length) in document j
lengthj = number of unique terms (word or phrase of the same length) in document j
N = number of documents in database (again, I use 8.1 billion for Google)
ni - number of documents containing the term (results of a search in quotes)

Using these equations and the resulting numbers for comparison is a much better way to check your page's use of a specific keyword phrase or term than just measuring keyword density. I hope to have an SEOmoz tool up in the next 2 months that can help by making the calculations for you - in the meantime, use Excel.