Please note, this is a STATIC archive of website moz.com from 05 Jul 2018, cach3.com does not collect or store any user information, there is no "phishing" involved.
By: Dan-Petrovic

User Behaviour Data as a Ranking Signal

Advanced SEO | Search Engines | User Experience (UX)
The author's views are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

Question: How does a search engine interpret user experience?
Answer: They collect and process user behaviour data.

Types of user behaviour data used by search engines include click-through rate (CTR), navigational paths, time, duration, frequency, and type of access.

Click-through rate

Click-through rate analysis is one of the most prominent search quality feedback signals in both commercial and academic information retrieval papers. Both Google and Microsoft have made considerable efforts towards development of mechanisms which help them understand when a page receives higher or lower CTR than expected.

Position bias

CTR values are heavily influenced by position because users are more likely to click on top results. This is called “position bias,” and it’s what makes it difficult to accept that CTR can be a useful ranking signal. The good news is that search engines have numerous ways of dealing with the bias problem. In 2008, Microsoft found that the "cascade model" worked best in bias analysis. Despite slight degradation in confidence for lower-ranking results, it performed really well without any need for training data and it operated parameter-free. The significance of their model is in the fact that it offered a cheap and effective way to handle position bias, making CTR more practical to work with.

Result attractiveness

Good CTR is a relative term. A 30% CTR for a top result in Google wouldn't be a surprise, unless it’s a branded term; then it would be a terrible CTR. Likewise, the same value for a competitive term would be extraordinarily high if nested between “high-gravity” search features (e.g. an answer box, knowledge panel, or local pack).

I've spent five years closely observing CTR data in the context of its dependence on position, snippet quality and special search features. During this time I've come to appreciate the value of knowing when deviation from the norm occurs. In addition to ranking position, consider other elements which may impact the user’s choice to click on a result:

  • Snippet quality
  • Perceived relevance
  • Presence of special search result features
  • Brand recognition
  • Personalisation

Practical application

Search result attractiveness is not an abstract academic problem. When done right, CTR studies can provide a lot of value to a modern marketer. Here's a case study where I take advantage of CTR average deviations in my phrase research and page targeting process.

Google's title bolding study

Google is also aware of additional factors that contribute to result attractiveness bias, and they've been busy working on non-position click bias solutions .

Google CTR study

They show strong interest in finding ways to improve the effectiveness of CTR-based ranking signals. In addition to solving position bias, Google's engineers have gone one step further by investigating SERP snippet title bolding as a result attractiveness bias factor. I find it interesting that Google recently removed bolding in titles for live search results, likely to eliminate the bias altogether. Their paper highlights the value in further research focused on the bias impact of specific SERP snippet features.

URL access, duration, frequency, and trajectory

Logged click data is not the only useful user behaviour signal. Session duration, for example, is a high-value metric if measured correctly. For example, a user could navigate to a page and leave it idle while they go out for lunch. This is where active user monitoring systems become useful.

There are many assisting user-behaviour signals which, while not indexable, aid measurement of engagement time on pages. This includes various types of interaction via keyboard, mouse, touchpad, tablet, pen, touch screen, and other interfaces.

Google's John Mueller recently explained that user engagement is not a direct ranking signal, and I believe this. Kind of. John said that this type of data (time on page, filling out forms, clicking, etc) doesn't do anything automatically.

At this point in time, we're likely looking at a sandbox model rather than a live listening and reaction system when it comes to the direct influence of user behaviour on a specific page. That said, Google does acknowledge limitations of quality-rater and sandbox-based result evaluation. They’ve recently proposed an active learning system, which would evaluate results on the fly with a more representative sample of their user base.

"Another direction for future work is to incorporate active learning in order to gather a more representative sample of user preferences."

Google's result attractiveness paper was published in 2010. In early 2011, Google released the Panda algorithm. Later that year, Panda went into flux, indicating an implementation of one form of an active learning system. We can expect more of Google's systems to run on their own in the future.

The monitoring engine

Google has designed and patented a system in charge of collecting and processing of user behaviour data. They call it "the monitoring engine", but I don't like that name—it's too long. Maybe they should call it, oh, I don't know... Chrome?

The actual patent describing Google's monitoring engine is a truly dreadful read, so if you're in a rush, you can read my highlights instead.

MetricsService

Let's step away from patents for a minute and observe what's already out there. Chrome's MetricsService is a system in charge of the acquisition and transmission of user log data. Transmitted histograms contain very detailed records of user activities, including opened/closed tabs, fetched URLs, maximized windows, et cetera.

Enter this in Chrome: chrome://histograms/
(Click here for technical details)

Here are a few external links with detailed information about Chrome's MetricsService, reasons and types of data collection, and a full list of histograms.

Use in rankings

Google can process duration data in an eigenvector-like fashion using nodes (URLs), edges (links), and labels (user behaviour data). Page engagement signals, such as session duration value, are used to calculate weights of nodes. Here are the two modes of a simplified graph comprised of three nodes (A, B, C) with time labels attached to each:

nodes

In an undirected graph model (undirected edges), the weight of the node A is directly driven by the label value (120 second active session). In a directed graph (directed edges), node A links to node B and C. By doing so, it receives a time-label credit from the nodes it links to.

In plain English, if you link to pages that people spend a lot of time on, Google will add a portion of that “time credit” towards the linking page. This is why linking out to useful, engaging content is a good idea. A “client behavior score” reflects the relative frequency and type of interactions by the user.

What's interesting is that the implicit quality signals of deeper pages also flow up to higher-level pages.

Reasonable surfer model

“Reasonable surfer” is the random surfer's successor. The PageRank dampening factor reflects the original assumption that after each followed link, our imaginary surfer is less likely to click on another random link, resulting in an eventual abandonment of the surfing path. Most search engines today work with a more refined model encompassing a wider variety of influencing factors.

For example, the likelihood of a link being clicked on within a page may depend on:

  • Position of the link on the page (top, bottom, above/below fold)
  • Location of the link on the page (menu, sidebar, footer, content area, list)
  • Size of anchor text
  • Font size, style, and colour
  • Topical cluster match
  • URL characteristics (external/internal, hyphenation, TLD, length, redirect, host)
  • Image link, size, and aspect ratio
  • Number of links on page
  • Words around the link, in title, or headings
  • Commerciality of anchor text

In addition to perceived importance from on-page signals, a search engine may judge link popularity by observing common user choices. A link on which users click more within a page can carry more weight than the one with less clicks. Google in particular mentions user click behaviour monitoring in the context of balancing out traditional, more manipulative signals (e.g. links).

In the following illustration, we can see two outbound links on the same document (A) pointing to two other documents: (B) and (C). On the left is what would happen in the traditional "random surfer model,” while on the right we have a link which sits on a more prominent location and tends to be a preferred choice by many of the pages' visitors.

link nodes

This method can be used on a single document or in a wider scope, and is also applicable to both single users (personalisation) and groups (classes) of users determined by language, browsing history, or interests.

Pogo-sticking

One of the most telling signals for a search engine is when users perform a query and quickly bounce back to search results after visiting a page that didn't satisfy their needs. The effect was described and discussed a long time ago, and numerous experiments show its effect in action. That said, many question the validity of SEO experiments largely due to their rather non-scientific execution and general data noise. So, it's nice to know that the effect has been on Google's radar.

Address bar

URL data can include whether a user types a URL into an address field of a web browser, or whether a user accesses a URL by clicking on a hyperlink to another web page or a hyperlink in an email message. So, for example, if users type in the exact URL and hit enter to reach a page, that represents a stronger signal than when visiting the same page after a browser autofill/suggest or clicking on a link.

  • Typing in full URL (full significance)
  • Typing in partial URL with auto-fill completion (medium significance)
  • Following a hyperlink (low significance)

Login pages

Google monitors users and maps their journey as they browse the web. They know when users log into something (e.g. social network) and they know when they end the session by logging out. If a common journey path always starts with a login page, Google will add more significance to the login page in their rankings.

"A login page can start a user on a trajectory, or sequence, of associated pages and may be more significant to the user than the associated pages and, therefore, merit a higher ranking score."

I find this very interesting. In fact, as I write this, we're setting up a login experiment to see if repeated client access and page engagement impacts the search visibility of the page in any way. Readers of this article can access the login test page with username: moz and password: moz123.

The idea behind my experiment is to have all the signals mentioned in this article ticked off:

  • URL familiarity, direct entry for maximum credit
  • Triggering frequent and repeated access by our clients
  • Expected session length of 30-120 seconds
  • Session length credit up-flow to home page
  • Interactive elements add to engagement (export, chart interaction, filters)

Combining implicit and traditional ranking signals

Google treats various user-generated data with different degrees of importance. Combining implicit signals such as day of the week, active session duration, visit frequency, or type of article with traditional ranking methods improves reliability of search results.

page quality metrics

Impact on SEO

The fact that behaviour signals are on Google's radar stresses the rising importance of user experience optimisation. Our job is to incentivise users to click, engage, convert, and keep coming back. This complex task requires a multidisciplinary mix, including technical, strategic, and creative skills. We're being evaluated by both users and search engines, and everything users do on our pages counts. The evaluation starts at the SERP level and follows users during the whole journey throughout your site.

"Good user experience"

Search visibility will never depend on subjective user experience, but on search engines' interpretation of it. Our most recent research into how people read online shows that users don't react well when facing large quantities of text (this article included) and will often skim content and leave if they can't find answers quickly enough. This type of behaviour may send the wrong signals about your page.

My solution was to present all users with a skeletal content form with supplementary content available on-demand through use of hypotext. As a result, our test page (~5000 words) increased the average time per user from 6 to 12 minutes and bounce rate reduced from 90% to 60%. The very article where we published our findings shows clicks, hovers, and scroll depth activity of double or triple values to the rest of our content. To me, this was convincing enough.

clicks

Google's algorithms disagreed, however, devaluing the content not visible on the page by default. Queries contained within unexpanded parts of the page aren't bolded in SERP snippets and currently don't rank as well as pages which copied that same content but made it visible. This is ultimately something Google has to work on, but in the meantime we have to be mindful of this perception gap and make calculated decisions in cases where good user experience doesn't match Google's best practices.

Relevant papers

About Dan-Petrovic — Australian marketing agency specialising in corporate-strength SEO, PPC, CRO, content marketing and outreach campaigns. For more information visit: DEJAN

Comments 65

Please keep your comments TAGFEE by following the community etiquette.

Comments are closed on posts more than 30 days old. Got a burning question? Head to our Q&A section to start a new conversation.
  • Great article Dan, thank you! Maybe one of the best about UX optimization for search

    • Thanks Olivier, so much of what I touched on can be expanded into a whole new article, the concept of result attractiveness in particular.

  • I am now trying new strategies to increase the CTR of my websites. For now it is something that does improve and it seems that Google is taking into account. Thanks for clarifying some of the questions I had. :)

    • It's a win win activity. Even if Google doesn't increase your rank you still end up with a more powerful CTR. Definitely worth doing.

  • I was actually wondering about Hypotext. I coded a similar plugin, Expander, which does the same thing using CSS3. I haven't checked SERP and rankings, so I might do it now, and compare it to your Hypotext.

  • When I saw this tweet from @randfish saying that "some of the best, most advanced writing in SEO right now is from @dejanseo" I was intrigued but I though it would be an exaggeration. After reading this post I am obliged to say that it is, in fact, an understatement.

    Dan, I'm going to melt your post into bars and keep them in a vault in Switzerland. Pure gold. Thank you for such an interesting and well documented article, and for all the research work supporting it.

  • Thank you Dan for an amazing study on user behavior and its impact as a ranking signal. I would say the SEO from now onwards is more about enhancing the user experience and providing fruitful information which users are seeking for. And about Google Chrome I personally believe that they are using it to create a database with things like usage patterns, demographics and other stuff that you can easily target in Google Adwords and we might see more targeting options in the future...

  • Well done, Dan! One of the best articles so far about User Behaviour Data compared with Ranking, I did hear Mr., John Mueller talking about it. On the other hand when it comes to Adwords AD Scoring system, Google does measure User Behaviour as part of Ad Quality Score. Thank You Dan

  • Wow, this is really thorough and quite eye-opening. Interesting to see how Google attempts to break down such a complicated concept into something machine-understandable.

  • Now this is what I call a study! Top work Dan...

    I think that what most will find interesting is the impact on SEO. Many believe SEO ends when you have tidied up your titles or built a few links - it goes so much deeper!

    10/10

    -Andy

  • Wow! I had to spent a lot of time reading all the content (external included) but no doubt... REALLY GREAT User Behaviour Data article Dan!!

    Thanks to share all your knowledge and test results. God Job!

  • Apart from the fact of reducing the bounce rate, adopting a plugin such as Hypotext, what impact does it have from an SEO perspective? Is the hidden text readable by search engine crawlers?

  • I'm wondering how sustainable the ctr-based growth of rankings might be. This could apply to the news type of content - not necessarily to evergreen one. Any news become outdated at some point.

    In my opinion we should look at it as an additional algorithm layer, that make SERPs more attuned to dynamically changing demand for content.

    As for statement, that Google doesn't know what people are doing on websites - they know more than they tell us and even more than we could imagine.

    Nice read @Dan-Petrovic. You've earned my share ;)

  • Great article Dan! Even without being sure about how strong UX signals are we highly recommend our clients to focus on that aspects of webdesign (where it doesn't negatively affect SEO of course). Just one thing bothers me. Do you know anything about Google collecting, processing, filtering data about "user behavior" from referrals in cotnext of growing spammy traffic from such pages like semalt.com, social-buttons.com, traffic2money.com (mostly from Russia)? Have you checked how blocking this traffic affects SEO? It generates really low quality bot traffic (short time visits, high bounce rate etc.). Filtering it in Google Analytics probably won't solve a problem and it should be blocked on server, right? Do you think that those bots can imitate logged in users?

  • First of all I would like to thank you for such a nice informative article.

    Even I implemented the same suggestion's and I can see improvements in rank. I have never done any sort of SEO activity in past. Only the changes suggested in this article has been implemented recently and lots of ups and downs is noticed in ranking. But I am facing one problem with my website None of my internal pages are ranking except 2-3 and Most of the keywords are ranking for homepage which even i am not targeting.

    Please suggest me what should I do.

  • You are the Man! This is the kind of information I love thanks for sharing. Everything is really spot on maybe a few philosophical differences.

    Achieving a high Click thru rate is a pretty simple formula. You are creating a virtual meeting of the minds between a search query and a result, which is hopefully your web page. Then there is all this “noise that gets in the way” bummer no one said life was going to easy.

    Maybe they should call it, oh, I don't know... Chrome? << Funny I lol’d on that, you are correct!

    Of course click thru rate is important but I do have this saying “you don’t take clicks to the bank you take conversions.” What I mean by that is you can have a boat ton of clicks with a low conversion rate and what good is that? I like going to the bank.

    See here is a quick shot of an Adwords campaign. (Last post 2012? yeah I don't post to FB anymore) I love using PPC to give me insights to SEO

    The click thru rate is only 11% but the conversion rate is 33%. BANK lol

    https://www.facebook.com/pages/AD-Web-Designs/132216191971

    yeah I don't post to FB anymore

    Happy Monday

    Oldest_SEO edited 2015-08-24T09:47:59-07:00
  • Great summary of what Google to correct its main algorithm based on user feedback.

    And it's just amazing to see the amount of data collected by Chrome with its MetricsService!

    • Yes, they collect quite a few things but only from those who opt-in apparently, and is sent as feedback upon application crash, but I've seen some evidence of it being sent intermittently outside of "crash report" scenario.

  • Great article. Lots of good information in there. Thank you for the write-up and research! Much appreciated.

    One note... Google said they will no longer be indexing content hidden in scripting like toggles or accordions or any item where user interaction is required to expand the content. Now how well this is applied is anyone's guess, but sounds like why your sections were not having their content indexed. .

    John Mueller said they assume if you are hiding it, it must not be important, so I don't think it will change soon.

    • Hidden content (tabs, accordions, hypotext...etc) is still indexed, only devalued. I've seen evidence of it not ranking as well as it should. That said in an experimental setting it didn't make any difference: https://dejanseo.com.au/experiment-results-can-hidden-content-rank-well/

      • I do see some of it being indexed, some of it not. However, Google's official statement is that they do not if the content is being loaded on the click which could explain the differences https://www.seroundtable.com/google-content-hidden-dynamic-20653.html

  • Thank you Dan for your hard-earned data. I'll refer this to the team and we'll see if we can create a valuable infographics base on this.

  • This is reason why i don't use Chrome and i don't like it at all.

    All begin in 2009 when we make site for friend of mine. Then we put site on production so customer can make minor few changes. And i visit site with Chrome (at this time version 5 or 6) just to be sure that everything works great. We was agreed to wait changes and then submit site in WMT for indexing. At this time site was with 0 links. So no one in world knows about it. Almost...

    On next day i get angry call that customer's site was indexed in Google before changes from them was applied. And we're all #WTF because no one submit site. Then i remember about mine innocent testing with Chrome.

    Even today we all don't know what kind of information Chrome (desktop - Win, OSX; mobile - Android, iOS and ChromeOS) sending back and forth to servers. For paranoid users there is Chromium - open source project; this is same Chrome w/o vendor extensions.


    Second - mine thoughts on "long" content are similar as your. But i don't like "hiding content" at all. Probably because i'm victim of this technique too using tabs in mine case. Result was terrible (for me) 99% of page content is devalued and can't be found in search engines at all. When this comes in large quantities whole site can get algorithmic filtering making situation even worse.

    • I first saw this article and then decided to investigate further. During my research friends recommended I use Fiddler and Wireshark to monitor outbound packets sent by Chrome. Have you tried them?

      • Yup - also CommView, HTTPScoop and tcpdump. Also few other.

        But that guys are smart and already know that someone can see information. So they're already prepared - encryption, compression and obfuscation on many layers.

        And story goes on just as in this article. Debian users (one of open source community guardians) fill bug ticket that Chromium (product with open source) trying to load Hotword (extension with closed source). Funny but this extension can't be seen on list with "active extensions". Just Hotword get access to microphone listening for specific phrase.

        So in 2015 isn't surprise that many users using Ghostery, AdBlock and similar extensions.

        • Sounds like you could publish a fascinating article with your technical knowledge. I'd read it :)

  • Good post Dan!

    When we look for something on Google, the user trend, at least from what I have observed, it is to click on the top results. Only when we find what we want on the first page we are going to the following.

    It is important to try to have a good position to win customers and have good content that they wish to return.

    Thanks for sharing your study with us.

  • Dan, this phrase research tool that you linked in the case study for xbmc-skins.com, is it one of your own? is it intented for free usage?

    • You can use it for free, but to calculate difficulty you need to load credits into it as it costs us to retrieve difficulty data via an API.

  • Thanks for such a good post: lots of valuable info!

  • hello

    nice article. thanks for sharing all that useful information about ux optimization!

    socialengaged edited 2015-08-26T05:18:40-07:00
  • Dan,

    This post was more than just amazing, you are an invaluable asset to the SEO community.

    Thank you.

  • Am pretty sure pogo-sticking is the most important of the factors you listed. If enough people pogo-stick back into the serps, it either means that G has served the wrong page or the page itself is not very useful (despite G thinking it is) and hence worth them substituting it with another.

  • Well worth the time to read, love your work mate.

  • the writer of the post is very inteligent.... i can learn many things from the posr... thank you fnd..... https://privatechefsclub.com/

  • Nice Article Dan-Petrovic. Your article is very insightful no doubt. I don't think user behavior put much impact on ranking positions. I am working on many different project at this time and many of them is not getting much traffic but they have stable ranking positions.

    KHitesh edited 2015-08-24T22:20:46-07:00
  • Great article. I specially like this " if you link to pages that people spend a lot of time on, Google will add a portion of that “time credit” towards the linking page". It makes sense a lot Google loves when you share good resources with others not only if other link to you.

  • Excellent insights, Dan! Extremely valuable info! User experience optimization is indeed the advanced approach to improving CTR. With the diverse use of mobile devices, crafting user experience should be on the top of the priority.

  • Great insight Dan. Truly more focus should be on optimizing the factors responsible for increasing the CTR of your website.

  • Awesome Post.... The actually learning platform & next post pls, give some new idea to get better user behavior.....

    Chandramani edited 2015-08-24T09:45:27-07:00
  • Great article Dan.

    I have one question, if user behavior data is a ranking signal then what about the newly created domains. These domains are attacked by spammy bots and incessantly they are increasing the bounce rate upto 100%. In that case what should we do?

    • I don't think bot activity can impact a site.

    • Asim the study is about the user behavior on SERPS, whether the users' behavior landing on a particular page from Google can impact the rankings or not.

  • Very nice analysis. Congrats Dan.

  • Fantastic post Dan. The way search engines are maturing and increasing their abilities to understand the behavioral patterns is just amazing. I'd love to see how they gonna tackle the use of Java scripts in upcoming years as lots of modern designs are building on this and so far search engines have hard time to mingle with it.

  • Great article - VERY insightful! Thanks much!

  • Hello Dan,

    Really great insights for CTR and I am 100% agree with this. When it comes to user experience, I would like to add one more point here, sometimes we open own website and even our employees too. That creates a bunch of unwanted bounce rate which looks bad in analytics, so download analytics opt-out browser add-on here https://tools.google.com/dlpage/gaoptout and stay away from own unwanted traffic.

    Hope this helps to everyone who did not know about it.

    Keep it up Dan :)

    • jackjohney

      What if User or employee change the browser ? every time they need to install it on new browser. I hope you know about filter options in GA, . Best option to avoid own traffic is to create ip filter, No need to save any addons here is the link for the same https://support.google.com/analytics/answer/1034840?hl=en

      • What if employee getting IP via DHCP from ISP?

        • jackjohney

          True and genuine DHCP server gives the LAN Admin a ton of control with IP assigning. So admin can manage or can set a ip range, thus you can filter ip range in GA.