Controlling Search Engine Access with Cookies & Session IDs

Comments 33

Please keep your comments TAGFEE by following the community etiquette.

E-mail me when new comments are posted

Sort by:

Comments are closed on posts more than 30 days old. Got a burning question? Head to our Q&A section to start a new conversation.

keishi_orlan

2008-02-06T03:26:17-08:00

Hi everybody ; i ' ve been reading a lot on seomoz; but never did post coment ; so here is my first post ..(fingers are shaking)

I m currently working on a shopping cart , (as each visit of a bot on an same page would generate a new session Id ...so a new url but with the same content ... so he we are with great case of dupe content)

to avoid duplicate content ,we use .htaccess with a rule that removes the session id and 301 to the the page with no session id; the rule is activated on list of known bots.

it s a simple solution for a big problem, i m not sue it s suitable for all cases of dupe content but it definetly helps for shopping carts ...

1 0

Hi everybody ; i ' ve been reading a lot on seomoz; but never did post coment ; so here is my first post ..(fingers are shaking) I m currently working on a shopping cart , (as each visit of a bot on an same page would generate a new session Id ...so a new url but with the same content ... so he we are with great case of dupe content) to avoid duplicate content ,we use .htaccess with a rule that removes the session id and 301 to the the page with no session id; the rule is activated on list of known bots. it s a simple solution for a big problem, i m not sue it s suitable for all cases of dupe content but it definetly helps for shopping carts ...
Cancel
King

2008-02-04T21:03:12-08:00

I'm not sure its really true that search engines (Google at least) don't accept cookies. I recently (well 6 months ago) created a site that checks for cookies before allowing customers access to the shopping cart. If cookies are disabled it sends the user to a info page on the topic Google indexed the actual shopping cart page perfectly well, they totally bypassed the "cookie info" page, and never indexed that at all. Cookie checking was done entirely via PHP code.

The php code functions like this:
1. On clicking a link it sends them to new page. On entering this page PHP sends a php cookie to the user.
2. PHP then sends a new header which in this case redirects to the shopping cart page.
3. If the cookie has been activated successfully we leave the user at this shopping cart page. Otherwise they are redirected a final time from the shopping cart page onto the cookie info page.
Kind of complicated, but its not too bad since its all in a few functions.

1 0
I'm not sure its really true that search engines (Google at least) don't accept cookies. I recently (well 6 months ago) created a site that checks for cookies before allowing customers access to the shopping cart. If cookies are disabled it sends the user to a info page on the topic Google indexed the actual shopping cart page perfectly well, they totally bypassed the "cookie info" page, and never indexed that at all. Cookie checking was done entirely via PHP code. The php code functions like this: <ol><li>On clicking a link it sends them to new page. On entering this page PHP sends a php cookie to the user. </li><li>PHP then sends a new header which in this case redirects to the shopping cart page. </li><li>If the cookie has been activated successfully we leave the user at this shopping cart page. Otherwise they are redirected a final time from the shopping cart page onto the cookie info page.</li></ol>Kind of complicated, but its not too bad since its all in a few functions.
Cancel
- g1smd
 
 2008-02-05T04:53:20-08:00
 
 How is the redirect done?
 
 Google would not follow some types of redirect. They would ignore it.
 
 1 0
 
 How is the redirect done? Google would not follow some types of redirect. They would ignore it.
 Cancel
- Hamlet Batista
 
 2008-02-05T15:00:16-08:00
 
 King - This sounds like a good test project. I would like to confirm this myself for each search engine.
 
 Accepting cookies is trivial, so I wouldn't be surprised if my tests matched your results.
 
 2 0
 
 King - This sounds like a good test project. I would like to confirm this myself for each search engine. Accepting cookies is trivial, so I wouldn't be surprised if my tests matched your results.
 Cancel
 - Ann Smarty
 
 2008-02-05T23:22:11-08:00
 
 I would be interested to hear about the results of your test...
 
 1 0
 
 I would be interested to hear about the results of your test...
 Cancel
 - Hamlet Batista
 
 2008-02-06T14:04:16-08:00
 
 I just confirmed search engines do not support cookies. I will write a post shortly with all the details.
 
 2 0
 
 I just confirmed search engines do not support cookies. I will write a post shortly with all the details.
 Cancel
Rebecca Kelley

2008-02-04T15:22:07-08:00

I like how Google(bot, anyway) is this hugely awesome looking graphic, while poor MSN is just a blue circle.

1 0

I like how Google(bot, anyway) is this hugely awesome looking graphic, while poor MSN is just a blue circle.
Cancel
- Rand Fishkin
 
 2008-02-04T15:48:26-08:00
 
 That's MSN.com, not Live Search. They have their own bot, and he's pretty cool looking. I'll try to trot him out on the blog more often.
 
 1 0
 
 That's MSN.com, not Live Search. They have their own bot, and he's pretty cool looking. I'll try to trot him out on the blog more often.
 Cancel
 - Rebecca Kelley
 
 2008-02-04T16:13:21-08:00
 
 Do eeeeeeit.
 
 1 0
 
 Do eeeeeeit. 
 Cancel
 - Rand Fishkin
 
 2008-02-04T19:54:38-08:00
 
 Just did - check out the new post on Microsoft buying Yahoo! :)
 
 3 0
 
 Just did - check out the new post on Microsoft buying Yahoo! :)
 Cancel
 - Ann Smarty
 
 2008-02-05T00:52:25-08:00
 
 Both Yahoo and MSN seem ...um... evil (is it only me?)... Google always seems so cute btw... :)
 
 AnnSmarty edited 2008-02-05T01:04:56-08:00
 1 0
 
 Both Yahoo and MSN seem ...um... evil (is it only me?)... Google always seems so cute btw... :)
 Cancel
UtahSEOPro

2008-02-09T01:08:36-08:00

So Rand, what if you wanted a search engine to get past a login form to see all your content but wanted users to have to login and/or register. Should you detect the bots and serve up a session to them (without using it in the URL)? If this was done then all people would have to do is change their user-agent to get access w/o logging in or registering.

Any good solutions?

1 0

So Rand, what if you wanted a search engine to get past a login form to see all your content but wanted users to have to login and/or register. Should you detect the bots and serve up a session to them (without using it in the URL)? If this was done then all people would have to do is change their user-agent to get access w/o logging in or registering. Any good solutions? 
Cancel
TobiasK67

2008-03-21T04:31:53-07:00

Hi Rand,

Thanks for the link to my article (enarion.net)! Any ideas how this could be improved?

Thx,

Tobias

1 0

Hi Rand, Thanks for the link to my article (enarion.net)! Any ideas how this could be improved? Thx, Tobias 
Cancel
Faheem ul Afaq

2013-02-07T10:13:41-08:00

My first post :) and I have a question:

Our website use cookies for premium content. It gives 5 visits free and the age is 6 months.
After 5 visits it serves subscription box even the visitor is coming from Google or any other search engine.

Does this falls in cloaking? any one seen any warning or penalty for such issue?

1 0

My first post :) and I have a question: Our website use cookies for premium content. It gives 5 visits free and the age is 6 months. After 5 visits it serves subscription box even the visitor is coming from Google or any other search engine. Does this falls in cloaking? any one seen any warning or penalty for such issue? 
Cancel
tangfucius

2008-04-24T13:54:09-07:00

just like to point out that many of the bots nowadays do accept cookies, googlebot, msnbot, slurp(yahoo!) all accept cookies.

1 0

just like to point out that many of the bots nowadays do accept cookies, googlebot, msnbot, slurp(yahoo!) all accept cookies.
Cancel
Papaspyropoulos

2008-02-04T14:47:38-08:00

Hi!

I enjoyed reading the article. It is very interesting and it made me think of things in another way. I am glad to be a member of SEOMoz.

P.S. Yeah, the graphics are very cool as Lorisa rightly stated before!

1 0

Hi! I enjoyed reading the article. It is very interesting and it made me think of things in another way. I am glad to be a member of SEOMoz. P.S. Yeah, the graphics are very cool as Lorisa rightly stated before! 
Cancel
Bob Gladstein

2008-02-04T14:41:06-08:00

I don't know if this is something you see as part of the purpose of the blog, but I bet a lot of people would appreciate some sample code for cookies and explanations of what the code would do.

1 0

I don't know if this is something you see as part of the purpose of the blog, but I bet a lot of people would appreciate some sample code for cookies and explanations of what the code would do.
Cancel
crexatalyst

2008-02-04T10:00:45-08:00

cookies is great for those users who are fun of browsing so many sites.,

1 0

cookies is great for those users who are fun of browsing so many sites.,
Cancel
Hamlet Batista

2008-02-04T06:51:57-08:00

Hey, this is a great idea. I never thought of using cookies for cloaking!

What are Session IDs?
Session IDs are virtually identical to cookies in functionality, with one big difference. Upon closing your browser (or re-starting), session ID information is no longer stored on your hard drive. The website you were interacting with may remember your data or actions, but they cannot retrieve session IDs from your machine. In essence, they're more like temporary cookies.

Technically speaking, session IDs are cookies too. The original concept, as you suggest, was avoiding expiration dates. Without expiration dates the browser won't persist the cookies to disk and will only keep them in memory. However, as r_wetzlamayr says, popular programming languages do specify an expiration date by default. They just make it short (a couple of hours, a few minutes, etc). You can experience this when you are automatically logged out from most dynamic sites when you haven't done anything for while.

HamletBatista edited 2008-02-04T06:53:25-08:00
1 0

Hey, this is a great idea. I never thought of using cookies for cloaking! <blockquote>What are Session IDs?Session IDs are virtually identical to cookies in functionality, with one big difference. Upon closing your browser (or re-starting), session ID information is no longer stored on your hard drive. The website you were interacting with may remember your data or actions, but they cannot retrieve session IDs from your machine. In essence, they're more like temporary cookies.</blockquote> Technically speaking, session IDs are cookies too. The original concept, as you suggest, was avoiding expiration dates. Without expiration dates the browser won't persist the cookies to disk and will only keep them in memory. However, as r_wetzlamayr says, popular programming languages do specify an expiration date by default. They just make it short (a couple of hours, a few minutes, etc). You can experience this when you are automatically logged out from most dynamic sites when you haven't done anything for while.
Cancel
Nuno Hipólito

2008-02-04T03:34:30-08:00

Cookies are usually great for the consumer and the site owner as well, but in the illustration I think the PC would eat the cookie rather than store it for later... :-)

carfeu edited 2008-02-04T03:35:42-08:00
1 0

Cookies are usually great for the consumer and the site owner as well, but in the illustration I think the PC would eat the cookie rather than store it for later... :-)
Cancel
Brent Payne

2008-02-04T10:55:07-08:00

Rand,

Nice post. Like seeing this more advanced level of SEO on SEOmoz.

The company I will most likely be working for in the next couple of weeks (final revisions being made to offer letter) has significant duplicate content issues created by content versus the standard canonicalization or URL structure issues (though they have those issues too). A specific question during the interview process was given to me regarding this problem and I provided a solution via JavaScript but doing so via cookies would be even better.

Keep this 'deeper' content flowing please!

Brent D. Payne

1 0

Rand, Nice post. Like seeing this more advanced level of SEO on SEOmoz. The company I will most likely be working for in the next couple of weeks (final revisions being made to offer letter) has significant duplicate content issues created by content versus the standard canonicalization or URL structure issues (though they have those issues too). A specific question during the interview process was given to me regarding this problem and I provided a solution via JavaScript but doing so via cookies would be even better. Keep this 'deeper' content flowing please! Brent D. Payne
Cancel
Christian Maund-Anderson

2008-02-04T12:04:46-08:00

great article, and I too stress the importance of avoiding dupe. content issues with sessionIDs.

1 0

great article, and I too stress the importance of avoiding dupe. content issues with sessionIDs.
Cancel
Lori Bourne

2008-02-04T13:42:57-08:00

I don't know why, but I just love these technical posts even though I don't always understand (or have use for) the information they contain. Maybe I just feel smarter after reading them?

Anyway, I really appreciate the time you put into posts like this, especially the cute and helpful graphics.

1 0

I don't know why, but I just love these technical posts even though I don't always understand (or have use for) the information they contain. Maybe I just feel smarter after reading them? Anyway, I really appreciate the time you put into posts like this, especially the cute and helpful graphics. 
Cancel
Maria S Balayan

2008-02-04T12:50:56-08:00

Great post. thanks

1 0

Great post. thanks
Cancel
r_wetzlmayr

2008-02-04T03:04:09-08:00

Rand:The website you were interacting with may remember your data or actions, but they cannot retrieve session IDs from your machine.This is not generaly true, and it solely depends on the session implementation.

For PHP, the session lifetime is determined by session_set_cookie_params, so a programmer can choose to expand a session's life time to any duration which seems fit. Up to the year 2037, just like for any cookie.

2 1

Rand:The website you were interacting with may remember your data or actions, but they cannot retrieve session IDs from your machine.This is not generaly true, and it solely depends on the session implementation. For PHP, the session lifetime is determined by <a href="https://at2.php.net/manual/en/function.session-set-cookie-params.php" rel="nofollow">session_set_cookie_params</a>, so a programmer can choose to expand a session's life time to any duration which seems fit. Up to the year 2037, just like for any cookie.
Cancel
- Rand Fishkin
 
 2008-02-04T08:15:40-08:00
 
 Thanks for the catch - I've edited the post and added in that note (along with some extra informationabout warnings for bots seeing session IDs in the URL). Need to be more careful when publishing late at night...
 
 1 0
 
 Thanks for the catch - I've edited the post and added in that note (along with some extra informationabout warnings for bots seeing session IDs in the URL). Need to be more careful when publishing late at night...
 Cancel
 - Richard Baxter
 
 2008-02-04T10:35:55-08:00
 
 I love how reading an article about something that every SEO has addressed at some point in their careers inspires new thought! I've been using sessionid's for years (and cloaking them) but i just has a eureka moment with a site that's duplicating a snippent of content in the page template..
 
 Cloak the snippet, duh!!
 
 Great post Rand, cheers
 
 richardbaxterseo
 
 1 0
 
 I love how reading an article about something that every SEO has addressed at some point in their careers inspires new thought! I've been using sessionid's for years (and cloaking them) but i just has a eureka moment with a site that's duplicating a snippent of content in the page template.. Cloak the snippet, duh!! Great post Rand, cheers richardbaxterseo
 Cancel
 - g1smd
 
 2008-02-04T15:26:18-08:00
 
 How do you turn off session IDs in the browser?
 
 The session ID is in the URL in the link on the site you are viewing.
 
 The session data is stored on the server. The expiry of the session is controlled by the server.
 
 g1smd edited 2008-02-04T15:31:09-08:00
 1 0
 
 How do you turn off session IDs in the browser? The session ID is in the URL in the link on the site you are viewing. The session data is stored on the server. The expiry of the session is controlled by the server.
 Cancel
 - Rand Fishkin
 
 2008-02-04T16:02:39-08:00
 
 I could be wrong, but I think by turning off cookies, you also limit session IDs frmo being stored on your machine (obviously, this won't prevent sessions that are stored on the server side only, though).
 
 1 0
 
 I could be wrong, but I think by turning off cookies, you also limit session IDs frmo being stored on your machine (obviously, this won't prevent sessions that are stored on the server side only, though).
 Cancel
 - g1smd
 
 2008-02-04T17:23:31-08:00
 
 Session IDs are in the URL, in the link you click on.
 
 1 0
 
 Session IDs are in the URL, in the link you click on.
 Cancel
 
 Hamlet Batista
 
 2008-02-05T14:56:28-08:00
 
 Sessions IDs can be implemented as either URL parameters or cookies. Most modern implementations today use cookies or a combination of both.
 
 From the web server perspective, a user session is simply a collection of information about a current visitor that is accessible via an ID. That information is usually recorded in memory. However, there are many implementations that record them in the database or disk. That ID is given to the user in the form of a cookie or URL parameter.
 
 If the user deletes the cookies or disable them, (and there is not URL parameter for the session ID) the server won't be able to extract the information (even though it is still memory).
 
 1 0
 
 Sessions IDs can be implemented as either URL parameters or cookies. Most modern implementations today use cookies or a combination of both. From the web server perspective, a user session is simply a collection of information about a current visitor that is accessible via an ID. That information is usually recorded in memory. However, there are many implementations that record them in the database or disk. That ID is given to the user in the form of a cookie or URL parameter. If the user deletes the cookies or disable them, (and there is not URL parameter for the session ID) the server won't be able to extract the information (even though it is still memory).
 Cancel
 
 g1smd
 
 2008-02-06T13:12:04-08:00
 
 There's a big difference between session IDs in URLs and session cookies. However, I can see why people can become confused about these things.
 
 1 0
 
 There's a big difference between session IDs in URLs and session cookies. However, I can see why people can become confused about these things. 
 Cancel
 
 Hamlet Batista
 
 2008-02-06T13:58:02-08:00
 
 g1smd,
 
 I am not sure what you mean with "a huge difference". Session IDs can be implemented as cookies with short expiration dates, session cookies or URL parameters.
 
 See https://cookies.lcs.mit.edu/seq_sessionid.html
 
 Session IDs allow a user to maintain an authenticated session with a service without having to re-enter a password for each Web page access. A server maps the session ID to a user's profile. The tricky part is that session IDs should not be predictable. Cookies and URL authenticators are popular mechanisms to implement session IDs.
 
 See https://searchsoftwarequality.techtarget.com/sDefinition/0,,sid92_gci1158582,00.html
 
 A session ID is a unique number that a Web site's server assigns a specific user for the duration of that user's visit (session). The session ID can be stored as a cookie, form field, or URL (Uniform Resource Locator). Some Web servers generate session IDs by simply incrementing static numbers. However, most servers use algorithms that involve more complex methods, such as factoring in the date and time of the visit along with other variables defined by the server administrator.
 Every time an Internet user visits a specific Web site, a new session ID is assigned. Closing a browser and then reopening and visiting the site again generates a new session ID. However, the same session ID is sometimes maintained as long as the browser is open, even if the user leaves the site in question and returns. In some cases, Web servers terminate a session and assign a new session ID after a few minutes of inactivity.
 Session IDs, in their conventional form, do not offer secure Web browsing. Skilled hackers can acquire session IDs (a process called session prediction), and then masquerade as authorized users in a form of attack known as session hijacking.
 
 While doing the search, I found a new technique I was not aware of. I need to see if current browsers and servers support it. There is an (new?) HTTP header called Session-Id. See https://www.w3.org/TR/WD-session-id
 
 HamletBatista edited 2008-02-06T14:01:27-08:00
 1 0
 
 g1smd, I am not sure what you mean with "a huge difference". Session IDs can be implemented as cookies with short expiration dates, session cookies or URL parameters. See <a href="https://cookies.lcs.mit.edu/seq_sessionid.html" rel="nofollow">https://cookies.lcs.mit.edu/seq_sessionid.html </a> <blockquote>Session IDs allow a user to maintain an authenticated session with a service without having to re-enter a password for each Web page access. A server maps the session ID to a user's profile. The tricky part is that session IDs should not be predictable. Cookies and URL authenticators are popular mechanisms to implement session IDs.</blockquote> See <a href="https://searchsoftwarequality.techtarget.com/sDefinition/0,,sid92_gci1158582,00.html" rel="nofollow">https://searchsoftwarequality.techtarget.com/sDefinition/0,,sid92_gci1158582,00.html</a> <blockquote>A session ID is a unique number that a Web site's <a href="https://whatis.techtarget.com/definition/0,,sid9_gci212964,00.html" rel="nofollow">server</a> assigns a specific user for the duration of that user's visit (<a href="https://searchsoa.techtarget.com/sDefinition/0,,sid26_gci541649,00.html" rel="nofollow">session</a>). The session ID can be stored as a <a href="https://searchsoftwarequality.techtarget.com/sDefinition/0,,sid92_gci211838,00.html" rel="nofollow">cookie</a>, form field, or <a href="https://searchnetworking.techtarget.com/sDefinition/0,,sid7_gci213251,00.html" rel="nofollow">URL</a> (Uniform Resource Locator). Some Web servers generate session IDs by simply incrementing static numbers. However, most servers use <a href="https://whatis.techtarget.com/definition/0,,sid9_gci211545,00.html" rel="nofollow">algorithm</a>s that involve more complex methods, such as factoring in the date and time of the visit along with other <a href="https://whatis.techtarget.com/definition/0,,sid9_gci213275,00.html" rel="nofollow">variable</a>s defined by the server administrator. Every time an Internet user visits a specific Web site, a new session ID is assigned. Closing a browser and then reopening and visiting the site again generates a new session ID. However, the same session ID is sometimes maintained as long as the browser is open, even if the user leaves the site in question and returns. In some cases, Web servers terminate a session and assign a new session ID after a few minutes of inactivity. Session IDs, in their conventional form, do not offer secure Web browsing. Skilled hackers can acquire session IDs (a process called <a href="https://searchsoftwarequality.techtarget.com/sDefinition/0,,sid92_gci1171285,00.html" rel="nofollow">session prediction</a>), and then masquerade as authorized users in a form of attack known as <a href="https://searchsoftwarequality.techtarget.com/sDefinition/0,,sid92_gci1188680,00.html" rel="nofollow">session hijacking</a>. </blockquote> While doing the search, I found a new technique I was not aware of. I need to see if current browsers and servers support it. There is an (new?) HTTP header called Session-Id. See <a href="https://www.w3.org/TR/WD-session-id" rel="nofollow">https://www.w3.org/TR/WD-session-id</a>
 Cancel

Post Analytics

Comments 33

Log in to Moz

Don't have an account?