Google has always attempted to collect an extraordinary amount of data about users (and webmasters in particular) - a few recent examples have spotlighted this trend. From what I can tell, there's virtually no limit to the amount of personal data the search giant could collect:
From Google's Search Box
- Every keyword search you perform
- Any websites in the results you visit
- The amount of time you spend on sites before returning to Google
- Common patterns of navigation via search
From Google's Other Public Services
- Locations/Directions you plot on Google Maps
- Messages you send and receive via Gmail or Talk
- Documents you create/edit on Google Docs
- Calendar information you add to Google Calendar
- Sites you subscribe to or click on at Google's Personalized Homepage
From Google's Webmaster Tools
- Any websites you control or have access to via Google Checkout, Google AdSense, Google AdWords, Google Analytics, Google Sitemaps & Google's Embeddable Site Search
- Accounts you create or information you post to Blogger, Groups, AdSense/AdWords, Picasa
From Google's Apps
- Anything you do on your computer after installing Google Desktop or Web Accelerator
- Sites you visit and what you do on the web using their built-in toolbar or the auto-embedded Firefox plug-in
From Google's Web Initatives
- Complete browsing habits of users connecting via free wifi in Mountain View
- Complete browsing habits of users on Google's connections in colleges, airports or cities
- Domain registration data and relationships between websites around the world
At this point, it's not that far-fetched for a black-hat affiliate marketer to be putting on the tinfoil hat everytime the get on the web. I certainly know more than one who connect through proxies, never use the same password/name/address/phone number/credit card twice and maintain that any Google actions against their networks are the result of "data leakage."
The broader question here is - where would Google draw the line? Once they have access to every piece of data about your online activity, from the sites you own to those you visit to where you live, who you email and what you like to do on the weekends, will they build associative databases to help correlate? Is there a "Rand Fishkin" file at Google that a curious engineer could pull up if ever they wanted to research my activities? If so... just FYI - that whole string of searches regarding "smelly pirate hookers" isn't what you think.
p.s. I mistakenly did not write "could collect" and thus made it sound as if Google already tracked all of this information. I have no source to suggest that this is the case, and this post is merely about uncovering the possibilities for data collection, not actualities. Sorry for any confusion.
Google definitely is collecting the data you mention under "Search Box" - that is, if you have a Gmail account and you are logged into it. I have a Gmail account that I am logged into almost all the time, and the other day I was searching around my Google account and clicked on a link called "Personalized Search".
Imagine my surprise when I could view all my searches, keywords, sites, images, etc. going back to March of this year. That's when I started my Gmail account. EVERYTHING was there, just as you say: the amount of time spent at each site, how many times it was visited, etc. as long as the site/link/image was arrived at via Google.
If you read the disclaimer at the bottom of Personalized Search, Google assures you that this search info is kept private; but they clearly leave it open that the info is recorded and stored somewhere and might be used by them for whatever purpose (refining searches, ads, etc) if they so desire.
They give you the option to clear your search history - for your own viewing - but again, they make it clear that they might store that info someplace else also.
Edited to add: they say that Personalized Search is in beta (what Google feature isn't) and that they are only doing it in a limited fashion. It's definitely the wave of the future, though.
I don't think too much about this, the answers are kinda obvious, but knowing that every page I load in my browser is stored in some kind of indexing queue (because of any PR query made) is not confortable for me, specially when my test pages/folders get indexed without even finish them.
That is very much similar to my attempt where I was not able to put in the right words. Result was a chaos :). I could not phrase it well or added some junk stuff :). It was fun and a learning experience.
Some food for thought 1) https://www.google-watch.org/gmail.html 2) https://www.google-watch.org/krane.html which says, "We are moving to a Google that knows more about you" by Google CEO Eric Schmidt.
What Google and MSN can do with this? Both are moving toward Behavioral Marketing, MSN publicly accepts it but Google not yet.
Google's privacy policy says, " Google collects personal information when you register for a Google service or otherwise voluntarily provide such information. We may combine personal information collected from you with information from other Google services or third parties to provide a better user experience, including customizing content for you."
More "We may use personal information to provide the services you've requested, including services that display customized content and advertising."
So we can expect more data collection from Google for good (hopefully).
Thanks, AjiNIMC
The question I have is how much website visitor data do they mine with adsense and google checkout code. Can you imagine the intense marketing machine they could build with data collected from the millions of websites using adsense? I suspect they are collecting data, but mostly for money making purposes. Not so much to spy on masses of people.
Is Google so secure that we shouldn't concern ourselves with data theft? Me thinks not.
No offense taken here, only because it's the truth. I just finishing a chapter in John Battale's The Search that talks a bit about privacy, data gathering, etc. It is scary at times to think a machine such as Google can know so much about you. I feel though that if one major story of data leakage to the government or to the press would be the death of Google. Once someone's secrets are revealed, panic ensues and everybody stops using Google.
This is not only about black hat marketing. Google hired former CIA and NSA specialists, indeed the company to become Google Maps was a CIA financed one. So this is about surveillance.
onreact.com - Dude, you need to take your meds.
BUT in your defense Google must:
1. Protect it's culture. 2. Protect itself from future attempts by government to gain full access.
If government gained FULL access to Google in the name of the war, terrorism, other threats people like onreact.com wouldn't be the only one wearing tin foil.
I am going to the bathroom now, did Google know that I was having a bowel movement? Oh no!!! I was using the ploppy plugin for wordpress: https://www.fatsquirrel.org/software/ploppy/
HA!!!
(Rand, try to layoff the linkbaiting posts for awhile they are tiring our eyes and brains, thank you!)
Aaron - when was the last time we did linkbait on SEOmoz? Maybe this one from 3 weeks ago... If you don't like the direction the blog's going, by all means - please suggest some topics you'd like to see covered.
Linkbait? Ha, that's ripe. This is an extremely valid point. Google has constantly increasing amounts of user information available for their use. They openly disclaim they're right to access user data of their services in the TOS of most of their services.
I'm not wearing a tinfoil hat, and I'm not a conspiracy theorist. However, corporate power always needs to be watched and discussed openly just as with government power.
I don't think Google is going "big brother" on us. The motives are simply to maintain and increase their competitive edge but the more information they receive the more power they have, not just in the marketplace. Heck, simply the search data they receive from being the dominant search engine is mind boggling. With all the hoopla Microsoft received regarding they're dominance of the OS market and the inherent privacy concerns, Google receives remarkably little. Especially concerning the fact they collect and store active user queries and actions.
I've often wondered how predicitive Google's DB could become. By monitoring search trends prior to past events and discovering common patterns it's not far fetched that they could see certain events coming well before anyone else noticed. As a simple example take real estate markets. Anyone think they could correlate past search trends to find the next booming market before real estate experts could? I don't know, but I do know information is power in this day and age and they are collecting it at pace far exceeding any other entity besides perhaps the government.
Seems like everyone thinks bullet points or slightly controversial subject matter = link bait
"Dude", wake up first before babbling, and use Google first before offending people: https://google.blognewschannel.com/archives/20... https://www.dailytech.com/article.aspx?newsid=...
It's not really difficullt to find out, so why don't you bother? It was even on Slashdot. So if you are not well informed enough don't yell at people for knowing more than you.
Google's Technology Director Craig Silverstein claimed that Google collects as little information as possible - which seems quite contradictory to Rand's posting.
Google is an information company. They make their money by collecting information, organizing it, and monetizing it. While their primary data source has been web pages, there is little reason to believe that it will not soon be user data. That is where the real money is.
After learning that Google pays out a significant portion of its adwords earnings to Mozilla when Firefox is used to search Google, I was a bit confused. I was convinced that Google's stake in Firefox/Mozilla was to prevent IE from giving an unfair advantage to MSN search. While this may still be the primary driving factor, Google's stake in Mozilla means that privacy protections will not become a core function of the Mozilla browser any time soon.
Toolbars, proxy switchers, extensions, etc. have been around for years - yet protecting your privacy with a standard browser still requires digging through options and advanced LAN preferences, searching the internet for free and open proxies, or downloading something like TOR.
By becoming the primary financial provider for Mozilla, Google can stymie attempts to add direct privacy protections.
Shameless Plug: There is a proposed search privacy standard which we could get the search engines to adopt. While certainly not a panacea, it is far more accessible that installing third party software. At least then we would be able to hold them accountable. https://www.poundprivacy.org
I agree that Google is potentially collecting huge amounts and types of information. But I think you go to far in saying that they have personal files or something. They just collect all the data to define patterns and get 'general' behaviour knowledge. Sure, the information can be related back to persons in some cases (like the AOL Data showed us), but that's not the main purpose..
or am i beeing terribly naive?
I agree that their intentions are for the enhancement of their services using general trends in the data collected. However, their personalized search maps your queries directly to your account, basically collecting a user profile of your search activities. This does not require some heavy lifting to track anyone (as with AOL's situation), they would simply have to pick you out of the database to view your profile.
I think so. Imagine that you got HIV last night. It's quite possible you'll be doing some research online and tossing Google all kinds of HIV and diseased keyword searches. You might also share a few Gmails with your wife about it using similar keywords. Not only are all these keywords and conversations saved and associated with your user account, but they are used to calculate which ads to show you. This isn't very intrusive, but we're up to speed with the AOL data mishap, where a number = a person. As long as Google doesn't release this info and no government demands access, you are only subject to what Google would like to do with this information.
**edit: I re-read this and I suppose the HIV stuff might offend people. Sorry if that happens to happen.
Ah the heck, what they can't get themselves they simply get from their monthly subscription of the NSA internet traffic logs :)