Since the beginning of SEO time, practitioners have been trying to crack the Google algorithm. Every once in a while, the industry gets a glimpse into how the search giant works and we have opportunity to deconstruct it. We don’t get many of these opportunities, but when we do—assuming we spot them in time—we try to take advantage of them so we can “fix the Internet.”
On Feb. 16, 2015, news started to circulate that NBC would start removing images and references of Brian Williams from its website.
This was it!
A golden opportunity.
This was our chance to learn more about the Knowledge Graph.
Expectation vs. reality
Often it’s difficult to predict what Google is truly going to do. We expect something to happen, but in reality it’s nothing like we imagined.
Expectation
What we expected to see was that Google would change the source of the image. Typically, if you hover over the image in the Knowledge Graph, it reveals the location of the image.
This would mean that if the image disappeared from its original source, then the image displayed in the Knowledge Graph would likely change or even disappear entirely.
Reality (February 2015)
The only problem was, there was no official source (this changed, as you will soon see) and identifying where the image was coming from proved extremely challenging. In fact, when you clicked on the image, it took you to an image search result that didn't even include the image.
Could it be? Had Google started its own database of owned or licensed images and was giving it priority over any other sources?
In order to find the source, we tried taking the image from the Knowledge Graph and “search by image” in images.google.com to find others like it. For the NBC Nightly News image, Google failed to even locate a match to the image it was actually using anywhere on the Internet. For other television programs, it was successful. Here is an example of what happened for Morning Joe:
So we found the potential source. In fact, we found three potential sources. Seemed kind of strange, but this seemed to be the discovery we were looking for.
This looks like Google is using someone else’s content and not referencing it. These images have a source, but Google is choosing not to show it.
Then Google pulled the ol’ switcheroo.
New reality (March 2015)
Now things changed and Google decided to put a source to their images. Unfortunately, I mistakenly assumed that hovering over an image showed the same thing as the file path at the bottom, but I was wrong. The URL you see when you hover over an image in the Knowledge Graph is actually nothing more than the title. The source is different.
Luckily, I still had two screenshots I took when I first saw this saved on my desktop. Success. One screen capture was from NBC Nightly News, and the other from the news show Morning Joe (see above) showing that the source was changed.
(NBC Nightly News screenshot.)
The source is a Google-owned property: gstatic.com. You can clearly see the difference in the source change. What started as a hypothesis in now a fact. Google is certainly creating a database of images.
If this is the direction Google is moving, then it is creating all kinds of potential risks for brands and individuals. The implications are a loss of control for any brand that is looking to optimize its Knowledge Graph results. As well, it seems this poses a conflict of interest to Google, whose mission is to organize the world’s information, not license and prioritize it.
How do we think Google is supposed to work?
Google is an information-retrieval system tasked with sourcing information from across the web and supplying the most relevant results to users' searches. In recent months, the search giant has taken a more direct approach by answering questions and assumed questions in the Answer Box, some of which come from un-credited sources. Google has clearly demonstrated that it is building a knowledge base of facts that it uses as the basis for its Answer Boxes. When it sources information from that knowledge base, it doesn't necessarily reference or credit any source.
However, I would argue there is a difference between an un-credited Answer Box and an un-credited image. An un-credited Answer Box provides a fact that is indisputable, part of the public domain, unlikely to change (e.g., what year was Abraham Lincoln shot? How long is the George Washington Bridge?) Answer Boxes that offer more than just a basic fact (or an opinion, instructions, etc.) always credit their sources.
There are four possibilities when it comes to Google referencing content:
- Option 1: It credits the content because someone else owns the rights to it
- Option 2: It doesn't credit the content because it’s part of the public domain, as seen in some Answer Box results
- Option 3: It doesn't reference it because it owns or has licensed the content. If you search for “Chicken Pox” or other diseases, Google appears to be using images from licensed medical illustrators. The same goes for song lyrics, which Eric Enge discusses here: Google providing credit for content. This adds to the speculation that Google is giving preference to its own content by displaying it over everything else.
- Option 4: It doesn't credit the content, but neither does it necessarily own the rights to the content. This is a very gray area, and is where Google seemed to be back in February. If this were the case, it would imply that Google is “stealing” content—which I find hard to believe, but felt was necessary to include in this post for the sake of completeness.
Is this an isolated incident?
At Five Blocks, whenever we see these anomalies in search results, we try to compare the term in question against others like it. This is a categorization concept we use to bucket individuals or companies into similar groups. When we do this, we uncover some incredible trends that help us determine what a search result “should” look like for a given group. For example, when looking at searches for a group of people or companies in an industry, this grouping gives us a sense of how much social media presence the group has on average or how much media coverage it typically gets.
Upon further investigation of terms similar to NBC Nightly News (other news shows), we noticed the un-credited image scenario appeared to be a trend in February, but now all of the images are being hosted on gstatic.com. When we broadened the categories further to TV shows and movies, the trend persisted. Rather than show an image in the Knowledge Graph and from the actual source, Google tends to show an image and reference the source from Google's own database of stored images.
And just to ensure this wasn't a case of tunnel vision, we researched other categories, including sports teams, actors and video games, in addition to spot-checking other genres.
Unlike terms for specific TV shows and movies, terms in each of these other groups all link to the actual source in the Knowledge Graph.
Immediate implications
It’s easy to ignore this and say “Well, it’s Google. They are always doing something.” However, there are some serious implications to these actions:
- The TV shows/movies aren't receiving their due credit because, from within the Knowledge Graph, there is no actual reference to the show’s official site
- The more Google moves toward licensing and then retrieving their own information, the more biased they become, preferring their own content over the equivalent—or possibly even superior—content from another source
- If feels wrong and misleading to get a Google Image Search result rather than an actual site because:
- The search doesn't include the original image
- Considering how poor Image Search results are normally, it feels like a poor experience
- If Google is moving toward licensing as much content as possible, then it could make the Knowledge Graph infinitely more complicated when there is a “mistake” or something unflattering. How could one go about changing what Google shows about them?
Google is objectively becoming subjective
It is clear that Google is attempting to create databases of information, including lyrics stored in Google Play, photos, and, previously, facts in Freebase (which is now Wikidata and not owned by Google).
I am not normally one to point my finger and accuse Google of wrongdoing. But this really strikes me as an odd move, one bordering on a clear bias to direct users to stay within the search engine. The fact is, we trust Google with a heck of a lot of information with our searches. In return, I believe we should expect Google to return an array of relevant information for searchers to decide what they like best. The example cited above seems harmless, but what about determining which is the right religion? Or even who the prettiest girl in the world is?
Questions such as these, which Google is returning credited answers for, could return results that are perceived as facts.
Should we next expect Google to decide who is objectively the best service provider (e.g., pizza chain, painter, or accountant), then feature them in an un-credited answer box? The direction Google is moving right now, it feels like we should be calling into question their objectivity.
But that’s only my (subjective) opinion.
Hi Aaron,
I have a little diffrent point to share.
Lets think this from Google point of view. Google index almost more than 48 Billion websites in each month to keep them in search queries right. Which means lots of Energy, Heat, huge Bandwidth etc. Also google want to index billions of websites faster than ever. In my opinion once google find organic contents, it wants to keep them under its own database and might don't index them more often as usual to decrease the heat cost, bandwidth, energy and also speed up spider.
I don't necessarily disagree with you. It might be one of the reasons. But it does mean that getting things indexed which are deserving might be really tough once Google has made a decision. That part is not OK to me.
Aaron,
Talk about a thought-provoking post. Many of us have noticed the ways in which Google is playing loose with the rules as regards images, content and such. I've often choose to think it's simply a test or the company attempting to provide users with the best results possible.
However, as you make clear with the image examples, not crediting the source--or, worse yet, crediting themselves as the source when the content is not owned by the search giant--seems quite a distance beyond misleading.
RS
Thanks Ronell!
Google made some changes when I was halfway through this post, which forced me to pivot. Lucky for them too, because I was about to rain fire and brimstone on them!
Cheers!
That clearly shows they're reading your drafts behind your back! Or maybe your thoughts. Fearing their stocks would plummet in the wake of your discoveries, they decided to rollback their villainous changes... halfway through your post.
Not to self: Apply tinfoil hat without prejudice.
I've noticed this and am already affected by it in the niche I work in; travel. Google is especially bad when it comes to flights, as they show a list of prices from airlines which are, more often than not, the correct prices but there is often quite a bit of lower prices available from online travel agents (which Google ignores) and superior technology out there (sites like Dohop) that can find cheaper flights.
Showing a price for a flight quickly is not, in my mind at least, superior to a greater range of options including low flight prices that Google can't show.
Interesting and good addition. That would be a clear example of Google showing content that they want to, not what deserves to be there. Really good point!
It has yet to improve to Google in that niche. Good point.
Very well written article and other comments. This maybe a little off topic but on point about Google's bias. I recently redesigned a website for a client a few days ago and announced it on my google plus page. As I always do I go and then search the key phrase for my client to get a baseline ranking to monitor rank movement and I was shocked to see my google plus post first page right after the paid ads. No great content, no educational information, nothing special about my post. He is a wedding photographer in San Diego and after searching that phrase my google plus post beat out almost all legitimate listings. If that is not bias I'm not sure what is.
It is a little off topic, but I think still relevant. Google does tend to still include Google+ when they can. I wrote about it once on search engine land. It's odd every time I see it.
While searching on google make sure you are not logged in with G Account.
Thanks so much for this detailed look into Google's inner practices. I definitely see, that over the last few years, Google has reduced the amount of first page real estate to organic results, and given more room to it's own ads and knowledge graph.
I feel as if they are moving to their own content because they know with 100% certainty that it is factual and accurate. Sourcing information from other places, regardless of how trustworthy, still leaves them open to displaying incorrect information. Also, the other day on Google.com right below the search box they had a link to another one of their domains and the link was nofollowed, which I thought was odd, but it shows that maybe they are still trying to remain laissez faire in the content that they return.
Hi Aaron,
great piece of work, i think Google has two major interests. 1) result with quality which leads to great user experience. 2. ) commercial interest for themselves which make sense.
I think it make sense to have these two factor for any origination to take over market.
"Should we next expect Google to decide who is objectively the best service provider (e.g., pizza chain, painter, or accountant)"
This topic has been in the back of my mind for a while. I've had this uneasy feeling about Google in their developments as they basically have full hold of the internet. When you say 'should we next expect...' I think of the Google ecommerce affiliate system where google really does just that so to reap the rewards of tapping into sales. I worry when it comes to data that companies can get too greedy. The rise of apple made me cringe as I watched an entire platform build walls around data - I don't want to see Google take a similar road..
I hope the evolution of Google continues to work to answer questions to search in a very dynamic and creative way. When you are on a very slow connection you always count on Google to be one of the only websites that will load up quickly. I'd imagine this might be reason for Google to have a sort of 'caching' system instead of pulling from direct sources. The grey area does come into play though without sourcing the context in which they extract data and I find it very wise that they shifted their platform.
You reminded me of something interesting Brian, which I wanted to talk about too. And I will play the opposite side here backing Google.
I think it's a little nutty how EVERYONE is always pointing a finger a Google and other companies like Apple and Facebook get off scot-free. Apple is a completely closed community. They even change their chargers from product to product forcing users to re-buy everything. Facebook does shady testing ALL the time.
And yet, Google, who practically gives away their android platform to anyone who wants it, for free, gets smashed for all sorts of wrong doing when the limit that and impose some kind of restriction.
It's a love hate relationship with Google. People love to hate those in power. I just want to be sure they are acting ethically.
I am generally not in any way a Google defender (which is evident if you follow my blog). Here is what I find most interesting, Google already does decide what gets shown via their algorithms and is certainly not terribly transparent about it. As distasteful as it might be to think about, they are a for profit company and their job is not to provide exposure for other businesses and organizations. Their job is to make money for their shareholders.
I wrote a post about this today entitled "Google Search Results - What Is Their Obligation to the Public?" which is more about the recent FTC issues. I'm happy to share the link if Aaron gives the ok!
I'm still not clear on how the Answer Boxes make money for Google. They might even take money away if an ad may have otherwise been clicked to find an answer to the query. The image issue you describe here is also interesting. I agree, crediting should absolutely happen - that is inline with copyright laws at a minimum. But why shouldn't they compile their own database of image results to display, as long as they are properly credited and attributed? I'm not sure I can come up with a great answer for why not.
I am really interested to see how this all plays out.
please do link to it! I look forward to reading it.
Like I said in the post I think its a gray area. On one hand, you are absolutely right. They are a for profit business and should find ways to make money and please their share holders. On the other hand, Search Engines are about organic listings and displaying the best there is I believe. I have no problem with google making money, but then they need to change the rules to say, we will show the best information... unless we have something which we believe is better, then we will show that. Or better yet, imagine a world where you could buy organic placement in SERPs? it's effectively the same. but the opposite. google buying a database of answers / images / information. How they are making money, they might not be. But I dont't necessarily see this going in such a positive direction.
Here is the link to my post "Google Search Results - What is Their Obligation to the Public?": https://bit.ly/1aSzQIy
HI Julie - Sorry for the delay but I read your post. I think Google is a business and the goal is to make money. I agree with you there. But they are also a publicly traded company, and they still have to follow the guidelines set forth by the FTC and other laws. Many people believe them to be a monopoly, which isn't bad on it's own, but abusing that power is. If they start to preference their content over other peoples / businesses, then that is not how a search engine works, at least how I define it. That is more how a database works and they would need to clarify what they are as a company.
Great article, Aaron, with lots of food for thought. One area where we can and should give Google some credit: they provide a "feedback" option underneath every Knowledge Graph. I don't know how much attention is paid to the form, but at least there is this possibility of giving specific feedback to correct any mistakes you might notice. I do wonder how effective this feedback is. Maybe your next Moz article? ;)
Interesting idea Noam. Very interesting... I just wonder who / what is looking at that feedback. I imagine it's like a big giant black hole, but it would be great if we knew.
it's great article with good info for bloggers.
Oh youngsters only at the end do you realize what is happening. As someone who as been practicing SEO before Google was born 20+ years. I have seen this coming for a long time. Their quest for domination accelerated once they turned into a public company. Their ambitions are clear. Owning the web was the start and watch how they will creep into the "internet of things" in the near future.
They only get 95% of their total revenue from search. Their approach is relatively juvenile, get more traffic going to paid search by any means necessary. They have all the statics in the world about internet usage. They need to grow revenue by any means necessary including taking away as much "free traffic" as possible and monetizing it.
Maybe its pay back because we used to manipulate the crap out of Google and that is really what SEO is manipulation to a favorable outcome.
Anyways WE gave Google the power and if we get together I mean really get together we can take that power away.away. Wether it is by switching to another search engine, banding our sites together for networks of traffic trading or promoting the crap out of Adwords ad blockers. And tons of other ways
I do like Google and they do come out with cool stuff but just like a child will push boundaries so does G. Sometimes you need to smack them on the butt to get their attention.
Take the web back it is yours.
This article asks some very good questions. Thanks for sharing it Moz.
Google the search engine which is now more in publicity, but doing Monopoly now, which is making the business of there shareholders or paid customers. Giving more value to them..May be after few days Google will make paid for Organic Search also.
Sometimes I wonder about Google. They say one thing and do another. They claim certain things are bad but then reward sites with great placement when they use the same tactics Google claims to dislike. Then launch a "mobile algorythm" that is said "Will change everything" and so far a month after the so called change I have seen nothing. No changes at all.
Questions like "How do we think Google is supposed to work?" just display how nieve some of us can still be when it comes to 'business' and the fact that we all have rose tinted glasses when it comes to Google... the answer to the question is it's a business which means it's supposed to make money! Given Google's listed on the stock exchange we could therefore say Google exist more specifically to make money for it's shareholders... I kind of miss the point too - a question like "what are google playing at with the recent mobile update" is far more relevant than the point being made.
A competitor site formed around January and carried on the wings of spammy SAPE links has just earned a Google answer box, outranking my site. Google can't get anything right. It's infuriating.
I think you are right on Simon. It's one thing when we as users make a mistake because we choose the wrong result. But Google is displaying this content as fact in the answer boxes (or so it seems to me). Once that starts to happen, they are crossing into some dangerous territory IMHO.
Well I personally think that Google wants to be the powerhouse of all the information and they are sort of creating their own artificial world built with the information walls of other websites. Though the worst part is not giving the sources a proper credit of information, the answer boxes, knowledge graph and the Google Now features all of them are based on the information fetched from other sources but not credit is given to them.
From the users prospective it's great. I don't have to click into websites and hunt around for information. It's right there for me in Google's search results.
Now I have my business head on and it's about to explode!
Aaron thank you for taking to the time to write this blog post. This lends itself to a worry I have long had about Google "they may have too much power". People, businesses live and die by what is being showed by Google. Two quotes come to mind; "With great power comes great responsibility", "Power corrupts, absolute power corrupts absolutely".
Google has been fairly open about how they do things, I have been a big fan of Matt Cutts videos and blogs, but he's been on leave and I haven't seen anybody as prolific as him in regards to SEO and what Google is doing. That uneasy feeling is certainly creeping in.
Great post again
Don
nice article i did not see that on google
Google is only interested in itself. I find it shameful what they are doing here but as long as nobody stops them from doing so, they will continue on this path. Looking at the money they spend in politics (probably a lot more than what is visible to the greater public) I assume that this will not change in the near future.
Hi Aaron
This is a great article with fascinating research. Always great to see people who take the time to do the research to back up their hypothese.
Google has been taking over our information by stealth for many years and this should be grave concern to many more people. Google promote themselves as alrutisic and sharing information as a service to the world but when one monolith has so much control over information then control and manipulation are the danger. It seems eerily reminiscsent of the control the Vatican once had over all printed and written material to me.
Simon_ensor makes a good point above that Google doesn't always return the correct answer to questions. And when so many people take what they read online/wikipedia/google for gospel without question - then Google begins to have serious power. A lot of non tech people even think that Google is 'the internet'.
Then again, as Matt-POP says above, who actually owns 'information'? That is a fascinating theological based question that can keep you awake for hours. Who does own the right to information?
I constantly face issues of my images being copied and used online - I even had a Russian cloned version of my 'What iS creativity?' (https://www.creativity101.com/what-is-creativity/) entire site - I have to say I was impressed with how it looked in Cryllic. In the end you have to let it go as once you put something out there you can't control it.
Google has a lot of responsibilty with the vast amount of information it has access to. Everyone always thought that Big Brother (George Orwell) was the prediction of a dystopian world government - I say Big G is the one to watch.
I guess I just think Google has to be held to a higher standard given who/what they are. With great power comes great responsibility. And if they start taking advantage of that power, they are being irrespobsibal.
Google doesn't seem to get it. You can't destroy all websites because then there IS no original work to steal. Eventually something has to give. Google can't just keep stealing whatever they want - or if they do, creators will stop allowing their work to be indexed at all. If it's not profitable to the creator, why bother letting Google have it?
Maybe we go down the "all app" route. Maybe we find other ways to get discovered that don't all rely on Google. Maybe, as thought leaders in all our own spaces, we direct our friends & family to other search engines and other ways to find what they want. It's hard to say but the current direction is obviously unsustainable for everyone but Google.
great point Matt. If they keep it up and start making their own content, creators will give up because they "assume" google will do it better.... until they don't. And then what!
Really solid point!
Hi Aaron, whilst people not getting due credit for images and other information is not ideal, I think a more important issue is that Google does not always return the right answer. I have searched a few times to find Google Answers to be mostly correct, but certain facets that were misleading. It is an interesting debate as to whether they are becoming more subjective with data especially if they are then licensing it - they do definitely seem to have changed their tact on subjectivity!
I think you are right on Simon. It's one thing when we as users make a mistake because we choose the wrong result. But Google is displaying this content as fact in the answer boxes (or so it seems to me). Once that starts to happen, they are crossing into some dangerous territory IMHO.
Need to change in Google systems because day by day Google changes in many different ways and sometimes it's shown as irresponsible.
Hello Aaron,
Great work done, it seems that google did mistakes in some areas but still it is useful in many areas and that is the reason people using it. I must say you keep eyes on google very closely, shoot very interesting images with proof, well done :)
PS: Aaron, please check your comment date, in many comments the date shows 23 days ago or a month ago as this blog published just yesterday. I already informed to Rand & team regarding this bug, hopefully it would fix soon.
Thanks Shubham! The reason it's like that is because it was originally on YouMoz. So it is actually that old :)
Hello Aaron,
I didn't know about it, anyway congratulations, your post got promoted :)
Hi Aaron, it is difficult to determine the future of the Internet, but it seems as though it would only be a matter of time that Google would start to capitalize (even more than before) on all things related to search. This article is just showing one more thing that Google is doing. As a photographer, I find it very disconcerting that Google is attempting to create their won database of photos around the Internet. That being said, if it happened to one of my originals, I could fight it but would probably lose in the end because Google is such a giant. I hope that this is something they are doing to make an easier experience for everyone and not screw the users. Google has been a great company until recently and I definitely think that the actions they have made in the past year, including this one, are too far outside of their scope.
Hi Aaron
Wonderful article, well done! I have often thought about this as well, especially years down the line. While we are often at Google's mercy, it does seem a bit off putting the amount of credit it tends to lend itself in the SERPs, with no one really saying "hey...uh...they're doing this..." - which your article just did. Answering questions with information that's not yours seems a bit off putting.
While this is happening, I would hope they are thinking of ways to help credit and push traffic to brands that are actively providing great content in an optimized fashion. Google always said they wanted to be an answer engine. They are now able to be more so than ever with the Knowledge Graph and Schema.
Like I said, I would hope that Google finds a way to credit those that it receives information from. I would like to think that they thought that through. It will be interesting going forward. Otherwise, I think we're going to see more issues with Google popping up much like, while not directly related to your article, what's happening in the EU with Google Shopping and antitrust suits.
Again, well done. Made us think early in the morning!
Thanks Patrick. I see it happening too often and something about it always made me uncomfortable. I finally figured out how to article that on paper:)
Really glad you enjoyed!
Gadzooks! This is shocking stuff! Thanks for this.
Hello Aaron Great post.
The key important thing of Google is that they keep update their algorithm and change with the mass requirement. and they know very well what their audience need while search the web. so i think we have many things to learn ahead to know how algorithm works. i think SEO has long lasting future as well.
Interesting article, I didn't get all about of it, but, sound reality.
Great Post. Thanks for sharing.
Terrific post! Represent!!!
Very good post. Thank you.
You got or not but Google is doing his business very well hahah
thank you it was great