Copyright and Cacheing: What Happens If You Change Your Mind About Letting a Search Engine Cache Your Site?

May It Please the Mozzers,

Parker vs. Yahoo!, Microsoft, 2008 U.S. Dist. LEXIS 74512 (E.D. Pa. Sep. 26, 2008)

A federal court recently ruled that a lawsuit against Yahoo! and Microsoft for displaying cached versions of websites after the website owner has complained can go forward.

Some of you may be experiencing a tingling feeling of deja vu. That's because the same plaintiff who brought this lawsuit against Yahoo! and Microsoft brought a copyright infringement lawsuit against Google several years ago. See Parker v. Google, Inc., 422 F. Supp. 2d 492
(E.D. Pa. 2006), aff’d, 242 Fed. App. 833 (3d Cir. 2007) (non-precedential), cert. denied, 128 S. Ct. 1101 (2008). In that case, the Court ruled that Google was not liable for direct copyright infringement by archiving and displaying usenet postings that contained copyrighted material, and also by displaying excerpts of websites in a list of search results. The case was one of many major search engine "wins" validating the way search engines operate and return content.

However, the case did not resolve all issues regarding search engine listings. The Parker v. Google court didn't make a ruling on whether Google committed direct copyright infringement by republishing "cached" copies of web pages on Google's own site. Parker's case against Yahoo! and Microsoft, however, directly examines the caching issue.

The Parties

Yahoo! and Microsoft need no introduction, so I'll skip straight to the plaintiff.

Gordon Roy Parker (AKA Ray Parker) is the author of several copyrighted works, included Outfoxing the Foxes and Why Hotties Choose Losers. Both are published online and freely available from Parker's website. Parker, a rather frequent litigant, represents himself in his lawsuit against Yahoo! and Microsoft. It's important to note that Parker did not employ the appropriate robots exclusion protocol to prevent search engines from crawling, indexing or displaying his content. Further, he did not send either search engine a take-down notice requesting that they remove the content. He went straight to filing this lawsuit.

Parker is suing Yahoo and MS because they create and republish allegedly unauthorized "cached" copies of his works.

The Claims

Parker claims that by making cached copies of his websites available to their users, both Yahoo and Microsoft republish his works in their entirety without his permission. Accordingly, Parker brought a bunch of claims, but I'm only interested in direct copyright infringement for purposes of this post. (The rest of the claims get dismissed outright anyway.)

It's worth noting that the case is nowhere near a trial stage. The decision I'm writing about today deals with legal technicalities about whether it is even possible to bring these kinds of claims. Basically, Yahoo and MS asked the judge to dismiss the case before it really got started because Parker's claims aren't valid and don't make sense.

The Ruling

Generally, the judge agreed with Yahoo and MS and dismissed most of the claims outright. Surprisingly, the Judge allowed the direct copyright infringement claim to go forward. Well, sort of.

Based on the law previously established in the Google case, the judge ruled that Yahoo and Microsoft are not breaking the law when they initially download Parker's website for the purpose of indexing (assuming they follow robots exclusion protocols). The only unresolved issue, thus, is whether Yahoo! and Microsoft commit copyright infringement by displaying cached copies of Parker's website.

The judge ruled that, at least initially, search engines do not infringe copyright by displaying cached copies of websites that don't utilize robots exclusion protocols. According to the judge, search engines are allowed to index and display cached copies because it is a reasonable assumption that if a website owner doesn't want his or her site to be indexed and displayed, he or she will use robots.txt to communicate to the engines. Thus, there is presumed permission, or "implied license," to let search engines do their thing. The onus is on the website owner to tell them no.

Thus, the Judge ruled that to the extent that Parker was seeking to hold search engines liable for initially indexing and displaying his cached content, the case is dismissed; Parker gave the search engines implied license by not using robots.txt.

HOWEVER, the judge did not completely dismiss the case. She allowed part of the case to go forward.

The judge ordered that Parker can only continue the case on the issue of whether Parker revoked his permission by filing this lawsuit. Thus, the court left open the proposition that the search engines may be liable for infringement once they knew or should have known that they no longer had permission to display the cached content. The unresolved question is what does a website owner have to do, if anything, to put the search engines on notice that she doesn't want her site's cached content to be displayed?

Changing the robots.txt protocols and waiting for the search engines to re-crawl could take months, but for most people in most situations, that will be sufficient. Sending a take-down notice is almost assuredly the quickest way to get your cached content removed from the search engines. That's certainly what I would do in an emergency. It seems to me like Parker chose the most laborious and expensive way to do it: filing a lawsuit. Could it be he doesn't really care about the cached content being displayed? Perhaps he's just more interested in the attention? In which case, posts like mine do nothing but encourage wasteful lawsuits.

Conclusion

The fall-out of this particular legal circus is that search engine practices are even further legitimized.

I for one think this is a good thing. As a consumer, I get tremendous value out of search engines and how they operate, including cached pages. I'm a little perplexed by Parker's motivations. He knows he can opt out, he is just choosing not to do so. I suppose he has a world view that puts more emphasis on private property rights than the democratization of knowledge. I'd be somewhat miffed if there were not opt-out mechanisms. But there are. So I'm not.

I don't think the unresolved issue of whether filing a lawsuit is revocation of an "implied license" will have an impact on the way search engines cache and display websites. Most content owners will continue to employ robots exclusion protocols and take-down notices to assist them in managing their content. Realistically, how many people would choose filing a lawsuit as their first choice for communicating their content management preferences to search engines? I'd hazard to guess that Parker is just about the only one.

Thus, even if Parker does manage to pull out a 'win' against the search engines on this narrow issue, it probably won't have a great impact on search engine caching strategies. Overall, the opinion is a win for search engines because it further legitimizes their practice of crawling, archiving and displaying web content.

Best Regards,

Sarah Bird

Case Library

More about the Parker v Google decision.

New Media & Technology blog on the Parker v. Yahoo! decision.

Eric Goldman, my hero, also has a great post on Parker v. Yahoo!

If you want to know more about digital media and implied licenses, read this excellent article.

Comments 8

Please keep your comments TAGFEE by following the community etiquette.

E-mail me when new comments are posted

Sort by:

Comments are closed on posts more than 30 days old. Got a burning question? Head to our Q&A section to start a new conversation.

Levi Wardell

2008-10-27T21:04:28-07:00

I'm at a loss. What does Parker really expect to gain from this? You have to assume that if this person is intelligent enough to represent themselves in court, they have the forethought to look into what would be needed to revoke the ability to display the cached pages. That being said, it smells like a lame attempt to make a buck.

On a side note, this topic may make for a good SEO 101 check list... steps you can take to remove a site from the engines.

2 0

I'm at a loss. What does Parker really expect to gain from this? You have to assume that if this person is intelligent enough to represent themselves in court, they have the forethought to look into what would be needed to revoke the ability to display the cached pages. That being said, it smells like a lame attempt to make a buck. On a side note, this topic may make for a good SEO 101 check list... steps you can take to remove a site from the engines. 
Cancel
Danny Sullivan

2008-10-28T10:17:22-07:00

"I for one think this is a good thing. As a consumer, I get tremendous value out of search engines and how they operate, including cached pages."

And as a consumer, you'd probably get tremendous value out of having a book given to you for free without buying it, but that doesn't mean it's legal for someone to reprint in this fashion.

I totally back the opt-out mechanism that exist for online content, ways to stay out of an index or prevent being cached. They are effective, and Parker (and others like some Belgian newspapers I know) should use them rather than waste court time.

But in 2006, we had a court rule this way on caching:

"First, Google’s cache functionality enables users to access content when the original page is inaccessible. The Internet is replete with references from academics, researchers, journalists, and site owners praising Google’s cache for this reason. In these circumstances, Google’s archival copy of a work obviously does not substitute for the original. Instead, Google’s “Cached” links allow users to locate and access information that is otherwise inaccessible."

https://forums.searchenginewatch.com/showthread.php?threadid=9809

That ruling freaked me out. The court to some degree said it was OK to reprint content that's not available, simply because it is not available -- not because you have permisssion to do it.

Again, useful. But until this ruling, not necessarily legal. And I think it was a bad ruling.

Point of history to also consider. Google was the search engine that popularized caching. The others at the time didn't do it. The legality was questionable right at the start, and they provide noarchive as a way to stay out.

Now around the same time, you had people who wanted to cloak content against Google. That also meant using noarchive. And Google put out a strong signal that if you used noarchive, you increased the odds that they would take a closer look at your site. Which, naturally, caused some peopel to back away from using noarchive.

I love the cache myself. It has saved me many times. If the court simply ruled that opt-out was enough to take away permission, I'd be pretty mellow over it. But that earlier ruling only said that was part of it and seemed to hand over rights that were frightening.

Maybe this federal case will help negate the other one.

2 0

"I for one think this is a good thing. As a consumer, I get tremendous value out of search engines and how they operate, including cached pages." And as a consumer, you'd probably get tremendous value out of having a book given to you for free without buying it, but that doesn't mean it's legal for someone to reprint in this fashion. I totally back the opt-out mechanism that exist for online content, ways to stay out of an index or prevent being cached. They are effective, and Parker (and others like some Belgian newspapers I know) should use them rather than waste court time. But in 2006, we had a court rule this way on caching: "First, Google’s cache functionality enables users to access content when the original page is inaccessible. The Internet is replete with references from academics, researchers, journalists, and site owners praising Google’s cache for this reason. In these circumstances, Google’s archival copy of a work obviously does not substitute for the original. Instead, Google’s “Cached” links allow users to locate and access information that is otherwise inaccessible." https://forums.searchenginewatch.com/showthread.php?threadid=9809 That ruling freaked me out. The court to some degree said it was OK to reprint content that's not available, simply because it is not available -- not because you have permisssion to do it. Again, useful. But until this ruling, not necessarily legal. And I think it was a bad ruling. Point of history to also consider. Google was the search engine that popularized caching. The others at the time didn't do it. The legality was questionable right at the start, and they provide noarchive as a way to stay out. Now around the same time, you had people who wanted to cloak content against Google. That also meant using noarchive. And Google put out a strong signal that if you used noarchive, you increased the odds that they would take a closer look at your site. Which, naturally, caused some peopel to back away from using noarchive. I love the cache myself. It has saved me many times. If the court simply ruled that opt-out was enough to take away permission, I'd be pretty mellow over it. But that earlier ruling only said that was part of it and seemed to hand over rights that were frightening. Maybe this federal case will help negate the other one. 
Cancel
Danny Dover

2008-10-28T16:21:36-07:00

Interesting post as always Sarah,

I have a little background question for you. Did the Google case establish the Robots Exclusion Policy as a legal standard?

Do people in the United States have to follow robots.txt by law? The protocol isn't well documented or defined. This seems like it could become a very big problem.

2 0

Interesting post as always Sarah, I have a little background question for you. Did the Google case establish the Robots Exclusion Policy as a legal standard? Do people in the United States have to follow robots.txt by law? The protocol isn't well documented or defined. This seems like it could become a very big problem.
Cancel
IndyMediaGroup

2008-10-28T09:12:36-07:00

Right after Parker got ditched by a hottie even though he's a loser, he went straight to the bar where he met this angry copyright lawyer who told him to forget the online copies of his books and to go straight after the big guys -- Yahoo! and MS. They have deep pockets so why not be like every other annoying litigious American and sue the corporations... If Parker needs to make money, he needs to be more creative...that's my take on why he went after them.

1 0

Right after Parker got ditched by a hottie even though he's a loser, he went straight to the bar where he met this angry copyright lawyer who told him to forget the online copies of his books and to go straight after the big guys -- Yahoo! and MS. They have deep pockets so why not be like every other annoying litigious American and sue the corporations... If Parker needs to make money, he needs to be more creative...that's my take on why he went after them.
Cancel
globusinternet

2008-10-28T01:27:27-07:00

"displaying usenet postings that contained copyrighted material" - Now dosent SEs exclude the content via robots.txt and nofollow metatag ?

Also notifying the SEs from Webmastertools to exclude or remove the content would have helped them ?

1 0

"displaying usenet postings that contained copyrighted material" - Now dosent SEs exclude the content via robots.txt and nofollow metatag ? Also notifying the SEs from Webmastertools to exclude or remove the content would have helped them ? 
Cancel
jaseemumer

2008-10-27T21:16:02-07:00

Parker just wanted to create some buzz.

It would be fun if the court agreed with the Search Engines and asked them to remove Parker's website from the index for the sake of case. You can guess what will happen to his traffic then.

1 0

Parker just wanted to create some buzz. It would be fun if the court agreed with the Search Engines and asked them to remove Parker's website from the index for the sake of case. You can guess what will happen to his traffic then. 
Cancel
Media1Designs

2008-10-28T00:28:08-07:00

I agree, the search engines should remove him from their indexes indefinitely. There has to be a though shall not try to sue us clause in the guidelines somewhere right?

1 0

I agree, the search engines should remove him from their indexes indefinitely. There has to be a though shall not try to sue us clause in the guidelines somewhere right?
Cancel
David Carralon

2008-10-28T02:36:21-07:00

I agree with Jasee that Parker may be looking for free-of-charge publicity and raise his profile through the controversy.

1 0

I agree with Jasee that Parker may be looking for free-of-charge publicity and raise his profile through the controversy. 
Cancel

Post Analytics

Comments 8

Log in to Moz

Don't have an account?