Over the past few days, I've seen a lot of questions in our Q+A section and across the blogosphere that suggest it's time for some direct answers from the search engines on major issues that affect business practices, consulting, and website building. Here are the ones I believe are in desperate need of straight responses:
- To all the engines - does content on subdomains inherit the full ranking ability provided by the pay-level domain? Website builders deserve to know this ahead of time so they can be intelligent about the ways they design their site structure (via SEJ).
- To Yahoo! and Microsoft/Live - Are you going to offer geo-targeting options like Google does in their Webmaster Tools? If not, would you be willing to follow a common format in robots.txt or meta tags to let sites tell you which countries/languages their content is targeted towards?
- To Google - if a site creates two subdomains, one targeting Canada and one targeting the US with very similar or nearly the same content on each, can both of those subdomains operate successfully targeting their specified country without causing duplicate content issues? If not, why?
- To Google - although you claim to notify webmasters inside Webmaster Tools about penalties to their site, I've seen many, many penalized domains and only one penalty message, ever. Why create and publicize the feature if you're going to continue to withhold that information from site owners? Surely your algorithms are savvy enough to detect pure spam vs. legitimate sites and businesses that simply made mistakes or attempted bad practices - why not give these domains the benefit of the doubt? Does more penalty reporting actually correlate with more spam? I'd find that hard to believe.
- To Yahoo! and Microsoft/Live - Penalty reporting would be an excellent feature - will we ever see it?
- To Google - An extraordinarily small percentage of questions get answered or addressed by Google representatives in the Google Groups for Webmasters area, yet you could easily create a more open, communicative environment by allowing your analysts to participate actively. What fears are preventing that direction? The strategy of Webmaster Central was always to open up communication, yet the party line of "well, most threads get good answers from the community" really ignores the potential abilities of your staff and the publicly promoted strategy of the group - where's the disconnect?
- To Yahoo! and Microsoft/Live - Any chance you'll offer groups/forums where webmasters can interact with engine representatives? (UPDATE: Yahoo! has a webmaster forum here and Live has one here, though it's down at the time of this writing.)
- To Google - Yahoo!'s Dynamic URL Rewriting system is a clear leap forward in letting site owners properly canonicalize page level content and is not a massively challenging process to implement - why not offer it (or something similar in meta tags or robots.txt)? Savvy SEOs can conditionally re-direct, but most organizations don't have this SEO intelligence - why punish them?
(BTW - To Yahoo!ers reading this, the reason folks don't use the system is because they still have to fix it manually for Google & Live, so it's generally not worth the effort. It really is a great system.) - To all the engines - can you offer some clear guidelines for what is cloaking vs. what is IP delivery (or whatever other name you have for showing different content to humans and bots in a good way)? Many companies worry about the practice, even when it's wise and engines would probably approve, while other push the boundaries because they haven't been well defined.
- To all the engines - would you consider offering a clear method for showing if a site's links no longer pass link juice? Google's current process of sometimes lowering PageRank in the toolbar is particularly weak, as those not savvy enough to be buying links from low quality directories and sites selling them are also often the ones who have no idea how to see what a domain's PageRank used to be or whether the current PR bar is meant to tell them something. On the other hand, Yahoo! and Live offer no method whatsoever. This strategy would seem to fit very well with the concept of protecting consumers, and the downside that some sites that escape detection could still sell links for a time would seem to be, by far, the lesser of the two evils here.
Sometimes, I'm on the side of the search engines keeping things quiet, and I understand and empathize with their reasons. However, for these questions, I think straight answers are in their best interests. I must say that I've been generally impressed with the search engines offering more transparency of late, so I have hope that we might get good responses on some of these.
Please feel free to provide your own questions of similar nature in the comments if you've got them.
Can I add my two cents Rand?
11. To all SE's - What tools will you give us webmasters, to curb duplicate content and spam sites?
12. To Google - Why don't you give us control over sitelinks, as to what to appear and what not ? 90% of the times, the links shown are irrelevant.
Mani, it's a good point your tackling here.
Even though you can have some sort of control by disabling some of the links that you don't want to appear, I think it's still not enough.
Mani, you know that in Google's webmater central you can block sitelinks that you don't like, right?
Matt (or anyone in the know): Why is there an expiration date on the sitelink blocking, as opposed to just a simple block/unblock? I get a notice that, if I don't visit Webmaster Tools, my blocked sitelinks will go active again in August, which seems to have no rhyme nor reason.
I think because our sitelinks can change over time, and we don't want people to go from 8 sitelinks to 7 sitelinks forever when they could have had 8 good ones after a while.
I'll take that as "it's a work in progress" ;) Thanks, Matt; I was just curious what the reasoning was. I have to admit, I was slightly paranoid about blocking one, in case that was a lever that could never be unpulled. I'm still thrilled that one of my clients got sitelinks, even if it's a bit of a magical, mystery box.
Hey Matt, thanks for replying. You are right, I can block the sitelinks selectively, but I can't decide as to what appears. As of now, I have to wait and see what appears first, and if found irrelevant, block it only to see something else irrelevant appearing in the place.
I don't think we have the right kind of control unless we can decide -
a) What to appear AND
b) What not to appear.
Just having either one of the option is quite not something to be happy with I guess. Correct me if I'm wrong.
In Google webmaster tools, you can have a control on what to have and what not to have in the sitelinks section.
Chetan, you can ONLY block the already "automatically" generated sitelinks. You can't decide as to what appears, it's an automated process.
I can give a quick reply or two. On #4 and #10, it's a matter of how spammers would abuse that information. If a spammer always knew how much Google trusted a given link, that would let them spam much more.
I've been meaning to do a post about #9 (cloaking vs. IP delivery) for a while and haven't found the time, so I may ask for some help at the Googleplex to write about that topic.
On #6, our intent was to create a healthy forum where not only Googlers can help, but other webmasters could weigh in as well. I think our primary effort has to be finding and removing spam and otherwise improving the quality of our index, and with the remaining time and resources that we have available, we try to find scalable ways to communicate with as many people as possible. So the webmaster discussion forum is one good way, blogs are another, and conferences are one more way. We'll keep trying new approaches too, e.g. the live chat that we did with webmaster forum members attracted several hundred people to ask questions (for free).
Matt,
Can you confirm or deny the information I stated above about the penalty notifications in WMC?
Thanks,
Brent D. Payne
I am admittedly biased, but I think the Webmaster Central forums and articles are actually really good for many, many issues.
It's easy to point out the controversial and vocal minority of issues which are not completely addressed, for reasons including what Matt mentions above. But for a large potion of the (few) Q+A I've answered/helped to answer here at SEOmoz, I've found very good and authoritative answers from webmaster central articles or forum posts.
Matt,
Regarding point 4: I understand that there is potential for abuse by spammers. They could potentially make a change, see if they get an email and use it as an iterative process to evaluate potential algorithm loopholes.
However, I’m sure that there are ways to limit this. You could limit the number of emails for a given site, or number to a webmaster account accessed from an IP address with multiple webmaster accounts with penalties.
There will always be some abuse but assuming that all webmasters will abuse the notification system seems a bit harsh.
I see more abuse potential for point 10.
Lance, I think the best way to think of it is a spectrum. On one side of the spectrum, there's the innocent site owner with (say) a small amount of hidden text that their webmaster might have added. On the other side of the spectrum there's the mega-blackhat doing all kinds of sophisticated things. The discussion is where on that spectrum Google should give feedback to people that we believe have some spam on their site.
We don't have the resources to engage in a one-on-one conversation with every site owner in the world, so we have to choose where on the spectrum to alert people and to encourage a reconsideration request. So far, we've stayed closer to the innocent side of the spectrum (things such as hidden text or someone with a hacked site). I'm open to ways that Google could move further along that spectrum. But when I see a pretty savvy site that was doing some really unusual link gathering or exchanging or other tricks, I tend to expect that the person made that choice deliberately and was willing to tolerate some of the risks. Given limited resources, we tend to try to help the smaller sites that didn't veer as far into the high-risk spectrum.
Matt,
Thanks for responding. I can see your point and understand the limited resources.
I can only imagine the amount of incoming requests and email traffic you and your team have to deal with.
Wait why arent you just focusing those that are signed up for GWT? Why worry about "every site owner in the world"? Clearly those that are signed up for GWT care about why or how they might have been penalized. Surely focusing on these webmasters only is scalable Matt.
Matt, when will Google start recognizing hidden links and text efficiently?
I still find so many websites ranking well because of the hidden links comming from different .gov and .org sites. So many times even I feel tempted to put on my black hat some day..
Great list and thanks for linking to the SEJ piece.
However you seem to be asking too much with this one:
With the openness like this, the SEO world will go mad, I am afraid. Besides, link juice can be clearly seen by the site ranking behavior...
Ann - I think you're overestimating the ability of most players in the search marketing field to test the "ranking" ability of a site. I personally don't even feel qualified to say for certain whether a site is passing juice, and this feeds an industry of folks who prey on this ignorance to sell a product (links) that provide no value.
"I personally don't even feel qualified to say for certain whether a site is passing juice"
Sure, no one can be sure. But that's the beauty - testing and experimenting.
I personally don't expect Google to clearly tell us what sites to buy links from :) Moreover, I don't think that would be the right thing to do. If information like this gets revealed, the anti-link-buying war can never be stopped (unless the power of inlinks is devalued completely)...
I have to agree with Ann (and I guess Matt below), seems this would just open pandora's box. Not sure we need even more abuse or exploitation of sites.
I think the abuse happening now from link sellers preying on ignorant buyers is far greater than the potential for having sellers be able to determine which sites have already been caught vs. which haven't. A formal statement of "we caught you" would, in my opinion, have a drastically positive effect on the overall problem - not only would buyers be warned, but the selling market would shrink massively as they became frustrated poyuring time, money and effort into business models the engines could snuff out in a second. I think this is a case of selfishness on the part of the engines and failure to grasp the big picture.
I have to agree about not really wanting the world at large to know for sure which sites are still passing PR (or link juice, strength, etc), although I hadn't thought of Rand's reasoning either: that the uncertainty allows useless sites to fake and inflate their ability to pass PR and prey on the uninformed.
Now I'm on the fence, but at the same time, I don't ever see a world where an official tool or a metric becomes available where we can clearly see that Site A's links are stellar and Site B's are worthless.
At the same time, knowing for sure would put those sites that still pass PR on notice: they'd know that they were being watched and that abuse of their strong standing would result in a public penalty.
4. Man is this ever so true. They need to come forward with this info pronto for those that penalized. It is like being held in jail without being charged.
6. I do see John Mueller and few others floating around there, but they need team full time to going through GG.
7.
https://www.jaankanellis.com/get-answers-from-all-three-search-engines/
MSN:https://forums.microsoft.com/webmaster/default.aspx?siteid=79
Yahoo:https://suggestions.yahoo.com/?prop=SiteExplorer
I feel terrible that I didn't know about the Yahoo! and Live forums (althought the Live Search forum appears to be down and isn't listed on this page https://forums.microsoft.com/ either).
Thanks for that - going to edit the post!
No problem Rand, this is why we have these things called blogs and comments. Information transfer ;)
I wish I knew how Google Maps connects a listing to reviews from other sites like TripAdvisor.
When I verified my sites, it essentially created new, duplicate listings. I really just wanted to add/correct data in the listing that comes up in the OneBox in regular search and in Google Maps.
It seems like Map spam to me to have multiple listings for the same business. If there's some magic key to getting all the data to funnel into one listing, I sure would like to know it.
ie Some random user (I'm told it was a user) added another listing for my business (with the wrong area code for the phone number to boot), and the TripAdvisor reviews attached to it. Yet, they don't attach to mine...the verified by business owner one.
#2 - asked a Yahoo rep explicity this question at a recent conference here in Ireland - Apparently, yes they are developing this functionality.
#3 - my experience with this is that you can have some quite large problems if the content is identical or near identical, expecially with users of google.com in those locales
#4 - they dont notofy all webmasters of penalties. I work with a large media corp who have a TBPR penalty and not a whisper of anotification. On the recent webmaster call the Googlers mentioned that they dont notify all sites of penalties.
#6 The recent appointment of John Mueller (JohnMu of Gsitecrawler fame) marks a change I think - John has been far more open than other Googlers, anf I think we're seeing that in some of the recent info that's coming out (the sitelinks on position #61 springs to mind).
#8 - not sure that Yahoo's dynamic URL thingie has any benefit personally. Used it on a large site that had massive dymanci URL issues but cant say we saw any return.
#10 - doubt you'll see any movement there while spammers tend to be some of the cleverist people in the industry..
Great set of questions :)
[edit - I started writting this about 4 hours ago and went on a call. now I see tonnes of responses, so sorry if I've repeated what others have said already.]
To Google - referring to the point 3 from Randy : as a company working world wide, does the company have to create subdomains for each country, or having subdirectories will make it as well ? knowing, that the content would be the same in the index page for many countries where we would use the English language, but the rest of the site would be different.
thank you Randy for asking all those questions. Hopefully, we'll eventually get some answers from the search engines !
I know I'm picky about this, but my name just has the four letters in it - "rand" there's no y or i or all. :)
As for your question, it's been answered several times by the Google folks - subdirectories are fine for targeting different regions. You can separately register them with Webmaster Tools and target them towards the region of your choice (which is a terrific feature).
Sorry about the mistake in your name.. :-(
I won't make it anymore..
Wonderful post Rand. Thank you for taking the time to compile this list.
I'd love to see Google more proactive with penalty notification. I think it would benefit Google more to let sites know they have been penalized and even why. It may cause the sites to change their ways, improve the website and in the long run improve user experience.
After all isn’t Google always talking about user experience.
This one is directed at all the engines:
When will we see added weight given to microformats and other "Semantic Web" propogation techniques in your ranking algorithms. I feel this adds a ton of distinguishable content and data and only helps the Internet come closer to what it could be.
I'll be happy to here any other SEO's thoughts on this as well.
Hi Randfish,
This question is for google,
I am sure everybody is familiar with the notorious search query “Who is failure?” and its result. Though this is discussed over and over again in many forums, Google has not yet done anything about this. Why is this not yet rectified?
nice post rand but i want to add this... to all SE...
> can you provide data that shows our CTR and Impressions in organic results?
Hi Rand
Nice post. Will add:
#11, to Google: Imagine to sites A and B, and 4 pages A1, A2, B1, B2 all having the same PageRank. Page B1 is targetted for "widget". Now change the setup as: If A1 links to B1 with the text "widget", and B2 links to A2 with any text, should I expect that B1 ranks higher for "widget" in the second case than in the first? Or will the IR-Score not value the incoming link because of the reciprocal B2->A2 link?
#12: Google: As Joost De Valk amongst others have documented has documented the nofollow attribute can be utilized by skilles SEOs to perform PageRank sculpting. It is also being used by wikipedia for all external links. Apparently this attribute is being (ab)used by skilled SEO's to get an even bigger advantaged over the ordinary webmaster, while it contradicts with the original "random visitor" in the PageRank algorithm. Especially for internal links, it looks most of all as a tool for the best SEO's. Have you considered to discontinue support for this tag (partly for internal links or entirely) due to its abuse?
"#12: Google: As Joost De Valk amongst others have documented has documented the nofollow attribute can be utilized by skilles SEOs to perform PageRank sculpting. It is also being used by wikipedia for all external links. Apparently this attribute is being (ab)used by skilled SEO's to get an even bigger advantaged over the ordinary webmaster, while it contradicts with the original "random visitor" in the PageRank algorithm. Especially for internal links, it looks most of all as a tool for the best SEO's. Have you considered to discontinue support for this tag (partly for internal links or entirely) due to its abuse?"
Lots of problems with this post:
1. I don’t see anyone getting much of an advantage using PR Sculpting. People may say they do but with no proof. I don’t see the tag being abused by anyone.
2. Using it the nofollow on external links is different from what most think PR Sculpting is as that applies to the internal links mostly.
3. Matt Cutt’s and other have said many times you would be wise to spend your time and resources elsewhere than worry about PR Sculpting.
Rand, I have a problem with Q2:
"To Yahoo! and Microsoft/Live - Are you going to offer geo-targeting options like Google does in their Webmaster Tools? If not, would you be willing to follow a common format in robots.txt or meta tags to let sites tell you which countries/languages their content is targeted towards?"
Surely the HTML "lang" attribute would do this sufficiently? e.g. lang="en-us" (English, USA or lang="fr-ca" (French, Canada)
It's an already establish convention that's been around for years, very easy to implement. Why re-invent the wheel. Or am I mis-reading the question?
Lets see some Yahoo guys coming over and answering to what you asked!
Great post. Rand, I appreciate you asking the questions that most of our voices are loud enough to ask.
I also appreciate Matt Cutts for his timely responses and commitment to engage in the webmaster/marketing community.
I think Yahoo and MS are too busy playing with each other to give our community much input, but I hope they prove us wrong. Maybe Yahoo will learn a thing or two with their new relationship with Google.
I couldn't agree more with #4. What good is the tool if it's not being utilized? I took over the SEO duties on a site after the first of the year and the site was severely penalized before I had the chance to see what kind of practices had previously been used. The site now only has a handful of internal pages indexed and I still haven't seen anyhing from Google as to why the site was penalized. Furthermore, I picked apart what I can see and still don't find anything on or off site that looks wrong.
If Google actually supported this tool I might actually be able to fix whatever it is they are unhappy with. If they are afraid reporting penalties will increase the spam why offer the service at all?
While I agree with most of the questions there are a things I wanted to ask about.
#9: It would seem that cloaking is a bad practice in general. Specifically with respect to Google, they cache a copy of pages they spider, why would you want to serve them something different than your actual site? Really we're talking about not serving up advertising since it is irrelevant to a bot and saves us on the bandwidth right? Or is it purely that we are trying to serve content in an optimized way for search engines? That being the case, shouldn't we, the developers of the web, be using open standards to create our sites? The engines use the standards to grab content and rank pages, with that in mind, our sites should be using those standards. Therefore, we shouldn't _need_ to have an "optimized" version of the site for SE's vs users
Plus, throwing process in front of the server adds overhead to requests and requires post processing of the response for the page from the server. This will slow things down and cause your server to work harder. This will decrease the lifetime of the server and, depending on the number of hits you receive, add major overhead.
@Mani
If I read you're addition correctly, number 12, you want to show specific content by relevance correct? Shouldn't using the sitemap protocol help with this (https://www.sitemaps.org/)? In theory, you are providing them with the correct site structure to search and therefore relevancy should be better based on what you specify.
Or, are you saying that you'd like more control over your page relation to other sites, meaning that siteX links to yours and is roughly the same, but that site Y links to yours and is just spam related and you want to remove that link? If so, right on! :)
Nicholas - there are a surprising number of times when site owners need to show different content to engines vs. humans for reasons ranging from landing page testing to ad insterstitials to login reqts. to geo-targeting and plenty more. White hat "cloaking" usually is not about showing an optimized version of a page, but simply about showing an accessible one or a more universal one.
Just to back up Rand: Even Google's own tools, like Website Optimizer, essentially cloak. Showing multiple test versions to the engines would potentially make a mess and would ultimately be confusing and harmful to end-users.
Ultimately, defining cloaking has always depended on intent; the problem is that no one seems to want to give a clear definition of what bad intent is.
FYI here's Google's take on cloaking and multivariate testing.
Great list Rand and hope some of these get checked off in the future.
I think there are also still some real basics that I'd love to hear the engines weigh in on just around the advanced queries and their reporting.
re: Google's site: results count. I did notice the other day that when checking sitemaps in Webmaster Tools, it was telling me how many of the pages in the sitemap are in the index.
I haven't noticed that before - if it is accurate (and I haven't tried to check it yet) it could be quite useful.
I've had conversations with Google about the lack of a WMC notification on more than one occassion.
I received EXTREMELY vague responses (just love it when I get those) on this but here is what I intrepretted from those responses (disclaimer, I may have interpretted it wrong).
Concensus via conversations with more than one Googler was that the notification process is set by a computer and not a human with some type of dial as to the number and type of notifications that Google sends. This dial is controlled by a small group of people (alluded to engineers or some other similiar group) and not a group of individuals that most (and I'd venture to say 'any') anyone could really influence.
This community MIGHT be able to do it but I honestly don't feel Google could change this in the short term even if they did want to change it--and I'm not convinced they do want to change it.
I think we often times think G' can do anything . . . in reality though it takes time for change to occur with them too and the larger they get the longer it will take for change to occur.
Great post Rand! An excellent way to rally the community to create a voice. Hopefully someone from G' will swing by here and provide whatever information they can.
Brent D. Payne
I'm tempted to open another SEOmoz account to give you another thumbs up
nice work, very much what we need from engins. iwould love to see a trouth full enigine, some time even they don't follow what they said. i love to see more reporting on dublicate contenet issues and more clear calrification from engines about dublicate contents
I love Seomoz as it has been VERY VERY helpful as I am New to the SEO community and my job (the only SEO manager). I am still working on my boss to get me the Pro membership. I cannot afford it on my own as gas is expensive and I drive a military issue Hummer...well maybe not but it sure feels like it.
My Question involves indexing large databases of subscribers to a website. Does anyone have an idea how facebook opened up their database to be indexed but not show their private information/profile? Here is the message I sent. They of course were no help with their automated Email reply directing me to wemaster support.
"Hello at Google,
I am contacting you in regards of having a database of names indexed by Google. We currently have over 850,000 people in our database. We would like Google's search engines to index those names, so that when Google users enter a person's name (ie John Smith) they will see that a John Smith who served in Army from 1965 to 1973) has registered on VetFriends.com How do I have Googles web crawlers index that information in their search engines? These names are secured in our database meaning that you have to sign up for our website either free or paid to be able to search our names. The names do not currently get indexed probably because our database is secured. We would like our results to show up in Google much like results may show up for Classmates.com This was taken from your results page for the name Adam Moore, how might we go about this."
Classmates: Adam Moore View Adam Moore’s profile on Classmates.com. Find old friends from school, college, old jobs, or military installations. www.classmates.com/directory/public/memberprofile/list.htm?regId=109013951 - 22k -
Have you tried to linking to each in a directory format e.g. linkedin.com?
Honestly we just had this idea pop into our heads the other day, I was curious about the best way this could be done. Thanks for your help!
Its not all that complex. Just go tot linkedin and see how they are doing it.