As you may already know, I’ve been a Web Developer since 1999 and run Website Setup dot net.
I use Google Webmaster Tools for several of my and my customers’ websites. Recently, under Diagnostics > Crawl Errors, I discovered quite a few 404 (not found) errors pointing to the same non-existent location:
Sure, a 404 error is totally expected. After all, the “/a” directory does not exist, however, the question remains, how did the Google-bot get the idea to crawl there in the first place? Perhaps a programming error on my part or a typo?
Well, this is strange, I’m now seeing it on more than one site and I’m finding other people complaining about the same thing suddenly appearing in their Webmaster Tools Dashboards.
Let’s now examine my Dashboard’s Linked From data:
Hmm, no clues there. Nothing in any of those pages link to a “/a” location.
We find this occurrence…
<a style="color: red; float: left; opacity: .55;" href="/a">a</a>
What’s the solution to all this? You don’t want a bunch of 404 errors piling up… although Google is smart enough to drop bad URL’s from their index, they can also penalize a site for this by reducing the crawl rate.
Solution 1: Redirect “/a” to your home page with a 301 in your htaccess file. This approach has two minor issues. One, that your server is doing the work by sending the Googlebot back to your home page and two, the page never existed in any Search Index, theoretically, there should be no reason to redirect it elsewhere.
Solution 2: Block this location from the Googlebot in your robots.txt file. This puts the responsibility on Google to stay out of someplace they don’t belong.
Disallow: /a/ Disallow: /a
After several weeks, you should see these erroneous 404 errors disappear. Good luck!
EDIT: In this article, I’m only assuming this is an issue for sites that host jQuery locally. I cannot imagine the google-bot trying to crawl scripts hosted on it’s own CDN!
EDIT 2: Here is an official response from a Google employee posting in Google Groups:
4/28/11 – 4:39 AM
I would also recommend not explicitly disallowing crawling of the jQuery file. While we generally wouldn’t index it on its own, we may need to access it to generate good Instant Previews for your site.
So to sum it up: If you’re seeing “/a” in the crawl errors in Webmaster Tools, you can just leave it like that, it won’t cause any problems. If you want to have it removed there, you can do a 301 redirect to your homepage.