Cleaning Up After Google’s URL Mess

Recently Google announced the release of their own URL shortening service, Goo.gl. If we put aside the data gathering aspects for a moment, this product is the third in a series of products that make the web a more difficult place to crawl. In this post we’ll follow a Goo.gl URL through its course and look at the trail of junk each piece leaves in its wake.

Starting with the Goo.gl URL using the SeoConsultants header checker, this is the path we see. It passes through three different 301 redirects.

http://goo.gl/fb/gNHq

#1 Server Response: http://goo.gl/fb/gNHq
HTTP Status Code: HTTP/1.0 301 Moved Permanently
Content-Type: text/html; charset=UTF-8
Location: http://feeds.feedburner.com/~r/Wolf-howl/~3/D-BSa1LCo88/?utm_source=feedburner&utm_medium=twitter
Expires: Tue, 15 Dec 2009 02:18:06 GMT
Date: Tue, 15 Dec 2009 02:18:06 GMT
Cache-Control: private, max-age=86400
X-Content-Type-Options: nosniff
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Server: GFE/2.0
Redirect Target: http://feeds.feedburner.com/~r/Wolf-howl/~3/D-BSa1LCo88/?utm_source=feedburner&utm_medium=twitter

#2 Server Response: http://feeds.feedburner.com/~r/Wolf-howl/~3/D-BSa1LCo88/?utm_source=feedburner&utm_medium=twitter
HTTP Status Code: HTTP/1.0 301 Moved Permanently
Location: http://graywolfseo.com/google/google-personalized-search-news/?utm_source=feedburner&utm_medium=twitter&utm_campaign=Feed%3A+Wolf-howl+%28Graywolfs+SEO+Blog%29
Content-Type: text/html; charset=UTF-8
Date: Tue, 15 Dec 2009 02:18:06 GMT
Expires: Tue, 15 Dec 2009 02:18:06 GMT
Cache-Control: private, max-age=0
X-Content-Type-Options: nosniff
X-XSS-Protection: 0
Server: GFE/2.0
Redirect Target: http://graywolfseo.com/google/google-personalized-search-news/?utm_source=feedburner&utm_medium=twitter&utm_campaign=Feed%3A+Wolf-howl+%28Graywolfs+SEO+Blog%29

#3 Server Response: http://graywolfseo.com/google/google-personalized-search-news/?utm_source=feedburner&utm_medium=twitter&utm_campaign=Feed%3A+Wolf-howl+%28Graywolfs+SEO+Blog%29
HTTP Status Code: HTTP/1.1 200 OK
Date: Tue, 15 Dec 2009 02:18:06 GMT
Server: Apache/2.2
X-Powered-By: PHP/5.2.0-8+etch15aaa+tigertech1
Vary: Cookie
X-Pingback: http://graywolfseo.com/xmlrpc.php
WP-Super-Cache: WP-Cache
Connection: close
Content-Type: text/html; charset=UTF-8

I can tell you this is not a condition I would personally ever recommend. I can also tell you from first hand personal experience I have sat on panels with Google employees where we all but held hands, sang Kumbaya, and preached about the value of “pretty” URL’s and the importance of keeping things easy for search engine crawlers to understand. This is why it’s so mind-boggling that a search engine is behind the current situation of things.

Back to the case at hand: the Goo.gl URL 301 redirects to a feedburner feedproxy URL. This is where things start to get screwy. (Irony: feedburner is a Google company.) Since you know the final URL, there’s no need for the intermediate URL step. The program should share the data and pass you straight to the destination without taking long road, if you know what I mean.

http://feeds.feedburner.com/~r/Wolf-howl/~3/D-BSa1LCo88/?utm_source=feedburner&utm_medium=twitter

On to step number two. If you are using feedburner and Google analytics, Google appends a bunch of tracking parameters, making your URL a big fat ugly mess like this:

http://graywolfseo.com/google/google-personalized-search-news/?utm_source=feedburner&utm_medium=twitter&utm_campaign=Feed%3A+Wolf-howl+%28Graywolfs+SEO+Blog%29

IMHO this is where the train completely jumps off the tracks. There’s no need for the program to make things that complicated to crawl or to  create duplicate content issues. Again both are Google services and should exchange the info behind the scenes. And NO, the rel=canonical is not a solution. Band aid solutions are not the answer to bad architecture.

The last hop comes from a clean up plugin I have from Joost de Valk that redirects you to a clean URL with all of the query-string garbage removed. But I wouldn’t need this step if the first two steps hadn’t created problems in the first place.

So how bout it, Google? Instead of releasing 38 new products in 70 days, put more resources into fixing the messes some of your existing ones are creating.

Warning Conspiracy Theory Ahead

shutterstock_28131340

Now some people might make the case that because Google has all of this data they are eventually able to sort it out back of the house and it’s not an issue at all. However, by creating all of this unnecessary complexity at the URL level, Google is intentionally making the web a more difficult place for less sophisticated crawlers, spiders, and search engines to deal with. Basically they are laying down land mines to slow or trip up the competition.  But that’s not something we would ever see come from the home of lava lamps and bean bags chairs, right? Right …

GraywolfSEO.com runs on the Genesis Framework

Genesis Framework

Genesis lets you to quickly and easily build amazing websites with WordPress. Whether you're a novice or advanced developer, Genesis provides the secure and search-engine-optimized foundation that takes WordPress to places you never thought it could go.
It's that simple - start using Genesis now!


Take advantage of the 6 default layout options, comprehensive SEO settings, rock-solid security, flexible theme options, cool custom widgets, custom design hooks, and a huge selection of child themes ("skins") that make your site look the way you want it to. With automatic theme updates and world-class support included, Genesis is the smart choice for your WordPress website or blog.


tla starter kit

Advertisers:

  1. Text Link Ads - New customers can get $100 in free text links.
  2. BOTW.org - Get a premier listing in the internet's oldest directory.
  3. Need an SEO Audit for your website, look at my SEO Consulting Services
  4. TigerTech - Great Web Hosting service at a great price.
More in Featured, SEO
Secure Pages, Shopping Carts and Duplicate Content

Last week at SES Chicago, we were sitting in some site review panels and came across an identical problem on...

Google Suggest – Broken and Filled with Porn and Children

In recent weeks Google has been launching a higher than usual amount of new products. Many will say they are...

Eric Schmidt, The Wall Street Journal and Personalized Search

Early Friday morning the Wall Street Journal published an op-ed piece from Eric Schmidt on how Google can help the...

Close