Superficial Crawling SEO Strategies

On WebmasterWorld people are discussing big daddy strategies, and on SEORountable they are highlighting how this is becoming a problem for directories. No discussion about Google’s new crawling method would be complete without also looking at Matt Cutts on the indexing timeline. While to some extent things are still in flux, I think we’ve hit a turning point for SEO, that needs to be reckoned with.

For those of you who’ve been playing in the space for any length of time what we’re seeing and hearing may look familiar. In fact it’s following a very similar path for the alleged ‘thing that looks and acts like a sandbox but really isn’t a sandbox’. A brief history, some people were reporting difficulties with getting new content to rank in the last quarter in 2003. Lot’s of folks dismissed it saying things were just in flux and under adjustment, but in early 2004 the reports were so large it became hard to ignore, and the ‘sandbox’ term was born. Now I know there are some who steadfastly believe there is no sandbox. What they mean to say is the sandbox doesn’t exist for them, either because they have long term marketing goals that don’t depend of free search engine traffic (Mike Grehan says) or they are creating unique and truly link-worthy content (When Unique Content Is Not “Unique” – Sugarrae). However if you ask anyone who plays in the highly competitive trenches, dabbles in the back-arts, has websites that are slightly gray hat, or is an outright button pushing spammer, they can confirm exactly what Danny Sullivan says yes there is a Google Sandbox (2005 Year End Revisit: Is There A Google Sandbox?)

However I’m not looking to stir up yet another pointless is there/is there not sandbox debate. What I am here to say is I think we are hitting the leading edge of a new force to be addressed with, which for lack of a better term I’m calling ‘sandbox crawling’. Here’s the way I see it, if your website is missing the right ‘quality indicators’ what you’ll start to see is superficial crawling and indexing of your website. Your site which may have had hundreds, thousands or even hundreds of thousands of pages will just not be as well represented in Google’s index as you would like it to be. You will still be indexed in some way, your home page for certain, and probably most of your first level pages for sure, but second-level, third-level and fourth level pages will remain largely untapped. Now what if you like Andy firmly believe The Money’s In The Archives Stupid isn’t this light crawling going to have an effect on the way you monetize your website, you bet it will!

Let’s look at some areas it will start to hit first like directories. Well established quality directories like botw.org and the yahoo directory are and will remain well indexed ([site:botw.org] and [site:dir.yahoo.com]). However some of the mid level and lower level directories are going so start dropping pages. When these pages start falling out of the index so will all the backlinks on them. Who’s going to be another candidate that gets hit hard, article directories. If you look at EzineArticles and IdeaMarketers you’ll still see they are fairly well represented ([site:ezinearticles.com] and [site:ezinearticles.com]) but lots of others article directories won’t be. What does that mean less pages indexed = less back links indexed.

So what are some strategies going forward, well 4 out 5 members from overly obvious department say write quality compelling unique link-worthy content, and actively promote your website with methods that aren’t dependent on free search engine traffic for a viable business model. Here are a few other tips to consider:

Google Sitemaps: If you have a good reasonably clean website there’s no compelling reason not to use Google sitemaps. The information you can get from it is helpful and it may even help you gain an ever so small minuscule amount of trust. The most luck I’ve had with Google sitemaps is on websites of under 100 pages.

Improve Your Architecture: The need for well thought out and planned architecture has never been greater. Organize things into logical categories that make sense for users, go wide and not deep. For heaven’s sake put up a site map already, and interlink different areas of your site as much as possible. Don’t use nofollow to manipulate your PageRank so your have a PR7 homepage and everything else is a PR1 it’s not the end result you want.

Get the Deep Links: Stop trying to control how people come to your site, let them link to where ever they want, in fact encourage it. My most well indexed site has about 70% of it’s links to pages other than the homepage.

Blackhat Strategies: Ok this sections for you churn and burn folks, the rest of you move along. Making money from those 5,000 or 50,000 pages of autogen content is harder now and may require a shift. Again I think the key here is thinking wide, combine all those short 100 word pages together, and build a massively long page connected right off the homepage. Let’s be honest you’re not concerned with usability, in fact bad usability is good, it makes them more likely to click on an advertisement and leave. How about we start thinking about uber-microsites and one page websites. You don’t have to worry about a one page website being deep crawled and indexed, it’s all or nothing. What about keeping domain costs down … well friend subdomains can be your friend …

While it is possible all of this may be a temporary glitch and everything may return to normal, but I’m not betting on it. I learned my lesson from the sandbox and and am starting to take a much more proactive role to counteract the new big daddy crawling method. I’m not spending time debating about how ‘right’, ‘wrong’, or ‘unfair’ it is. Plain and simple it’s not going to line my pockets. I’m doing what I can to get ahead of the curve.

Related Information

GraywolfSEO.com runs on the Genesis Framework

Genesis Framework

Genesis lets you to quickly and easily build amazing websites with WordPress. Whether you're a novice or advanced developer, Genesis provides the secure and search-engine-optimized foundation that takes WordPress to places you never thought it could go.
It's that simple - start using Genesis now!


Take advantage of the 6 default layout options, comprehensive SEO settings, rock-solid security, flexible theme options, cool custom widgets, custom design hooks, and a huge selection of child themes ("skins") that make your site look the way you want it to. With automatic theme updates and world-class support included, Genesis is the smart choice for your WordPress website or blog.


tla starter kit

Advertisers:

  1. Text Link Ads - New customers can get $100 in free text links.
  2. BOTW.org - Get a premier listing in the internet's oldest directory.
  3. Need an SEO Audit for your website, look at my SEO Consulting Services
  4. TigerTech - Great Web Hosting service at a great price.