One of the things toolbars and personalized search will do is reduce algo busting to the equivalent of speaking Latin. However personally I hate personalized search, and don’t think it’s going to be as big as they all hope, so I like to practice a little algo dissection from time to time.
One of my favorite methods is pushing a SERP way beyond the levels it’s normally expected to operate within. Now of course you can learn things by giving it a gentle shove, or even a firm push, but I want to really push it and see what it does under the stress. So to perform this test I’m going to pick three words from the alphabet. Not just any three words, but three words with the greatest amount of distance from each other, both from and alphabetical standpoint and a topic standpoint. So I will pick a word that starts with the first letter of the alphabet A, then a word that starts with the 13th letter of the alphabet M (those of you who are older than 30 may remember when the number 13 in window meant something) and something from last letter of the alphabet Z.
So our test case will involve the words arabesque malachite ziphius. Ok I know you’re going to ask, arabesque is a ballet term, since the girls go to ballet I have to learn these things. Malachite is opaque greenish stone, I used to work for a store that sold jewelry and we had to learn about gemstones, and yes it was on the test. Ziphius is the first part of the Latin name for a culvers beaked whale (sometimes I’m just wired ok?). Now we have three terms that are far apart in the alphabet and the number of documents that talk about ballet, gemstones, and rare deep sea cetaceans is going to be pretty low.
Lets look at the Google’s SERP [arabesque malachite ziphius]. First off as of this writing I get 46 results (of course a few months after this is published the results will be different and this document will probably appear as well). The second thing to notice is that all of these results are supplemental results. Another thing to nice is the size of these documents as low as 1200K and as high as 6500K. If you actually open any of the results you’ll see the files are dictionary files, with nothing more than words. Two questions come to mind, why are these files available on publicize servers, and why is Google indexing them?
Ok but lets get back to the SEO perspective, current thinking is that the optimal size for pages is about 100K-ish. However this shows Google is capable of reading quite a bit deeper. Now of course it should be noted there is no formatting data on these pages, and they are in the supplemental index. Next lets look at proximity, now if you’ll notice the words arabesque and malachite are hyper-linked and ziphius isn’t, so does that mean the last word wasn’t in the file or it was too far, lets choose the last word in the dictionary to check [arabesque malachite zyzomys] OK the top SERP is still the same but the word isn’t hyper-linked so it’s a proximity issue, let’s go back to the original SERP. I didn’t count the number of characters between arabesque and malachite because my software can’t do it easily, however arabesque is on line 5076 and malachite is on line 54973 so it is a considerable distance away.
So what does all this mean? Well Google’s algo is a bit funny, single factors really seem to mean very little anymore, things work in pairs, groups, and sets, it’s almost as if it’s trying to be intelligent, notice I said trying. Here’s my current working theory. Google likes authoritative websites, at least authoritative from a linkage standpoint. We also know from the sandbox that authoritativeness isn’t enough, you also need need certain amount of quality and or trust. The question is are these documents the only dictionary files with those words close enough, or are they the only ones that are on domains with with enough authority and trust. I can’t understand the non-english ones well enough to comment, but I would say it’s true for most of the domains listed there. So again my current working theory is that authority and trust are primary ranking components of ranking right now (which would account for amazon, ebay, craigslist all over the SERPS I watch). Faking authority from a linkage viewpoint is possible. Faking trust is more difficult or expensive (you could buy/rent links from trusted websites). Add in a delta t factor (difference over time) for those two components to catch the SEO’s, and you’ve got nice aggressive spam filter that matches many of the characteristics of the sandbox.
So what’s a workaround? Well I have yet to find one that is repeatable with any measure of certainty, however I will share what I’m thinking with you. Don’t plan on monetizing a website for at least 6 to 12 months. Build your website around a topic you can monetize, just don’t try to do it until it ranks (at least for secondary terms). That means no aff links, no adsense, no chitika, no YPN, no nuthin’, just content. Grow it slowly, very slowly. Once you got your base content (12-20 articles) add a few new articles a month (a few not 50, not 20, not even 10). Give it as much link bait as possible. Wait until you are getting regular traffic from Google before you monetize it, and when you do, do it slowly.