On my list of top 5 problems I encounter when doing client work, the number one problem is poor URL structure. In this post I’ll go over the basics, how to get it right, what are the most common problems, and how to fix them.
While this may sound like a big fancy term, what it basically means is that your content needs to have one URL and only one URL to be most effective. For example, a website could have identical content on each of these URLs, yet search engines would consider them as different pages:
Depending on how you link to pages internally or how others link to them externally, search engines will eventually discover these multiple URLs. If one “page” can have 6 or more “page” permutations, it’s easy to see how a website can get out of control quickly.
You can fix this issue with the rel=canonical meta tag, but this is really more of a band aid solution. If 99% of the people link to a version of your URL that doesn’t match the rel=canonical tag, the search engines will ignore your tag in favor of the links.
The best way to handle this problem is with server side programming. Use htaccess or ISAPI to configure your web server to serve pages always with or always without the “www.” If you have been serving both, do a link audit to see which has the most inbound links and choose that one. For new domains, I omit the “www.” Although it gives you three more characters if you keep the www or allows you to use a bigger font size in printed material if you omit it, there is no SEO value one way or the other. You just want to be consistent.
File extensions in URLs
There is no value to having .html, .php, .asp or any other file extension, as long as you chose a common one and not something irregular like “.dx” that executes a programming language of your choice. That said, the best method is to not use any extension at all. If your website is around long enough, chances are good you will move from one language to another and change file extensions. When you do, you will need to set up redirects so you don’t lose your inbound link equity. Sure, you can set up your server to serve .asp pages using .php programming, but this makes your server work harder than it has to. That may not matter on some sites but it does on busy ones. If you are using extensions now, don’t change them just for the sake of changing them–wait until you upgrade the code, then rip the band aid off (so to speak) and remove all of the extensions.
your content needs to have one URL and only one URL to be most effective…
Word Delimiters in URLs
There was a time where Google really cared about what you used between words, but nowadays it’s less important. You can use hyphens or underscores–it doesn’t really matter. I like hyphens for readability and usability, but either is fine. If you choose another character you are “going off the reservation” and search engines may have trouble deciphering your words and giving them value, so choose this course with extreme caution. I also advise against “smushing” your words together. Search engines may be smart enough to figure out easy, common, high volume keywords, but other words become problematic.
Shorter is almost always better than longer. Ideally you want to have no more than five words. If you can get it closer to three, that’s even better. Don’t use abbreviations, stop words, or really short words. While search engines can handle longer URLs, in practice I have found it better to be under 60 characters in total length including the domain. You can certainly go longer without running into problems–in fact, you’ll need to on large sites–but when you do go over 60 characters, do it for a reason.
Keywords and Numbers in your URLs
Generally speaking, it’s better to have keywords in your URL than not to. However, having too many keywords in your URL can work against you from an algorithmic and perception point of view. For example, which URL looks more trustworthy to you.
The second doesn’t break any of the rules and it is short, but it has that untrustworthy, keyword stuffing quality to it that will work against you.
If you use numbers in your URLs, be careful. Make sure they aren’t in a format that Google might interpret as a date. If it is in a format similar to a date, they may be interpreted as such and your post could look older (or younger) than it really is. My advice? Don’t wander down that dark alley, keep off the moors, and avoid that issue entirely if you can.
Parameters in URLs
One of the sticky issues with URLs is using parameters. In the early days of the internet, programmers and developers used parameters in URLs to keep track of things like products or shopping sessions. Today it’s used by everyone from the marketing and advertising department to customer service. In the beginning, search engines had a lot of problems with parameters, so they ignored any URLs with parameters in them. Nowadays they are much more sophisticated and, in most cases, can handle them just fine. They even have specialized tools that allow you to tell them which parameters to ignore, but don’t fall into this trap. Using parameters is like sword fighting. Unless you are very, very good, eventually someone will get hurt. You can avoid that problem if you avoid parameters entirely. If it seems like I’m being overly cautious here, you are correct. I’ve just seen too many instances where this has derailed more than one website. If you need to use URL tracking, use hash tag tracking.
Directories in URLs
The question of whether or not to use directories in your URLs is slightly nuanced. If you are going to have a large site in excess of 500 or 1000 pages, then yes, you will absolutely need directories in your URLs. If you are going to have a smaller site, it’s up to you. Using directories gives you additional control, but it costs you in complexity and overall URL length, so consider those facts before you make your choice.
Moving, Changing and Redirecting URLs
If your website is around long enough, for one reason or another you will have to change URLs. When you do, the best way to make sure things don’t go off the rails is use a 301 redirect from the old URL to the new URL. Search engines have gotten smarter about other types but using them is still a bit of a dice roll, so don’t take that chance if you don’t need to. See my article on how to create a smart custom 404 page for an efficient way to put that logic into action.
Unless you are working with a custom built CMS, it’s unlikely you will have the flexibility and control to get each of the aspects I discussed above. Just think of each one as a trade off. The more of them you give up, the more obstacles you put in the way of your overall SEO effort and the harder you make things for yourself. So let’s recap the important aspects of this post:
- Don’t allow multiple versions of your pages to exists under more than one URL, including “www”, pages, trailing slashes and other canonical issues.
- File type extensions don’t matter, but ideally you should avoid extensions so you don’t have performance issues or redirect issues in the future.
- Keep your URLs as short as possible.
- Put keywords in your URLs without stuffing.
- Avoid numbers if possible, especially numbers that could be interpreted as dates.
- Avoid parameters if possible; if you absolutely need them, use hash tags
- Directories are only necessary for large sites or for sites where you need to segment the content; otherwise, they are optional.
- Use a 301 for redirects: it’s the least problematic solution.