This post is part of the Friday Q&A section. If you want to ask a question, just write a comment below.
Should I use an HTML sitemap on my blog, or an XML one? Or both? Chris Pearson says that XHTML sitemaps aren’t recommended anymore because these pages end up having 100 links or way more, and Google has said that this is a poor practice to be avoided if at all possible.
He himself used to have both the HTML sitemap and the XML sitemap. Now he has taken away the former from his site. Pro Blog Design uses the HTML sitemap but has it broken to pages using the Dagon Design Sitemap Generator.
I notice you use only the HTML sitemap and have more than 1000 links in it. Has this impacted your rankings? Which sitemap would you recommend? And if it is one or the other, which plugin would you recommend to generate the sitemap?
First of all let’s clarify the difference between the two types of sitemaps for those who don’t know it. An HTML sitemap is a page within your site that links to all the other important pages on your site. “Archives” page is nothing more than an HTML sitemap.
An XML sitemap does pretty much the same thing, but instead of using HTML (which a browser can interpret like any other page) it uses XML, which is a markup language to encode documents. This means that the XML sitemap will only be useful/make sense to search bots, while the HTML sitemap will also be visible/useful to human visitors.
In my opinion most websites should have an HTML sitemap. Why? For two main reasons: it helps search engines crawl and understand your site, and it helps human visitors browse your site more efficiently.
Why do I say “most” and not “all” websites? Because some specific types might not need it. Very small sites or online stores, for example, might not need an HTML sitemap because finding the single pages there is straight forward. For content based websites like blogs, however, the HTML sitemap is really helpful.
Now if you are worried about having too many links on a single page, you could use a sectioned HTML sitemap, where the initial page links to all the months, and then inside each month’s page you’ll have links to the single posts.
However, I believe this structure is less functional than having all the posts in a single page, like I do. Try to find a specific post on those sectioned sitemaps and you’ll see how much time you’ll waste going back and forth the pages until you find what you are looking for. On my archives, on the other hand, you just need to browse around until you find the right post.
Google does recommend webmasters to keep fewer than 100 links per page (including sitemap pages), but I believe this is not a strict policy. In other words, as long as your page with over 100 links exists to help your visitors, Google should be fine with it. This is the case with my archives. In fact you can see that it has a PR 5, and if you search on Google for “daily blog tips archives” it will appear on the first result, meaning that Google is fine with it.
I use the Clean Archives WordPress plugin to create that page automatically.
What about XML sitemaps? In my opinion using such a sitemap is not necessary, but it might be useful in some situations. Why don’t I think it is necessary? Because if you craft your website with an efficient structure (i.e., with a flat hierarchy of pages, a clear navigation and sound permalinks) Google should have no problems crawling and indexing all your pages correctly.
You can test this by using the “site:domain.com” operator on Google. It will reveal how many pages Google currently indexes from your site. Daily Blog Tips, for example, has around 1,700 pages indexed by Google. I have 1,425 posts published. If you then add the secondary pages (e.g., pages subsequent to the homepage, category and archive pages) this number should get close to 1,700, so yeah Google is indexing all my pages right now.
Now in what situations I think one should use an XML sitemap? Exactly when one is having crawling or indexation problems. If Google was only indexing part of my pages, say 500 out of 1,700, then I would consider using an XML sitemap. Similarly if my newest posts were taking too long to get indexed by Google (if at all) I would also add an XML sitemap to try to solve the problem.
Summing up: I believe most sites should use an HTML sitemap, because they are useful both for search engines and for human visitors (and if you don’t want to have a page with more than 100 links just break your sitemap in sub-sections). As for XML sitemaps, I believe they are useful if you are having crawling or indexation problems. Obviously adding an XML sitemap to a healthy site won’t do any harm, but I am not sure how much good it will do either.