The wasted web

Over the last few years, search engines have become the most important source of traffic to websites. The web has also become one of the most popular sources for information, especially among regular web users. Because of this, exposing your content to the web, and making it easy to find in search engines, is one of the best ways to maximize the return you get from creating content.

Unfortunately, at most sites much of the valuable content is "hidden". It's not directly available by clicking on a text link. The amount of content that is hidden on the web may be hundreds of times larger than the amount that's easily available. According to search company BrightPlanet, there are hundreds of billions of pages that aren't available by traditional search engines.

This content is sometimes called the "invisible web", because it's invisible to typical web users. That title, though, is weak when you consider the tremendous lost value of this content. To businesses, this content is the "wasted web" - content that companies have invested substantial resources in, yet that is inaccessible to the majority of web users.

This problem is huge for ebusinesses, and can have a big affect on your bottom line. One example is found in many company's self-help systems. The only way to access this content is through the web interface within the site's self-help area. Often, this content is not indexed by the site itself, let alone major search engines. This means that customers coming to the site can't search for the data, unless they are within the "self-help" area of the site. To customers using other search engines, the content doesn't even exist.

The potential value of subscription content is similarly limited. If the subscription content is hidden away, no one will link to it, the content won't be in search engines, and it will have only a fraction of its potential audience. In a recent article in Editor & Publisher in the US, Steve Outing argues that it may be counterproductive for news sites to charge for access to content. "All that 'good stuff' that publishers think is worth charging for will, for the most part, be visible only to users of those individual publishers' Web sites. News search engines, which send plenty of user traffic to news sites, won't see or refer their users to such material."

There many ways that content ends up being hidden from the web:

* Information stored in online databases
* Self-help systems that require user choice
* Secured content
* Pages that are filtered for business or political reasons
* Subscription or paid contentIf you've got content that falls into these categories, or that is not directly available via the web for other reasons, there's probably a big gap between the potential value of this content, and the value that you're currently getting from it. Identifying this content that's being "wasted" and exposing it to the web can dramatically improve the return on investment from this content.

Making wasted content availableThere are often technical hurdles to making the content indexable, and there may be business reasons for keeping it secured.

Most hidden content is either in a database system or a content management system. Fortunately, the same tools that can keep content hidden can be used to make it easy to be searched. For example, content in web-enabled databases is hidden because it depends on queries. One way this can be made visible to the web is by creating a summary report for the most common queries. A summary report could contain headings and brief entries for each database entry, and links to the detailed content. Because a summary report is text with links, it can be easily indexed.

Secured content is hidden from the web intentionally, but that also limits its value. One way around this is to create two versions of the content, a secure and a non-secure version. This can easily be done with content management systems:

* The content management entry needs to contain a summary entry for each page
* The non-secured version could contain a heading, summaryinformation, and a link to the secured version.
* The secured version would have the complete page
* When published, the two versions are generated and moved tosecured and non-secured directories.

By doing this, you expose a limited version of the content to the web to be indexed by search engines. Salon magazine's site provides a good example of this. They publish an unsecured page that works as a teaser, to get you interested. If you're a subscriber, you can just click to the full version. If you're not a subscriber, you can choose from several options in order to view the article. The result of this strategy is that their articles show up in Google and other search engines, driving readers and potential customers to their site.

Using this approach, information in secured areas, siloed areas of a website, and self-help systems can be published in a version that will be visible to the web. This will make it easier for people to find your content, increase traffic, and maximize the value of this content.

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

More about Google

Show Comments