Keeping tabs on web activity

As the importance of e-commerce increases, log reporting has become a very important tool for understanding web behavior. There are three main ways to log the activity of web visitors: application logging, web server logging, and client-side data collection. Each of these has benefits for understanding customer behavior, while each has limitations, too. Understanding the pros and cons of each approach is important when deciding which approach to use.

Application logging

Application logging can often give the most detailed information about customer activity:

* It supports logging the details of secured transactions;

* It can log any request or form entry that an application receives; and

* It can be used to capture a complete record of client activity within an application.

This comes at a heavy price, though:

* Application logging must be implemented for each application;

* It requires custom development; and

* It's not useful for understanding the "big picture" of client's behavior throughout a site.

Application logging is often implemented on a per application basis by individual business areas. This allows each area to get the information needed that is unique to that application. This is also a limitation, though, because this information is often not in a form that is useful for sharing across business areas, or for developing a corporate view of customer behavior. Because of this, application logging is good for identifying what a customer does within an application, but not as useful for understanding how the customer got to the application, or what motivated them.

Web server logging

Web server logging is the most commonly used tool for understanding customer web activity. There are many reasons that web server logs are popular:

* They allow logging to be done per-domain, rather than per application;

* They can be used with standard tools, and don't require customer development;

* They provide technical information, such as load balancing and broken links;

* They log requests to graphics and other downloadable files;

* They provide a detailed account of web server activity; and

* They collect many details that other methods miss.

There are some problems with web server logs, though:

* Web log management can be challenging for busy sites served by many servers;

* Most information in web log files is irrelevant for understanding browsing behavior;

* Log files do not capture activity handled by browser or network caches;

Log file reporting is the easiest method to implement, because there are many commercial and free log analysis tools. Log reporting will show all the requests handled by the server, including requests to graphics, pdf files and other downloadables, and even missing or moved files.

Unfortunately, log reports show what servers have done, rather than what customers have done. In some cases, this can be very different. The discrepancy between what customers do and what servers log is most noticeable with public, non-secured content.

Pages that are not encrypted are often cached within users' browsers, within proxies, or other network caches. When someone visits a site, popular pages may be served by web caches. In this case, only the page requests unique to each visitor are recorded by the web server. This results in a view of the customer navigation that can look random or arbitrary, because the common links between pages are missing from the log files.

Client-side logging

Because of the limitations of server logging, many companies are using client-side logging. This is usually implemented by embedding Javascript within each web page that requests a 1 x 1 pixel invisible gif image.

The Javascript executes every time the page is loaded, so even cached pages generate new requests for the gif image. The Javascript request transmits page-specific information and information about the customers' system to the image server, where the information is logged. This provides a clear log of every page a customer views, including pages held in the browser's cache or other network caches.

There are many benefits to this approach:

* Client-side logs are a fraction of the size of web server logs;

* They track user page views rather than web server requests;

* It can be used to track users across many servers and even multiple domains; and

* It can collect detailed and customized information on client configuration.

The client-side approach comes with several limitations, though:

* It requires Javascript, so users with Javascript disabled will not be logged;

* It records only hits to pages, not graphics or other downloadable files;

* It doesn't provide a record of unfulfilled page requests; and

* It doesn't provide any information on server load.

Which is the best tool?

There are many approaches to logging web activity because no approach is a complete solution. Each tool may be the best for different situations. Application logging is usually the best approach to understanding the details of web applications and transactions. Web server logging is the best approach for understanding what's happening on web servers, and can be very accurate for tracking activity in secured areas. Server logging also provides the broadest picture of customer behavior, including requests to pdf files, multimedia content and other downloadables. Finally, client-side logging is often the best tool for getting a business view of browsing activity. It accurately tracks the "big-picture" of how users browse through a site.

Join the newsletter!

Error: Please check your email address.
Show Comments

Market Place