Trial by fire

Realistic stress testing is essential when launching online services, experts say

Organizations too often are not correctly testing and tuning their Web sites to handle un- expected user volume, according to industry experts.

Proper load testing and tuning of a site during development can help ensure a successful launch. During the Thrift Savings Plan (TSP) online system's rocky start last month, site activity slowed to a crawl, locking scores of users out of their accounts. Now, experts advise agencies to use load-testing software or services that realistically emulate Web traffic.

"Most [organizations] don't load test correctly," said Eric Siegel, principal Internet consultant for Keynote Systems Inc., which provides Internet performance management and testing products and services.

Siegel said organizations run load tests, in which they emulate a large number of concurrent users, but they don't test for flash loads, which occur when heavy traffic appears quickly.

Another issue organizations often overlook is abandonment — when users click away from a site — because of poor response time or other reasons. Although legacy systems usually indicate that a session has been terminated, Web protocols don't. As a result, abandoned sessions use resources until the server forces them to close after a timeout, Siegel said.

The Federal Trade Commission experienced a flash load with the launch of its "Do Not Call" national registry June 27. The FTC created the registry to make it easier for consumers to block telemarketing calls. Users can register their phone numbers online at www.donot call.gov if they have an active e-mail address or by calling a toll-free number.

On opening day, consumers overwhelmed the site with attempts to register their numbers. The Web site received about 1,000 hits per second, according to the FTC, significantly slowing the site.

AT&T, which is hosting the site, stress tested it before deployment using the services of Mercury Interactive Corp., said Richard Callahan, a client business manager with AT&T. In fact, AT&T tested "far in excess" of the anticipated Web load.

Glitches can always surface, however, in a site as complex as the FTC's, he added. "We reacted and brought in additional resources and services," he said. "Other Web-hosting firms might have had to pull the site off-line, but AT&T, which has a hosting facility with unlimited bandwidth, never thought about shutting down."

Keynote Systems, which has been monitoring activity at the agency's site, noted that Web servers and network performance have been excellent, and availability has been 100 percent since June 27. As of July 7, 19.6 million telephone numbers were logged into the registry. Users favored online registrations 89 percent to 11 percent, according to the FTC.

The TSP site has not been as fortunate. Federal employees and retirees still could not access parts of it more than two weeks after the TSP launched the new Web interface. Officials said the site was stress tested, but a former government official said it was "never adequately tested." He indicated that the TSP problems stem from officials being pressured to get something up and running quickly.

The former official said that many organizations are probably not doing adequate testing.

To perform realistic load tests, agencies should measure performance from end-user locations before production, Siegel said. Information technology personnel at agencies can measure performance from their users' Internet service provider by installing agent software on the ISP links. They can get the information from Web logs.

Once in production, the IT workers need to continue to measure user transaction arrival rates to detect and fix problems before the users notice, Siegel said. To prepare for flash loads, agencies need to prepare thin sites, such as news sites do to handle more users when an important event creates a surge in traffic. Arrangements with content distributors will help agencies quickly activate additional bandwidth, servers and backup equipment.

Tuning the Web site before deployment is also critical. Agencies should decrease the size and number of page elements and speed up the performance of secure pages by using a hardware accelerator that offloads encrypted traffic from Web servers such as Secure Sockets Layer, Siegel said.

Additionally, IT personnel should monitor image and content servers because slow servers may hamper downloads and make users think the site is dead, he said.

The emergence of dynamic interface-based portals during the past three years also demands a different design mindset, said Craig Roth, a vice president with META Group Inc.

"The biggest step missed before agencies design a site is understanding different customer patterns," he said.

If a site is designed properly, users, based on their requirements, will be able to navigate more easily through the site, Roth said.

***

Potential hot spots

Before launching a Web site, critical details are often overlooked. What follows is a short list of important areas organizations should consider before taking a site online.

Problem ... Remedy

Heavy traffic ... Conduct realistic load testing across the Internet; measure transaction arrival rates as opposed to concurrent users; measure across the Net from user locations.

Sudden traffic loads ... Build a thin Web page that can handle an increase in volume, similar to what news Web sites do during spikes in traffic, or use content distributors to redirect traffic to a mirror site. Measure performance from users' ISPs. Stress tests may not reflect real-world performance.

Tuning before deployment:

Problem ... Remedy

Large pages with too many elements ... Use style sheets and decrease total page sizes.

Slow load times ... Monitor image and content servers for problems that might slow performance.

Slow security pages ... Offload encrypted traffic to hardware accelerator.

NEXT STORY: SAIC to build DHS architecture