|
There are many ways to measure Web response times. While almost all provide some
useful data, few actually measure the customer's experience. Below, you'll find
each of the principal methods and their strengths and weaknesses briefly described
in relatively simple terms.
No one technique can be used universally to measure page load times or study
timing issues. We use several of these techniques as part of our consulting
activities and rely on two as part of our day to day timing processes. By
choosing among several well understood tools we are able to make highly accurate
measurements.
Description
By using one of several HTML and script techniques it is possible to cause a
page to load a specific URL after the user's browser has completely loaded the
page as seen by the user. Most Web servers keep a log file that records, amongst
other things, the times at which specific page objects are loaded. The log file
can thus contain both the start and completion time for the page. Various post
processing tools can analyze the file and compute the page load time.
Positives
- This techniques does a relatively good job of measuring the URL's load
time as experienced by the user.
- It measures every URL load as opposed to sampling.
Negatives
- The underlying pages must be modified to include the timing content. For
a large site this is a significant undertaking.
- The measuring technique typically uses the browser's "onload" event. Many
dynamic pages use this event for their own purposes requiring some page
"recoding."
- Measurements generally not real time. A post processing analysis is
required to extract the measurements.
- The technique actually slows the site down just a bit.
- The technique will not work on all pages, particularly those that
continue to push content to the user.
- There is some time distortion due to issues of browser rendering
behaviors
Bottom Line
- A reasonably good techniques for dealing with a few pages or
investigating specific problems.
- Requirement to modify the page limits its applicability.
- Non-real time post processing makes it unsuitable for alerting services.
There are a number of workstation side techniques that can be broken into the
following categories:
These use a technical tool/software called a protocol analyzer that watches
the low level communications protocols used to retrieve the URL's content. By
noting when the connections start and terminate it is possible to measure the
URL's download time (sort of).
Positives
- Provides a great deal of detailed data about the low level protocol
behaviors that is very valuable in dealing with protocol related issues.
Negatives
- Protocol analyzer must use a detection algorithm to determine when the
URL load is complete. Typically this involves waiting for all the URL's
connections to be either closed or reused. While the technique can work
reasonably well for simple pages, for pages with significant active content,
detecting page complete is a serious concern. In our experience, the failure
rate is unacceptably large.
- Requires modifying the workstation's communications facility slowing the
system down and distorting
the time measurements.
- Sensitive to the engine that requests the URLs. Many firms use a pseudo
browser to drive the protocol analyzer. Pseudo browsers have serious
measurement negatives.
Bottom Line
- If you are concerned about protocol issues this is a very good
technique. We use it to investigate specific protocol related issues.
- Modification of the communications stack distorts timing and is a
potentially serious issue. In our experience, protocol based timings are not
generally reliable.
- It does not measure the user's experience.
This technique uses a proxy server as a stand-in for the user's browser. It
is similar in many ways to using a protocol analyzer, but monitors higher level
data flows. The proxy server is driven by a browser or more commonly a pseudo
browser. It notes the time at which the page load starts and when the
browser/pseudo browser stops requesting content. The difference is the page load
time.
Positives
- A proxy server can provide very valuable data on HTTP protocol issues
useful in understanding certain page related functionality issues.
Negatives
- Generally all the same as a protocol analyzer.
Bottom Line
- If you are concerned about HTTP protocol issues this is a very good
technique. We use this tool to help us when dealing with form and security
related site issues.
- Proxy servers distort time measurements and introduce other artifacts in
the timing process. In our experience, proxy based timings are not generally
reliable.
- It does not measure the user's experience.
Pseudo browsers are simply applications that work like a browser. They open
the URL, read its content, parse it, and then load any other content required by
the page in a fashion that attempts to mimic a browser's behavior.
Positives
- This is a very simple technique that verifies that the server can
deliver content and can provide accurate timings of its own behavior
Negatives
- Most pseudo browsers do not in fact mimic the behavior of real browsers
very well. We've tried several and found that their timings are very
doubtful.
- Pseudo browsers generally do not deal with page scripts. As a
consequence they can fail to load all the required content. As dynamic pages
become more common this is an increasingly serious defect.
Bottom Line
- While the technique is very common, it simply doesn't work very well. We
do not generally use it at all.
Microsoft makes IE available as a component that can be embedded in an
application. The application can drive IE and measure the page load time.
Positives
- Behaves very much like IE since the IE component is actually responsible
for fetching the page, parsing, building a rendering tree, etc.
Negatives
- The document technique for page load complete for the IE component does
not work reliably. In our experience over 15% of URLs are inaccurately timed
using the embedded IE component.
- For reasons we do not fully understand, the times reported are about 18%
fast on average. We are suspicious that there is an issue with the embedded IE
component that is the underlying cause.
Bottom Line
- An interesting experimental tool but not very good for its intended
purpose.
- We occasionally use one for quick and dirty tests but not as part
of our production environment.
IE is a com component and thus can be automated by external applications. In
this approach an external application directs a running unmodified instance of
the IE browser to navigate to a page and times the process.
Positives
- Uses a real browser just like a real user and accurately exhibits all IE
behaviors.
- Does the "right thing" with scripts and active pages.
Negatives
- The major problem with this technique is determining when the page is
really loaded. The documented techniques do not work reliably.
Bottom Line
- The technique does not work well in its documented form and can not be
reliably used in that manner.
Notwithstanding the bottom line, this is one of the techniques we use in
production. After testing with several hundred thousand pages we determined that
one or both of two conditions are true when a page has been completely loaded
and displayed in the IE browser. These conditions are signaled by a sequence of
browser events that are tied to the HTML DOM and/or the browser's internal
operations. They do not materially impact the
browser's performance. Based on these proprietary techniques we are able to
measure page load time with a high degree of accuracy without modifying the page
or communications environment.
IE runs in a Windows window. It is possible to drive IE with an external
application and watch the browser's window. When the window exhibits certain
knowable content it has been completely loaded.
Positives
- Uses a real browser just like a real user and accurately exhibits all IE
behaviors.
- Does the "right thing" with scripts and active pages.
- Depends on deep characteristics of the browser and operating system and
is thus relatively independent of the URL.
Negatives
- While it works with the vast majority of pages, there are some that have
content that cause failures.
- Requires polling of the window to determine when the page is loaded.
Timings are sensitive to the polling time.
- Can not probe the window browser content and thus can not be made
sensitive to "user need" examination.
Bottom Line
- Excepting the polling issue, this is a generally good technique with
broad applicability.
This is one of the techniques we use when IE automation does not work
acceptably.
It is possible to construct an HTML page that uses frames and the frame's
"onload" event to measure the time it takes for a target URL to load in a second
frame. You can see a sample of this technique on our site and at several other
sites on the Web.
Positives
- Uses a real browser very much like a real user.
- Does the "right thing" with most scripts and active pages.
Negatives
- Only works with pages that will successfully load in a frame. A number
of increasingly common active page techniques are inconsistent with frames
and this approach will not work with those pages.
- A server side application is required to save the time data.
- It is difficult to issue alerts. As a practical the server side
application must do the alert.
Bottom Line
- Excepting the frame issue, this technique works generally well but does
require significant server support.
This is another technique we use when IE automation does not work acceptably. Using some proprietary
technology we have developed, we are able to identify pages that are frame
compatible and only use the technique with those pages. |