The list of URL's that webclient
will fetch are read from
an input or request file. These requests must include
a checksum to validate the returned data, and may also be interspersed
with a variety of directives controlling the think times, the
request header and body, and the way in which summary statistics
are generated. Several examples of an input file are show in a
previous section.
The request file is formatted with a quasi-XML-like markup language. It normally starts with a <<START>> tag, and ends with an <<END>> tag. In between, each line specifies an HTTP request, indicates any additional POST data, tells how often the request needs to be run, whether or not the request is part of a group block that needs to run together, and the checksum information. Lines beginning with the hash # sign are treated as comments.
The collection of requests between
<<START>>
and <<END>>
is termed a "session", and is meant to model a typical
user session, from logon, to doing some work, to logging
off. When a session is played back with webclient, the
requests are processed in the sequence specified in the
file.
A session may be run multiple times. In order to simulate a real user's pauses between between web pages, the playback can be adjusted so that there is a pause between requests. The pause can be specified to be (exponentially) random or a fixed amount of time.
The basic elements of the syntax are described below:
Indicates the start of a session description.
All URL's up to the <<END>>
input line are read and
saved and submitted as a session.
Currently, only one such tag may appear in a request file.
Denotes the end of a session description. It is followed by two parameters:
count: integer, the number of times that the session will
be replayed. Note that this value can be overridden with the
--repeat-count
command-line flag.
think_time: floating point, number of seconds.
If zero, then webclient
will play back the URI's in as rapid
succession as possible.
If negative, then webclient
will wait exactly -think_time
seconds between each URI.
If positive, then webclient
will wait a random,
exponentially-distributed amount of time between each URI.
The average distribution of think times will equal
think_time
. Note that this value can be overridden with the
command-line --think-time
option (See the
Think Time Distributions section).
Currently, only one such tag may appear in a request file.
Requests may be formatted into a single input line each, as
shown here. Alternately, the data, fraction and checksum info
can be broken up into multiple lines by using the
<<REQUEST>>
etc. directives below.
Depending on the style and nature of the session, the
single-line approach may be simpler and easier to read
than the multi-line approach; this is left as a matter
of taste.
If the single-line approach is used, then each request must really be a single input line. If the url or data gets long -- don't insert a CR or NL to split the line -- the CR/NL will be interpreted as white space and the url will end at that point. If the url includes a query string (data after the ?) you must include it as part of the url without spaces.
The fraction and checksuminfo fields have the same
meaning as those described below. Besides just GET
and POST
, and valid HTTP method may be specified, including
OPTIONS
, HEAD
, PUT
, DELETE
, TRACE
and CONNECT
. Note that the POST
form must have
the post body appearing as shown in the second form.
Request the indicated url
using method
from the web server.
The method should be a valid HTTP method; viz. one of
GET
, HEAD
, POST
, etc.
The url
may be a fully qualified URI (such as
http://server.com/some/page.html
) or a path fragment
(such as /some/page.html
). The former form allows
a session to stretch across multiple servers; webclient
will resolve and address each server in turn. If these two
types of requests are intermixed, the most recently specified
server will be used for the fragmentary requests. Alternately
an initial web server can be specified with the -w
or
--webserver
flag (this initial server will be over-ridden
if/when a fully-qualified URL appears in the input file).
Note that POST
requests should be followed by a
<<BODY>>
tag specifying the body of the POST
request.
These two tags are synonyms. They are used to specify an HTTP
body that is sent to the webserver along with the HTTP header.
The end of the body should be marked with a <</BODY>>
or <</POSTDATA>>
tag appearing on a new line.
This tag allows webclient
to emulate not only the use of
HTML forms, but also to support non-HTML-based HTTP protocols,
such as OFX.
The body to a POST
may be specified out-of-line, as a
separate file. The indicated file should contain the text
of the POST
body. This form is especially convenient
when working with HTTP protocols that have large bodies
(such as OFX), or the bodies are available through an
independent test suite (such as OFX).
The fraction controls randomization of the session.
If the fraction is 1.0, then this URL request is
submitted as part of every session that is run. If
the fraction is between 0.0 and 1.0, then that
value indicates the fraction of sessions for which
this request will be run. For example, to run the
request on 50% of all sessions, use a fraction of
0.50. For each time through the session, webclient
will generate a random number to determine if the
request should be run, using fraction
as the
probability.
Values greater than one are interpreted as run counts.
Thus, a value of 1.5
will be interpreted to mean
that the request should be run at least once, and
possibly twice. (Note, however, that the summary
timing statistics report might not be generated in the
form you expect when using values greater than than 1.0.
This may change in future versions.)
A block of repeated URL's may be specified by using a negative number for the fraction. This usage is discussed in greater detail below.
Specifies a checksum against which the returned page
will be validated. This checksum is normally computed
by webmon
when a session is being recorded; the checksum
can also be recomputed by specifying the -v
flag.
To disable the use of checksum validation for the current
request, specify a -1
for the first integer.
Text in between the <<HEADER>>
and the
<</HEADER>>
tags will be used to form the HTTP header. This header will remain
in effect for the current and subsequent requests until a new
header is specified (with either the <<HEADER>>
or
the <<HEADERFILE>>
directive).
Specify a file that contains the request header to use.
This tag performs the same function as the --header-file
flag, except that it applies on a per-url basis.
For greater detail, review the section on
Header Substitution.
The specified header will remain
in effect for the current and subsequent requests until a new
header is specified (with either the <<HEADER>>
or
the <<HEADERFILE>>
directive).
Specifies that a pause should occur between this and the next request.
Used to emulate a user pausing to "think" between page fetches.
If the number of seconds is specified, then the pause will be for
that length; otherwise, the default think-time (specified on the
<<END>>
tag or with the --think-time
flag)
will be used. Note that, as elsewhere, times specified as a negative
number denote a fixed think time, while those specified with a positive
number denote an average for a random exponential distribution.
Specifies that performance statistics should be rolled up between
this and the previous <<MARK>>
and reported as a
unit. This is useful when a single page view might appear as multiple
URL's in the request file, and means and deviations are wanted for the
combination.
Oftentimes, a number of URL requests need to be run as a group, with the same random choice being made for all members of the group. For example, suppose you want to run the "purchase product" transaction in 35% of all sessions. Well, in a real session, the user wouldn't be able to jump into the middle of the set of web pages that form the transaction. Instead, they must first pull up a services page, then select a service/product, then fill out a form, then verify their order before submission, etc. This sequence of pages must be treated as a block in order for them to make sense. If the first page of the block is randomly chosen to be run, then the whole block will be run. If the first page is randomly rejected, then none of the block is to be run.
Members of a block can be indicated by specifying the fraction -1.0. If the fraction is negative, then that web page will be treated as part of a block that begins with the most recent non-negative URI. Thus, to group a series of pages together, you do something like the following:
GET /page1 0.75
GET /page2 -1.0
GET /page3 -1.0
GET /page4 -1.0
This will cause pages 1-4 to be run on 75% of all sessions. Note that the group ends with the next page with a non-negative value for the fraction.
(Extensions are planned for nested hierarchical blocks but have not been implemented).
Think times are used to emulate a user pausing between requests to
"think" about what they are doing before issuing the next request.
webclient
has a number of ways of specifying the think time
that should be applied between requests.
By default, the think time specified on the <<END>>
line
applies to each gap between input URL's. That is, there
will be a pause between each URL before it is issued. This value
can be overridden by specifying the --think-time
flag; but again,
the value specified applies to each gap. The location of the gaps,
and the length of the gaps can be overridden by using the
<<THINK>>
directive.
The think-times used can be fixed or randomly generated. When the think-time
is randomly generated, it is done so with an exponential distribution
whose average is the specified time. The exponential distribution
provides a more realistic model of actual human behavior, with some
pauses taking longer than others, but all tending towards a mean.
In all cases where a think time is specified to webclient
, a positive
number implies that the exponential distribution should be used, and
a negative number implies that a fixed length of time should be used.
A gaussian distribution may also be specified. Further details are
presented in the
Think Time Distributions section.
For example, the input file
<<START>>
GET /pageone.html
GET /pagetwo.html
<<THINK>>
GET /pagethree.html
GET /pagefour.html
GET /more.html
<<THINK>>
GET /another.html
GET /andmore.html
<<THINK>>
GET /last.html
<<THINK>>
<<END>> 50 8.3
will result in no pause between the fetch of /pageone.html
and /pagetwo.html
, and a think-time of 8.3 seconds
between the fetch of /pagetwo.html
and /pagethree.html
.
That is, the think-time will be non-zero only where the
<<THINK>>
directive appears.
Optionally, independent think times can be specified, like so:
<<START>>
GET /pageone.html
GET /pagetwo.html
<<THINK>> 5.2
GET /pagethree.html
GET /pagefour.html
GET /more.html
<<THINK>>
GET /another.html
GET /andmore.html
<<THINK>> 49.0
GET /last.html
<<END>> 50 8.3
The first pause will last 5.2 seconds, the second will last
8.3 seconds (the default), and the last pause will last 49 seconds.
In this example, there is no pause between the fetch of /last.html
and /pageone.html
in the next go-around. A final think is needed
if you want to avoid immediately going back to the beginning.
If the keyword <<THINK>>
never appears in the file, then the
default think-time will be applied between each and every url.
Thus, the input file
<<START>>
GET /pageone.html
GET /pagetwo.html
GET /pagethree.html
<<END>> 50 8.3
is identical to
<<START>>
GET /pageone.html
<<THINK>> 8.3
GET /pagetwo.html
<<THINK>> 8.3
GET /pagethree.html
<<THINK>> 8.3
<<END>> 50 8.3
Additional flexibility is provided by the --think-time
flag.
If this flag is used, it overrides the default value specified
with the <<END>>
tag.