QualityAssurance

Monday, December 13, 2010

Content Delivery Networks (CDNs)

Accellion offers secure large file delivery services (e.g., replacing email, ftp, and CDs).

Akamai, the company that made content delivery networks famous with a record-setting IPO. Core offering is distribution services (both http content and streaming media), and has recently unbundled other services including network monitoring and geographic targetting. In April 2000, Akamai purchased InterVu; in 2005, Akamai purchased Speedera, and in 2007, Netli.

AppStream provides content delivery services for applications, not static web content. It delivers what the client needs to get started, and continues to anticipate what the user will need next, enabling fast startup times, even for large Java applications.

AT&T offers Intelligent Content Distribution services.

BitGravity is a content distribution network, established in 2006, to provide services for the delivery of audio, video, software, and advertising.

CacheNetworks offers the CacheFly CDN, which uses BGP AnyCast to find the server with lowest latency to the user.

CDNetworks provides content acceleration with a global network of POPs and serves as Asia's largest CDN service provider.

EdgeCast (founded 2006) is a content delivery network (CDN) offering Flash, Windows, Silverlight and HTTP Progressive download streaming all for the same low price. They also provide website acceleration for increasing web site performance and speeding up page load times as well as advanced reporting and analytics.

EdgeStream provides video delivery solutions, including a complete software platform from client to server and a distributed and fault-tolerant hosting network.

Limelight Networks offers an advanced content delivery network for Internet distribution of high-bandwidth media such as video, music, games, and downloads.

LocalMirror offers a managed content delivery network for static content delivery, DNS services, and audio and video streaming.

Mirror Image provides content and streaming delivery services through a global network of Content Access Points. In March 1999, Xcelera purchased a majority of Mirror Image, and in January 2001, Xcelera/Mirror Image purchased Clearway Technologies.

NaviSite offers two content delivery services: Delta Edge, which builds content on the edge, updating only the portions that have changed, and HotRoute, which provides network-based proximity detection to select the closest server to deliver the requested content.

Panther Express offers a reliable, scalable, yet cost effective content delivery solution.

Peer 1's Rapid Edge CDN provides global load balancing across its network of caches with on-demand propagation.

Savvis offers Content Delivery Services that were once under the names of Sandpiper, Digital Island, and Exodus.

TTS Europe focuses on media rich electronic learning with its quantum solution for distributing and tracking media-rich e-learning content throughout your network. It is a scalable and flexible CDN that integrates with a learning management system or intranet training database.

SyncCast provides CDN web-casting services as well as DRM and ad-insertion products.

VitalStream (purchased by InterNap in Feb 2007) provides media streaming and broadcasting services including pay-per-view, on-demand streaming, and live-event streaming.

Companies offering content delivery (and related) software

Castify Networks (purchased Dec 2007 by Omneon) provides technology for large scale and high quality streaming media delivery, including ad insertion, scheduling, and distribution restriction.

Blue Coat offers caching and compression solutions.

Circadence provides the Conductor QOS Managed Service, which incorporates caching, content and customer prioritization, managed transport, and soft landing (sort of a Web busy signal).

Cisco offers a number of products (including content-aware routers, switches, caches, etc.) as part of its Content Networkinginitiative.

Imperial Software Technology (IST) offers DeltaStream which transmits live data to interested websites to deliver dynamic, real-time information.

Kasenna's MediaBase provides content management and distribution for high-quality video.

Radware supplies products for web server, cache, and firewall traffic management.

RepliWeb offers software to distribute content by replication across systems.

Resonate supplies software solutions for multisite redundancy, load-balancing, monitoring and management.

SinoCDN provides the intelligent streaming gateway appliance with which streaming content delivery networks can be built, supporting the major streaming protocols.

Solid State Networks uniquely uses the BitTorrent protocol to provide speedy content delivery at a lower cost.

Software Pursuits offers SureSynch file synchronization software for Windows clients and servers.

In addition to proxy caches, Stratacache offers software and hardware CDN solutions, targeted primarily to the enterprise.

WARP Solutions offers a products such as a dynamic content accelerator to optimize application server performance.

HTTP caching

13 Caching in HTTP

HTTP is typically used for distributed information systems, where performance can be improved by the use of response caches. The HTTP/1.1 protocol includes a number of elements intended to make caching work as well as possible. Because these elements are inextricable from other aspects of the protocol, and because they interact with each other, it is useful to describe the basic caching design of HTTP separately from the detailed descriptions of methods, headers, response codes, etc.

Caching would be useless if it did not significantly improve performance. The goal of caching in HTTP/1.1 is to eliminate the need to send requests in many cases, and to eliminate the need to send full responses in many other cases. The former reduces the number of network round-trips required for many operations; we use an "expiration" mechanism for this purpose (see section 13.2). The latter reduces network bandwidth requirements; we use a "validation" mechanism for this purpose (see section 13.3).

Requirements for performance, availability, and disconnected operation require us to be able to relax the goal of semantic transparency. The HTTP/1.1 protocol allows origin servers, caches,

and clients to explicitly reduce transparency when necessary. However, because non-transparent operation may confuse non-expert users, and might be incompatible with certain server applications (such as those for ordering merchandise), the protocol requires that transparency be relaxed

      - only by an explicit protocol-level request when relaxed by         client or origin server

      - only with an explicit warning to the end user when relaxed by         cache or client

Therefore, the HTTP/1.1 protocol provides these important elements:

      1. Protocol features that provide full semantic transparency when          this is required by all parties.

      2. Protocol features that allow an origin server or user agent to          explicitly request and control non-transparent operation.

      3. Protocol features that allow a cache to attach warnings to          responses that do not preserve the requested approximation of          semantic transparency.

A basic principle is that it must be possible for the clients to detect any potential relaxation of semantic transparency.

      Note: The server, cache, or client implementor might be faced with       design decisions not explicitly discussed in this specification.       If a decision might affect semantic transparency, the implementor       ought to err on the side of maintaining transparency unless a       careful and complete analysis shows significant benefits in       breaking transparency.

13.1.1 Cache Correctness

A correct cache MUST respond to a request with the most up-to-date response held by the cache that is appropriate to the request (see sections 13.2.5, 13.2.6, and 13.12) which meets one of the following conditions:

      1. It has been checked for equivalence with what the origin server          would have returned by revalidating the response with the          origin server (section 13.3);

      2. It is "fresh enough" (see section 13.2). In the default case,          this means it meets the least restrictive freshness requirement          of the client, origin server, and cache (see section 14.9); if          the origin server so specifies, it is the freshness requirement          of the origin server alone.

         If a stored response is not "fresh enough" by the most          restrictive freshness requirement of both the client and the          origin server, in carefully considered circumstances the cache          MAY still return the response with the appropriate Warning          header (see section 13.1.5 and 14.46), unless such a response          is prohibited (e.g., by a "no-store" cache-directive, or by a          "no-cache" cache-request-directive; see section 14.9).

      3. It is an appropriate 304 (Not Modified), 305 (Proxy Redirect),          or error (4xx or 5xx) response message.

If the cache can not communicate with the origin server, then a correct cache SHOULD respond as above if the response can be correctly served from the cache; if not it MUST return an error or warning indicating that there was a communication failure.

If a cache receives a response (either an entire response, or a 304 (Not Modified) response) that it would normally forward to the requesting client, and the received response is no longer fresh, the cache SHOULD forward it to the requesting client without adding a new Warning (but without removing any existing Warning headers). A cache SHOULD NOT attempt to revalidate a response simply because that response became stale in transit; this might lead to an infinite loop. A user agent that receives a stale response without a Warning MAY display a warning indication to the user.

13.1.2 Warnings

Whenever a cache returns a response that is neither first-hand nor "fresh enough" (in the sense of condition 2 in section 13.1.1), it MUST attach a warning to that effect, using a Warning general-header. The Warning header and the currently defined warnings are described in section 14.46. The warning allows clients to take appropriate action.

Warnings MAY be used for other purposes, both cache-related and otherwise. The use of a warning, rather than an error status code, distinguish these responses from true failures.

Warnings are assigned three digit warn-codes. The first digit indicates whether the Warning MUST or MUST NOT be deleted from a stored cache entry after a successful revalidation:

1xx Warnings that describe the freshness or revalidation status of the response, and so MUST be deleted after a successful revalidation. 1XX warn-codes MAY be generated by a cache only when validating a cached entry. It MUST NOT be generated by clients.

2xx Warnings that describe some aspect of the entity body or entity headers that is not rectified by a revalidation (for example, a lossy compression of the entity bodies) and which MUST NOT be deleted after a successful revalidation.

See section 14.46 for the definitions of the codes themselves.

HTTP/1.0 caches will cache all Warnings in responses, without deleting the ones in the first category. Warnings in responses that are passed to HTTP/1.0 caches carry an extra warning-date field, which prevents a future HTTP/1.1 recipient from believing an erroneously cached Warning.

Warnings also carry a warning text. The text MAY be in any appropriate natural language (perhaps based on the client's Accept headers), and include an OPTIONAL indication of what character set is used.

Multiple warnings MAY be attached to a response (either by the origin server or by a cache), including multiple warnings with the same code number. For example, a server might provide the same warning with texts in both English and Basque.

When multiple warnings are attached to a response, it might not be practical or reasonable to display all of them to the user. This version of HTTP does not specify strict priority rules for deciding which warnings to display and in what order, but does suggest some heuristics.

13.1.3 Cache-control Mechanisms

The basic cache mechanisms in HTTP/1.1 (server-specified expiration times and validators) are implicit directives to caches. In some cases, a server or client might need to provide explicit directives to the HTTP caches. We use the Cache-Control header for this purpose.

The Cache-Control header allows a client or server to transmit a variety of directives in either requests or responses. These directives typically override the default caching algorithms. As a general rule, if there is any apparent conflict between header values, the most restrictive interpretation is applied (that is, the one that is most likely to preserve semantic transparency). However,

in some cases, cache-control directives are explicitly specified as weakening the approximation of semantic transparency (for example, "max-stale" or "public").

The cache-control directives are described in detail in section 14.9.

13.1.4 Explicit User Agent Warnings

Many user agents make it possible for users to override the basic caching mechanisms. For example, the user agent might allow the user to specify that cached entities (even explicitly stale ones) are never validated. Or the user agent might habitually add "Cache- Control: max-stale=3600" to every request. The user agent SHOULD NOT default to either non-transparent behavior, or behavior that results in abnormally ineffective caching, but MAY be explicitly configured to do so by an explicit action of the user.

If the user has overridden the basic caching mechanisms, the user agent SHOULD explicitly indicate to the user whenever this results in the display of information that might not meet the server's transparency requirements (in particular, if the displayed entity is known to be stale). Since the protocol normally allows the user agent to determine if responses are stale or not, this indication need only be displayed when this actually happens. The indication need not be a dialog box; it could be an icon (for example, a picture of a rotting fish) or some other indicator.

If the user has overridden the caching mechanisms in a way that would abnormally reduce the effectiveness of caches, the user agent SHOULD continually indicate this state to the user (for example, by a display of a picture of currency in flames) so that the user does not inadvertently consume excess resources or suffer from excessive latency.

13.1.5 Exceptions to the Rules and Warnings

In some cases, the operator of a cache MAY choose to configure it to return stale responses even when not requested by clients. This decision ought not be made lightly, but may be necessary for reasons of availability or performance, especially when the cache is poorly connected to the origin server. Whenever a cache returns a stale response, it MUST mark it as such (using a Warning header) enabling the client software to alert the user that there might be a potential problem.

It also allows the user agent to take steps to obtain a first-hand or fresh response. For this reason, a cache SHOULD NOT return a stale response if the client explicitly requests a first-hand or fresh one, unless it is impossible to comply for technical or policy reasons.

13.1.6 Client-controlled Behavior

While the origin server (and to a lesser extent, intermediate caches, by their contribution to the age of a response) are the primary source of expiration information, in some cases the client might need to control a cache's decision about whether to return a cached response without validating it. Clients do this using several directives of the Cache-Control header.

A client's request MAY specify the maximum age it is willing to accept of an unvalidated response; specifying a value of zero forces the cache(s) to revalidate all responses. A client MAY also specify the minimum time remaining before a response expires. Both of these options increase constraints on the behavior of caches, and so cannot further relax the cache's approximation of semantic transparency.

A client MAY also specify that it will accept stale responses, up to some maximum amount of staleness. This loosens the constraints on the caches, and so might violate the origin server's specified constraints on semantic transparency, but might be necessary to support disconnected operation, or high availability in the face of poor connectivity.

13.2 Expiration Model

13.2.1 Server-Specified Expiration

HTTP caching works best when caches can entirely avoid making requests to the origin server. The primary mechanism for avoiding requests is for an origin server to provide an explicit expiration time in the future, indicating that a response MAY be used to satisfy subsequent requests. In other words, a cache can return a fresh response without first contacting the server.

Our expectation is that servers will assign future explicit expiration times to responses in the belief that the entity is not likely to change, in a semantically significant way, before the expiration time is reached. This normally preserves semantic transparency, as long as the server's expiration times are carefully chosen.

The expiration mechanism applies only to responses taken from a cache and not to first-hand responses forwarded immediately to the requesting client.

If an origin server wishes to force a semantically transparent cache to validate every request, it MAY assign an explicit expiration time in the past. This means that the response is always stale, and so the cache SHOULD validate it before using it for subsequent requests. See section 14.9.4 for a more restrictive way to force revalidation.

If an origin server wishes to force any HTTP/1.1 cache, no matter how it is configured, to validate every request, it SHOULD use the "must- revalidate" cache-control directive (see section 14.9).

Servers specify explicit expiration times using either the Expires header, or the max-age directive of the Cache-Control header.

An expiration time cannot be used to force a user agent to refresh its display or reload a resource; its semantics apply only to caching mechanisms, and such mechanisms need only check a resource's expiration status when a new request for that resource is initiated. See section 13.13 for an explanation of the difference between caches and history mechanisms.

13.2.2 Heuristic Expiration

Since origin servers do not always provide explicit expiration times, HTTP caches typically assign heuristic expiration times, employing algorithms that use other header values (such as the Last-Modified time) to estimate a plausible expiration time. The HTTP/1.1 specification does not provide specific algorithms, but does impose worst-case constraints on their results. Since heuristic expiration times might compromise semantic transparency, they ought to used cautiously, and we encourage origin servers to provide explicit expiration times as much as possible.

13.2.3 Age Calculations

In order to know if a cached entry is fresh, a cache needs to know if its age exceeds its freshness lifetime. We discuss how to calculate the latter in section 13.2.4; this section describes how to calculate the age of a response or cache entry.

In this discussion, we use the term "now" to mean "the current value of the clock at the host performing the calculation." Hosts that use HTTP, but especially hosts running origin servers and caches, SHOULD use NTP [28]or some similar protocol to synchronize their clocks to a globally accurate time standard.

HTTP/1.1 requires origin servers to send a Date header, if possible, with every response, giving the time at which the response was generated (see section 14.18). We use the term "date_value" to denote the value of the Date header, in a form appropriate for arithmetic operations.

HTTP/1.1 uses the Age response-header to convey the estimated age of the response message when obtained from a cache. The Age field value is the cache's estimate of the amount of time since the response was generated or revalidated by the origin server.

In essence, the Age value is the sum of the time that the response has been resident in each of the caches along the path from the origin server, plus the amount of time it has been in transit along network paths.

We use the term "age_value" to denote the value of the Age header, in a form appropriate for arithmetic operations.

A response's age can be calculated in two entirely independent ways:

      1. now minus date_value, if the local clock is reasonably well          synchronized to the origin server's clock. If the result is          negative, the result is replaced by zero.

      2. age_value, if all of the caches along the response path          implement HTTP/1.1.

Given that we have two independent ways to compute the age of a response when it is received, we can combine these as

       corrected_received_age = max(now - date_value, age_value)

and as long as we have either nearly synchronized clocks or all- HTTP/1.1 paths, one gets a reliable (conservative) result.

Because of network-imposed delays, some significant interval might pass between the time that a server generates a response and the time it is received at the next outbound cache or client. If uncorrected, this delay could result in improperly low ages.

Because the request that resulted in the returned Age value must have been initiated prior to that Age value's generation, we can correct for delays imposed by the network by recording the time at which the request was initiated. Then, when an Age value is received, it MUST be interpreted relative to the time the request was initiated, not

the time that the response was received. This algorithm results in conservative behavior no matter how much delay is experienced. So, we compute:

      corrected_initial_age = corrected_received_age                             + (now - request_time)

where "request_time" is the time (according to the local clock) when the request that elicited this response was sent.

Summary of age calculation algorithm, when a cache receives a response:

      /*        * age_value        *      is the value of Age: header received by the cache with        *              this response.        * date_value        *      is the value of the origin server's Date: header        * request_time        *      is the (local) time when the cache made the request        *              that resulted in this cached response        * response_time        *      is the (local) time when the cache received the        *              response        * now        *      is the current (local) time        */

      apparent_age = max(0, response_time - date_value);       corrected_received_age = max(apparent_age, age_value);       response_delay = response_time - request_time;       corrected_initial_age = corrected_received_age + response_delay;       resident_time = now - response_time;       current_age   = corrected_initial_age + resident_time;

The current_age of a cache entry is calculated by adding the amount of time (in seconds) since the cache entry was last validated by the origin server to the corrected_initial_age. When a response is generated from a cache entry, the cache MUST include a single Age header field in the response with a value equal to the cache entry's current_age.

The presence of an Age header field in a response implies that a response is not first-hand. However, the converse is not true, since the lack of an Age header field in a response does not imply that the

response is first-hand unless all caches along the request path are compliant with HTTP/1.1 (i.e., older HTTP caches did not implement the Age header field).

13.2.4 Expiration Calculations

In order to decide whether a response is fresh or stale, we need to compare its freshness lifetime to its age. The age is calculated as described in section 13.2.3; this section describes how to calculate the freshness lifetime, and to determine if a response has expired. In the discussion below, the values can be represented in any form appropriate for arithmetic operations.

We use the term "expires_value" to denote the value of the Expires header. We use the term "max_age_value" to denote an appropriate value of the number of seconds carried by the "max-age" directive of the Cache-Control header in a response (see section 14.9.3).

The max-age directive takes priority over Expires, so if max-age is present in a response, the calculation is simply:

      freshness_lifetime = max_age_value

Otherwise, if Expires is present in the response, the calculation is:

      freshness_lifetime = expires_value - date_value

Note that neither of these calculations is vulnerable to clock skew, since all of the information comes from the origin server.

If none of Expires, Cache-Control: max-age, or Cache-Control: s- maxage (see section 14.9.3) appears in the response, and the response does not include other restrictions on caching, the cache MAY compute a freshness lifetime using a heuristic. The cache MUST attach Warning 113 to any response whose age is more than 24 hours if such warning has not already been added.

Also, if the response does have a Last-Modified time, the heuristic expiration value SHOULD be no more than some fraction of the interval since that time. A typical setting of this fraction might be 10%.

The calculation to determine if a response has expired is quite simple:

      response_is_fresh = (freshness_lifetime > current_age)

13.2.5 Disambiguating Expiration Values

Because expiration values are assigned optimistically, it is possible for two caches to contain fresh values for the same resource that are different.

If a client performing a retrieval receives a non-first-hand response for a request that was already fresh in its own cache, and the Date header in its existing cache entry is newer than the Date on the new response, then the client MAY ignore the response. If so, it MAY retry the request with a "Cache-Control: max-age=0" directive (see section 14.9), to force a check with the origin server.

If a cache has two fresh responses for the same representation with different validators, it MUST use the one with the more recent Date header. This situation might arise because the cache is pooling responses from other caches, or because a client has asked for a reload or a revalidation of an apparently fresh cache entry.

13.2.6 Disambiguating Multiple Responses

Because a client might be receiving responses via multiple paths, so that some responses flow through one set of caches and other responses flow through a different set of caches, a client might receive responses in an order different from that in which the origin server sent them. We would like the client to use the most recently generated response, even if older responses are still apparently fresh.

Neither the entity tag nor the expiration value can impose an ordering on responses, since it is possible that a later response intentionally carries an earlier expiration time. The Date values are ordered to a granularity of one second.

When a client tries to revalidate a cache entry, and the response it receives contains a Date header that appears to be older than the one for the existing entry, then the client SHOULD repeat the request unconditionally, and include

       Cache-Control: max-age=0

to force any intermediate caches to validate their copies directly with the origin server, or

       Cache-Control: no-cache

to force any intermediate caches to obtain a new copy from the origin server.

If the Date values are equal, then the client MAY use either response (or MAY, if it is being extremely prudent, request a new response). Servers MUST NOT depend on clients being able to choose deterministically between responses generated during the same second, if their expiration times overlap.

13.3 Validation Model

When a cache has a stale entry that it would like to use as a response to a client's request, it first has to check with the origin server (or possibly an intermediate cache with a fresh response) to see if its cached entry is still usable. We call this "validating" the cache entry. Since we do not want to have to pay the overhead of retransmitting the full response if the cached entry is good, and we do not want to pay the overhead of an extra round trip if the cached entry is invalid, the HTTP/1.1 protocol supports the use of conditional methods.

The key protocol features for supporting conditional methods are those concerned with "cache validators." When an origin server generates a full response, it attaches some sort of validator to it, which is kept with the cache entry. When a client (user agent or proxy cache) makes a conditional request for a resource for which it has a cache entry, it includes the associated validator in the request.

The server then checks that validator against the current validator for the entity, and, if they match (see section 13.3.3), it responds with a special status code (usually, 304 (Not Modified)) and no entity-body. Otherwise, it returns a full response (including entity-body). Thus, we avoid transmitting the full response if the validator matches, and we avoid an extra round trip if it does not match.

In HTTP/1.1, a conditional request looks exactly the same as a normal request for the same resource, except that it carries a special header (which includes the validator) that implicitly turns the method (usually, GET) into a conditional.

The protocol includes both positive and negative senses of cache- validating conditions. That is, it is possible to request either that a method be performed if and only if a validator matches or if and only if no validators match.

      Note: a response that lacks a validator may still be cached, and       served from cache until it expires, unless this is explicitly       prohibited by a cache-control directive. However, a cache cannot       do a conditional retrieval if it does not have a validator for the       entity, which means it will not be refreshable after it expires.

13.3.1 Last-Modified Dates

The Last-Modified entity-header field value is often used as a cache validator. In simple terms, a cache entry is considered to be valid if the entity has not been modified since the Last-Modified value.

13.3.2 Entity Tag Cache Validators

The ETag response-header field value, an entity tag, provides for an "opaque" cache validator. This might allow more reliable validation in situations where it is inconvenient to store modification dates, where the one-second resolution of HTTP date values is not sufficient, or where the origin server wishes to avoid certain paradoxes that might arise from the use of modification dates.

Entity Tags are described in section 3.11. The headers used with entity tags are described in sections 14.19, 14.24, 14.26 and 14.44.

13.3.3 Weak and Strong Validators

Since both origin servers and caches will compare two validators to decide if they represent the same or different entities, one normally would expect that if the entity (the entity-body or any entity- headers) changes in any way, then the associated validator would change as well. If this is true, then we call this validator a "strong validator."

However, there might be cases when a server prefers to change the validator only on semantically significant changes, and not when insignificant aspects of the entity change. A validator that does not always change when the resource changes is a "weak validator."

Entity tags are normally "strong validators," but the protocol provides a mechanism to tag an entity tag as "weak." One can think of a strong validator as one that changes whenever the bits of an entity changes, while a weak value changes whenever the meaning of an entity changes. Alternatively, one can think of a strong validator as part of an identifier for a specific entity, while a weak validator is part of an identifier for a set of semantically equivalent entities.

      Note: One example of a strong validator is an integer that is       incremented in stable storage every time an entity is changed.

      An entity's modification time, if represented with one-second       resolution, could be a weak validator, since it is possible that       the resource might be modified twice during a single second.

      Support for weak validators is optional. However, weak validators       allow for more efficient caching of equivalent objects; for       example, a hit counter on a site is probably good enough if it is       updated every few days or weeks, and any value during that period       is likely "good enough" to be equivalent.

A "use" of a validator is either when a client generates a request and includes the validator in a validating header field, or when a server compares two validators.

Strong validators are usable in any context. Weak validators are only usable in contexts that do not depend on exact equality of an entity. For example, either kind is usable for a conditional GET of a full entity. However, only a strong validator is usable for a sub-range retrieval, since otherwise the client might end up with an internally inconsistent entity.

Clients MAY issue simple (non-subrange) GET requests with either weak validators or strong validators. Clients MUST NOT use weak validators in other forms of request.

The only function that the HTTP/1.1 protocol defines on validators is comparison. There are two validator comparison functions, depending on whether the comparison context allows the use of weak validators or not:

      - The strong comparison function: in order to be considered equal,         both validators MUST be identical in every way, and both MUST         NOT be weak.

      - The weak comparison function: in order to be considered equal,         both validators MUST be identical in every way, but either or         both of them MAY be tagged as "weak" without affecting the         result.

An entity tag is strong unless it is explicitly tagged as weak. Section 3.11 gives the syntax for entity tags.

A Last-Modified time, when used as a validator in a request, is implicitly weak unless it is possible to deduce that it is strong, using the following rules:

      - The validator is being compared by an origin server to the         actual current validator for the entity and,

      - That origin server reliably knows that the associated entity did         not change twice during the second covered by the presented         validator.

      - The validator is about to be used by a client in an If-         Modified-Since or If-Unmodified-Since header, because the client         has a cache entry for the associated entity, and

      - That cache entry includes a Date value, which gives the time         when the origin server sent the original response, and

      - The presented Last-Modified time is at least 60 seconds before         the Date value.

      - The validator is being compared by an intermediate cache to the         validator stored in its cache entry for the entity, and

      - That cache entry includes a Date value, which gives the time         when the origin server sent the original response, and

      - The presented Last-Modified time is at least 60 seconds before         the Date value.

This method relies on the fact that if two different responses were sent by the origin server during the same second, but both had the same Last-Modified time, then at least one of those responses would have a Date value equal to its Last-Modified time. The arbitrary 60- second limit guards against the possibility that the Date and Last- Modified values are generated from different clocks, or at somewhat different times during the preparation of the response. An implementation MAY use a value larger than 60 seconds, if it is believed that 60 seconds is too short.

If a client wishes to perform a sub-range retrieval on a value for which it has only a Last-Modified time and no opaque validator, it MAY do this only if the Last-Modified time is strong in the sense described here.

A cache or origin server receiving a conditional request, other than a full-body GET request, MUST use the strong comparison function to evaluate the condition.

These rules allow HTTP/1.1 caches and clients to safely perform sub- range retrievals on values that have been obtained from HTTP/1.0

servers.

13.3.4 Rules for When to Use Entity Tags and Last-Modified Dates

We adopt a set of rules and recommendations for origin servers, clients, and caches regarding when various validator types ought to be used, and for what purposes.

HTTP/1.1 origin servers:

      - SHOULD send an entity tag validator unless it is not feasible to         generate one.

      - MAY send a weak entity tag instead of a strong entity tag, if         performance considerations support the use of weak entity tags,         or if it is unfeasible to send a strong entity tag.

      - SHOULD send a Last-Modified value if it is feasible to send one,         unless the risk of a breakdown in semantic transparency that         could result from using this date in an If-Modified-Since header         would lead to serious problems.

In other words, the preferred behavior for an HTTP/1.1 origin server is to send both a strong entity tag and a Last-Modified value.

In order to be legal, a strong entity tag MUST change whenever the associated entity value changes in any way. A weak entity tag SHOULD change whenever the associated entity changes in a semantically significant way.

      Note: in order to provide semantically transparent caching, an       origin server must avoid reusing a specific strong entity tag       value for two different entities, or reusing a specific weak       entity tag value for two semantically different entities. Cache       entries might persist for arbitrarily long periods, regardless of       expiration times, so it might be inappropriate to expect that a       cache will never again attempt to validate an entry using a       validator that it obtained at some point in the past.

HTTP/1.1 clients:

      - If an entity tag has been provided by the origin server, MUST         use that entity tag in any cache-conditional request (using If-         Match or If-None-Match).

      - If only a Last-Modified value has been provided by the origin         server, SHOULD use that value in non-subrange cache-conditional         requests (using If-Modified-Since).

      - If only a Last-Modified value has been provided by an HTTP/1.0         origin server, MAY use that value in subrange cache-conditional         requests (using If-Unmodified-Since:). The user agent SHOULD         provide a way to disable this, in case of difficulty.

      - If both an entity tag and a Last-Modified value have been         provided by the origin server, SHOULD use both validators in         cache-conditional requests. This allows both HTTP/1.0 and         HTTP/1.1 caches to respond appropriately.

An HTTP/1.1 origin server, upon receiving a conditional request that includes both a Last-Modified date (e.g., in an If-Modified-Since or If-Unmodified-Since header field) and one or more entity tags (e.g., in an If-Match, If-None-Match, or If-Range header field) as cache validators, MUST NOT return a response status of 304 (Not Modified) unless doing so is consistent with all of the conditional header fields in the request.

An HTTP/1.1 caching proxy, upon receiving a conditional request that includes both a Last-Modified date and one or more entity tags as cache validators, MUST NOT return a locally cached response to the client unless that cached response is consistent with all of the conditional header fields in the request.

      Note: The general principle behind these rules is that HTTP/1.1       servers and clients should transmit as much non-redundant       information as is available in their responses and requests.       HTTP/1.1 systems receiving this information will make the most       conservative assumptions about the validators they receive.

      HTTP/1.0 clients and caches will ignore entity tags. Generally,       last-modified values received or used by these systems will       support transparent and efficient caching, and so HTTP/1.1 origin       servers should provide Last-Modified values. In those rare cases       where the use of a Last-Modified value as a validator by an       HTTP/1.0 system could result in a serious problem, then HTTP/1.1       origin servers should not provide one.

13.3.5 Non-validating Conditionals

The principle behind entity tags is that only the service author knows the semantics of a resource well enough to select an appropriate cache validation mechanism, and the specification of any validator comparison function more complex than byte-equality would open up a can of worms. Thus, comparisons of any other headers (except Last-Modified, for compatibility with HTTP/1.0) are never used for purposes of validating a cache entry.

13.4 Response Cacheability

Unless specifically constrained by a cache-control (section 14.9) directive, a caching system MAY always store a successful response (see section 13.8) as a cache entry, MAY return it without validation if it is fresh, and MAY return it after successful validation. If there is neither a cache validator nor an explicit expiration time associated with a response, we do not expect it to be cached, but certain caches MAY violate this expectation (for example, when little or no network connectivity is available). A client can usually detect that such a response was taken from a cache by comparing the Date header to the current time.

      Note: some HTTP/1.0 caches are known to violate this expectation       without providing any Warning.

However, in some cases it might be inappropriate for a cache to retain an entity, or to return it in response to a subsequent request. This might be because absolute semantic transparency is deemed necessary by the service author, or because of security or privacy considerations. Certain cache-control directives are therefore provided so that the server can indicate that certain resource entities, or portions thereof, are not to be cached regardless of other considerations.

Note that section 14.8 normally prevents a shared cache from saving and returning a response to a previous request if that request included an Authorization header.

A response received with a status code of 200, 203, 206, 300, 301 or 410 MAY be stored by a cache and used in reply to a subsequent request, subject to the expiration mechanism, unless a cache-control directive prohibits caching. However, a cache that does not support the Range and Content-Range headers MUST NOT cache 206 (Partial Content) responses.

A response received with any other status code (e.g. status codes 302 and 307) MUST NOT be returned in a reply to a subsequent request unless there are cache-control directives or another header(s) that explicitly allow it. For example, these include the following: an Expires header (section 14.21); a "max-age", "s-maxage", "must- revalidate", "proxy-revalidate", "public" or "private" cache-control directive (section 14.9).

13.5 Constructing Responses From Caches

The purpose of an HTTP cache is to store information received in response to requests for use in responding to future requests. In many cases, a cache simply returns the appropriate parts of a response to the requester. However, if the cache holds a cache entry based on a previous response, it might have to combine parts of a new response with what is held in the cache entry.

13.5.1 End-to-end and Hop-by-hop Headers

For the purpose of defining the behavior of caches and non-caching proxies, we divide HTTP headers into two categories:

      - End-to-end headers, which are  transmitted to the ultimate         recipient of a request or response. End-to-end headers in         responses MUST be stored as part of a cache entry and MUST be         transmitted in any response formed from a cache entry.

      - Hop-by-hop headers, which are meaningful only for a single         transport-level connection, and are not stored by caches or         forwarded by proxies.

The following HTTP/1.1 headers are hop-by-hop headers:

      - Connection       - Keep-Alive       - Proxy-Authenticate       - Proxy-Authorization       - TE       - Trailers       - Transfer-Encoding       - Upgrade

All other headers defined by HTTP/1.1 are end-to-end headers.

Other hop-by-hop headers MUST be listed in a Connection header, (section 14.10) to be introduced into HTTP/1.1 (or later).

13.5.2 Non-modifiable Headers

Some features of the HTTP/1.1 protocol, such as Digest Authentication, depend on the value of certain end-to-end headers. A transparent proxy SHOULD NOT modify an end-to-end header unless the definition of that header requires or specifically allows that.

A transparent proxy MUST NOT modify any of the following fields in a request or response, and it MUST NOT add any of these fields if not already present:

      - Content-Location

      - Content-MD5

      - ETag

      - Last-Modified

A transparent proxy MUST NOT modify any of the following fields in a response:

      - Expires

but it MAY add any of these fields if not already present. If an Expires header is added, it MUST be given a field-value identical to that of the Date header in that response.

A proxy MUST NOT modify or add any of the following fields in a message that contains the no-transform cache-control directive, or in any request:

      - Content-Encoding

      - Content-Range

      - Content-Type

A non-transparent proxy MAY modify or add these fields to a message that does not include no-transform, but if it does so, it MUST add a Warning 214 (Transformation applied) if one does not already appear in the message (see section 14.46).

      Warning: unnecessary modification of end-to-end headers might       cause authentication failures if stronger authentication       mechanisms are introduced in later versions of HTTP. Such       authentication mechanisms MAY rely on the values of header fields       not listed here.

The Content-Length field of a request or response is added or deleted according to the rules in section 4.4. A transparent proxy MUST preserve the entity-length (section 7.2.2) of the entity-body, although it MAY change the transfer-length (section 4.4).

13.5.3 Combining Headers

When a cache makes a validating request to a server, and the server provides a 304 (Not Modified) response or a 206 (Partial Content) response, the cache then constructs a response to send to the requesting client.

If the status code is 304 (Not Modified), the cache uses the entity- body stored in the cache entry as the entity-body of this outgoing response. If the status code is 206 (Partial Content) and the ETag or Last-Modified headers match exactly, the cache MAY combine the contents stored in the cache entry with the new contents received in the response and use the result as the entity-body of this outgoing response, (see 13.5.4).

The end-to-end headers stored in the cache entry are used for the constructed response, except that

      - any stored Warning headers with warn-code 1xx (see section         14.46) MUST be deleted from the cache entry and the forwarded         response.

      - any stored Warning headers with warn-code 2xx MUST be retained         in the cache entry and the forwarded response.

      - any end-to-end headers provided in the 304 or 206 response MUST         replace the corresponding headers from the cache entry.

Unless the cache decides to remove the cache entry, it MUST also replace the end-to-end headers stored with the cache entry with corresponding headers received in the incoming response, except for Warning headers as described immediately above. If a header field- name in the incoming response matches more than one header in the cache entry, all such old headers MUST be replaced.

In other words, the set of end-to-end headers received in the incoming response overrides all corresponding end-to-end headers stored with the cache entry (except for stored Warning headers with warn-code 1xx, which are deleted even if not overridden).

      Note: this rule allows an origin server to use a 304 (Not       Modified) or a 206 (Partial Content) response to update any header       associated with a previous response for the same entity or sub-       ranges thereof, although it might not always be meaningful or       correct to do so. This rule does not allow an origin server to use       a 304 (Not Modified) or a 206 (Partial Content) response to       entirely delete a header that it had provided with a previous       response.

13.5.4 Combining Byte Ranges

A response might transfer only a subrange of the bytes of an entity- body, either because the request included one or more Range specifications, or because a connection was broken prematurely. After several such transfers, a cache might have received several ranges of the same entity-body.

If a cache has a stored non-empty set of subranges for an entity, and an incoming response transfers another subrange, the cache MAY combine the new subrange with the existing set if both the following conditions are met:

      - Both the incoming response and the cache entry have a cache         validator.

      - The two cache validators match using the strong comparison         function (see section 13.3.3).

If either requirement is not met, the cache MUST use only the most recent partial response (based on the Date values transmitted with every response, and using the incoming response if these values are equal or missing), and MUST discard the other partial information.

13.6 Caching Negotiated Responses

Use of server-driven content negotiation (section 12.1), as indicated by the presence of a Vary header field in a response, alters the conditions and procedure by which a cache can use the response for subsequent requests. See section 14.44 for use of the Vary header field by servers.

A server SHOULD use the Vary header field to inform a cache of what request-header fields were used to select among multiple representations of a cacheable response subject to server-driven negotiation. The set of header fields named by the Vary field value is known as the "selecting" request-headers.

When the cache receives a subsequent request whose Request-URI specifies one or more cache entries including a Vary header field, the cache MUST NOT use such a cache entry to construct a response to the new request unless all of the selecting request-headers present in the new request match the corresponding stored request-headers in the original request.

The selecting request-headers from two requests are defined to match if and only if the selecting request-headers in the first request can be transformed to the selecting request-headers in the second request

by adding or removing linear white space (LWS) at places where this is allowed by the corresponding BNF, and/or combining multiple message-header fields with the same field name following the rules about message headers in section 4.2.

A Vary header field-value of "*" always fails to match and subsequent requests on that resource can only be properly interpreted by the origin server.

If the selecting request header fields for the cached entry do not match the selecting request header fields of the new request, then the cache MUST NOT use a cached entry to satisfy the request unless it first relays the new request to the origin server in a conditional request and the server responds with 304 (Not Modified), including an entity tag or Content-Location that indicates the entity to be used.

If an entity tag was assigned to a cached representation, the forwarded request SHOULD be conditional and include the entity tags in an If-None-Match header field from all its cache entries for the resource. This conveys to the server the set of entities currently held by the cache, so that if any one of these entities matches the requested entity, the server can use the ETag header field in its 304 (Not Modified) response to tell the cache which entry is appropriate. If the entity-tag of the new response matches that of an existing entry, the new response SHOULD be used to update the header fields of the existing entry, and the result MUST be returned to the client.

If any of the existing cache entries contains only partial content for the associated entity, its entity-tag SHOULD NOT be included in the If-None-Match header field unless the request is for a range that would be fully satisfied by that entry.

If a cache receives a successful response whose Content-Location field matches that of an existing cache entry for the same Request- ]URI, whose entity-tag differs from that of the existing entry, and whose Date is more recent than that of the existing entry, the existing entry SHOULD NOT be returned in response to future requests and SHOULD be deleted from the cache.

13.7 Shared and Non-Shared Caches

For reasons of security and privacy, it is necessary to make a distinction between "shared" and "non-shared" caches. A non-shared cache is one that is accessible only to a single user. Accessibility in this case SHOULD be enforced by appropriate security mechanisms. All other caches are considered to be "shared." Other sections of

this specification place certain constraints on the operation of shared caches in order to prevent loss of privacy or failure of access controls.

13.8 Errors or Incomplete Response Cache Behavior

A cache that receives an incomplete response (for example, with fewer bytes of data than specified in a Content-Length header) MAY store the response. However, the cache MUST treat this as a partial response. Partial responses MAY be combined as described in section 13.5.4; the result might be a full response or might still be partial. A cache MUST NOT return a partial response to a client without explicitly marking it as such, using the 206 (Partial Content) status code. A cache MUST NOT return a partial response using a status code of 200 (OK).

If a cache receives a 5xx response while attempting to revalidate an entry, it MAY either forward this response to the requesting client, or act as if the server failed to respond. In the latter case, it MAY return a previously received response unless the cached entry includes the "must-revalidate" cache-control directive (see section 14.9).

13.9 Side Effects of GET and HEAD

Unless the origin server explicitly prohibits the caching of their responses, the application of GET and HEAD methods to any resources SHOULD NOT have side effects that would lead to erroneous behavior if these responses are taken from a cache. They MAY still have side effects, but a cache is not required to consider such side effects in its caching decisions. Caches are always expected to observe an origin server's explicit restrictions on caching.

We note one exception to this rule: since some applications have traditionally used GETs and HEADs with query URLs (those containing a "?" in the rel_path part) to perform operations with significant side effects, caches MUST NOT treat responses to such URIs as fresh unless the server provides an explicit expiration time. This specifically means that responses from HTTP/1.0 servers for such URIs SHOULD NOT be taken from a cache. See section 9.1.1 for related information.

13.10 Invalidation After Updates or Deletions

The effect of certain methods performed on a resource at the origin server might cause one or more existing cache entries to become non- transparently invalid. That is, although they might continue to be "fresh," they do not accurately reflect what the origin server would return for a new request on that resource.

There is no way for the HTTP protocol to guarantee that all such cache entries are marked invalid. For example, the request that caused the change at the origin server might not have gone through the proxy where a cache entry is stored. However, several rules help reduce the likelihood of erroneous behavior.

In this section, the phrase "invalidate an entity" means that the cache will either remove all instances of that entity from its storage, or will mark these as "invalid" and in need of a mandatory revalidation before they can be returned in response to a subsequent request.

Some HTTP methods MUST cause a cache to invalidate an entity. This is either the entity referred to by the Request-URI, or by the Location or Content-Location headers (if present). These methods are:

      - PUT

      - DELETE

      - POST

In order to prevent denial of service attacks, an invalidation based on the URI in a Location or Content-Location header MUST only be performed if the host part is the same as in the Request-URI.

A cache that passes through requests for methods it does not understand SHOULD invalidate any entities referred to by the Request-URI.

13.11 Write-Through Mandatory

All methods that might be expected to cause modifications to the origin server's resources MUST be written through to the origin server. This currently includes all methods except for GET and HEAD. A cache MUST NOT reply to such a request from a client before having transmitted the request to the inbound server, and having received a corresponding response from the inbound server. This does not prevent a proxy cache from sending a 100 (Continue) response before the inbound server has sent its final reply.

The alternative (known as "write-back" or "copy-back" caching) is not allowed in HTTP/1.1, due to the difficulty of providing consistent updates and the problems arising from server, cache, or network failure prior to write-back.

13.12 Cache Replacement

If a new cacheable (see sections 14.9.2, 13.2.5, 13.2.6 and 13.8) response is received from a resource while any existing responses for the same resource are cached, the cache SHOULD use the new response to reply to the current request. It MAY insert it into cache storage and MAY, if it meets all other requirements, use it to respond to any future requests that would previously have caused the old response to be returned. If it inserts the new response into cache storage the rules in section 13.5.3 apply.

      Note: a new response that has an older Date header value than       existing cached responses is not cacheable.

13.13 History Lists

User agents often have history mechanisms, such as "Back" buttons and history lists, which can be used to redisplay an entity retrieved earlier in a session.

History mechanisms and caches are different. In particular history mechanisms SHOULD NOT try to show a semantically transparent view of the current state of a resource. Rather, a history mechanism is meant to show exactly what the user saw at the time when the resource was retrieved.

By default, an expiration time does not apply to history mechanisms. If the entity is still in storage, a history mechanism SHOULD display it even if the entity has expired, unless the user has specifically configured the agent to refresh expired history documents.

This is not to be construed to prohibit the history mechanism from telling the user that a view might be stale.

      Note: if history list mechanisms unnecessarily prevent users from       viewing stale resources, this will tend to force service authors       to avoid using HTTP expiration controls and cache controls when       they would otherwise like to. Service authors may consider it       important that users not be presented with error messages or       warning messages when they use navigation controls (such as BACK)       to view previously fetched resources. Even though sometimes such       resources ought not to cached, or ought to expire quickly, user       interface considerations may force service authors to resort to       other means of preventing caching (e.g. "once-only" URLs) in order       not to suffer the effects of improperly functioning history       mechanisms.

Saturday, December 11, 2010

Stress Testing an Apache Application Server in a Real World Environment

We've all had an experience in which the software is installed on the servers, the network is connected and the application is running. Naturally, the next step is to think, "I wonder how much traffic this system can support?" Sometimes the question lingers and sometimes it passes, but it always presents itself. So, how do we figure out how much traffic our server and application can handle? Can it handle only a few active clients or can it withstand a proper Slashdotting? To appreciate fully the challenges one faces in trying to answer these questions, we must first understand the dynamic application and how it works.

A traditional dynamic application has five main components: the application server, the database server, the application, the database and the network. In the open-source world, the application server usually is Apache. And, often, Apache is running on Linux.

The database server can be almost anything that can do the job; for most smaller applications, this tends to be MySQL. In this article, I highlight the open-source PostgreSQL server, which also runs on Linux.

The application itself can be almost anything that fits the project requirements. Sometimes it makes sense to use Perl, sometimes PHP, sometimes Java. It is beyond the scope of this article to determine the benefits or liability of a particular platform, but a firm understanding of the best tool for the job is necessary to plan properly for adequate performance in a running application.

The database itself can mean the difference between a maximum load of one user and 5,000 users. A bad schema can be the death of an application, while a good schema can make up for a multitude of other shortcomings.

The network tends to be the forgotten part of the equation, but it can be as detrimental as bad application code or a bad schema. A noisy network can slow intra-server communications dramatically. It also can introduce errors and other unknowns into communications that, in turn, have unknown results on the running code.

As you have probably guessed, finding where our optimal performance lies and pushing those limits is more than a minor challenge. Like the formula-one race car that runs with almost absolute technical efficiency, the five main components of the Web-based application determine whether the system can handle its load optimally. By looking at those components and measuring how they react under certain circumstances, we can use that data to better tune the system as a whole.

Introduction to Testing

To begin the testing, we need to create an environment that facilitates micro-management of the five components. Being as most enterprise class applications are based on large proprietary hardware configurations, setting up a testing configuration often is prohibitive in cost. But, one of the advantages of the open-source model is a lot of the configurations are based on commodity hardware. The commodity hardware configuration, therefore, is the basic assumption used throughout the testing setup. This is not to say that a setup based on large proprietary hardware is not as valid or that the methods outlined are not compatible; it simply is more expensive.

We first need to set up a testing network. For this we use three computers on a private network segment. The systems should be exact replicas of the servers going into production or ones that already exist in the production environment. This, in a simple sense, accounts for the application/Web server and the database server, with the third system being a traffic generator and monitor. These three computers are connected through a hub for testing, because the shared nature of the hub facilitates monitoring network traffic. A better but more expensive solution would replace the hub with a switch and introduce an Ethernet tap into the configuration. The testing network we use, though, is a fairly accurate representation of the network topology that exists in the DMZ or behind the firewall of a live network.

Accurately monitoring the activity of the network and the systems involved in serving the applications requires some software, the first of which is the operating system. In this article, I use Red Hat 7.3, although there are few Red Hat-isms that are specific to these setups and tests. To get the best performance from the server machines, it is a good idea to make sure only the most necessary services are running. On the application server, this list includes Apache and SSH (if necessary); on the database server the list normally includes PostgreSQL and SSH (again, if necessary). As a general preference, I like to make sure all critical services, including Apache, PostgreSQL and the kernel itself are compiled from source. The benefit of doing this is ensuring only the necessary options are activated and nothing extraneous is active and taking up critical memory or processor time.

On the application and database servers, a necessary component that should be included is the sysstat package. This package normally is installed on the default Red Hat installation. For other distributions, the sysmon package can be found here and compiled from source. Sysstat is a good monitoring tool for most activities, as it can display at a glance almost all of the relevant information about a running system, including network activity, system loads and much more. This package works by polling data at specified intervals and is useful for general system monitoring. For our tests, we run sysstat in a more active mode, from the command line--a topic discussed in more depth later in this article.

It is a good idea to be familiar with the tools collected in the sysstat package, especially the sar and sadc programs. The man page for both of these programs provides a wealth of details. One of the limitations of the sysstat package is it has a minimum data sampling duration of one second. In my experience with this type of testing, a one-second sample is adequate for assessing where problems begin to creep into the configuration.

As we move to a different testing tool, we also are moving to a different portion of our testing network, the network itself. One of the best tools for this task is tcpdump. Tcpdump is a general purpose network data collection tool and, like sysstat, is available in binary form for most distributions, as well as in source code fromwww.tcpdump.org.

About now you may be asking why we are looking at raw network data. On occasion, I have errors be introduced into the communications between servers. For instance, sometimes data packets can become mangled in transit. Raw network data, then, is a great resource to have to refer back to in the event of a problem that cannot be diagnosed easily.

Tcpdump could be an article unto itself due to the depth and complexity of the subject of networking as well as the program itself. Specific usage examples follow in the next section, in which the actual testing procedure is explained. For now, tcpdump should be installed on our traffic generator system.

The last major component we need for our testing is a piece of software named flood, which is written by the Apache Group and available at www.apache.org. Flood still is considered alpha software and, therefore, is not well documented. On-line support also is limited, as few people seem to use it.

To begin, we need to download the flood source. We can get the source from here. A nice and simple document on how to build the flood source can be found there as well. If the Web application to be tested runs over https, reading this document is a must.

In it's most simple form, the method to build the software is:

     tar -zxvf flood-0.x.tar.gz      cd flood-0.x/      ./buildconf      ./configure --disable-shared      make all

Flood is executed and run from its source directory using the newly created ./floodexecutable.

The "./flood" syntax is quite simple. It generally follows the format:

     ./flood configuration-file > output.file

The configuration file is where the real work and power of flood is revealed, and several example files are provided in the ./examples directory in the flood source. It is a good idea to have a working knowledge of their construction, as well as some knowledge of XML. See Listing 1 for an example configuration file.

Listing 1. Example Configuration File

The general form of the configuration file is:

The is where the specific URLs are placed that flood uses to step through and access the application. Due to the way flood processes these URLs under certain configurations, it is possible to simulate a complete session a visitor may make to the Web application.

The section is where specifics are set about how the file should be processed as well as which URLs should be used. This section uses several tags to define the behavior of the flood process. They are:

These seven 7 tags are relatively well defined in the configuration file examples. The other main sections--farmer, farm and seed--set the parameters of how many times to run through the list, how often and the seed number for easy test duplication.

A real world note about flood from my own experience: if the application has rigidly defined URLs that reference individual pages, the stock flood report is useful with little modification. If, however, the Web application uses a few pages that refresh depending on variables and change accordingly, as is the case with most dynamic Web applications, flood results can be difficult to use. In the latter case, flood's primary usefulness comes in the scripting of traffic to a test environment for the purpose of simulating traffic. It is important to understand the benefits and the shortcomings of any applications being used; testing a Web application is no different.

Testing

The actual testing of the systems is similar to a ballet in terms of the level of choreography necessary to make everything run in concert. The absolute most essential act necessary to facilitate this is time synchronization. Having all machine times set as close as possible to one another other is imperative; without this simple step, it is impossible to correlate actions with events. Setting accurate time across our testing network should be our first task in beginning testing.

The second task for testing should be to create our flood configuration file. There are many ways to create the flood configuration file, but one method that creates some usable results is to parse the production Web server's access logs. A simple Perl script can be created to parse the log file and output the correct format, an XML configuration file. This method also is one of the easiest ways to create scripted sessions that reflect actual system usage the way a real visitor would use it.

The third task we need to perform is setting up the system monitoring on the application and database servers. As described above, we use the sysstat program to monitor the system's production environments. The program sadc is the back end process for collecting the data, and the most simple form for setting up the sysstat monitoring is:

     /usr/lib/sa/sadc 1 [# of seconds to report] outfile

As is probably obvious, it is important to capture enough data to encompass the entire duration of the testing. The above command should be started on both testing systems used to serve data, for example, the Web and database servers.

The fourth step should be to start up tcpdump monitoring on the traffic generation/monitoring system. The easiest way to do this is to issue the commandtcpdump -w outfile. This command outputs all network data to the outfile specified in a format easily loadable into an analysis tool, such as Ethereal.

Now that all of our monitoring is set up and running on the appropriate systems, the last part is to begin actual traffic generation by starting flood. In this stage, it is a good idea to start slow with little traffic and increment the volume up at a consistent pace until the limits of the server are reached.

In the previous two sections, we looked at the setup and the actual testing on our test network, but we have not looked at the data the software we use generates. For sample data, please see Listing 2.

Listing 2. Sample Data from Testing Software

To utilize the generated data, we go back to our old friend, Perl. For both tcpdump and flood, the individual data is measured in utime and easily can be compared and analyzed based on the reported times. The raw output of the flood report is:

     Absolute time started      relative time (to first column) to open the socket      relative time to write to the socket      relative time to read the socket      relative time to close the socket      OK or FAIL notification      the thread or PID of the farmer making the request      the URL of the target without query strings

Some of the example report processing scripts included in the flood source output the raw data into a simple yet readable output. Either by using these scripts as they are or by using them as a starting point to build a different report, it is possible to glean some essential data from the flood report. One method I have used to identify quickly trends in the data is to run the raw flood output through a Perl script that translates the utime values to a more "readable" number by dividing them by one million. This modified output then is passed to a GNUplot script (see Listing 3), which creates a nice graphic where trends can be seen at a glance. It then is trivial to match up which offending activity happened at what time and to see across the entire network what was going on with all systems at that moment. Once the offending activity is determined, it is quite possible to adjust the systems to correct the problem and then retest using the same method.

Listing 3. Modified Output Sent to GNUplot Script

The last item I want to address is the tcpdump data. The easiest method of working with tcpdump files is to use Ethereal. Ethereal is a graphical interface that loads all of the tcpdump data into an easy-to-read format. Its best feature, however, is its ability to trace or follow an individual connection--very handy in tracking down problematic connections.

Conclusions and Recommendations

Every Web-based application is different every other one, and no two pieces of hardware are exactly the same when running these types of applications. It is difficult to say exactly where problems might arise or where things might break. Stress testing requires an intimate knowledge of the software, the systems and the network that encompass the operating environment. These are the truisms of this type of activity, and although the challenges and learning curve is daunting, it is well worth the effort.

Stress testing requires a degree of patience, as rushing the testing can result in collecting bad data and/or ambiguous results. Always take the time to understand fully the results of the previous test before continuing on to the next round.

Drawing on my own experience in these types of tests and the resulting system tuning, I have reached these conclusions about dynamic Web-based application performance. Whenever possible:

Separate the application from the db.
Use as many diverse data channels as possible (i.e., separate drives for data and system on separate channels or controllers).
Use as good a machine as is practical.
Databases are memory hungry--feed them.
Understand relational database theory and the five normal forms.
Understand good development practices and follow them.
RAID 5 sounds like a great idea until a database lives on it and that database liberally uses INSERTS and UPDATES. If you need hardware redundancy, there are more database-friendly ways to accomplish it.
Just because it sounds like a great idea to put lots of XML into a db and let the front end parse it out, think again.
Remember that your servers can communicate only as fast as the network goes. Use good networking components and cables.

I have given you a brief overview of how to stress test Web application systems, as well as some of the tools to use. Now it's your turn to set up everything and use what you have learned. Remember to be creative and don't be afraid to hunt down new or better tools to do the job. The better your information, the better you can understand how to answer the questions listed at the beginning of this article.

Monday, December 13, 2010

Content Delivery Networks (CDNs)

Content Delivery Networks (CDNs)

Companies offering content delivery (and related) software

HTTP caching

13 Caching in HTTP

13.1.1 Cache Correctness

13.1.2 Warnings

13.1.3 Cache-control Mechanisms

13.1.4 Explicit User Agent Warnings

13.1.5 Exceptions to the Rules and Warnings

13.1.6 Client-controlled Behavior

13.2 Expiration Model

13.2.1 Server-Specified Expiration

13.2.2 Heuristic Expiration

13.2.3 Age Calculations

13.2.4 Expiration Calculations

13.2.5 Disambiguating Expiration Values

13.2.6 Disambiguating Multiple Responses

13.3 Validation Model

13.3.1 Last-Modified Dates

13.3.2 Entity Tag Cache Validators

13.3.3 Weak and Strong Validators

13.3.4 Rules for When to Use Entity Tags and Last-Modified Dates

13.3.5 Non-validating Conditionals

13.4 Response Cacheability

13.5 Constructing Responses From Caches

13.5.1 End-to-end and Hop-by-hop Headers

13.5.2 Non-modifiable Headers

13.5.3 Combining Headers

13.5.4 Combining Byte Ranges

13.6 Caching Negotiated Responses

13.7 Shared and Non-Shared Caches

13.8 Errors or Incomplete Response Cache Behavior

13.9 Side Effects of GET and HEAD

13.10 Invalidation After Updates or Deletions

13.11 Write-Through Mandatory

13.12 Cache Replacement

13.13 History Lists

Saturday, December 11, 2010

Stress Testing an Apache Application Server in a Real World Environment

Subscribe To QA Expertise

Total Pageviews