CodeNet / Веб программирование / Протоколы и стандарты / HTTP / RFC 2068
Caching in HTTP / RFC 2068

   HTTP is typically used for distributed information systems, where
   performance can be improved by the use of response caches. The
   HTTP/1.1 protocol includes a number of elements intended to make
   caching work as well as possible. Because these elements are
   inextricable from other aspects of the protocol, and because they
   interact with each other, it is useful to describe the basic caching
   design of HTTP separately from the detailed descriptions of methods,

   headers, response codes, etc.

   Caching would be useless if it did not significantly improve
   performance. The goal of caching in HTTP/1.1 is to eliminate the need
   to send requests in many cases, and to eliminate the need to send
   full responses in many other cases. The former reduces the number of
   network round-trips required for many operations; we use an
   "expiration" mechanism for this purpose (see section 13.2). The
   latter reduces network bandwidth requirements; we use a "validation"
   mechanism for this purpose (see section 13.3).

   Requirements for performance, availability, and disconnected
   operation require us to be able to relax the goal of semantic
   transparency. The HTTP/1.1 protocol allows origin servers, caches,
   and clients to explicitly reduce transparency when necessary.
   However, because non-transparent operation may confuse non-expert
   users, and may be incompatible with certain server applications (such
   as those for ordering merchandise), the protocol requires that
   transparency be relaxed

  o  only by an explicit protocol-level request when relaxed by client
     or origin server

  o  only with an explicit warning to the end user when relaxed by cache
     or client

   Therefore, the HTTP/1.1 protocol provides these important elements:

  1. Protocol features that provide full semantic transparency when this
     is required by all parties.

  2. Protocol features that allow an origin server or user agent to
     explicitly request and control non-transparent operation.

  3. Protocol features that allow a cache to attach warnings to
     responses that do not preserve the requested approximation of
     semantic transparency.

   A basic principle is that it must be possible for the clients to
   detect any potential relaxation of semantic transparency.

     Note: The server, cache, or client implementer may be faced with
     design decisions not explicitly discussed in this specification. If
     a decision may affect semantic transparency, the implementer ought
     to err on the side of maintaining transparency unless a careful and
     complete analysis shows significant benefits in breaking
     transparency.

13.1.1 Cache Correctness

   A correct cache MUST respond to a request with the most up-to-date
   response held by the cache that is appropriate to the request (see
   sections 13.2.5, 13.2.6, and 13.12) which meets one of the following
   conditions:

  1. It has been checked for equivalence with what the origin server
     would have returned by revalidating the response with the origin
     server (section 13.3);

  2. It is "fresh enough" (see section 13.2). In the default case, this
     means it meets the least restrictive freshness requirement of the
     client, server, and cache (see section 14.9); if the origin server
     so specifies, it is the freshness requirement of the origin server
     alone.

  3. It includes a warning if the freshness demand of the client or the
     origin server is violated (see section 13.1.5 and 14.45).

  4. It is an appropriate 304 (Not Modified), 305 (Proxy Redirect), or
     error (4xx or 5xx) response message.

   If the cache can not communicate with the origin server, then a
   correct cache SHOULD respond as above if the response can be
   correctly served from the cache; if not it MUST return an error or

   warning indicating that there was a communication failure.

   If a cache receives a response (either an entire response, or a 304
   (Not Modified) response) that it would normally forward to the
   requesting client, and the received response is no longer fresh, the
   cache SHOULD forward it to the requesting client without adding a new
   Warning (but without removing any existing Warning headers). A cache
   SHOULD NOT attempt to revalidate a response simply because that
   response became stale in transit; this might lead to an infinite
   loop. An user agent that receives a stale response without a Warning
   MAY display a warning indication to the user.

13.1.2 Warnings

   Whenever a cache returns a response that is neither first-hand nor
   "fresh enough" (in the sense of condition 2 in section 13.1.1), it
   must attach a warning to that effect, using a Warning response-
   header. This warning allows clients to take appropriate action.

   Warnings may be used for other purposes, both cache-related and
   otherwise. The use of a warning, rather than an error status code,
   distinguish these responses from true failures.

   Warnings are always cachable, because they never weaken the
   transparency of a response. This means that warnings can be passed to
   HTTP/1.0 caches without danger; such caches will simply pass the
   warning along as an entity-header in the response.

   Warnings are assigned numbers between 0 and 99. This specification
   defines the code numbers and meanings of each currently assigned
   warnings, allowing a client or cache to take automated action in some
   (but not all) cases.

   Warnings also carry a warning text. The text may be in any
   appropriate natural language (perhaps based on the client's Accept
   headers), and include an optional indication of what character set is
   used.

   Multiple warnings may be attached to a response (either by the origin
   server or by a cache), including multiple warnings with the same code
   number. For example, a server may provide the same warning with texts
   in both English and Basque.

   When multiple warnings are attached to a response, it may not be
   practical or reasonable to display all of them to the user. This
   version of HTTP does not specify strict priority rules for deciding
   which warnings to display and in what order, but does suggest some
   heuristics.

   The Warning header and the currently defined warnings are described
   in section 14.45.

13.1.3 Cache-control Mechanisms

   The basic cache mechanisms in HTTP/1.1 (server-specified expiration
   times and validators) are implicit directives to caches. In some
   cases, a server or client may need to provide explicit directives to
   the HTTP caches. We use the Cache-Control header for this purpose.

   The Cache-Control header allows a client or server to transmit a
   variety of directives in either requests or responses. These
   directives typically override the default caching algorithms. As a
   general rule, if there is any apparent conflict between header
   values, the most restrictive interpretation should be applied (that
   is, the one that is most likely to preserve semantic transparency).
   However, in some cases, Cache-Control directives are explicitly
   specified as weakening the approximation of semantic transparency
   (for example, "max-stale" or "public").

   The Cache-Control directives are described in detail in section 14.9.

13.1.4 Explicit User Agent Warnings

   Many user agents make it possible for users to override the basic
   caching mechanisms. For example, the user agent may allow the user to
   specify that cached entities (even explicitly stale ones) are never
   validated. Or the user agent might habitually add "Cache-Control:
   max-stale=3600" to every request. The user should have to explicitly
   request either non-transparent behavior, or behavior that results in
   abnormally ineffective caching.

   If the user has overridden the basic caching mechanisms, the user
   agent should explicitly indicate to the user whenever this results in
   the display of information that might not meet the server's
   transparency requirements (in particular, if the displayed entity is
   known to be stale). Since the protocol normally allows the user agent
   to determine if responses are stale or not, this indication need only
   be displayed when this actually happens. The indication need not be a
   dialog box; it could be an icon (for example, a picture of a rotting
   fish) or some other visual indicator.

   If the user has overridden the caching mechanisms in a way that would
   abnormally reduce the effectiveness of caches, the user agent should
   continually display an indication (for example, a picture of currency
   in flames) so that the user does not inadvertently consume excess
   resources or suffer from excessive latency.

13.1.5 Exceptions to the Rules and Warnings

   In some cases, the operator of a cache may choose to configure it to
   return stale responses even when not requested by clients. This
   decision should not be made lightly, but may be necessary for reasons
   of availability or performance, especially when the cache is poorly
   connected to the origin server. Whenever a cache returns a stale
   response, it MUST mark it as such (using a Warning header). This
   allows the client software to alert the user that there may be a
   potential problem.

   It also allows the user agent to take steps to obtain a first-hand or
   fresh response. For this reason, a cache SHOULD NOT return a stale
   response if the client explicitly requests a first-hand or fresh one,
   unless it is impossible to comply for technical or policy reasons.

13.1.6 Client-controlled Behavior

   While the origin server (and to a lesser extent, intermediate caches,
   by their contribution to the age of a response) are the primary
   source of expiration information, in some cases the client may need
   to control a cache's decision about whether to return a cached
   response without validating it. Clients do this using several
   directives of the Cache-Control header.

   A client's request may specify the maximum age it is willing to
   accept of an unvalidated response; specifying a value of zero forces
   the cache(s) to revalidate all responses. A client may also specify
   the minimum time remaining before a response expires. Both of these
   options increase constraints on the behavior of caches, and so cannot
   further relax the cache's approximation of semantic transparency.

   A client may also specify that it will accept stale responses, up to
   some maximum amount of staleness. This loosens the constraints on the
   caches, and so may violate the origin server's specified constraints
   on semantic transparency, but may be necessary to support
   disconnected operation, or high availability in the face of poor
   connectivity.

13.2 Expiration Model

13.2.1 Server-Specified Expiration

   HTTP caching works best when caches can entirely avoid making
   requests to the origin server. The primary mechanism for avoiding
   requests is for an origin server to provide an explicit expiration
   time in the future, indicating that a response may be used to satisfy
   subsequent requests.  In other words, a cache can return a fresh

   response without first contacting the server.

   Our expectation is that servers will assign future explicit
   expiration times to responses in the belief that the entity is not
   likely to change, in a semantically significant way, before the
   expiration time is reached. This normally preserves semantic
   transparency, as long as the server's expiration times are carefully
   chosen.

   The expiration mechanism applies only to responses taken from a cache
   and not to first-hand responses forwarded immediately to the
   requesting client.

   If an origin server wishes to force a semantically transparent cache
   to validate every request, it may assign an explicit expiration time
   in the past. This means that the response is always stale, and so the
   cache SHOULD validate it before using it for subsequent requests. See
   section 14.9.4 for a more restrictive way to force revalidation.

   If an origin server wishes to force any HTTP/1.1 cache, no matter how
   it is configured, to validate every request, it should use the
   "must-revalidate" Cache-Control directive (see section 14.9).

   Servers specify explicit expiration times using either the Expires
   header, or the max-age directive of the Cache-Control header.

   An expiration time cannot be used to force a user agent to refresh
   its display or reload a resource; its semantics apply only to caching
   mechanisms, and such mechanisms need only check a resource's
   expiration status when a new request for that resource is initiated.
   See section 13.13 for explanation of the difference between caches
   and history mechanisms.

13.2.2 Heuristic Expiration

   Since origin servers do not always provide explicit expiration times,
   HTTP caches typically assign heuristic expiration times, employing
   algorithms that use other header values (such as the Last-Modified
   time) to estimate a plausible expiration time. The HTTP/1.1
   specification does not provide specific algorithms, but does impose
   worst-case constraints on their results. Since heuristic expiration
   times may compromise semantic transparency, they should be used
   cautiously, and we encourage origin servers to provide explicit
   expiration times as much as possible.

13.2.3 Age Calculations

   In order to know if a cached entry is fresh, a cache needs to know if
   its age exceeds its freshness lifetime. We discuss how to calculate
   the latter in section 13.2.4; this section describes how to calculate
   the age of a response or cache entry.

   In this discussion, we use the term "now" to mean "the current value
   of the clock at the host performing the calculation." Hosts that use
   HTTP, but especially hosts running origin servers and caches, should
   use NTP [28] or some similar protocol to synchronize their clocks to
   a globally accurate time standard.

   Also note that HTTP/1.1 requires origin servers to send a Date header
   with every response, giving the time at which the response was
   generated. We use the term "date_value" to denote the value of the
   Date header, in a form appropriate for arithmetic operations.

   HTTP/1.1 uses the Age response-header to help convey age information
   between caches. The Age header value is the sender's estimate of the
   amount of time since the response was generated at the origin server.
   In the case of a cached response that has been revalidated with the
   origin server, the Age value is based on the time of revalidation,
   not of the original response.

   In essence, the Age value is the sum of the time that the response
   has been resident in each of the caches along the path from the
   origin server, plus the amount of time it has been in transit along
   network paths.

   We use the term "age_value" to denote the value of the Age header, in
   a form appropriate for arithmetic operations.

   A response's age can be calculated in two entirely independent ways:

     1. now minus date_value, if the local clock is reasonably well
        synchronized to the origin server's clock. If the result is
        negative, the result is replaced by zero.

     2. age_value, if all of the caches along the response path
        implement HTTP/1.1.

   Given that we have two independent ways to compute the age of a
   response when it is received, we can combine these as

          corrected_received_age = max(now - date_value, age_value)

   and as long as we have either nearly synchronized clocks or all-

   HTTP/1.1 paths, one gets a reliable (conservative) result.

   Note that this correction is applied at each HTTP/1.1 cache along the
   path, so that if there is an HTTP/1.0 cache in the path, the correct
   received age is computed as long as the receiving cache's clock is
   nearly in sync. We don't need end-to-end clock synchronization
   (although it is good to have), and there is no explicit clock
   synchronization step.

   Because of network-imposed delays, some significant interval may pass
   from the time that a server generates a response and the time it is
   received at the next outbound cache or client. If uncorrected, this
   delay could result in improperly low ages.

   Because the request that resulted in the returned Age value must have
   been initiated prior to that Age value's generation, we can correct
   for delays imposed by the network by recording the time at which the
   request was initiated. Then, when an Age value is received, it MUST
   be interpreted relative to the time the request was initiated, not
   the time that the response was received. This algorithm results in
   conservative behavior no matter how much delay is experienced. So, we
   compute:

         corrected_initial_age = corrected_received_age
                               + (now - request_time)

   where "request_time" is the time (according to the local clock) when
   the request that elicited this response was sent.

   Summary of age calculation algorithm, when a cache receives a
   response:

      /*
       * age_value
       *      is the value of Age: header received by the cache with
       *              this response.
       * date_value
       *      is the value of the origin server's Date: header
       * request_time
       *      is the (local) time when the cache made the request
       *              that resulted in this cached response
       * response_time
       *      is the (local) time when the cache received the
       *              response
       * now
       *      is the current (local) time
       */
      apparent_age = max(0, response_time - date_value);

      corrected_received_age = max(apparent_age, age_value);
      response_delay = response_time - request_time;
      corrected_initial_age = corrected_received_age + response_delay;
      resident_time = now - response_time;
      current_age   = corrected_initial_age + resident_time;

   When a cache sends a response, it must add to the
   corrected_initial_age the amount of time that the response was
   resident locally. It must then transmit this total age, using the Age
   header, to the next recipient cache.

     Note that a client cannot reliably tell that a response is first-
     hand, but the presence of an Age header indicates that a response
     is definitely not first-hand. Also, if the Date in a response is
     earlier than the client's local request time, the response is
     probably not first-hand (in the absence of serious clock skew).

13.2.4 Expiration Calculations

   In order to decide whether a response is fresh or stale, we need to
   compare its freshness lifetime to its age. The age is calculated as
   described in section 13.2.3; this section describes how to calculate
   the freshness lifetime, and to determine if a response has expired.
   In the discussion below, the values can be represented in any form
   appropriate for arithmetic operations.

   We use the term "expires_value" to denote the value of the Expires
   header. We use the term "max_age_value" to denote an appropriate
   value of the number of seconds carried by the max-age directive of
   the Cache-Control header in a response (see section 14.10.

   The max-age directive takes priority over Expires, so if max-age is
   present in a response, the calculation is simply:

         freshness_lifetime = max_age_value

   Otherwise, if Expires is present in the response, the calculation is:

         freshness_lifetime = expires_value - date_value

   Note that neither of these calculations is vulnerable to clock skew,
   since all of the information comes from the origin server.

   If neither Expires nor Cache-Control: max-age appears in the
   response, and the response does not include other restrictions on
   caching, the cache MAY compute a freshness lifetime using a
   heuristic. If the value is greater than 24 hours, the cache must
   attach Warning 13 to any response whose age is more than 24 hours if

   such warning has not already been added.

   Also, if the response does have a Last-Modified time, the heuristic
   expiration value SHOULD be no more than some fraction of the interval
   since that time. A typical setting of this fraction might be 10%.

   The calculation to determine if a response has expired is quite
   simple:

         response_is_fresh = (freshness_lifetime > current_age)

13.2.5 Disambiguating Expiration Values

   Because expiration values are assigned optimistically, it is possible
   for two caches to contain fresh values for the same resource that are
   different.

   If a client performing a retrieval receives a non-first-hand response
   for a request that was already fresh in its own cache, and the Date
   header in its existing cache entry is newer than the Date on the new
   response, then the client MAY ignore the response. If so, it MAY
   retry the request with a "Cache-Control: max-age=0" directive (see
   section 14.9), to force a check with the origin server.

   If a cache has two fresh responses for the same representation with
   different validators, it MUST use the one with the more recent Date
   header. This situation may arise because the cache is pooling
   responses from other caches, or because a client has asked for a
   reload or a revalidation of an apparently fresh cache entry.

13.2.6 Disambiguating Multiple Responses

   Because a client may be receiving responses via multiple paths, so
   that some responses flow through one set of caches and other
   responses flow through a different set of caches, a client may
   receive responses in an order different from that in which the origin
   server sent them. We would like the client to use the most recently
   generated response, even if older responses are still apparently
   fresh.

   Neither the entity tag nor the expiration value can impose an
   ordering on responses, since it is possible that a later response
   intentionally carries an earlier expiration time. However, the
   HTTP/1.1 specification requires the transmission of Date headers on
   every response, and the Date values are ordered to a granularity of
   one second.

   When a client tries to revalidate a cache entry, and the response it
   receives contains a Date header that appears to be older than the one
   for the existing entry, then the client SHOULD repeat the request
   unconditionally, and include

          Cache-Control: max-age=0

   to force any intermediate caches to validate their copies directly
   with the origin server, or

          Cache-Control: no-cache

   to force any intermediate caches to obtain a new copy from the origin
   server.

   If the Date values are equal, then the client may use either response
   (or may, if it is being extremely prudent, request a new response).
   Servers MUST NOT depend on clients being able to choose
   deterministically between responses generated during the same second,
   if their expiration times overlap.

13.3 Validation Model

   When a cache has a stale entry that it would like to use as a
   response to a client's request, it first has to check with the origin
   server (or possibly an intermediate cache with a fresh response) to
   see if its cached entry is still usable. We call this "validating"
   the cache entry.  Since we do not want to have to pay the overhead of
   retransmitting the full response if the cached entry is good, and we
   do not want to pay the overhead of an extra round trip if the cached
   entry is invalid, the HTTP/1.1 protocol supports the use of
   conditional methods.

   The key protocol features for supporting conditional methods are
   those concerned with "cache validators." When an origin server
   generates a full response, it attaches some sort of validator to it,
   which is kept with the cache entry. When a client (user agent or
   proxy cache) makes a conditional request for a resource for which it
   has a cache entry, it includes the associated validator in the
   request.

   The server then checks that validator against the current validator
   for the entity, and, if they match, it responds with a special status
   code (usually, 304 (Not Modified)) and no entity-body. Otherwise, it
   returns a full response (including entity-body). Thus, we avoid
   transmitting the full response if the validator matches, and we avoid
   an extra round trip if it does not match.

     Note: the comparison functions used to decide if validators match
     are defined in section 13.3.3.

   In HTTP/1.1, a conditional request looks exactly the same as a normal
   request for the same resource, except that it carries a special
   header (which includes the validator) that implicitly turns the
   method (usually, GET) into a conditional.

   The protocol includes both positive and negative senses of cache-
   validating conditions. That is, it is possible to request either that
   a method be performed if and only if a validator matches or if and
   only if no validators match.

     Note: a response that lacks a validator may still be cached, and
     served from cache until it expires, unless this is explicitly
     prohibited by a Cache-Control directive. However, a cache cannot do
     a conditional retrieval if it does not have a validator for the
     entity, which means it will not be refreshable after it expires.

13.3.1 Last-modified Dates

   The Last-Modified entity-header field value is often used as a cache
   validator. In simple terms, a cache entry is considered to be valid
   if the entity has not been modified since the Last-Modified value.

13.3.2 Entity Tag Cache Validators

   The ETag entity-header field value, an entity tag, provides for an
   "opaque" cache validator. This may allow more reliable validation in
   situations where it is inconvenient to store modification dates,
   where the one-second resolution of HTTP date values is not
   sufficient, or where the origin server wishes to avoid certain
   paradoxes that may arise from the use of modification dates.

   Entity Tags are described in section 3.11. The headers used with
   entity tags are described in sections 14.20, 14.25, 14.26 and 14.43.

13.3.3 Weak and Strong Validators

   Since both origin servers and caches will compare two validators to
   decide if they represent the same or different entities, one normally
   would expect that if the entity (the entity-body or any entity-
   headers) changes in any way, then the associated validator would
   change as well.  If this is true, then we call this validator a
   "strong validator."

   However, there may be cases when a server prefers to change the
   validator only on semantically significant changes, and not when

   insignificant aspects of the entity change. A validator that does not
   always change when the resource changes is a "weak validator."

   Entity tags are normally "strong validators," but the protocol
   provides a mechanism to tag an entity tag as "weak." One can think of
   a strong validator as one that changes whenever the bits of an entity
   changes, while a weak value changes whenever the meaning of an entity
   changes.  Alternatively, one can think of a strong validator as part
   of an identifier for a specific entity, while a weak validator is
   part of an identifier for a set of semantically equivalent entities.

     Note: One example of a strong validator is an integer that is
     incremented in stable storage every time an entity is changed.

     An entity's modification time, if represented with one-second
     resolution, could be a weak validator, since it is possible that
     the resource may be modified twice during a single second.

     Support for weak validators is optional; however, weak validators
     allow for more efficient caching of equivalent objects; for
     example, a hit counter on a site is probably good enough if it is
     updated every few days or weeks, and any value during that period
     is likely "good enough" to be equivalent.

     A "use" of a validator is either when a client generates a request
     and includes the validator in a validating header field, or when a
     server compares two validators.

   Strong validators are usable in any context. Weak validators are only
   usable in contexts that do not depend on exact equality of an entity.
   For example, either kind is usable for a conditional GET of a full
   entity. However, only a strong validator is usable for a sub-range
   retrieval, since otherwise the client may end up with an internally
   inconsistent entity.

   The only function that the HTTP/1.1 protocol defines on validators is
   comparison. There are two validator comparison functions, depending
   on whether the comparison context allows the use of weak validators
   or not:

  o  The strong comparison function: in order to be considered equal,
     both validators must be identical in every way, and neither may be
     weak.
  o  The weak comparison function: in order to be considered equal, both
     validators must be identical in every way, but either or both of
     them may be tagged as "weak" without affecting the result.

   The weak comparison function MAY be used for simple (non-subrange)

   GET requests. The strong comparison function MUST be used in all
   other cases.

   An entity tag is strong unless it is explicitly tagged as weak.
   Section 3.11 gives the syntax for entity tags.

   A Last-Modified time, when used as a validator in a request, is
   implicitly weak unless it is possible to deduce that it is strong,
   using the following rules:

  o  The validator is being compared by an origin server to the actual
     current validator for the entity and,
  o  That origin server reliably knows that the associated entity did
     not change twice during the second covered by the presented
     validator.
or

  o  The validator is about to be used by a client in an If-Modified-
     Since or If-Unmodified-Since header, because the client has a cache
     entry for the associated entity, and
  o  That cache entry includes a Date value, which gives the time when
     the origin server sent the original response, and
  o  The presented Last-Modified time is at least 60 seconds before the
     Date value.
or

  o  The validator is being compared by an intermediate cache to the
     validator stored in its cache entry for the entity, and
  o  That cache entry includes a Date value, which gives the time when
     the origin server sent the original response, and
  o  The presented Last-Modified time is at least 60 seconds before the
     Date value.

   This method relies on the fact that if two different responses were
   sent by the origin server during the same second, but both had the
   same Last-Modified time, then at least one of those responses would
   have a Date value equal to its Last-Modified time. The arbitrary 60-
   second limit guards against the possibility that the Date and Last-
   Modified values are generated from different clocks, or at somewhat
   different times during the preparation of the response. An
   implementation may use a value larger than 60 seconds, if it is
   believed that 60 seconds is too short.

   If a client wishes to perform a sub-range retrieval on a value for
   which it has only a Last-Modified time and no opaque validator, it
   may do this only if the Last-Modified time is strong in the sense
   described here.

   A cache or origin server receiving a cache-conditional request, other
   than a full-body GET request, MUST use the strong comparison function
   to evaluate the condition.

   These rules allow HTTP/1.1 caches and clients to safely perform sub-
   range retrievals on values that have been obtained from HTTP/1.0
   servers.

13.3.4 Rules for When to Use Entity Tags and Last-modified Dates

   We adopt a set of rules and recommendations for origin servers,
   clients, and caches regarding when various validator types should be
   used, and for what purposes.

   HTTP/1.1 origin servers:

  o  SHOULD send an entity tag validator unless it is not feasible to
     generate one.
  o  MAY send a weak entity tag instead of a strong entity tag, if
     performance considerations support the use of weak entity tags, or
     if it is unfeasible to send a strong entity tag.
  o  SHOULD send a Last-Modified value if it is feasible to send one,
     unless the risk of a breakdown in semantic transparency that could
     result from using this date in an If-Modified-Since header would
     lead to serious problems.

   In other words, the preferred behavior for an HTTP/1.1 origin server
   is to send both a strong entity tag and a Last-Modified value.

   In order to be legal, a strong entity tag MUST change whenever the
   associated entity value changes in any way. A weak entity tag SHOULD
   change whenever the associated entity changes in a semantically
   significant way.

     Note: in order to provide semantically transparent caching, an
     origin server must avoid reusing a specific strong entity tag value
     for two different entities, or reusing a specific weak entity tag
     value for two semantically different entities. Cache entries may
     persist for arbitrarily long periods, regardless of expiration
     times, so it may be inappropriate to expect that a cache will never
     again attempt to validate an entry using a validator that it
     obtained at some point in the past.

   HTTP/1.1 clients:

     o  If an entity tag has been provided by the origin server, MUST
        use that entity tag in any cache-conditional request (using
        If-Match or If-None-Match).

     o  If only a Last-Modified value has been provided by the origin
        server, SHOULD use that value in non-subrange cache-conditional
        requests (using If-Modified-Since).
     o  If only a Last-Modified value has been provided by an HTTP/1.0
        origin server, MAY use that value in subrange cache-conditional
        requests (using If-Unmodified-Since:). The user agent should
        provide a way to disable this, in case of difficulty.
     o  If both an entity tag and a Last-Modified value have been
        provided by the origin server, SHOULD use both validators in
        cache-conditional requests. This allows both HTTP/1.0 and
        HTTP/1.1 caches to respond appropriately.

   An HTTP/1.1 cache, upon receiving a request, MUST use the most
   restrictive validator when deciding whether the client's cache entry
   matches the cache's own cache entry. This is only an issue when the
   request contains both an entity tag and a last-modified-date
   validator (If-Modified-Since or If-Unmodified-Since).

     A note on rationale: The general principle behind these rules is
     that HTTP/1.1 servers and clients should transmit as much non-
     redundant information as is available in their responses and
     requests. HTTP/1.1 systems receiving this information will make the
     most conservative assumptions about the validators they receive.

     HTTP/1.0 clients and caches will ignore entity tags. Generally,
     last-modified values received or used by these systems will support
     transparent and efficient caching, and so HTTP/1.1 origin servers
     should provide Last-Modified values. In those rare cases where the
     use of a Last-Modified value as a validator by an HTTP/1.0 system
     could result in a serious problem, then HTTP/1.1 origin servers
     should not provide one.

13.3.5 Non-validating Conditionals

   The principle behind entity tags is that only the service author
   knows the semantics of a resource well enough to select an
   appropriate cache validation mechanism, and the specification of any
   validator comparison function more complex than byte-equality would
   open up a can of worms.  Thus, comparisons of any other headers
   (except Last-Modified, for compatibility with HTTP/1.0) are never
   used for purposes of validating a cache entry.

13.4 Response Cachability

   Unless specifically constrained by a Cache-Control (section 14.9)
   directive, a caching system may always store a successful response
   (see section 13.8) as a cache entry, may return it without validation
   if it is fresh, and may return it after successful validation. If

   there is neither a cache validator nor an explicit expiration time
   associated with a response, we do not expect it to be cached, but
   certain caches may violate this expectation (for example, when little
   or no network connectivity is available). A client can usually detect
   that such a response was taken from a cache by comparing the Date
   header to the current time.

     Note that some HTTP/1.0 caches are known to violate this
     expectation without providing any Warning.

   However, in some cases it may be inappropriate for a cache to retain
   an entity, or to return it in response to a subsequent request. This
   may be because absolute semantic transparency is deemed necessary by
   the service author, or because of security or privacy considerations.
   Certain Cache-Control directives are therefore provided so that the
   server can indicate that certain resource entities, or portions
   thereof, may not be cached regardless of other considerations.

   Note that section 14.8 normally prevents a shared cache from saving
   and returning a response to a previous request if that request
   included an Authorization header.

   A response received with a status code of 200, 203, 206, 300, 301 or
   410 may be stored by a cache and used in reply to a subsequent
   request, subject to the expiration mechanism, unless a Cache-Control
   directive prohibits caching. However, a cache that does not support
   the Range and Content-Range headers MUST NOT cache 206 (Partial
   Content) responses.

   A response received with any other status code MUST NOT be returned
   in a reply to a subsequent request unless there are Cache-Control
   directives or another header(s) that explicitly allow it. For
   example, these include the following: an Expires header (section
   14.21); a "max-age", "must-revalidate", "proxy-revalidate", "public"
   or "private" Cache-Control directive (section 14.9).

13.5 Constructing Responses From Caches

   The purpose of an HTTP cache is to store information received in
   response to requests, for use in responding to future requests. In
   many cases, a cache simply returns the appropriate parts of a
   response to the requester. However, if the cache holds a cache entry
   based on a previous response, it may have to combine parts of a new
   response with what is held in the cache entry.

13.5.1 End-to-end and Hop-by-hop Headers

   For the purpose of defining the behavior of caches and non-caching
   proxies, we divide HTTP headers into two categories:

  o  End-to-end headers, which must be transmitted to the
     ultimate recipient of a request or response. End-to-end
     headers in responses must be stored as part of a cache entry
     and transmitted in any response formed from a cache entry.
  o  Hop-by-hop headers, which are meaningful only for a single
     transport-level connection, and are not stored by caches or
     forwarded by proxies.

   The following HTTP/1.1 headers are hop-by-hop headers:

     o  Connection
     o  Keep-Alive
     o  Public
     o  Proxy-Authenticate
     o  Transfer-Encoding
     o  Upgrade

   All other headers defined by HTTP/1.1 are end-to-end headers.

   Hop-by-hop headers introduced in future versions of HTTP MUST be
   listed in a Connection header, as described in section 14.10.

13.5.2 Non-modifiable Headers

   Some features of the HTTP/1.1 protocol, such as Digest
   Authentication, depend on the value of certain end-to-end headers. A
   cache or non-caching proxy SHOULD NOT modify an end-to-end header
   unless the definition of that header requires or specifically allows
   that.

   A cache or non-caching proxy MUST NOT modify any of the following
   fields in a request or response, nor may it add any of these fields
   if not already present:

     o  Content-Location
     o  ETag
     o  Expires
     o  Last-Modified

   A cache or non-caching proxy MUST NOT modify or add any of the
   following fields in a response that contains the no-transform Cache-
   Control directive, or in any request:

     o  Content-Encoding
     o  Content-Length
     o  Content-Range
     o  Content-Type

   A cache or non-caching proxy MAY modify or add these fields in a
   response that does not include no-transform, but if it does so, it
   MUST add a Warning 14 (Transformation applied) if one does not
   already appear in the response.

     Warning: unnecessary modification of end-to-end headers may cause
     authentication failures if stronger authentication mechanisms are
     introduced in later versions of HTTP. Such authentication
     mechanisms may rely on the values of header fields not listed here.

13.5.3 Combining Headers

   When a cache makes a validating request to a server, and the server
   provides a 304 (Not Modified) response, the cache must construct a
   response to send to the requesting client. The cache uses the
   entity-body stored in the cache entry as the entity-body of this
   outgoing response. The end-to-end headers stored in the cache entry
   are used for the constructed response, except that any end-to-end
   headers provided in the 304 response MUST replace the corresponding
   headers from the cache entry. Unless the cache decides to remove the
   cache entry, it MUST also replace the end-to-end headers stored with
   the cache entry with corresponding headers received in the incoming
   response.

   In other words, the set of end-to-end headers received in the
   incoming response overrides all corresponding end-to-end headers
   stored with the cache entry. The cache may add Warning headers (see
   section 14.45) to this set.

   If a header field-name in the incoming response matches more than one
   header in the cache entry, all such old headers are replaced.

     Note: this rule allows an origin server to use a 304 (Not Modified)
     response to update any header associated with a previous response
     for the same entity, although it might not always be meaningful or
     correct to do so. This rule does not allow an origin server to use
     a 304 (not Modified) response to entirely delete a header that it
     had provided with a previous response.

13.5.4 Combining Byte Ranges

   A response may transfer only a subrange of the bytes of an entity-
   body, either because the request included one or more Range
   specifications, or because a connection was broken prematurely. After
   several such transfers, a cache may have received several ranges of
   the same entity-body.

   If a cache has a stored non-empty set of subranges for an entity, and
   an incoming response transfers another subrange, the cache MAY
   combine the new subrange with the existing set if both the following
   conditions are met:

     o  Both the incoming response and the cache entry must have a cache
        validator.
     o  The two cache validators must match using the strong comparison
        function (see section 13.3.3).

   If either requirement is not meant, the cache must use only the most
   recent partial response (based on the Date values transmitted with
   every response, and using the incoming response if these values are
   equal or missing), and must discard the other partial information.

13.6 Caching Negotiated Responses

   Use of server-driven content negotiation (section 12), as indicated
   by the presence of a Vary header field in a response, alters the
   conditions and procedure by which a cache can use the response for
   subsequent requests.

   A server MUST use the Vary header field (section 14.43) to inform a
   cache of what header field dimensions are used to select among
   multiple representations of a cachable response. A cache may use the
   selected representation (the entity included with that particular
   response) for replying to subsequent requests on that resource only
   when the subsequent requests have the same or equivalent values for
   all header fields specified in the Vary response-header. Requests
   with a different value for one or more of those header fields would
   be forwarded toward the origin server.

   If an entity tag was assigned to the representation, the forwarded
   request SHOULD be conditional and include the entity tags in an If-
   None-Match header field from all its cache entries for the Request-
   URI. This conveys to the server the set of entities currently held by
   the cache, so that if any one of these entities matches the requested
   entity, the server can use the ETag header in its 304 (Not Modified)
   response to tell the cache which entry is appropriate. If the
   entity-tag of the new response matches that of an existing entry, the

   new response SHOULD be used to update the header fields of the
   existing entry, and the result MUST be returned to the client.

   The Vary header field may also inform the cache that the
   representation was selected using criteria not limited to the
   request-headers; in this case, a cache MUST NOT use the response in a
   reply to a subsequent request unless the cache relays the new request
   to the origin server in a conditional request and the server responds
   with 304 (Not Modified), including an entity tag or Content-Location
   that indicates which entity should be used.

   If any of the existing cache entries contains only partial content
   for the associated entity, its entity-tag SHOULD NOT be included in
   the If-None-Match header unless the request is for a range that would
   be fully satisfied by that entry.

   If a cache receives a successful response whose Content-Location
   field matches that of an existing cache entry for the same Request-
   URI, whose entity-tag differs from that of the existing entry, and
   whose Date is more recent than that of the existing entry, the
   existing entry SHOULD NOT be returned in response to future requests,
   and should be deleted from the cache.

13.7 Shared and Non-Shared Caches

   For reasons of security and privacy, it is necessary to make a
   distinction between "shared" and "non-shared" caches. A non-shared
   cache is one that is accessible only to a single user. Accessibility
   in this case SHOULD be enforced by appropriate security mechanisms.
   All other caches are considered to be "shared." Other sections of
   this specification place certain constraints on the operation of
   shared caches in order to prevent loss of privacy or failure of
   access controls.

13.8 Errors or Incomplete Response Cache Behavior

   A cache that receives an incomplete response (for example, with fewer
   bytes of data than specified in a Content-Length header) may store
   the response. However, the cache MUST treat this as a partial
   response.  Partial responses may be combined as described in section
   13.5.4; the result might be a full response or might still be
   partial. A cache MUST NOT return a partial response to a client
   without explicitly marking it as such, using the 206 (Partial
   Content) status code. A cache MUST NOT return a partial response
   using a status code of 200 (OK).

   If a cache receives a 5xx response while attempting to revalidate an
   entry, it may either forward this response to the requesting client,

   or act as if the server failed to respond. In the latter case, it MAY
   return a previously received response unless the cached entry
   includes the "must-revalidate" Cache-Control directive (see section
   14.9).

13.9 Side Effects of GET and HEAD

   Unless the origin server explicitly prohibits the caching of their
   responses, the application of GET and HEAD methods to any resources
   SHOULD NOT have side effects that would lead to erroneous behavior if
   these responses are taken from a cache. They may still have side
   effects, but a cache is not required to consider such side effects in
   its caching decisions. Caches are always expected to observe an
   origin server's explicit restrictions on caching.

   We note one exception to this rule: since some applications have
   traditionally used GETs and HEADs with query URLs (those containing a
   "?" in the rel_path part) to perform operations with significant side
   effects, caches MUST NOT treat responses to such URLs as fresh unless
   the server provides an explicit expiration time. This specifically
   means that responses from HTTP/1.0 servers for such URIs should not
   be taken from a cache. See section 9.1.1 for related information.

13.10 Invalidation After Updates or Deletions

   The effect of certain methods at the origin server may cause one or
   more existing cache entries to become non-transparently invalid. That
   is, although they may continue to be "fresh," they do not accurately
   reflect what the origin server would return for a new request.

   There is no way for the HTTP protocol to guarantee that all such
   cache entries are marked invalid. For example, the request that
   caused the change at the origin server may not have gone through the
   proxy where a cache entry is stored. However, several rules help
   reduce the likelihood of erroneous behavior.

   In this section, the phrase "invalidate an entity" means that the
   cache should either remove all instances of that entity from its
   storage, or should mark these as "invalid" and in need of a mandatory
   revalidation before they can be returned in response to a subsequent
   request.

   Some HTTP methods may invalidate an entity. This is either the entity
   referred to by the Request-URI, or by the Location or Content-
   Location response-headers (if present). These methods are:

     o  PUT
     o  DELETE
     o  POST

   In order to prevent denial of service attacks, an invalidation based
   on the URI in a Location or Content-Location header MUST only be
   performed if the host part is the same as in the Request-URI.

13.11 Write-Through Mandatory

   All methods that may be expected to cause modifications to the origin
   server's resources MUST be written through to the origin server. This
   currently includes all methods except for GET and HEAD. A cache MUST
   NOT reply to such a request from a client before having transmitted
   the request to the inbound server, and having received a
   corresponding response from the inbound server. This does not prevent
   a cache from sending a 100 (Continue) response before the inbound
   server has replied.

   The alternative (known as "write-back" or "copy-back" caching) is not
   allowed in HTTP/1.1, due to the difficulty of providing consistent
   updates and the problems arising from server, cache, or network
   failure prior to write-back.

13.12 Cache Replacement

   If a new cachable (see sections 14.9.2, 13.2.5, 13.2.6 and 13.8)
   response is received from a resource while any existing responses for
   the same resource are cached, the cache SHOULD use the new response
   to reply to the current request. It may insert it into cache storage
   and may, if it meets all other requirements, use it to respond to any
   future requests that would previously have caused the old response to
   be returned. If it inserts the new response into cache storage it
   should follow the rules in section 13.5.3.

     Note: a new response that has an older Date header value than
     existing cached responses is not cachable.

13.13 History Lists

   User agents often have history mechanisms, such as "Back" buttons and
   history lists, which can be used to redisplay an entity retrieved
   earlier in a session.

   History mechanisms and caches are different. In particular history
   mechanisms SHOULD NOT try to show a semantically transparent view of
   the current state of a resource. Rather, a history mechanism is meant
   to show exactly what the user saw at the time when the resource was
   retrieved.

   By default, an expiration time does not apply to history mechanisms.
   If the entity is still in storage, a history mechanism should display
   it even if the entity has expired, unless the user has specifically
   configured the agent to refresh expired history documents.

   This should not be construed to prohibit the history mechanism from
   telling the user that a view may be stale.

     Note: if history list mechanisms unnecessarily prevent users from
     viewing stale resources, this will tend to force service authors to
     avoid using HTTP expiration controls and cache controls when they
     would otherwise like to. Service authors may consider it important
     that users not be presented with error messages or warning messages
     when they use navigation controls (such as BACK) to view previously
     fetched resources. Even though sometimes such resources ought not
     to cached, or ought to expire quickly, user interface
     considerations may force service authors to resort to other means
     of preventing caching (e.g. "once-only" URLs) in order not to
     suffer the effects of improperly functioning history mechanisms.
Ваш аккаунт

Последние темы форума

Почтовая рассылка

Caching in HTTP / RFC 2068

Оставить комментарий

Комментарий: можно использовать BB-коды Максимальная длина комментария - 4000 символов.	CodeNet ВКонтакте Facebook Twitter Google Яндекс Чтобы оставить комментарий, необходимо авторизоваться. Можно ввести логин и пароль, или авторизоваться через социальные сети.