[Mod_gzip] "mod_gzip_send_vary=Yes" disables caching on IE? (1.3.26.1a)
Slava Bizyayev
mod_gzip@lists.over.net
Mon, 9 Dec 2002 02:34:17 -0600
Yes, it appears to happen that the heap of misunderstanding (and
corresponding misleading) is growing quickly. So, let me update the status
of the problem briefly and to skip some (even important for some other
cases) details of the discussion in order to concentrate on the main
problem.
It is suspected that M$IE doesn't cache response locally in presence of Via
HTTP header even when the browser is explicitly instructed to cache the
particular page for some (limited) time. It's not that important to learn
how the browser behaves in case of absence of Expires HTTP header, because
the default browser's behavior is not even implied in any rfc.
This discussion began from the statement that any Via prevents M$IE from
caching the response. From my point of view that was not quite the truth.
For example, you can try
http://devl4.outlook.net/devdoc/Dynagzip/Dynagzip.html using M$IE 6.0 like:
C05 --> S06 GET /devdoc/Dynagzip/Dynagzip.html HTTP/1.1
C05 --> S06 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/msword, */*
C05 --> S06 Referer:
http://users.outlook.net/~sbizyaye/cgi-bin/pp-slav.cgi/index.html
C05 --> S06 Accept-Language: en-us
C05 --> S06 Accept-Encoding: gzip, deflate
C05 --> S06 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)
C05 --> S06 Host: devl4.outlook.net
C05 --> S06 Accept-Charset: ISO-8859-1
== Body was 0 bytes ==
C05 <-- S06 HTTP/1.1 200 OK
C05 <-- S06 Date: Mon, 09 Dec 2002 04:54:27 GMT
C05 <-- S06 Server: Apache/1.3.22 (Unix) Debian GNU/Linux mod_perl/1.26
C05 <-- S06 X-Module-Sender: Apache::Dynagzip
C05 <-- S06 Expires: Monday, 09-December-2002 05:24:27 GMT
C05 <-- S06 Vary: Accept-Encoding,*
C05 <-- S06 Transfer-Encoding: chunked
C05 <-- S06 Content-Type: text/html; charset=iso-8859-1
C05 <-- S06 Content-Encoding: gzip
C05 <-- S06 == Incoming Body was 12181 bytes ==
== Transmission: text gzip chunked ==
== Chunk Log ==
a (hex) = 10 (dec)
1cb7 (hex) = 7351 (dec)
12ba (hex) = 4794 (dec)
0 (hex) = 0 (dec)
== Latency = 0.280 seconds, Extra Delay = 0.380 seconds
== Restored Body was 51665 bytes ==
Next 30 minutes any attempt to reach the same URL from the same browser does
not generate any HTTP requests. Every time your browser uses only the cached
page to display. The same behavior was confirmed by Jordan Russell for the
compressed content in his own experiments. Since then the discussion
continues about uncompressed responses only. It is especially important to
have right Vary even on uncompressed responses to work properly with cache
proxies.
In my observations I see right behavior of M$IE in the following experiment:
C05 --> S06 GET /devdoc/Dynagzip/Dynagzip.html HTTP/1.1
C05 --> S06 Accept: */*
C05 --> S06 Referer:
http://users.outlook.net/~sbizyaye/cgi-bin/pp-slav.cgi/index.html
C05 --> S06 Accept-Language: en-us
C05 --> S06 Accept-Encoding: gzip, deflate
C05 --> S06 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)
C05 --> S06 Host: devl4.outlook.net
C05 --> S06 Pragma: no-cache
C05 --> S06 Accept-Charset: ISO-8859-1
== Body was 0 bytes ==
C05 <-- S06 HTTP/1.1 200 OK
C05 <-- S06 Date: Mon, 09 Dec 2002 06:20:15 GMT
C05 <-- S06 Server: Apache/1.3.22 (Unix) Debian GNU/Linux mod_perl/1.26
C05 <-- S06 X-Module-Sender: Apache::Dynagzip
C05 <-- S06 Expires: Monday, 09-December-2002 06:50:15 GMT
C05 <-- S06 Content-Type: text/html; charset=iso-8859-1
C05 <-- S06 Content-Length: 51665
C05 <-- S06 Vary: User-Agent,*
== Body was 51665 bytes ==
The response is coming back uncompressed. Expires instructs to cache this
content for 30 minutes. It works fine on my machine. No new requests during
the next 30 minutes.
Unfortunately, I can not say the same about the pure "Vary: *". In my
experiment:
C05 --> S06 GET /devdoc/Dynagzip/Dynagzip.html HTTP/1.1
C05 --> S06 Accept: */*
C05 --> S06 Accept-Language: en-us
C05 --> S06 Accept-Encoding: gzip, deflate
C05 --> S06 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)
C05 --> S06 Host: devl4.outlook.net
C05 --> S06 Accept-Charset: ISO-8859-1
== Body was 0 bytes ==
C05 <-- S06 HTTP/1.1 200 OK
C05 <-- S06 Date: Mon, 09 Dec 2002 05:53:09 GMT
C05 <-- S06 Server: Apache/1.3.22 (Unix) Debian GNU/Linux mod_perl/1.26
C05 <-- S06 X-Module-Sender: Apache::Dynagzip
C05 <-- S06 Expires: Monday, 09-December-2002 06:23:09 GMT
C05 <-- S06 Content-Type: text/html; charset=iso-8859-1
C05 <-- S06 Content-Length: 51665
C05 <-- S06 Vary: *
== Body was 51665 bytes ==
My browser generates the same (unconditional) request every time when I try
to reach the same URL.
So, it seems more safe to use "Vary: User-Agent,*" instead of just "Vary: *"
for uncompressed content even if there are no User-Agent features in your
response...
Thanks,
Slava
----- Original Message -----
From: <TOKILEY@aol.com>
To: <mod_gzip@lists.over.net>
Sent: Sunday, December 08, 2002 9:36 PM
Subject: Re: [Mod_gzip] "mod_gzip_send_vary=Yes" disables caching on IE?
(1.3.26.1a)
>
> > Slava wrote...
> >
> >> "Vary:" is only meant to be used to indicate what REQUEST
> >> headers might be involved in the 'varying' conditions that
> >> might influence what PROXY CACHE should do.
> >>
> >> In other words... "Vary:" is IN the RESPONSE headers but
> >> is only supposed to reference REQUEST headers.
> >
> > Except "Vary: *" which is absolutely legal and is not associated with
any
> > request header. It is defined by rfc2616 especially to serve some
> > special responses, like personalized content.
>
> Yes it is. You are right about that... but note that even "Vary: *"
> is referencing REQUEST headers... not RESPONSE headers...
> and the "Vary: *" instruction is still intended as a signal to
> intermdiate Proxies and NOT to actual end-point User Agents.
>
> What "Vary: *" is actually 'telling' an inline Proxy Cache Server
> could be translated like this...
>
> "You can go ahead and cache this response if you choose to
> so you can hand out the same response to the next person
> who asks for the same URI... but be advised that if ANY of
> the REQUEST headers from the next person who asks for
> this differs in any way from the REQUEST headers associated
> with the CURRENT REQUEST ( the one that produced this
> response ) then do NOT give them this response and you
> must come back to the Content Origin Server and let IT
> decide what response this person should get."
>
> Note that there still is no reason that a Proxy Cache Server
> should NOT go ahead and 'cache' this response even though
> it has "Vary: *" in the RESPONSE headers. "Vary: *" does
> not mean DON'T CACHE IT... it just means that unless the
> next set of REQUEST headers matches perfectly then you
> can't consider this the correct response to the URI request.
>
> In many cases... the next request for the URI will, in fact,
> be IDENTICAL. If you have a client base that is all pretty
> much using the same version of MSIE or Netscape then
> whoever asks for that URI next is very likey to send
> the exact same request headers... in which case it is still
> OK for the Proxy Cache Server to 'dish out' the cached
> response even though it has "Vary: *" on it.
>
> Of course... even a slight change in the TIME or the DATE
> on an "If-Modified-Since:" request will mean that at least
> one of the request headers is now 'different' but that's
> beside the point. Theoretically... a Proxy Cache Server
> SHOULD go ahead and 'cache' responses even though
> they have "Vary: *" on them because there is ALWAYS
> the chance that other requests will arrive that have
> identical request headers... and the "Vary: *" has
> NOT been satisfied (yet).
>
> Example: Let's say that the response to a request
> for a static URI is always going to be output from
> some Content Origin Server with "Vary: *" in the
> response header.
>
> If you were doing a test using 100 identical
> browsers ( same exact version ) and you cleared all
> of the browser caches so there is no chance of
> sending an "If-Modified-Since" request header ( or any
> header that might have a time/date in it ) and then you had
> all of them request the exact same URI via SQUID or some other
> Proxy Cache Server... then what SHOULD happens is that the
> very FIRST request for that URI SHOULD get 'cached'
> by the Proxy even though it has "Vary: *" on it.
>
> There are 99 other requests for the exact same URI
> that are about to arrive which will, in fact, have the
> EXACT SAME REQUEST HEADERS as the first
> request ( that should have primed the cache ) so
> it would violate the 'principle of least astonishment'
> to think that the Proxy Server would make 99 more
> requests for the same URI from the same upstream
> Server when the "Vary: *" conditions have NOT been
> met yet and it SHOULD be serving up the same
> non-expired response right out of it's own cache.
>
> This test would be similar to the famous ( and very
> common ) 9:00 AM syndrome whereby everyone at
> a company makes the same exact request for the
> same exact page right about the same exact time
> every day ( Like a base level company web page )
> because everyone fires up their computer within
> a few moments of each other every morning.
>
> In a company environment like that it is highly likely
> that everyone WILL be using the 'exact same browser'
> and that the request headers for the same URI will
> all be IDENTICAL for each and every request.
>
> If that company web page happened to be across
> the Atlantic Ocean and was NOT being 'cached'
> solely because it has "Vary: *" in the response
> header(s) then I imagine some IT director somewhere
> would be scratching his head wondering why the overseas
> bandwidth bills are so high since the company is
> SUPPOSED to be using a Proxy Cache on either
> side of the Ocean to minimize the high-cost traffic.
>
> Now that SQUID 2.5 is finally not refusing to cache
> ANY "Vary:" responses... I wonder what the policy is
> with regards to "Vary: *" and SQUID 2.5? Will SQUID 2.5
> still treat "Vary: *" as "Expires: -1" and NEVER cache
> those responses or will it, in fact, go ahead and cache
> it on the off chance that it MIGHT be able to use it
> for awhile ( until the Vary: * condition is satisfied ).
>
> I would bet that even SQUID 2.5 is simply treating "Vary: *"
> as "Expires: -1" is never, ever caching those responses.
>
> >> 1. If a response shows up with a "Vary:" header and there
> >> isn't any compressed data along with it... then MSIE
> >> will NOT cache that response because it has no idea
> >> how to handle "Vary:" ( nor is it really supposed to
> >> since it's not really a Proxy Cache ). It is simply doing what
> >> even SQUID has done with regards to "Vary:" up until
> >> just 7 weeks ago when SQUID 2.5 was released. SQUID
> >> would just refuse to ever cache anything that had a "Vary:"
> >> header.
> >
> >Disagree. When for example you provide the User-Agent specific
JavaScript,
> >or CSS, you would consider to place "Vary: User-Agent" in your response
> >even with no compression at all.
>
> Yes... of course. That's what "Vary:" was meant for... but
> 2 points come to mind...
>
> 1. The "Vary: User-Agent" signal on a RESPONSE is still meant
> for inline Proxy Server Caches only and NOT for the 'end point'
> User-Agent itself. The 'User-Agent' knows what it's own "User-Agent'
> string is and it's never going to change. ( at least not while the
> browser is currently running, anway ) "Vary: User-Agent" is only
> meant for an inline Server that has some chance of receiving
> DIFFERENT requests from DIFFERENT "User-Agents".
> "Vary: User-Agent" is meaningless to the end-point browser.
>
> 2. Are you saying that if/when you send CSS or JS to MSIE with
> "Vary: User-Agent" then that response DOES get 'cached' by
> MSIE? If it does... then that would be a scenario that defies
> what others have been saying on this thread. What they seem
> to be saying is that it looks like MSIE will NEVER 'cache'
> anything that has a "Vary:' field of any kind in the response
> header(s). ( Unless I totally missed the point of the discussion ).
>
> Don't get me wrong here... I think that MSIE should most certainly
> take a look at the "Vary: User-Agent" response header and realize
> that the current response is an answer to its OWN request and
> go ahead and CACHE IT... but that's not the behavior that seems
> be reported at this time.
>
> Could that be why a lot of Javascript is so damn slow?
> It's arriving with "Vary: User-Agent" and MSIE is refusing to
> cache it and so it keeps reloading the same damn Javascript
> all the time?
>
> >It is not supposed to affect the Expires header which you could provide
> >with the same response.
>
> No.. it is not.
>
> As far as I know... sending something like "Expires: -1" should
> take precedence over ANY other 'caching rules' and that
> response should NEVER be cached by ANYONE... Inline
> Proxy Cache Server or end-point User-Agent (browser).
>
> >> If anyone wants to know exactly what MSIE is doing with
> >> regards to "Vary:" then I suggest you just ASK them.
> >
> > Last time I've sent my request to M$ on June 13, 2002. I'm still waiting
> > for the response...
> >
> > Thanks,
> > Slava
>
> Then follow it up.
>
> Light a fire under someone's ass.
>
> They are now LEGALLY supposed to tell you what you
> want to know about their 'internals'. That's what the
> court case was all about and that was what a JUDGE
> told them they have to do.
>
> Yours, Kevin
>
> PS: Disclaimer... I am not a lawyer I and I have not read
> the entire court decision against Microsoft but I did give
> a once over and this is my layman's impression of what
> they are supposed to do. They do not have to break the
> company ( Microsoft ) up into tiny little divisions so long
> as they begin to 'play fair' and let developers who are
> trying to work with their products have the information
> they need to do their work... AND to 'develop similar
> products'. It was all about (finally) letting go of 'secrets'
> which created the monopoly in the first place.
> _______________________________________________
> mod_gzip mailing list
> mod_gzip@lists.over.net
> http://lists.over.net/mailman/listinfo/mod_gzip
>