[Mod_gzip] "mod_gzip_send_vary=Yes" disables caching on IE? (1.3.26.1a)

mod_gzip@lists.over.net mod_gzip@lists.over.net
Sun, 8 Dec 2002 22:36:35 EST


> Slava wrote...
>
>> "Vary:" is only meant to be used to indicate what REQUEST
>> headers might be involved in the 'varying' conditions that
>> might influence what PROXY CACHE should do.
>>
>> In other words... "Vary:" is IN the RESPONSE headers but
>> is only supposed to reference REQUEST headers.
>
> Except "Vary: *" which is absolutely legal and is not associated with any
> request header. It is defined by rfc2616 especially to serve some
> special responses, like personalized content.

Yes it is. You are right about that... but note that even "Vary: *"
is referencing REQUEST headers... not RESPONSE headers...
and the "Vary: *" instruction is still intended as a signal to
intermdiate Proxies and NOT to actual end-point User Agents.

What "Vary: *" is actually 'telling' an inline Proxy Cache Server
could be translated like this...

"You can go ahead and cache this response if you choose to
so you can hand out the same response to the next person 
who asks for the same URI... but be advised that if ANY of
the REQUEST headers from the next person who asks for 
this differs in any way from the REQUEST headers associated
with the CURRENT REQUEST ( the one that produced this
response ) then do NOT give them this response and you 
must come back to the Content Origin Server and let IT
decide what response this person should get."

Note that there still is no reason that a Proxy Cache Server
should NOT go ahead and 'cache' this response even though
it has "Vary: *" in the RESPONSE headers. "Vary: *" does
not mean DON'T CACHE IT... it just means that unless the
next set of REQUEST headers matches perfectly then you
can't consider this the correct response to the URI request.

In many cases... the next request for the URI will, in fact,
be IDENTICAL. If you have a client base that is all pretty
much using the same version of MSIE or Netscape then
whoever asks for that URI next is very likey to send
the exact same request headers... in which case it is still
OK for the Proxy Cache Server to 'dish out' the cached
response even though it has "Vary: *" on it.

Of course... even a slight change in the TIME or the DATE
on an "If-Modified-Since:" request will mean that at least
one of the request headers is now 'different' but that's 
beside the point. Theoretically... a Proxy Cache Server
SHOULD go ahead and 'cache' responses even though
they have "Vary: *" on them because there is ALWAYS
the chance that other requests will arrive that have
identical request headers... and the "Vary: *" has
NOT been satisfied (yet).

Example: Let's say that the response to a request 
for a static URI is always going to be output from
some Content Origin Server with "Vary: *" in the
response header.

If you were doing a test using 100 identical
browsers ( same exact version ) and you cleared all
of the browser caches so there is no chance of 
sending an "If-Modified-Since" request header ( or any
header that might have a time/date in it ) and then you had 
all of them request the exact same URI via SQUID or some other 
Proxy Cache Server... then what SHOULD happens is that the
very FIRST request for that URI SHOULD get 'cached'
by the Proxy even though it has "Vary: *" on it.

There are 99 other requests for the exact same URI
that are about to arrive which will, in fact, have the
EXACT SAME REQUEST HEADERS as the first
request ( that should have primed the cache ) so 
it would violate the 'principle of least astonishment'
to think that the Proxy Server would make 99 more
requests for the same URI from the same upstream
Server when the "Vary: *" conditions have NOT been
met yet and it SHOULD be serving up the same
non-expired response right out of it's own cache.

This test would be similar to the famous ( and very
common ) 9:00 AM syndrome whereby everyone at
a company makes the same exact request for the
same exact page right about the same exact time
every day ( Like a base level company web page )
because everyone fires up their computer within
a few moments of each other every morning.

In a company environment like that it is highly likely
that everyone WILL be using the 'exact same browser'
and that the request headers for the same URI will
all be IDENTICAL for each and every request.

If that company web page happened to be across
the Atlantic Ocean and was NOT being 'cached'
solely because it has "Vary: *" in the response
header(s) then I imagine some IT director somewhere 
would be scratching his head wondering why the overseas
bandwidth bills are so high since the company is
SUPPOSED to be using a Proxy Cache on either
side of the Ocean to minimize the high-cost traffic.

Now that SQUID 2.5 is finally not refusing to cache
ANY "Vary:" responses... I wonder what the policy is
with regards to "Vary: *" and SQUID 2.5? Will SQUID 2.5
still treat "Vary: *" as "Expires: -1" and NEVER cache
those responses or will it, in fact, go ahead and cache
it on the off chance that it MIGHT be able to use it
for awhile ( until the Vary: * condition is satisfied ).

I would bet that even SQUID 2.5 is simply treating "Vary: *"
as "Expires: -1" is never, ever caching those responses.

>> 1. If a response shows up with a "Vary:" header and there
>> isn't any compressed data along with it... then MSIE
>> will NOT cache that response because it has no idea
>> how to handle "Vary:" ( nor is it really supposed to
>> since it's not really a Proxy Cache ). It is simply doing what
>> even SQUID has done with regards to "Vary:" up until
>> just 7 weeks ago when SQUID 2.5 was released. SQUID
>> would just refuse to ever cache anything that had a "Vary:"
>> header.
>
>Disagree. When for example you provide the User-Agent specific JavaScript,
>or CSS, you would consider to place "Vary: User-Agent" in your response 
>even with no compression at all.

Yes... of course. That's what "Vary:" was meant for... but
2 points come to mind...

1. The "Vary: User-Agent" signal on a RESPONSE is still meant
for inline Proxy Server Caches only and NOT for the 'end point'
User-Agent itself. The 'User-Agent' knows what it's own "User-Agent'
string is and it's never going to change. ( at least not while the
browser is currently running, anway ) "Vary: User-Agent" is only
meant for an inline Server that has some chance of receiving
DIFFERENT requests from DIFFERENT "User-Agents".
"Vary: User-Agent" is meaningless to the end-point browser.

2. Are you saying that if/when you send CSS or JS to MSIE with
"Vary: User-Agent" then that response DOES get 'cached' by
MSIE? If it does... then that would be a scenario that defies 
what others have been saying on this thread. What they seem
to be saying is that it looks like MSIE will NEVER 'cache'
anything that has a "Vary:' field of any kind in the response
header(s). ( Unless I totally missed the point of the discussion ).

Don't get me wrong here... I think that MSIE should most certainly
take a look at the "Vary: User-Agent" response header and realize
that the current response is an answer to its OWN request and
go ahead and CACHE IT... but that's not the behavior that seems
be reported at this time. 

Could that be why a lot of Javascript is so damn slow?
It's arriving with "Vary: User-Agent" and MSIE is refusing to
cache it and so it keeps reloading the same damn Javascript
all the time?

>It is not supposed to affect the Expires header which you could provide 
>with the same response.

No.. it is not.

As far as I know... sending something like "Expires: -1" should
take precedence over ANY other 'caching rules' and that 
response should NEVER be cached by ANYONE... Inline
Proxy Cache Server or end-point User-Agent (browser).

>> If anyone wants to know exactly what MSIE is doing with
>> regards to "Vary:" then I suggest you just ASK them.
>
> Last time I've sent my request to M$ on June 13, 2002. I'm still waiting
> for the response...
>
> Thanks,
> Slava

Then follow it up.

Light a fire under someone's ass.

They are now LEGALLY supposed to tell you what you
want to know about their 'internals'. That's what the
court case was all about and that was what a JUDGE
told them they have to do.

Yours, Kevin

PS: Disclaimer... I am not a lawyer I and I have not read
the entire court decision against Microsoft but I did give
a once over and this is my layman's impression of what
they are supposed to do. They do not have to break the
company ( Microsoft ) up into tiny little divisions so long
as they begin to 'play fair' and let developers who are 
trying to work with their products have the information
they need to do their work... AND to 'develop similar
products'. It was all about (finally) letting go of 'secrets'
which created the monopoly in the first place.