[Mod_gzip] MSIE cannot handle Vary header(s)

Slava Bizyayev mod_gzip@lists.over.net
Tue, 10 Dec 2002 23:45:09 -0600


Yes, I'm reading...

I'm pretty sure I saw right behavior of M$IE-6.0 on "Vary: User-Agent,*" for
uncompressed content with my own eyes. I will definitely double check my
experiments (as soon as I restore my server) and let you know, because from
my point of view the restriction on the use of * makes a huge impact on
strategy of content delivery to M$ browsers.

Regards,
Slava

----- Original Message -----
From: <TOKILEY@aol.com>
To: <mod_gzip@lists.over.net>
Cc: <tomaz.borstnar@over.net>; <robertc@squid-cache.org>;
<hno@marasystems.com>; <cranstone1@attbi.com>; <CK1@wwwtech.de>;
<sbizyaye@outlook.net>; <jr-list-mod_gzip@quo.to>
Sent: Tuesday, December 10, 2002 3:20 AM
Subject: Re: [Mod_gzip] MSIE cannot handle Vary header(s)


>
> Hello all.
>
> This is a continuation of the thread entitled...
>
> [Mod_gzip] "mod_gzip_send_vary=Yes" disables caching on IE
>
> After several hours spent doing my own testing with MSIE and
> digging into MSIE internals with a kernel debugger I think I
> have the answers.
>
> The news is NOT GOOD.
>
> I will start with a SUMMARY first for those who don't have the
> time to read the whole, ugly story but for those who want to
> know where the following 'conclusions' are coming from I
> refer you to the rest of the message and the "detail".
>
> SUMMARY
>
> There is only 1 request header value that you can use with
> "Vary:" that will cause MSIE to cache a non-compressed
> response and that is ( drum roll please ) "User-Agent".
>
> If you use ANY other (legal) request header field name in
> a "Vary:" header then MSIE ( Versions 4, 5 and 6 ) will
> REFUSE to cache that response in the MSIE local cache.
>
> This is why Jordan is seeing a caching problem and Slava
> is not. Slava is 'accidentally' using the only possible "Vary:"
> field name that will cause MSIE to behave as it should
> and cache a non-compressed response.
>
> Jordan is seeing non-compressed responses never being
> cached by MSIE because the responses are arriving
> with something other than "Vary: User-Agent" like
> "Vary: Accept-Encoding".
>
> It should be perfectly legal and fine to send "Vary: Accept-Encoding"
> on a non-compressed response that can 'Vary' on that field
> value and that response SHOULD be 'cached' by MSIE...
> but so much for assumptions. MSIE will NOT cache this response.
>
> MSIE will treat ANY field name other than "User-Agent"
> as if "Vary: *" ( Vary + STAR ) was used and it will
> NOT cache the non-compressed response.
>
> The reason the COMPRESSED responses are, in fact,
> always getting cached no matter what "Vary:" field name
> is present is just as I suspected... it is because MSIE
> decides it MUST cache responses that arrive with
> "Content-Encoding: gzip" because it MUST have a
> disk ( cache ) file to work with in order to do the
> decompression.
>
> The problem exists in ALL versions of MSIE but it's
> even WORSE for any version earlier than 5.0. MSIE 4.x
> will not even cache responses with "Vary: User-Agent".
>
> That's it for the SUMMARY.
>
> The rest of this message contains the gory details.
>
> There are 'sections' to this since it gets a little deep.
>
> * WHY WILL MSIE ONLY CACHE "Vary: User-Agent"
>
> Because this was specifically reported as a bug against
> MSIE 4.0 way back in 1999 and they hacked a fix into
> the browser base code for it. That 'hack' has been
> carried forward to every new version but they have never done
> anything else with "Vary:" and to this day "User-Agent" is the
> only string value for "Vary:" that they are even 'checking' for.
>
> I discovered this only AFTER using a kernel debugger and
> watching the code evaluate the "Vary:" field. Once I saw that
> it was the only possible value that would cause anything
> to be 'cached' I did a GOOGLE search and found that it
> has been a known issue since 1999.
>
> For anyone who is interested you might want to check out
> the following links and read the 'history' behind this bug...
>
> This first link is a Problem Report submitted to the Apache
> Server folks on March 25, 1999. The TITLE of that bug report is...
>
> "Client bug: IE 4.0 breaks with "Vary" header"
>
> http://bugs.apache.org/index.cgi/full/4118
>
> The problem was much more serious in MSIE 4.0 than
> it is now. MSIE 4.0 was actually getting TOTALLY
> confused whenever a "Vary:" header would arrive
> in a response and it was treating it as a download error
> and putting up a nasty Dialog box.
>
> It was a very VISIBLE bug and that's why Microsoft
> stepped in and "fixed" it.
>
> The following is taken from the PR report itself.
>
> [snip]
>
> When Internet Explorer receives a "Vary: Host" header, or a "Vary:
*"header,
> the system will improperly report "file not found".  The exact error
> message is:"Internet Explorer cannot download from the Internet site
> viewer.zip
> from palm.dahm.com.  The downloaded file is not available.  This could be
> due to your Security or Language settings or because the server was
> unable to retrieve the requested file."
>
> [snip]
>
> What (apparently) happened when Microsoft realized they had
> this bad bug in their "Vary:" handler is that they simply 'hacked'
> it so that at least it would not put up such a bad ( and incorrect )
> error message. It was around that time that they at least added
> some code to pick up a "Vary:" field but all they really did was
> set a flag to cause ALL RESPONSES that have "Vary:" headers
> to be treated as "Vary: *" and nothing would ever be cached.
>
> This was really no different from what all existing versions of
> SQUID ( at that time ) would do. It was only 7 weeks ago that
> a version of SQUID emerged which would do anything other
> than this base level 'hack' at "Vary:" handling.
>
> ASIDE: What is interesting about the above Apache PR report
> from 1999 is that they added a 'hack' of their own to get
> around the problem which few people know about but which
> could NOW cause all kinds of NEW problems if anyone
> is still actually using the 'hack'...
>
> The following is taken directly from Dean Gaudet's commit
> log when he 'patched' the server to get around the "Vary:"
> problem(s) in MSIE...
>
> [snip]
>
> A new environment variable, "force-no-vary", has beenadded.
> If set with BrowserMatch, the Vary field will notbe sent as part of
> the response header.  This change shouldappear in the next release
> after 1.3.6.  Thanks for the report and for using Apache!
>
> [snip]
>
> What that means is that people can UNCONDITIONALLY
> set their Servers to NEVER send "Vary:" headers based
> totally on the "BrowserMatch" directive in Apache.
>
> This means that NO ONE would be able to add a "Vary:"
> header ( like mod_gzip or mod_deflate or DynaZip or
> whoever ) even if they WANTED to. It will be 'stripped out'.
>
> Something to keep in mind as all this "Vary:" stuff starts
> looming closer over the horizon.
>
> But I digress...
>
> Sometime after those initial MSIE bug reports and the first addition
> of any code to even handle "Vary:" at all... the single pickup
> for "Vary: User-Agent" was added in response to a specific
> bug report against MSIE.
>
> They added a flag to allow responses with "Vary: User-Agent"
> to be cached locally and this fixed the 'bug' report but ALL
> other values for "Vary:" field were (are) still ignored and
> are treated just like "Vary: *" and nothing is cached locally.
>
> The following is a 'message thread' at the W3C.ORG
> forum itself from just this past spring/summer which shows
> that the problems still exist even in MSIE 6.0.
>
> If you go to this message link you can just click on 'Previous
> Message' and 'Next message' to move forward and back
> through the thread....
>
> http://lists.w3.org/Archives/Public/ietf-http-wg/2002AprJun/0046.html
>
> * TEST CASES
>
> I played with an HTTP Sever and MSIE and narrowed down
> the test response to the absolute minimum that would
> produce the CORRECT behavior.
>
> NOTE: At no time during these tests was I sending
> any actual compression. My goal was to narrow down
> what happens with NON-COMPRESSED responses
> which is really the whole issue.
>
> The CORRECT behavior is for the response to be CACHED
> locally by MSIE and only retrieved if/when it EXPIRES or
> if the user presses CTRL-R ( Reload ) or hits the "Refresh"
> button.
>
> Once MSIE has cached a page locally then hitting either the
> FORWARD or BACK buttons or choosing the page from the
> History list should NOT cause a new request for the page
> to be emitted from the browser. When it is functioning
> correctly MSIE should do nothing but reload the page
> from the local cache.
>
> The following was my 'base response' ( Similar to Jordan's
> test case since it only sends back "Hello World"... )
>
> [snip]
> HTTP/1.1 200 OK
> Content-Type: text/plain
> Connection: Close
>
> Hello World
> [snip]
>
> All versions of MSIE will simply 'do the right thing' when this
> document arrives and will store it in the local cache and will
> NOT 'reload' it unless you force it to by pressing 'Refresh'
> or by clearing your local cache.
>
> It makes no difference if there is an "Expires:" field.
> If there isn't one... the default value assigned to the
> document in the local cache is "Expires: None" which
> means it will be there until you clear your cache ( manually ).
>
> The very next test I tried was simply adding a "Vary:"
> field and I choose the one that is MOST relevant to
> this discussion... "Vary: Accept-Encoding"
>
> Here is the next response that arrived in MSIE...
>
> [snip]
> HTTP/1.1 200 OK
> Content-Type: text/plain
> Connection: Close
> Vary: Accept-Encoding
>
> Hello World
> [snip]
>
> No version of MSIE will cache this response
>
> If any response ever arrives with "Vary: Accept-Encoding"
> then MSIE will constantly go back upstream to get
> a new copy of the document... even when you are
> simply using the BACK/FORWARD buttons or choosing
> the original URI from the browser history list.
>
> This is the behavior that Jordan and Tomaz and others
> have discovered.
>
> The next thing I tried was simply some 'other'
> request field name along with Vary:
>
> The "User-Agent" field name would probably
> be the second most-relevant to compression
> variants so here is what I sent to MSIE next...
>
> [snip]
> HTTP/1.1 200 OK
> Content-Type: text/plain
> Connection: Close
> Vary: User-Agent
>
> Hello World
> [snip]
>
> This worked fine. The response WAS CACHED by MSIE ( 5.x, 6.x )
> just fine... just like the original base document with no "Vary:" field.
>
> NOTE: Only versions of MSIE higher than 4.x will cache this.
>
> I the tried all kinds of 'combinations' of strings in the
> "Vary:" field. In the interests of time here I will just
> list what 'variations' I tried for "Vary:" and the results.
>
> Each test was identical to the 'base test' in all
> other respects.
>
> This gets pretty interesting...
>
> Vary: Accept-Encoding    <- Response is NOT cached by MSIE
> Vary: User-Agent            <- Response is CACHED! Always reloads from
local
> disk
> Vary: Accept_Language  <- Response is NOT cached by MSIE
> Vary: *                           <- Response is NOT cached by MSIE (
Correct
> behavior )
> Vary: Host                     <- Respoinse is NOT cached by MSIE
>
> NOTE: I think it's pretty amazing that MSIE won't even
> cache anything just because it has "Vary: Accept-Language".
> Of all the request headers... "Accept-Language" would
> probably be the most often used "Vary:" field and
> "Vary: Accept-Language" represents one of the reasons
> why "Vary:" was invented in the first place... so people
> could be sure they are getting the right LANGUAGE
> on the pages they ask for.
>
> * SPELLING COUNTS ( SORT OF )
>
> A little more testing proved that the actual pickup for
> "User-Agent" in MSIE ( The only one they are checking
> for ) is, in fact, using 'strncmp()' and is NOT case-sensitive.
> This is what you would expect.
>
> Vary: User-Agent   <- Response is CACHED
> Vary: user-Agent   <- Response is CACHED
> Vary: User-agent   <- Response is CACHED
> Vary: user-agent    <- Response is CACHED
>
> However... here is where SPELLING COUNTS...
>
> Vary: User-Agent:  <- Response is NOT CACHED ( Extra colon on the end )
> Vary: User Agent   <- Response is NOT CACHED ( SPACE instead of HYPHEN )
> Vary: UserAgent    <- Response is NOT CACHED ( Hypen left out )
>
> Punctation also produces some strange results...
>
> ( Keep in mind that you are supposed to be allowed to list
> any number of field names separated by commas... )
>
> Vary: User-Agent,   <- Response is NOT CACHED ( Single comma on the end )
> Vary: ,User-Agent   <- Response is NOT CACHED ( Single comma on front )
> Vary: "User-Agent"  <- Response is NOT CACHED ( Quote marks not allowed )
>
> The following was a surprise....
>
> Even though the "User-Agent" field name is present MSIE will
> still refuse to cache the response if there is ANOTHER field name
present...
>
> Vary: User-Agent, Accept-Encoding  <- Response is NOT CACHED.
>
> And despite what Slava says about using "Vary: User-Agent,*"
> ( User-Agent + comma + STAR ) I could NOT get any of the
> following responses to cache at all..
>
> Vary: User-Agent,*    <- Response is NOT CACHED ( No space after comma )
> Vary: User-Agent, *   <- Response is NOT CACHED ( Space after comma )
> Vary: *,User-Agent    <- Response is NOT CACHED ( No space after comma )
> Vary: *, User-Agent   <- Response is NOT CACHED ( Space after comma )
>
> Slava? Are you still reading along?
>
> Are you SURE you are seeing non-compressed documents
> cached locally by MSIE using "Vary: User-Agent,*" ?
>
> I could NOT get this to work in ANY version of MSIE
> but just plain old "Vary: User-Agent" DOES WORK.
>
> * CACHE-CONTROL FORCES MSIE TO WORK CORRECTLY
>
> "Vary: Accept-Encoding" will always cause MSIE to refuse
> to cache the response.
>
> However... if you simply send the following instead...
>
> Vary: Accept-Encoding
> Cache-Control: private
>
> Then MSIE will now function 'normally' and cache the
> ( non-compressed ) response.
>
> This is not a 'magic bullet' however nor is it a 'fix'.
> It's really just an interesting discovery.
>
> Unfortunately you gain nothing but a new local cache
> file because MSIE will still go back upstream for a
> new version of the page every time you hit BACK
> or FORWARD just as if the page was never written
> to the local cache at all ( which it WAS... and it
> even has "Expires: None" on it which means MSIE
> SHOULD be reloading from the local cache )
>
> This looks like yet another bug with regards to
> 'Cache-Control:' or something. Not sure.
>
> That's about it.
>
> Like I said... the news is NOT GOOD here.
>
> It means that despite the fact that no Proxy Servers
> ( other than the brand new SQUID 2.5 only out
> for 7 weeks now ) really support "Vary:" the way it was
> designed... MSIE itself has a LOOONG way to go before
> anyone will be able to use "Vary:" for anything.
>
> ...and since you can't stop an inline Proxy from
> forwarding the "Vary:" headers to the end-point
> browser ( You probably SHOULD be able to
> but that's a whole 'nother thread of discussion )
> then it is looking more and more as if using
> "Vary:" for anything is just simply going to CAUSE
> far more problems then it will SOLVE.
>
> It will be YEARS before all this gets fixed... if ever.
>
> Gotta run.
>
> If you have read this far down then your brain
> is probably as fried as mine is at the moment.
>
> Later...
> Kevin
>
> PS: If anyone is still seeing totally different results than
> what I have now seen with my own eyes ( and debuggers )
> then let's keep this going and figure out what the heck
> is going on here.
>
>
>
>
> _______________________________________________
> mod_gzip mailing list
> mod_gzip@lists.over.net
> http://lists.over.net/mailman/listinfo/mod_gzip
>