Antwort: Re: [Mod_gzip] "mod_gzip_send_vary=Yes" disables caching on IE? (1.3.26.1a)

mod_gzip@lists.over.net mod_gzip@lists.over.net
Mon, 9 Dec 2002 20:35:45 +0200


Hi Kevin,


> Looks like Slava is saying he's not having any problems at
> all and that everything is working fine while (Jordan?) says
> that MSIE isn't caching ANY of the non-compressed variants
> if they have ANY kind of "Vary:" header.
>
> Can we narrow this down to the specific problem and stop
> spinning our wheels?
> Is this just an MSIE version specific issue?
> What (Exact) versions and build levels of MSIE are we talking
> about here and does one really actually WORK while another does
> NOT?

I am running Apache 1.3.26 plus mod_gzip 1.3.19.1a, but
with my own mod_headers configuration for sending "Vary:"
headers (as 1.3.26.1a still has a minor bug).
This scenario is running for about half a year now, with
customers running any type of M$IE from 5.0 to 5.5 and
6.0 on this server.

I am running some log file analysis tool (mgzta) that is
in a way UserAgent specific. For example, I can see the
rate of mod_gzip compression events based on the exact
UserAgent string, as well as the average HTTP response
size for each UserAgent string. In some cases these va-
lues differ from each other because certain customers
have specific network configurations; one of these even
asked us to turn off mod_gzip for them because they need
to print the pages with Netscape 4.7 as top priority.

My users are working on the same server most every day.
Therefore caching is a high priority issue for me.
Besides, I am able to trace each individual user in my
Apache logs (because they are forced to have unique coo-
kie values from login, and I have these in access_log).
Therefore I can easily compute the rate of HTTP status
codes 200, 304 and others for each user, and I can com-
pute tables that can be sorted by each of these values.
I run CGI scripts doing just that.
So what are they telling me?

My logs contain 720 individual users for today, using 77
different UserAgent strings. From these, I can sense 184
that had not a single HTTP 304 request the whole day -
they either are caching perfectly (they are supplied with
"Expires:" headers for images that last for 1-2 weeks, and
most HTML pages are either "no-cache" or have "Expires:"
intervals as well) or not at all - but would then cause an
unusual high number of HTTP requests because the pages in
question contain large numbers of small images. And of all
these 184 users I can identify not a single one who obvious-
ly has configured his browser as to not cache anything at
all (I would then detect certain HTTP request patterns at
first sight), and most of them are far below average of
total HTTP requests.
On the other end of the table, there are 114 users who do
more than 30% of all their HTTP requests for cache valida-
tion - these seem to have a bad caching strategy configured,
mostly they will run "validate always". I can detect all
types of browsers there, M$IE of all versions as well as
different Netscape 4 variants ... given the proper working
style with our product, this may be as bad as 81% of all
HTTP requests being of the 304 type (lots of small images
for ertain pages ...). There is even one Netscape7 among
these - you can do it wrong with every UserAgent. :-(

All in all I have a HTTP-304 ratio of 16,5% today, which
is the result of sending "Expires:" and "Cache-Control:"
headers as well as including JavaScript and CSS mostly on
server side (so they can be compressed for Netscape 4, the
rate of compressed responses being slightly above 80% for
HTML files, which make up for 60% of the requests but 90%
of the uncompressed traffic volume, even when adding esti-
mated 1 KB of HTTP request and response headers to each
content size; overall traffic reduction is then 56%).
Given the best possible browser configuration a user can
easily reach values of below 10% of HTTP-304 values - one
third of all our users are working this range, including
all of my colleagues from my office. But that would requi-
re all visitors to set their browser configuration to "au-
tomatic", thus believing the HTTP headers we send them.
Unlike the "Accept-Encoding:" header, I cannot detect the
browser cache configuration by some individual HTTP header,
thus I can't show the users a warning on the login page
(which I do when they are not using compression). I may be
able to see whether they do a conditional GET, but I cannot
easily find out whether they should rather have relied on
the expiration period I told them via the corresponding
HTTP headers.

All of these responses are getting "Vary: Accept-Encoding",
regardless whether they are compressed or not - I want to
"play fair" to whatever proxy may be out there (and there
are Squids of all shapes and sizes at the locations of our
customers, back to version 2.1, as well as some other pro-
xies).
What I actually _don't_ see is any special behaviour of
any special M$IE version when dealing with "Vary:" headers.
(I don't have Apple Macs as clients, only Windows, plus
some OS/2 with Netscape 4.61.)

On the other hand I somehow _do_ know a behaviour of M$IE
similar to the one that was described in this thread. I can
even reproduce it against one specific Apache server in our
office (but only with M$IE, not with any other browser).
I have another specific Apache running the same Apache and
mod_gzip version and the same configuration as the aformen-
tioned one, and the effect will _not_ occur there; I can re-
produce both within the same browser session, for each M$IE
version we have here, 5.0 as well as 5.5 or even 6.0.
This behaviour has started some months ago, but I don't have
any explanation for it ... but I definitely send identical
HTTP headers from both servers, and there is no proxy in
between my browser and either of them ...

Regards, Michael