Antwort: Re: [Mod_gzip] Gzip encoding, IE 5 & 6, proxy caches,
NetCache
mod_gzip@lists.over.net
mod_gzip@lists.over.net
Fri, 4 Oct 2002 20:11:33 +0200
Hi Nigel,
> There may be a solution to the problem. When using Gzip + SSL I
> know some people precede the document with 2048 space characters
> (these compress easily) ... it's not elegant but it is reported to work.
I don't think this to be still a valid solution today, although
it has been one (about 12+ months ago).
First of all, I would suggest to let the document start with the
same 2044 bytes as it would normally start, but have them included
in HTML comment characters /* ... */.
This way, the gzip algorithm will compress them the first time it
is seeing this content, and can just reference the identical string
the second time. I would bet this to be a candidate for the best
possible compression rate you can ever get in this case.
You could even write some Apache handler to automatically generate
such a 2048 bytes "prefix" - you wouldn't have to do this manually.
But anyway, this won't work with modern browsers like M$IE 6.0.
The reason for this is that you change the _semantics_ of the do-
cument.
A _valid_ HTML document requires a DOCTYPE identifier as the
first line of the document.
And modern browsers like M$IE 6.0 and Mozilla are checking for this
line now, and make their interpretation of the content _depend_ upon
having such a line containing a valid DOCTYPE identifier:
- If they find one, they will display the document according to the
HTML standard;
- if they don't find one, they will tolerate many more errors and
"interpret" the document content.
This is named "standards compliance mode" vs. "quirks mode" in Mo-
zilla - I don't know whether there are corresponding names in the
M$IE universe.
You can check this yourself, using Mozilla (Netscape 6.2.3 is too
old for this, but Netscape 7.0PR1 will suffice):
- Display some HTML page in the browser,
- right-click your mouse,
- select "Page Info" from the context menu, and
- look at the line named "Render mode" there.
You will find "Quirks mode" for most pages of the Web, but
"Standards compliance mode" when you visit a page that claims to
be standards compliant by offering a correct DOCTYPE - you can
take my mod_gzip documentation pages at
http://www.schroepl.net/projekte/mod_gzip/
to have an example for standards compliant pages. ;-)
So if you send 2048 bytes at the start of the document, you
cannot send a DOCTYPE there, and browsers like Mozilla and M$IE
6.0 will interpret _any_ undisclosed number of HTML tags in _any_
undisclosed different ways, because this makes the difference
between their fancy HTML errors autocorrection code being turned
on or off. This is some sophisticated guessing game, but not HTML.
So you would be forced to violate HTML standards to ensure that
your page will be rendered the same way in all requests - which
is a contradiction in itself, because HTML standards are meant
to _ensure_ that a browser has the chance to know how exactly to
render a page.
Regards, Michael