Antwort: Re: [Mod_gzip] Gzip encoding, IE 5 & 6, proxy caches, NetCache

mod_gzip@lists.over.net mod_gzip@lists.over.net
Fri, 4 Oct 2002 20:11:33 +0200


Hi Nigel,


 > There may be a solution to the problem. When using Gzip + SSL I
 > know some people precede the document with 2048 space characters
 > (these compress easily) ... it's not elegant but it is reported to work.

 I don't think this to be still a valid solution today, although
 it has been one (about 12+ months ago).

 First of all, I would suggest to let the document start with the
 same 2044 bytes as it would normally start, but have them included
 in HTML comment characters /* ... */.
 This way, the gzip algorithm will compress them the first time it
 is seeing this content, and can just reference the identical string
 the second time. I would bet this to be a candidate for the best
 possible compression rate you can ever get in this case.

 You could even write some Apache handler to automatically generate
 such a 2048 bytes "prefix" - you wouldn't have to do this manually.
 But anyway, this won't work with modern browsers like M$IE 6.0.
 The reason for this is that you change the _semantics_ of the do-
 cument.

 A _valid_ HTML document requires a DOCTYPE identifier as the
 first line of the document.
 And modern browsers like M$IE 6.0 and Mozilla are checking for this
 line now, and make their interpretation of the content _depend_ upon
 having such a line containing a valid DOCTYPE identifier:
 - If they find one, they will display the document according to the
   HTML standard;
 - if they don't find one, they will tolerate many more errors and
   "interpret" the document content.
 This is named "standards compliance mode" vs. "quirks mode" in Mo-
 zilla - I don't know whether there are corresponding names in the
 M$IE universe.

 You can check this yourself, using Mozilla (Netscape 6.2.3 is too
 old for this, but Netscape 7.0PR1 will suffice):
 - Display some HTML page in the browser,
 - right-click your mouse,
 - select "Page Info" from the context menu, and
 - look at the line named "Render mode" there.
 You will find "Quirks mode" for most pages of the Web, but
 "Standards compliance mode" when you visit a page that claims to
 be standards compliant by offering a correct DOCTYPE - you can
 take my mod_gzip documentation pages at
     http://www.schroepl.net/projekte/mod_gzip/
 to have an example for standards compliant pages. ;-)

 So if you send 2048 bytes at the start of the document, you
 cannot send a DOCTYPE there, and browsers like Mozilla and M$IE
 6.0 will interpret _any_ undisclosed number of HTML tags in _any_
 undisclosed different ways, because this makes the difference
 between their fancy HTML errors autocorrection code being turned
 on or off. This is some sophisticated guessing game, but not HTML.

 So you would be forced to violate HTML standards to ensure that
 your page will be rendered the same way in all requests - which
 is a contradiction in itself, because HTML standards are meant
 to _ensure_ that a browser has the chance to know how exactly to
 render a page.

Regards, Michael