Antwort: [Mod_gzip] Is there a test-suite which excerises gzip compression
in variou servers
mod_gzip@lists.over.net
mod_gzip@lists.over.net
Mon, 22 Mar 2004 15:38:45 +0100
Dies ist eine mehrteilige Nachricht im MIME-Format.
--=_alternative 00507882C1256E5F_=
Content-Type: text/plain; charset="US-ASCII"
Hi Yusuf,
> Not really a mod_gzip question but I would like to know if there is a
> test-suite which excerises gzip compression in various http servers.
I don't quite understand what you are looking for.
My problem is that "exercise gzip compression" is way too underspecified
to be assigned a boolean value.
gzip itself is an algorithm that can take a number of parameters, such as
compression quality (level 1..9). So you cannot just compare two gzip
compressed files and say if one is "correct" the other must be "broken".
gzip may produce any number of different correct results - it just depends
on how clever the compression algorithm builds the string table which will
feed the decompression algorithm to decode the content.
There can be any number of gzip implementations - all they need to
produce is "compatible" output so that the gzip decoding algorithm can
uncompress the data.
mod_gzip doesn't support configuring these parameters as the quality
level is hard-coded in the source. But mod_deflate provides several of
these configuration directives so different Apache 2.x using mod_deflate
with different configuration files may actually serve differently
compressed
content all of which might still be "correct".
So you cannot know beforehand what kind of result Apache 2 with
mod_deflate would produce unless you know its configuration.
And how should a "test suite" be able to know?
> However, when I ask for the test page via wget
> wget --header="Accept-Encoding: gzip" URL -O /tmp/foo.gz
> foo.gz when gunzip'ped checksums to the correct size of the original
> file.
And what about the content of the HTTP response being sent to a browser?
Any difference?
What number of bytes does the access log claim to have sent?
Could you provide the complete set of HTTP result headers?
You are still not sure whether you are hunting a bug in Cherokee or in
the browsers.
I have sent a HTTP requests to the Cherokee URL you named and this site
serves gzipped content, yet doesn't provide "Content-Length" headers
and possibly other things as well.
To make a serious comparison I'd suggest you post some real URL of a
simple HTML page (not referencing any other URL) on your Cherokee site.
I'd then use
http://www.schroepl.net/cgi-bin/http_trace.pl
to request the page from your site, save the (uncompressed) file on my
machine, let it be served by mod_gzip, and then compare the results
(Content-Length and other HTTP headers).
The gzipping algorithm is likely to be used from the zlib library (you
might
ask the Cherokee developer for this aspect, he shoud know) - not many
people would ever try to reimplement this algorithm (Kevin Kiley being
one of these, so mod_gzip indeed has its own gzipping code).
Therefore I doubt your server sends differently compressed (!) responses
to your different 'browsers' but it might well send HTTP response headers
that your browsers don't like. Having a look at these HTTP response
headers
might be key to finding the source of the effect that you experience.
> The Cherokee author is looking into this but it's strange that wget
> gets the file correctly whereas both IE/Firefox don't.
I believe the browser gets the file (!) as correctly as your wget. But
then
the browser interprets the HTTP headers while your wget ignores these.
> A test-suite might make it easier to validate various http server
> for their gzip serving
I don't think the gzipping itself is the problem (see above).
I rather suspect this server (or browser!) to not comply to HTTP in some
aspect or another, like sending broken HTTP headers or not sending headers
that the browsers would require or whatnot.
Regards, Michael
--=_alternative 00507882C1256E5F_=
Content-Type: text/html; charset="US-ASCII"
<br><font size=2 color=red face="sans-serif"><b>Hi Yusuf,</b></font>
<br>
<br>
<br><font size=2 color=red face="sans-serif"><b>> </b></font><font size=2><tt>Not
really a mod_gzip question but I would like to know if there is a</tt></font>
<br><font size=2 color=red face="sans-serif"><b>> </b></font><font size=2><tt>test-suite
which excerises gzip compression in various http servers.</tt></font>
<br>
<br><font size=2 color=red face="sans-serif"><b>I don't quite understand
what you are looking for.</b></font>
<br>
<br><font size=2 color=red face="sans-serif"><b>My problem is that "exercise
gzip compression" is way too underspecified</b></font>
<br><font size=2 color=red face="sans-serif"><b>to be assigned a boolean
value.</b></font>
<br><font size=2 color=red face="sans-serif"><b>gzip itself is an algorithm
that can take a number of parameters, such as</b></font>
<br><font size=2 color=red face="sans-serif"><b>compression quality (level
1..9). So you cannot just compare two gzip</b></font>
<br><font size=2 color=red face="sans-serif"><b>compressed files and say
if one is "correct" the other must be "broken".</b></font>
<br><font size=2 color=red face="sans-serif"><b>gzip may produce any number
of different correct results - it just depends</b></font>
<br><font size=2 color=red face="sans-serif"><b>on how clever the compression
algorithm builds the string table which will</b></font>
<br><font size=2 color=red face="sans-serif"><b>feed the decompression
algorithm to decode the content.</b></font>
<br><font size=2 color=red face="sans-serif"><b>There can be any number
of gzip implementations - all they need to</b></font>
<br><font size=2 color=red face="sans-serif"><b>produce is "compatible"
output so that the gzip decoding algorithm can</b></font>
<br><font size=2 color=red face="sans-serif"><b>uncompress the data.</b></font>
<br>
<br><font size=2 color=red face="sans-serif"><b>mod_gzip doesn't support
configuring these parameters as the quality</b></font>
<br><font size=2 color=red face="sans-serif"><b>level is hard-coded in
the source. But mod_deflate provides several of</b></font>
<br><font size=2 color=red face="sans-serif"><b>these configuration directives
so different Apache 2.x using mod_deflate</b></font>
<br><font size=2 color=red face="sans-serif"><b>with different configuration
files may actually serve differently compressed</b></font>
<br><font size=2 color=red face="sans-serif"><b>content all of which might
still be "correct".</b></font>
<br><font size=2 color=red face="sans-serif"><b>So you cannot know beforehand
what kind of result Apache 2 with</b></font>
<br><font size=2 color=red face="sans-serif"><b>mod_deflate would produce
unless you know its configuration.</b></font>
<br><font size=2 color=red face="sans-serif"><b>And how should a "test
suite" be able to know?</b></font>
<br>
<br><font size=2 color=red face="sans-serif"><b>> </b></font><font size=2><tt>However,
when I ask for the test page via wget</tt></font>
<br><font size=2 color=red face="sans-serif"><b>> </b></font><font size=2><tt>wget
--header="Accept-Encoding: gzip" URL -O /tmp/foo.gz</tt></font>
<br><font size=2 color=red face="sans-serif"><b>> </b></font><font size=2><tt>foo.gz
when gunzip'ped checksums to the correct size of the original</tt></font>
<br><font size=2 color=red face="sans-serif"><b>> </b></font><font size=2><tt>file.</tt></font>
<br>
<br><font size=2 color=red face="sans-serif"><b>And what about the content
of the HTTP response being sent to a browser?</b></font>
<br><font size=2 color=red face="sans-serif"><b>Any difference?</b></font>
<br><font size=2 color=red face="sans-serif"><b>What number of bytes does
the access log claim to have sent?</b></font>
<br><font size=2 color=red face="sans-serif"><b>Could you provide the complete
set of HTTP result headers?</b></font>
<br><font size=2 color=red face="sans-serif"><b>You are still not sure
whether you are hunting a bug in Cherokee or in</b></font>
<br><font size=2 color=red face="sans-serif"><b>the browsers.</b></font>
<br>
<br><font size=2 color=red face="sans-serif"><b>I have sent a HTTP requests
to the Cherokee URL you named and this site</b></font>
<br><font size=2 color=red face="sans-serif"><b>serves gzipped content,
yet doesn't provide "Content-Length" headers</b></font>
<br><font size=2 color=red face="sans-serif"><b>and possibly other things
as well.</b></font>
<br><font size=2 color=red face="sans-serif"><b>To make a serious comparison
I'd suggest you post some real URL of a</b></font>
<br><font size=2 color=red face="sans-serif"><b>simple HTML page (not referencing
any other URL) on your Cherokee site.</b></font>
<br><font size=2 color=red face="sans-serif"><b>I'd then use</b></font>
<br><font size=2 color=red face="sans-serif"><b>
http://www.schroepl.net/cgi-bin/http_trace.pl</b></font>
<br><font size=2 color=red face="sans-serif"><b>to request the page from
your site, save the (uncompressed) file on my</b></font>
<br><font size=2 color=red face="sans-serif"><b>machine, let it be served
by mod_gzip, and then compare the results</b></font>
<br><font size=2 color=red face="sans-serif"><b>(Content-Length and other
HTTP headers).</b></font>
<br>
<br><font size=2 color=red face="sans-serif"><b>The gzipping algorithm
is likely to be used from the zlib library (you might</b></font>
<br><font size=2 color=red face="sans-serif"><b>ask the Cherokee developer
for this aspect, he shoud know) - not many</b></font>
<br><font size=2 color=red face="sans-serif"><b>people would ever try to
reimplement this algorithm (Kevin Kiley being</b></font>
<br><font size=2 color=red face="sans-serif"><b>one of these, so mod_gzip
indeed has its own gzipping code).</b></font>
<br><font size=2 color=red face="sans-serif"><b>Therefore I doubt your
server sends differently compressed (!) responses</b></font>
<br><font size=2 color=red face="sans-serif"><b>to your different 'browsers'
but it might well send HTTP response headers</b></font>
<br><font size=2 color=red face="sans-serif"><b>that your browsers don't
like. Having a look at these HTTP response headers</b></font>
<br><font size=2 color=red face="sans-serif"><b>might be key to finding
the source of the effect that you experience.</b></font>
<br>
<br><font size=2 color=red face="sans-serif"><b>> </b></font><font size=2><tt>The
Cherokee author is looking into this but it's strange that wget</tt></font>
<br><font size=2 color=red face="sans-serif"><b>></b></font><font size=2><tt>
gets</tt></font><font size=2 color=red face="sans-serif"><b> </b></font><font size=2><tt>the
file correctly whereas both IE/Firefox don't.</tt></font>
<br>
<br><font size=2 color=red face="sans-serif"><b>I believe the browser gets
the file (!) as correctly as your wget. But then</b></font>
<br><font size=2 color=red face="sans-serif"><b>the browser interprets
the HTTP headers while your wget ignores these.</b></font>
<br>
<br><font size=2 color=red face="sans-serif"><b>></b></font><font size=2><tt>
A test-suite might</tt></font><font size=2 color=red face="sans-serif"><b>
</b></font><font size=2><tt>make it easier to validate various http server</tt></font>
<br><font size=2 color=red face="sans-serif"><b>></b></font><font size=2><tt>
for their gzip serving</tt></font>
<br>
<br><font size=2 color=red face="sans-serif"><b>I don't think the gzipping
itself is the problem (see above).</b></font>
<br><font size=2 color=red face="sans-serif"><b>I rather suspect this server
(or browser!) to not comply to HTTP in some</b></font>
<br><font size=2 color=red face="sans-serif"><b>aspect or another, like
sending broken HTTP headers or not sending headers</b></font>
<br><font size=2 color=red face="sans-serif"><b>that the browsers would
require or whatnot.</b></font>
<br>
<br><font size=2 color=red face="sans-serif"><b>Regards, Michael</b></font>
--=_alternative 00507882C1256E5F_=--