[Mod_gzip] mod_gzip + mod_proxy in forward proxy mode
mod_gzip@lists.over.net
mod_gzip@lists.over.net
Mon, 15 Dec 2003 17:48:42 EST
--part1_1c2.12f76ec7.2d0f944a_boundary
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Hi Jason...
This is Kevin Kiley...
There are only 2 real possibilities here...
1. You have uncovered a bug in mod_gzip 2.0.40
2. You have uncovered a 'filtering' bug in Apache 2.0.48
mod_gzip 2.0.40 is numbered that way because it was written
and tested against Apache 2.x release 2.0.40 ( Some time back ).
It was tested with files of all sizes, from small to HUGE, and there
was never a problem such as the one you are now describing.
Once it started compressing... EVERYTHING got compressed.
There have been lots of bugs uncovered in the Apache 2.x 'filtering'
scheme since then and there are more being found every day so
it's quite possible that it's simply something going haywire with
the Apache 2.x filtering scheme. mod_proxy for 2.x has also
undergone many revisions since then since it was hardly working
at all with Apache 2.0.40.
Actually... according to your description my best guess would be
that this is some bug in the Apache 2.0 filtering.
If the compression seems to START OK and the beginning of
the output contains valid GZIP signature bytes then the only reason
that 'uncompressed' data should come after that is if, somehow,
the output filter is being BYPASSED by Apache for some
subsequent 'output' and the filter never 'sees' it or gets a chance
to add it to the compressed stream.
Once the compression starts... ANY output data seen by the filter
should also be compressed and added to the output stream. Only
reason it wouldn't be, methinks, is if the filter is not actually
getting the data at all and it goes right out the door uncompressed.
Here is an easy way to take the 'mod_gzip' component out of
the picture and see if it's a deep-level Apache bug...
Just use Apache's own 'mod_deflate' filter instead of mod_gzip.
If you are seeing the same problem using 'mod_deflate' ( which
comes with Apache 2.0 ) then it's a bug in Apache itself.
If everything is OK using mod_deflate then that must mean
the mod_gzip 2.0.40 code is now out-dated and needs some kind
of patch to keep up with the changes being made to Apache 2.x
Repeat tests using mod_deflate + mod_proxy and let us
know if same results.
If you have Apache 2.0.48 then you ALREADY HAVE mod_deflate.
Apache 2.0.48 ships with it.
Later...
Kevin
PS: It's also quite possible that mod_proxy is the culprit here
but mod_deflate can help prove that as well so give mod_deflate a try.
In a message dated 12/15/2003 1:51:48 PM Central Standard Time,
jindor@yahoo.com writes:
> Hello all,
>
> Before I post this message I went through the past archive to see some
> similar
> threads... but there seemed to be no discussion on the problem I have.
>
> I'm using apache 2.0.48, mod_gzip 2.0.40. What I aim to do is to use
> mod_gzip,
> mod_proxy (in forward proxy mode) to compress some of the HTTP objects like
> html, doc files. No caching is involved. So the Apache would be like a
> general
> forward proxy server with compression feature for particular objects.
>
> I confirmed that mod_gzip can compress the objects from mod_proxy too. But
> it
> was not 100% successful. It could compress only *small* HTML pages (less
> than,
> approximately 1K) successfully. For the bigger HTML, however, the HTML body
> was
> not correctly encoded, even though the response has "Content-Encoding: gzip"
> header.
>
> By "not correctly" I mean that the HTML body does begin with gzip signature
> (1F
> 8B ...), but in the body HTML tags are clearly visible! It was not encoded.
> Of
> course browser can't analyze it, failing to display it on the screen. It is
> definitely working, but something is wrong.
>
> Since with IE I cannot see what's going on between browser and proxy, I used
> wget to see the response. (http://www.xyz.com/end.html is of 422 bytes).
>
> wget http://www.xyz.com/end.html -S --user-agent="Mozilla/4.0 (Compatible;
> MSIE 6.0; Windows NT 5.0)" --header="Accept-Encoding: gzip, deflate" -O
> dnfile
>
> Connecting to 192.168.2.20:8880... connected! <--- I run proxy at port
> 192.168.2.20:8880
> Proxy request sent, awaiting response... 200 OK
> 2 Date: Mon, 15 Dec 2003 19:20:43 GMT
> 3 Server: Apache/2.0.47 (Unix) PHP/4.3.4 mod_ssl/2.0.47 OpenSSL/0.9.7a
> 4 Last-Modified: Mon, 18 Feb 2002 15:49:48 GMT
> 5 ETag: "20f69-1a6-4e58df00"
> 6 Accept-Ranges: bytes
> 7 Content-Length: 305
> 8 Content-Type: text/html; charset=ISO-8859-1
> 9 Via: 1.0 199.26.172.28
> 10 Content-Encoding: gzip
> 11 Vary: Accept-Encoding
> 12 Connection: close
> 13
>
>
> Everything looks OK. The downloaded file is ungzippable, IE could show the
> content too.
>
> But if I requested http://www.xyz.com/index.html which is of 2690 bytes, IE
> can't display it, and wget shows something unexpected.
>
> ################################
>
> wget http://199.26.172.28/index.html -S
> --user-agent="Mozilla/4.0 (Compatible; MSIE 6.0; Windows NT 5.0)"
> --header="Accept-Encoding: gzip, deflate" -O dnfile
>
> --14:26:51-- http://199.26.172.28:80/index.html
> => `dnfile'
> Connecting to 192.168.2.20:8880... connected!
> Proxy request sent, awaiting response... 200 OK
> 2 Date: Mon, 15 Dec 2003 19:23:48 GMT
> 3 Server: Apache/2.0.47 (Unix) PHP/4.3.4 mod_ssl/2.0.47 OpenSSL/0.9.7a
> 4 Last-Modified: Thu, 04 Sep 2003 22:31:32 GMT
> 5 ETag: "20f6a-a82-8bb6d900"
> 6 Accept-Ranges: bytes
> 7 Content-Length: 2690
> 8 Content-Type: text/html; charset=ISO-8859-1
> 9 Via: 1.0 199.26.172.28
> 10 Content-Encoding: gzip
> 11 Vary: Accept-Encoding
> 12 Connection: close
> 13
>
> 0K -> . [ 48%]
>
> 14:26:51 (1.25 MB/s) - Connection closed at byte 1312. Giving up.
>
> ##########################3
>
>
> And the HTML body is weird:
>
> Have a look -> http://209.178.198.123/dmp.bmp
>
> Following is the relevant portion of httpd.conf:
>
> ##############################################
> # LoadModule foo_module modules/mod_foo.so
> #
> LoadModule proxy_module modules/mod_proxy.so
> LoadModule proxy_ftp_module modules/mod_proxy_ftp.so
> LoadModule proxy_http_module modules/mod_proxy_http.so
> LoadModule proxy_connect_module modules/mod_proxy_connect.so
> LoadModule gzip_module modules/mod_gzip.so
>
> ProxyRequests On
> ProxyVia On
>
> <Proxy *>
>
> Order allow,deny
> Allow from all
>
> </Proxy>
>
> mod_gzip_on Yes
> mod_gzip_can_negotiate Yes
> mod_gzip_add_header_count Yes
> mod_gzip_minimum_file_size 100
> mod_gzip_maximum_file_size 1000000
> mod_gzip_keep_workfiles No
> mod_gzip_maximum_inmem_size 100000
> mod_gzip_dechunk Yes
> mod_gzip_min_http 1000
>
> mod_gzip_item_include file \.htm$
> mod_gzip_item_include mime text/.*
> mod_gzip_item_exclude mime ^image/.*
> mod_gzip_temp_dir "/tmp"
> mod_gzip_item_include handler proxy-server
> mod_gzip_command_version mod_gzip_version
> CustomLog logs/gzip.log mod_gzip_info2
>
> ##############
>
>
> Would anyone please tell me what (I've done) is wrong? Any kind of clue
> would
> be greatly appreciated.
>
> Thanks.
>
> Jason J Kim
>
> _____________________________________________________________________
> ?? ???? ??? ???? - ??! ??
> http://mail.yahoo.co.kr
> ??,???,??? ??? ???? - ??! ???
> http://autos.yahoo.co.kr/autos/
> _______________________________________________
> mod_gzip mailing list
> mod_gzip@lists.over.net
> http://lists.over.net/mailman/listinfo/mod_gzip
>
--part1_1c2.12f76ec7.2d0f944a_boundary
Content-Type: text/html; charset="US-ASCII"
Content-Transfer-Encoding: quoted-printable
<HTML><FONT FACE=3Darial,helvetica><HTML><FONT SIZE=3D2 PTSIZE=3D10><BR>
Hi Jason...<BR>
This is Kevin Kiley...<BR>
<BR>
There are only 2 real possibilities here...<BR>
<BR>
1. You have uncovered a bug in mod_gzip 2.0.40<BR>
2. You have uncovered a 'filtering' bug in Apache 2.0.48<BR>
<BR>
mod_gzip 2.0.40 is numbered that way because it was written<BR>
and tested against Apache 2.x release 2.0.40 ( Some time back ).<BR>
<BR>
It was tested with files of all sizes, from small to HUGE, and there<BR>
was never a problem such as the one you are now describing.<BR>
Once it started compressing... EVERYTHING got compressed.<BR>
<BR>
There have been lots of bugs uncovered in the Apache 2.x 'filtering'<BR>
scheme since then and there are more being found every day so<BR>
it's quite possible that it's simply something going haywire with<BR>
the Apache 2.x filtering scheme. mod_proxy for 2.x has also<BR>
undergone many revisions since then since it was hardly working<BR>
at all with Apache 2.0.40.<BR>
<BR>
Actually... according to your description my best guess would be<BR>
that this is some bug in the Apache 2.0 filtering.<BR>
If the compression seems to START OK and the beginning of <BR>
the output contains valid GZIP signature bytes then the only reason<BR>
that 'uncompressed' data should come after that is if, somehow,<BR>
the output filter is being BYPASSED by Apache for some <BR>
subsequent 'output' and the filter never 'sees' it or gets a chance<BR>
to add it to the compressed stream. <BR>
<BR>
Once the compression starts... ANY output data seen by the filter<BR>
should also be compressed and added to the output stream. Only<BR>
reason it wouldn't be, methinks, is if the filter is not actually <BR>
getting the data at all and it goes right out the door uncompressed.<BR>
<BR>
Here is an easy way to take the 'mod_gzip' component out of <BR>
the picture and see if it's a deep-level Apache bug...<BR>
<BR>
Just use Apache's own 'mod_deflate' filter instead of mod_gzip.<BR>
<BR>
If you are seeing the same problem using 'mod_deflate' ( which<BR>
comes with Apache 2.0 ) then it's a bug in Apache itself.<BR>
<BR>
If everything is OK using mod_deflate then that must mean <BR>
the mod_gzip 2.0.40 code is now out-dated and needs some kind<BR>
of patch to keep up with the changes being made to Apache 2.x<BR>
<BR>
Repeat tests using mod_deflate + mod_proxy and let us<BR>
know if same results.<BR>
<BR>
If you have Apache 2.0.48 then you ALREADY HAVE mod_deflate.<BR>
Apache 2.0.48 ships with it.<BR>
<BR>
Later...<BR>
Kevin<BR>
<BR>
PS: It's also quite possible that mod_proxy is the culprit here<BR>
but mod_deflate can help prove that as well so give mod_deflate a try.<BR>
<BR>
<BR>
In a message dated 12/15/2003 1:51:48 PM Central Standard Time, jindor@yahoo=
.com writes:<BR>
<BR>
<BR>
<BLOCKQUOTE TYPE=3DCITE style=3D"BORDER-LEFT: #0000ff 2px solid; MARGIN-LEFT=
: 5px; MARGIN-RIGHT: 0px; PADDING-LEFT: 5px">Hello all,<BR>
<BR>
Before I post this message I went through the past archive to see some simil=
ar<BR>
threads... but there seemed to be no discussion on the problem I have.<BR>
<BR>
I'm using apache 2.0.48, mod_gzip 2.0.40. What I aim to do is to use mod_gzi=
p,<BR>
mod_proxy (in forward proxy mode) to compress some of the HTTP objects like<=
BR>
html, doc files. No caching is involved. So the Apache would be like a gener=
al<BR>
forward proxy server with compression feature for particular objects.<BR>
<BR>
I confirmed that mod_gzip can compress the objects from mod_proxy too. But i=
t<BR>
was not 100% successful. It could compress only *small* HTML pages (less tha=
n,<BR>
approximately 1K) successfully. For the bigger HTML, however, the HTML body=20=
was<BR>
not correctly encoded, even though the response has "Content-Encoding: gzip"=
<BR>
header.<BR>
<BR>
By "not correctly" I mean that the HTML body does begin with gzip signature=20=
(1F<BR>
8B ...), but in the body HTML tags are clearly visible! It was not encoded.=20=
Of<BR>
course browser can't analyze it, failing to display it on the screen. It is<=
BR>
definitely working, but something is wrong.<BR>
<BR>
Since with IE I cannot see what's going on between browser and proxy, I used=
<BR>
wget to see the response. (http://www.xyz.com/end.html is of 422 bytes).<BR>
<BR>
wget http://www.xyz.com/end.html -S --user-agent=3D"Mozilla/4.0 (Compa=
tible;<BR>
MSIE 6.0; Windows NT 5.0)" --header=3D"Accept-Encoding: gzip, deflate"=
-O dnfile<BR>
<BR>
Connecting to 192.168.2.20:8880... connected! <--- I run proxy at p=
ort<BR>
192.168.2.20:8880<BR>
Proxy request sent, awaiting response... 200 OK<BR>
2 Date: Mon, 15 Dec 2003 19:20:43 GMT<BR>
3 Server: Apache/2.0.47 (Unix) PHP/4.3.4 mod_ssl/2.0.47 OpenSSL/0.9.7a<BR>
4 Last-Modified: Mon, 18 Feb 2002 15:49:48 GMT<BR>
5 ETag: "20f69-1a6-4e58df00"<BR>
6 Accept-Ranges: bytes<BR>
7 Content-Length: 305<BR>
8 Content-Type: text/html; charset=3DISO-8859-1<BR>
9 Via: 1.0 199.26.172.28<BR>
10 Content-Encoding: gzip<BR>
11 Vary: Accept-Encoding<BR>
12 Connection: close<BR>
13<BR>
<BR>
<BR>
Everything looks OK. The downloaded file is ungzippable, IE could show the<B=
R>
content too.<BR>
<BR>
But if I requested http://www.xyz.com/index.html which is of 2690 bytes, IE<=
BR>
can't display it, and wget shows something unexpected.<BR>
<BR>
################################<BR>
<BR>
wget http://199.26.172.28/index.html -S <BR>
--user-agent=3D"Mozilla/4.0 (Compatible; MSIE 6.0; Windows NT 5.0)" <B=
R>
--header=3D"Accept-Encoding: gzip, deflate" -O dnfile<BR>
<BR>
--14:26:51-- http://199.26.172.28:80/index.html<BR>
=3D> `dnfile=
'<BR>
Connecting to 192.168.2.20:8880... connected!<BR>
Proxy request sent, awaiting response... 200 OK<BR>
2 Date: Mon, 15 Dec 2003 19:23:48 GMT<BR>
3 Server: Apache/2.0.47 (Unix) PHP/4.3.4 mod_ssl/2.0.47 OpenSSL/0.9.7a<BR>
4 Last-Modified: Thu, 04 Sep 2003 22:31:32 GMT<BR>
5 ETag: "20f6a-a82-8bb6d900"<BR>
6 Accept-Ranges: bytes<BR>
7 Content-Length: 2690<BR>
8 Content-Type: text/html; charset=3DISO-8859-1<BR>
9 Via: 1.0 199.26.172.28<BR>
10 Content-Encoding: gzip<BR>
11 Vary: Accept-Encoding<BR>
12 Connection: close<BR>
13<BR>
<BR>
0K -> .  =
; &nb=
sp; &=
nbsp;  =
; [ 48%]<BR>
<BR>
14:26:51 (1.25 MB/s) - Connection closed at byte 1312. Giving up.<BR>
<BR>
##########################3<BR>
<BR>
<BR>
And the HTML body is weird:<BR>
<BR>
Have a look -> http://209.178.198.123/dmp.bmp<BR>
<BR>
Following is the relevant portion of httpd.conf:<BR>
<BR>
##############################################<BR>
# LoadModule foo_module modules/mod_foo.so<BR>
#<BR>
LoadModule proxy_module modules/mod_proxy.so<BR>
LoadModule proxy_ftp_module modules/mod_proxy_ftp.so<BR>
LoadModule proxy_http_module modules/mod_proxy_http.so<BR>
LoadModule proxy_connect_module modules/mod_proxy_connect.so<BR>
LoadModule gzip_module modules/mod_gzip.so<BR>
<BR>
ProxyRequests On<BR>
ProxyVia On<BR>
<BR>
<Proxy *><BR>
<BR>
Order allow,deny<BR>
Allow from all<BR>
<BR>
</Proxy><BR>
<BR>
mod_gzip_on Yes<BR>
mod_gzip_can_negotiate Yes<BR>
mod_gzip_add_header_count Yes<BR>
mod_gzip_minimum_file_size 100<BR>
mod_gzip_maximum_file_size 1000000<BR>
mod_gzip_keep_workfiles No<BR>
mod_gzip_maximum_inmem_size 100000<BR>
mod_gzip_dechunk Yes<BR>
mod_gzip_min_http  =
; 1000<BR>
<BR>
mod_gzip_item_include file \.htm$<BR>
mod_gzip_item_include mime text/.*<BR>
mod_gzip_item_exclude mime ^image/.*<BR>
mod_gzip_temp_dir "/tmp"<BR>
mod_gzip_item_include handler proxy-server<BR>
mod_gzip_command_version mod_gzip_version<BR>
CustomLog logs/gzip.log mod_gzip_info2<BR>
<BR>
##############<BR>
<BR>
<BR>
Would anyone please tell me what (I've done) is wrong? Any kind of clue woul=
d<BR>
be greatly appreciated.<BR>
<BR>
Thanks.<BR>
<BR>
Jason J Kim<BR>
<BR>
_____________________________________________________________________<BR>
?? ???? ??? ???? - ??! ??<BR>
http://mail.yahoo.co.kr<BR>
??,???,??? ??? ???? - ??! ???<BR>
http://autos.yahoo.co.kr/autos/<BR>
_______________________________________________<BR>
mod_gzip mailing list<BR>
mod_gzip@lists.over.net<BR>
http://lists.over.net/mailman/listinfo/mod_gzip<BR>
</BLOCKQUOTE><BR>
<BR>
</FONT></HTML>
--part1_1c2.12f76ec7.2d0f944a_boundary--