caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* integration of compression with channels
@ 2008-11-04 12:50 Eric Cooper
  2008-11-04 13:26 ` [Caml-list] " Stefano Zacchiroli
  0 siblings, 1 reply; 2+ messages in thread
From: Eric Cooper @ 2008-11-04 12:50 UTC (permalink / raw)
  To: caml-list

I was interested to see Zack's work on integrating gzip and bzip2 with
I/O channels:
    http://upsilon.cc/~zack/blog/posts/2008/11/ocaml_batteries_gzip/

I initially tried something like this in the approx proxy server, but
found out the hard way that it was difficult to deal with corrupt .gz
files.  You might only discover the corruption after reading garbage
for a while, and an exception at that point would be unexpected.

Eventually I switched to spawning a "gunzip" process to a temporary
file, and then reading that.  In addition to detecting corruption
early, it was also significantly faster than CamlZip.

I suppose one could argue that you can get an I/O error even from
reading an uncompressed file (bad disk block, or whatever), and that
a robust program should be equally prepared to deal with that.
But I think there's a real difference in practice.

The integrated approach is definitely more elegant, and perhaps the
performance will be competitive someday.  So I'd be interested
if anyone has a better way of handling potentially corrupt files.

-- 
Eric Cooper             e c c @ c m u . e d u


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-11-04 13:35 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-04 12:50 integration of compression with channels Eric Cooper
2008-11-04 13:26 ` [Caml-list] " Stefano Zacchiroli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).