From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id C3233BB84 for ; Mon, 8 Sep 2008 02:10:53 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApwBAPsJxEhKfU4ZlWdsb2JhbACCMY9UPgEBAQEJAwoHD5kahT0BAg X-IronPort-AV: E=Sophos;i="4.32,354,1217800800"; d="scan'208";a="28910953" Received: from ey-out-2122.google.com ([74.125.78.25]) by mail4-smtp-sop.national.inria.fr with ESMTP; 08 Sep 2008 02:10:53 +0200 Received: by ey-out-2122.google.com with SMTP id 6so523596eyi.15 for ; Sun, 07 Sep 2008 17:10:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:received:received:organization:to:subject:date :user-agent:cc:references:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:message-id:from; bh=hl/C1oxidKTnexFiJF4jzfs8UVdkJqx9MIlwag0uKkE=; b=plAfaTFYd+Mm767nOqAozK08BkLq9vCkzd+/2TL3WP29IAR0ZXBy5AcSO7LXeeD17c 56IPqScm83dfk+1M8wTFF+1q5E9fowDGGyWdI90kHxS54W/8rrCcxO+jkILBVFp3MpUp yLfdWD9agsfeh6n4QYJkADqTuYcUtZG/t4Fmo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=organization:to:subject:date:user-agent:cc:references:in-reply-to :mime-version:content-type:content-transfer-encoding :content-disposition:message-id:from; b=pDZ3Rg6dNvm3Z6CBY04IGSgMoJtaW5yFJTQXZEuzG2ZmEP9h/iDxhJhA2N7Hn9kuG8 AoVK4RKnaAf4c8cCEfsJaXIt5lT7qtq7iDsvvNKz1KWKqnuTVqDdrIvFfueT4WJO/ETZ 2+9EmVRQzttM5XJllQchvSTbOXSrpbfncN8mU= Received: by 10.210.65.2 with SMTP id n2mr5580046eba.111.1220832652608; Sun, 07 Sep 2008 17:10:52 -0700 (PDT) Received: from leper.local ( [86.140.183.215]) by mx.google.com with ESMTPS id j8sm6496326gvb.1.2008.09.07.17.10.51 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 07 Sep 2008 17:10:52 -0700 (PDT) Organization: Flying Frog Consultancy Ltd. To: "Nathaniel Gray" Subject: Re: [Caml-list] New Ocaml Plug-in for NetBeans Date: Mon, 8 Sep 2008 02:10:56 +0100 User-Agent: KMail/1.9.9 Cc: "Maxence Guesdon" , caml-list@yquem.inria.fr References: <1217061979.488ae45b452b9@webmail.inescporto.pt> <200808201732.07275.jon@ffconsultancy.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200809080210.57000.jon@ffconsultancy.com> From: Jon Harrop X-Spam: no; 0.00; ocaml:01 plug-in:98 frog:98 token:01 wrote:01 caml-list:01 string:02 optimizing:03 gzip:04 gray:90 file:11 file:11 ltd:87 source:12 writing:12 On Monday 08 September 2008 00:31:47 Nathaniel Gray wrote: > In fact, gzip does a pretty fine job of optimizing .annot files. The > source file this .annot came from is 15K, the .annot file is 78K, and > gzipping it gets it down to 9K. Tokenizing that .annot format by simply mapping each string onto a unique token and writing out the mapping first will shrink such files enormously whilst still being easily dissected from any language. Moreover, the benefit is super-linear... -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e