From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mo-p00-ob.rzone.de (mo-p00-ob.rzone.de [81.169.146.162]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id p6SFJOVG016104 for ; Thu, 28 Jul 2011 11:19:25 -0400 (EDT) X-RZG-AUTH: :JiIXek6mfvEEUpFQdo7Fj1/zg48CFjWjQv0cW+St/nW/avgusCdvwXOZ/NA7x/bslxlDPy6G726REW7piU2JIqNTwRM= X-RZG-CLASS-ID: mo00 Received: from britannica.bec.de ([2001:6f8:13f0:0:5e26:aff:fe1e:99a9]) by smtp.strato.de (klopstock mo11) (RZmta 26.2) with (DHE-RSA-AES128-SHA encrypted) ESMTPA id 0004e0n6SEa50I for ; Thu, 28 Jul 2011 17:18:13 +0200 (MEST) Received: by britannica.bec.de (sSMTP sendmail emulation); Thu, 28 Jul 2011 17:18:12 +0200 Date: Thu, 28 Jul 2011 17:18:12 +0200 From: Joerg Sonnenberger To: tech@mdocml.bsd.lv Subject: Re: Need hash: uthash? Message-ID: <20110728151812.GA6598@britannica.bec.de> Mail-Followup-To: tech@mdocml.bsd.lv References: <4E313AC0.2010106@bsd.lv> <20110728150401.GA6081@britannica.bec.de> <4E317C5B.9000804@bsd.lv> X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E317C5B.9000804@bsd.lv> User-Agent: Mutt/1.5.21 (2010-09-15) On Thu, Jul 28, 2011 at 05:12:27PM +0200, Kristaps Dzonsons wrote: > On 28/07/2011 17:04, Joerg Sonnenberger wrote: > >On Thu, Jul 28, 2011 at 12:32:32PM +0200, Kristaps Dzonsons wrote: > >>mandoc is getting a `tr' implementation*, needed primarily for > >>perlpod. This is expensive as it involves iterating over each > >>character in each text string, then each element in an array of `tr' > >>characters (or escape sequences). Expect it in the next few commits > >>(now it's in polish phase). > > > >Shouldn't this use a simple byte lookup table for the hot path? > >Most of the tr processing applies to non-special sequences and unicode > >or other \-literals are rare. > > Joerg, > > On the contrary. perlpod (followed by GNU) is the main offender and > makes significant use of escape-translation. Yes, I could > special-case \(*W, but really would rather not. i think you misunderstand me. My point is that the really critical path for .tr processing is handling non-backslashed sequences. E.g. normal text. For that, building a byte lookup table drops the performance penalty to a few percent at most. The rest is relative expensive anyway, but it should be only a very small part of the input. So the O(nm) drops down to O(normal text + m * backslash-literals). Joerg -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv