From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp1.rz.uni-karlsruhe.de (Debian-exim@smtp1.rz.uni-karlsruhe.de [129.13.185.217]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id oBANCi8J009095 for ; Fri, 10 Dec 2010 18:12:46 -0500 (EST) Received: from hekate.usta.de (asta-nat.asta.uni-karlsruhe.de [172.22.63.82]) by smtp1.rz.uni-karlsruhe.de with esmtp (Exim 4.63 #1) id 1PRC8p-0005s4-6D; Sat, 11 Dec 2010 00:12:43 +0100 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.72) (envelope-from ) id 1PRC8p-0005mZ-3n for tech@mdocml.bsd.lv; Sat, 11 Dec 2010 00:12:43 +0100 Received: from iris.usta.de ([172.24.96.5] helo=usta.de) by donnerwolke.usta.de with esmtp (Exim 4.69) (envelope-from ) id 1PRC8p-0002N0-2o for tech@mdocml.bsd.lv; Sat, 11 Dec 2010 00:12:43 +0100 Received: from schwarze by usta.de with local (Exim 4.72) (envelope-from ) id 1PRC8o-000558-RW for tech@mdocml.bsd.lv; Sat, 11 Dec 2010 00:12:42 +0100 Date: Sat, 11 Dec 2010 00:12:42 +0100 From: Ingo Schwarze To: tech@mdocml.bsd.lv Subject: Re: roff.c question Message-ID: <20101210231242.GD18607@iris.usta.de> References: <4CF77A2B.6020702@bsd.lv> <4CF79F45.6080105@bsd.lv> <20101202225019.GD12188@iris.usta.de> <20101203214929.GB28384@iris.usta.de> <4CFBAC80.1060004@bsd.lv> <20101208010527.GA25360@iris.usta.de> <4D01F5A4.8010300@bsd.lv> <20101210204513.GB18607@iris.usta.de> <20101210205203.GA30244@britannica.bec.de> <20101210211020.GC18607@iris.usta.de> X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101210211020.GC18607@iris.usta.de> User-Agent: Mutt/1.5.21 (2010-09-15) Hi, > i'm tempted to do some measurements... I took the ksh(1) manual, which is rather large, but a real-word manual with few user-defined strings, concatenated it 20 times to iteslf (only one header, of course), and added one .ds near the beginning, such that roff_res() gets called. The result is a bit above 100k lines, two and a half MB total size. Timing for mandoc -Tlint -Wfatal test.1 on my notebook is: three checks up front: 1.40s delayed checks: 1.39s with strchr: 1.39s calling roff_res N times instead of once: N=0: 1.37s N=100, strchr: 3.25s N=100, up front: 4.45s So, time spent in roff_res() is about 2.2 percent of the total parsing time, and we can save nearly 40 percent of these 2.2 percent. Thus, on a typical mdoc(7) manual, we economize between 0.8 and 0.9 percent parsing time with these optimizations. Rendering time is about 0.55s, so we save about 0.6 percent total time. Given that the rendering time for the ksh(1) manual is 100 milliseconds on my notebook, we can save 600 microseconds in absolute numbers, on a large manual. Thus, here is a version using these optimizations, which fortunately does not make the code more complicated. I diffed against OpenBSD, because that makes the diff easier to understand. Yours, Ingo Index: roff.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/roff.c,v retrieving revision 1.23 diff -u -p -r1.23 roff.c --- roff.c 9 Dec 2010 20:56:30 -0000 1.23 +++ roff.c 10 Dec 2010 22:58:19 -0000 @@ -345,18 +345,11 @@ roff_res(struct roff *r, char **bufp, si size_t nsz; char *n; - /* String escape sequences have at least three characters. */ + /* Search for a leading backslash and save a pointer to it. */ - for (cp = *bufp + pos; cp[0] && cp[1] && cp[2]; cp++) { - - /* - * The first character must be a backslash. - * Save a pointer to it. - */ - - if ('\\' != *cp) - continue; - stesc = cp; + cp = *bufp + pos; + while (NULL != (cp = strchr(cp, '\\'))) { + stesc = cp++; /* * The second character must be an asterisk. @@ -364,7 +357,9 @@ roff_res(struct roff *r, char **bufp, si * so it can't start another escape sequence. */ - if ('*' != *(++cp)) + if ('\0' == *cp) + return(1); + if ('*' != *cp++) continue; /* @@ -373,7 +368,9 @@ roff_res(struct roff *r, char **bufp, si * Save a pointer to the name. */ - switch (*(++cp)) { + switch (*cp) { + case ('\0'): + return(1); case ('('): cp++; maxl = 2; -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv