tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: Ingo Schwarze <schwarze@usta.de>
To: tech@mdocml.bsd.lv
Subject: Re: roff.c question
Date: Fri, 10 Dec 2010 21:45:13 +0100	[thread overview]
Message-ID: <20101210204513.GB18607@iris.usta.de> (raw)
In-Reply-To: <4D01F5A4.8010300@bsd.lv>

Hi Kristaps,

Kristaps Dzonsons wrote on Fri, Dec 10, 2010 at 10:40:52AM +0100:
>>> Ingo wrote:

>>>> 1) roff_res() should not expand \\* or \\\\*,
>>>>    but it should expand \* and \\\*.

> I worry a bit about performance, as this is an
> often-called function:

Yes, the function is called L+E times,
where L is the number of input lines and E is the number
of expanded strings and user-defined roff macro invocations.

But my algorithm is still O(N), where N is the number of
characters on the line, as long as no expansion is done.
So, if C is the total number of characters in the input file,
and e is the mean number of expansions per line,
then the total order of the search algorithm (without
doing the actual expansions) is O(C*(1+e)).

The work to do for the expansions is O(C*e), but that
didn't change.

All that is still nearly linear, which is hardly a bad order.

> why the funny business looking for three
> non-nils then the escaped asterisk?
> 
> Seems (it pseudo-C)
> 
>   for (cp = buf + pos; cpp = strstr(cp, "\\*"); cp++)
>      if ('\0' == cpp[2]) continue;
>      ...
> 
> would be much clearer and faster.  No?

No.

Faster, no.  That's O(C*(1+e)) as well.
So we are talking about constant factors here.
Maybe you are economizing two char comparisons
per loop cycle.  Assuming e=1, that will buy you
perhaps 400k char comparisons on a 100k input file.
That might be about a millisecond on a very old PC.

Clearer, no.  It's incorrect.  You cannot use strstr().
The number of backslashes before "\\*" is relevant.
It must be even, or you don't want to substitute.

If we want to go into micro-optimization, we could do this:

        for (cp = *bufp + pos; *cp; cp++) {
                if ('\\' != *cp)
                        continue;
                stesc = cp;
		if ('\0' == *(++cp))
			break;
		if ('*' != *cp)
			continue;
		if ('\0' == *(++cp))
			break;
		switch (*cp) {

deferring the tests until they are really needed (untested).
That's fewer tests because most of the time, the
continue branches will be taken.

But is that really better?  It is more code, and harder
to understand.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

  reply	other threads:[~2010-12-10 20:45 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-01 16:33 exit_status persistence Kristaps Dzonsons
2010-12-01 16:41 ` Kristaps Dzonsons
2010-12-01 21:28 ` Ingo Schwarze
2010-12-02 10:51   ` exit_status persistence (now: roff.c question) Kristaps Dzonsons
2010-12-02 13:29     ` Kristaps Dzonsons
2010-12-02 22:50       ` roff.c question Ingo Schwarze
2010-12-03 21:49         ` Ingo Schwarze
2010-12-05 15:15           ` Kristaps Dzonsons
2010-12-08  1:05             ` Ingo Schwarze
2010-12-10  9:40               ` Kristaps Dzonsons
2010-12-10 20:45                 ` Ingo Schwarze [this message]
2010-12-10 20:52                   ` Joerg Sonnenberger
2010-12-10 21:10                     ` Ingo Schwarze
2010-12-10 21:17                       ` Joerg Sonnenberger
2010-12-10 23:12                       ` Ingo Schwarze
2010-12-03 23:31         ` Ingo Schwarze
2010-12-05 15:17           ` Kristaps Dzonsons
2010-12-09 23:45             ` Ingo Schwarze
2010-12-10  9:32               ` Kristaps Dzonsons
2010-12-02 20:54     ` Ingo Schwarze

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101210204513.GB18607@iris.usta.de \
    --to=schwarze@usta.de \
    --cc=tech@mdocml.bsd.lv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).