mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@aerifal.cx>
To: musl@lists.openwall.com
Subject: Re: multibyte performance findings
Date: Tue, 9 Apr 2013 01:54:36 -0400	[thread overview]
Message-ID: <20130409055436.GU20323@brightrain.aerifal.cx> (raw)
In-Reply-To: <20130406060852.GH20323@brightrain.aerifal.cx>

On Sat, Apr 06, 2013 at 02:08:52AM -0400, Rich Felker wrote:
> On Sat, Apr 06, 2013 at 01:21:21AM -0400, Rich Felker wrote:
> > Hi all,
> > 
> > I've been examining performance in the multibyte conversion functions
> > (as part of the POSIX locale controversy), and have some interesting
> > findings so far:
> > [...]
> 
> And here's a diff of the proposed changes so far..
> 
> Rich

> diff --git a/src/multibyte/mbrtowc.c b/src/multibyte/mbrtowc.c
> index cc49781..d552652 100644
> --- a/src/multibyte/mbrtowc.c
> +++ b/src/multibyte/mbrtowc.c
> @@ -18,6 +18,7 @@ size_t mbrtowc(wchar_t *restrict wc, const char *restrict src, size_t n, mbstate
>  	const unsigned char *s = (const void *)src;
>  	const unsigned N = n;
>  
> +	if (!n) return -2;
>  	if (!st) st = (void *)&internal_state;
>  	c = *(unsigned *)st;
>  	
> @@ -27,9 +28,9 @@ size_t mbrtowc(wchar_t *restrict wc, const char *restrict src, size_t n, mbstate
>  		n = 1;
>  	} else if (!wc) wc = (void *)&wc;
>  
> -	if (!n) return -2;

This change turned out to be wrong (it's an invalid transformation
when s is null) and I found a better improvement anyway, which I've
committed. The commit log message is actually rather interesting:

http://git.musl-libc.org/cgit/musl/commit/?id=a49e038bab7b3927b6a9c7d0c52f9e1a9cb82629

and I think this finding serves as a warning about writing 'clever'
code for special cases that "falls through" to the general code,
rather than just writing the special case code explicitly.

> +	/* This condition can only be true if *s<0x80 and c==0 */
> +	if (*s + c < 0x80) return !!(*wc = *s);
>  	if (!c) {
> -		if (*s < 0x80) return !!(*wc = *s);
>  		if (*s-SA > SB-SA) goto ilseq;
>  		c = bittab[*s++-SA]; n--;
>  	}

I omitted this for now too since the improvement seems difficult to
measure. In principle it should be better, so I may revisit this
later.

Rich


      reply	other threads:[~2013-04-09  5:54 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-06  5:21 Rich Felker
2013-04-06  6:08 ` Rich Felker
2013-04-09  5:54   ` Rich Felker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130409055436.GU20323@brightrain.aerifal.cx \
    --to=dalias@aerifal.cx \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).