mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Isaac Dunham <idunham@lavabit.com>
To: musl@lists.openwall.com
Subject: Re: strcasestr.c
Date: Wed, 20 Feb 2013 22:18:27 -0800	[thread overview]
Message-ID: <20130220221827.6ee8a6ff.idunham@lavabit.com> (raw)
In-Reply-To: <20130221010328.GN20323@brightrain.aerifal.cx>

On Wed, 20 Feb 2013 20:03:28 -0500
Rich Felker <dalias@aerifal.cx> wrote:

> 
> Yes, it seems to have been an early mistake I made getting wget to
> work. Unfortunately, I think it would be bad policy to remove it now
> since that would break existing dynamic binaries using it, but one
> could make an argument that breaking them is "right" since they were
> already broken (not behaving as intended)...
> 
> > >Since strcasestr is nonstandard and not clearly specified,
> > 
> > it's so non-standard that even nobody uses it.
> > i looked up the usage of the function in codesearch.debian.net, and
> > the only *user* (from all ~20K debian packages) of the function is
> > gnu wget.
> 
> Are you sure this search was correct? IIRC there were more...

A quick check here indicates that busybox, mutt, git, midnight commander, sylpheed, foomatic-rip, elinks, and a couple libraries use it.
Busycox uses it in grep and for checking passwords (see libbb/obscure.c).
 
> I think leaving it as-is is the worst case. It's impossible to detect
> without runtime checks that it's incorrect, so it might encourage
> configure scripts to add runtime checks for broken strcasestr that
> break cross compiling. Or it might lead to programs just assuming it's
> correct, then breaking.
> 
> > in any case it doesnt make sense to put much work and especially
> > much code into it.
> > if it's gonna be implemented "correctly" at all, it should be as
> > slim as possible, in the order of 3-5 LOC.
> 
> As much as I appreciate Todd's interest in contributing a 2way-based
> version, I tend to agree. Not only would adopting the 2way code be a
> fairly large code addition for a never-used feature, but it would also
> bind us to the choice to do ASCII-only case mapping or drop
> performance drastically in the future if we want to change that
> decision.
> 
> My leaning right now would be to write the naive strstr loop using
> strcasecmp instead of strcmp (or an inline loop) for the inner loop.
> This will cause strcasestr to have the exact same case-folding
> semantics as strcasecmp, whatever those are in the future. This is
> best for consistency. Unfortunately, it's very bad from a performance
> standpoint, but I don't know of any code using this function for
> high-performance use.

Busybox implements their own version using this approach.
 
> The other somewhat reasonable option would be removing the function,
> which would expose breakage in programs that were already using the
> broken version in musl. I'm mildly against this, but I'd be interested
> in hearing arguments either way.

Were the claimed frequency correct, I would want it gone. As it stands, I think that a small but slow version is justifiable. A large one isn't.
-- 
Isaac Dunham <idunham@lavabit.com>



  parent reply	other threads:[~2013-02-21  6:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-14 14:59 strcasestr.c Todd C. Miller
2013-02-14 15:23 ` strcasestr.c Rich Felker
2013-02-17 19:04   ` strcasestr.c Rich Felker
2013-02-20 22:28     ` strcasestr.c John Spencer
2013-02-20 23:56       ` strcasestr.c Szabolcs Nagy
2013-02-21  1:03       ` strcasestr.c Rich Felker
2013-02-21  1:30         ` strcasestr.c Kurt H Maier
2013-02-21  1:34           ` strcasestr.c Rich Felker
2013-02-21  6:18         ` Isaac Dunham [this message]
2013-02-21 20:00           ` strcasestr.c John Spencer
2013-02-21 20:13             ` strcasestr.c Szabolcs Nagy
2013-02-22  5:20         ` strcasestr.c Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130220221827.6ee8a6ff.idunham@lavabit.com \
    --to=idunham@lavabit.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).