mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Hans Harder <hans@atbas.org>
To: Rich Felker <dalias@libc.org>
Cc: musl@lists.openwall.com
Subject: Re: [musl] hostname is using a case sensitive search in function name_from_hosts
Date: Wed, 15 Jun 2022 08:30:02 +0200	[thread overview]
Message-ID: <CAKzsc6dqT_u9UyQvH1jEqJ_=yzyU+XFhn_6cA_m=decfeuMpPA@mail.gmail.com> (raw)
In-Reply-To: <20220609203426.GE7074@brightrain.aerifal.cx>

I don't have much experience in making patches, but you mean sometime
like this patch.

It uses strtok and only compares something case insensitive if the
token is the same length of name.
Also it makes it simpler in the remaining code.

Hans

diff -u a/src/network/lookup_name.c b/src/network/lookup_name.c
--- a/src/network/lookup_name.c 2022-04-07 17:12:40.000000000 +0000
+++ b/src/network/lookup_name.c 2022-06-15 06:15:46.680000000 +0000
@@ -6,6 +6,7 @@
 #include <ctype.h>
 #include <stdlib.h>
 #include <string.h>
+#include <strings.h>
 #include <fcntl.h>
 #include <unistd.h>
 #include <pthread.h>
@@ -49,6 +50,7 @@
 static int name_from_hosts(struct address buf[static MAXADDRS], char
canon[static 256], const char *name, int family)
 {
    char line[512];
+   char sep[4] = " \t\n";
    size_t l = strlen(name);
    int cnt = 0, badfam = 0, have_canon = 0;
    unsigned char _buf[1032];
@@ -62,17 +64,22 @@
        return EAI_SYSTEM;
    }
    while (fgets(line, sizeof line, f) && cnt < MAXADDRS) {
-       char *p, *z;
-
+       char *p;
        if ((p=strchr(line, '#'))) *p++='\n', *p=0;
-       for(p=line+1; (p=strstr(p, name)) &&
-           (!isspace(p[-1]) || !isspace(p[l])); p++);
-       if (!p) continue;
+       if (line[0] == 0 || line[0]=='\n') continue;
+       p = strtok(line, sep);
+       while( p != NULL ) {
+           /* only compare case insensitive if length of both are the same */
+           if (strlen(p) == l  && strcasecmp(p,name)==0) {
+               p = strtok(line, sep);
+               break;
+           }
+           p = strtok(NULL, sep);
+       }
+       if (p == NULL) continue;

        /* Isolate IP address to parse */
-       for (p=line; *p && !isspace(*p); p++);
-       *p++ = 0;
-       switch (name_from_numeric(buf+cnt, line, family)) {
+       switch (name_from_numeric(buf+cnt, p, family)) {
        case 1:
            cnt++;
            break;
@@ -86,12 +93,10 @@
        if (have_canon) continue;

        /* Extract first name as canonical name */
-       for (; *p && isspace(*p); p++);
-       for (z=p; *z && !isspace(*z); z++);
-       *z = 0;
-       if (is_valid_hostname(p)) {
+       p = strtok(NULL, sep);
+       if (p != NULL && is_valid_hostname(p)) {
            have_canon = 1;
-           memcpy(canon, p, z-p+1);
+           strcpy(canon, p);
        }
    }
    __fclose_ca(f);



On Thu, Jun 9, 2022 at 10:34 PM Rich Felker <dalias@libc.org> wrote:
>
> On Thu, Jun 09, 2022 at 08:42:28PM +0200, Hans Harder wrote:
> > Hi,
> > I discovered that the function name_from_hosts parses the /etc/hosts
> > file and does a case sensitive search for a name.
> > Sometimes I encounter mixed upper and lowercase hostnames in a /etc/hosts file.
> > It would be easier if the function searches for the name in a case
> > insensitive way....
> >
> > By changing line 68  in src/network/lookup_name.c
> >        for(p=line+1; (p=strstr(p, name)) &&
> > to:
> >        for(p=line+1; (p=strcasestr(p, name)) &&
> >
> > That would resolve the problem.
>
> strcasestr isn't a good match here, because it's quadratic time and
> would be potentially quite slow (depending on file contents). It's
> also not in a usable namespace, and is something of a junk function we
> included for questionable reasons.
>
> The core problem here is that strstr isn't really the right operation
> to be using, and was something of a lazy hack. Due to the linear-time
> implementation it doesn't hurt, but it would make a lot more sense to
> parse this right looking at separators. Even then though it's some
> work to make it properly case-insensitive; strcasecmp is insufficient
> and only handles single-byte characters. So the right thing to do is
> really picking up review and merge of the draft IDN handling work,
> which (if I'm remembering right) normalizes case as an inherent part
> of the process.
>
> Rich

  reply	other threads:[~2022-06-15  8:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-09 18:42 Hans Harder
2022-06-09 20:34 ` Rich Felker
2022-06-15  6:30   ` Hans Harder [this message]
2022-06-16 16:19   ` NRK
2022-06-16 16:45     ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKzsc6dqT_u9UyQvH1jEqJ_=yzyU+XFhn_6cA_m=decfeuMpPA@mail.gmail.com' \
    --to=hans@atbas.org \
    --cc=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).