From: "Robert Högberg" <robert.hogberg@gmail.com>
To: musl@lists.openwall.com
Subject: Unexpected regex behaviour
Date: Mon, 29 Oct 2018 23:26:19 +0100 [thread overview]
Message-ID: <CAFYbUHMxwFOzW9f_T0etsi9efH3RoSJpBhCVigBXB9LM-ANE-A@mail.gmail.com> (raw)
[-- Attachment #1.1: Type: text/plain, Size: 1195 bytes --]
Hi,
I've noticed that the musl regex implementation behaves slightly
differently than the glibc implementation. I'm attaching a short program
showing the behaviour.
The difference makes yate (http://yate.null.ro) misbehave when running with
musl (reported here: https://github.com/openwrt/telephony/issues/378).
Yate uses a regexp like this:
"^\\([[:alpha:]][[:alnum:]]\\+:\\)\\?/\\?/\\?\\([^[:space:][:cntrl:]@]\\+@\\)\\?\\([[:alnum:]._+-]\\+\\|[[][[:xdigit:].:]\\+[]]\\)\\(:[0-9]\\+\\)\\?"
.. to parse strings like:
"sip:012345678@11.111.11.111:5060;user=phone"
.. and the matches produced by musl are:
Match 0: 0 - 32 sip:012345678@11.111.11.111:5060
Match 1: -1 - -1
Match 2: 0 - 14 sip:012345678@
Match 3: 14 - 27 11.111.11.111
Match 4: 27 - 32 :5060
.. while glibc produces:
Match 0: 0 - 32 sip:012345678@11.111.11.111:5060
Match 1: 0 - 4 sip:
Match 2: 4 - 14 012345678@
Match 3: 14 - 27 11.111.11.111
Match 4: 27 - 32 :5060
What do you think?
I've only tested musl 1.1.19. Sorry if this is not valid for later
releases. I skimmed the 1.1.20 release notes and didn't find anything regex
related.
Regards
Robert
[-- Attachment #1.2: Type: text/html, Size: 1973 bytes --]
[-- Attachment #2: yate_regexp.c --]
[-- Type: text/x-csrc, Size: 1402 bytes --]
#include <regex.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
const char* s = "sip:012345678@11.111.11.111:5060;user=phone";
const char* re = "^\\([[:alpha:]][[:alnum:]]\\+:\\)\\?/\\?/\\?\\([^[:space:][:cntrl:]@]\\+@\\)\\?\\([[:alnum:]._+-]\\+\\|[[][[:xdigit:].:]\\+[]]\\)\\(:[0-9]\\+\\)\\?";
regex_t* data = (regex_t*)malloc(sizeof(regex_t));
regcomp(data, re, 0);
const int MAX_MATCH = 9;
regmatch_t rmatch[MAX_MATCH];
regexec(data, s, MAX_MATCH, rmatch, 0);
for (int i = 0; i < MAX_MATCH; i++) {
char substr[256];
unsigned substr_len = rmatch[i].rm_eo - rmatch[i].rm_so;
memcpy(substr, s + rmatch[i].rm_so, substr_len);
substr[substr_len] = '\0';
printf("Match %u: %2d - %2d \t%s\n",
i, rmatch[i].rm_so, rmatch[i].rm_eo,
substr_len > 0? substr : "");
}
return 0;
}
/*
glibc:
Match 0: 0 - 32 sip:012345678@11.111.11.111:5060
Match 1: 0 - 4 sip:
Match 2: 4 - 14 012345678@
Match 3: 14 - 27 11.111.11.111
Match 4: 27 - 32 :5060
Match 5: -1 - -1
Match 6: -1 - -1
Match 7: -1 - -1
Match 8: -1 - -1
musl 1.1.19:
Match 0: 0 - 32 sip:012345678@11.111.11.111:5060
Match 1: -1 - -1
Match 2: 0 - 14 sip:012345678@
Match 3: 14 - 27 11.111.11.111
Match 4: 27 - 32 :5060
Match 5: -1 - -1
Match 6: -1 - -1
Match 7: -1 - -1
Match 8: -1 - -1
*/
next reply other threads:[~2018-10-29 22:26 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-29 22:26 Robert Högberg [this message]
2018-10-29 22:59 ` Rich Felker
2018-10-30 11:05 ` Szabolcs Nagy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFYbUHMxwFOzW9f_T0etsi9efH3RoSJpBhCVigBXB9LM-ANE-A@mail.gmail.com \
--to=robert.hogberg@gmail.com \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).