mailing list of musl libc
 help / color / mirror / code / Atom feed
* Regression: git no longer works with musl libc's regex impl
@ 2016-10-04 15:08 Rich Felker
  2016-10-04 15:27 ` Jeff King
  0 siblings, 1 reply; 32+ messages in thread
From: Rich Felker @ 2016-10-04 15:08 UTC (permalink / raw)
  To: git; +Cc: musl

This commit broke support for using git with musl libc:

https://github.com/git/git/commit/2f8952250a84313b74f96abb7b035874854cf202

Rather than depending on non-portable GNU regex extensions, there is a
simple portable fix for the issue this code was added to work around:
When a text file is being mmapped for use with string functions which
depend on null termination, if the file size:

1. is nonzero mod page size, it just works; the remainder of the last
   page reads as zero bytes when mmapped.

2. if an exact multiple of the page size, then instead of directly
   mmapping the file, first mmap a mapping 1 byte (thus 1 page) larger
   with MAP_ANON, then use MAP_FIXED to map the file over top of all
   but the last page. Now the mmapped buffer can safely be used as a C
   string.

If such a solution is acceptable I can try to prepare a patch.

Rich


^ permalink raw reply	[flat|nested] 32+ messages in thread
* RE: [musl] Re: Regression: git no longer works with musl libc's regex impl
@ 2016-10-05 16:37 writeonce
  0 siblings, 0 replies; 32+ messages in thread
From: writeonce @ 2016-10-05 16:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: musl, git, Jeff King

Hi Johannes,

> 
> 
> -------- Original Message --------
> Subject: RE: [musl] Re: Regression: git no longer works with musl libc's
> regex impl
> From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
> Date: Wed, October 05, 2016 3:49 am
> To: writeonce@midipix.org
> Cc: musl@lists.openwall.com, git@vger.kernel.org, Jeff King
> <peff@peff.net>
> 
> Hi writeonce,
> 
> On Tue, 4 Oct 2016, writeonce@midipix.org wrote:
> 
> > < On Tue, 4 Oct 2016, Rich Felker wrote:
> > < 
> > < > On Tue, Oct 04, 2016 at 11:27:22AM -0400, Jeff King wrote:
> > < > > On Tue, Oct 04, 2016 at 11:08:48AM -0400, Rich Felker wrote:
> > < > > 
> > < > > > 1. is nonzero mod page size, it just works; the remainder of the
> > < > > > last page reads as zero bytes when mmapped.
> > < > > 
> > < > > Is that a portable assumption?
> > < > 
> > < > Yes.
> > < 
> > < No, it is not. You quote POSIX, but the matter of the fact is that we
> > < use a subset of POSIX in order to be able to keep things running on
> > < Windows.
> > 
> > As far as I can tell (and as the attached program may help demonstrate),
> > the above assumption has been valid on all versions of Windows since at
> > least Windows 2000.
> 
> And since W2K is already past its end of life, it would be safe for
> practical considerations.
> 
> However, I have to add two comments to that:
> 
> - it is *not* guaranteed. The behavior is undefined, even if you see
>  consistent behavior so far. Future Windows versions might break that
>  assumption freely, though.
> 

That is of course the official language, and generally speaking a good
rule of thumb. However... there is enough information to suggest that
when it comes to mapping of file-backed sections, the NT kernel
developers will choose to keep things the way they are. In brief, here's
why:

Given a "gray zone" that spans from EOF to end-of-page, there are in
essence three possible behaviors:

[1] bytes in the gray zone are accessible and are all zero's. This is
the current behavior.
[2] bytes in the gray zone are not accessible; trying to read past EOF
would result in a segfault.
[3] bytes in the gray zone are accessible but might contain random data
or junk.

Assessment:

[1] backward-compatible, POSIX-compliant, single code path for both
WIN32 and LXW.
[2] requires changing memory access granularity from 4096 bytes to a
single byte, and is therefore extremely costly.
[3] introduces a whole new class of security vulnerabilities, and will
thus be a lot of fun to watch:-)

All in all taken, then, I'd argue that relying on the current behavior
is very reasonable. If you, too, find the above assessment valid, and
since you mentioned that you were a Microsoft employee, it would be
great if you could make a good-faith effort to have the current behavior
added to the Driver Documentation and thus guaranteed.

PS. this isn't to say that the regex extension should or should not be
used, only that a decision on the matter should not be based on the
undocumentedness of current behavior.

midipix

> - some implementations of the REG_STARTEND feature have the nice property
>  that they can read past NUL characters. Granted, not all of them do
>  (AFAIU one example is FreeBSD itself, the first platform to sport
>  REG_STARTEND), but we at least reap the benefit whenever using a regex
>  that *can* read past NUL characters.
> 
> > In this context, one thing to remember is that the page-size for the mod
> > operation is 4096, whereas the POSIX page-size (for the purpose of mmap
> > and mremap) is 65536.
> 
> Indeed. A colleague of mine spotted the segfault when diffing a file that
> was *exactly* 4,096 bytes.
> 
> > Note also that in the case of file-backed mapped sections, using
> > kernel32.dll or msvcrt.dll or cygwin/newlib or midipix/musl is of little
> > significance, specifically since all invoke ZwCreateSection and
> > ZwMapViewOfSection under the hood.
> 
> Right. It's all backed by the very same kernel functions.
> 
> Ciao,
> Johannes



^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2016-10-07 11:30 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-04 15:08 Regression: git no longer works with musl libc's regex impl Rich Felker
2016-10-04 15:27 ` Jeff King
2016-10-04 15:40   ` Rich Felker
2016-10-04 16:08     ` Johannes Schindelin
2016-10-04 16:11       ` Rich Felker
2016-10-04 17:16         ` Johannes Schindelin
2016-10-04 18:00           ` Ray Donnelly
2016-10-04 17:29       ` Kurt H Maier
2016-10-04 17:33       ` Julien Ramseier
2016-10-04 17:36         ` Rich Felker
2016-10-04 17:39       ` [musl] " Rich Felker
2016-10-05 11:17         ` Johannes Schindelin
2016-10-05 13:01           ` Szabolcs Nagy
2016-10-05 13:15           ` Rich Felker
2016-10-04 22:06       ` [musl] " James B
2016-10-04 22:33         ` Rich Felker
2016-10-04 22:48           ` [musl] " Junio C Hamano
2016-10-05 13:11           ` Jakub Narębski
2016-10-05 16:15             ` Rich Felker
2016-10-05 10:41         ` Johannes Schindelin
2016-10-05 11:59           ` [musl] " James B
2016-10-05 16:11             ` Jeff King
2016-10-05 16:27               ` Rich Felker
2016-10-06 10:44             ` Johannes Schindelin
2016-10-06 19:18       ` Ævar Arnfjörð Bjarmason
2016-10-06 19:23         ` Jeff King
2016-10-06 19:25           ` Rich Felker
2016-10-06 19:28             ` Jeff King
2016-10-06 22:42         ` Ramsay Jones
2016-10-07 11:30           ` Jakub Narębski
2016-10-04 16:01   ` Johannes Schindelin
2016-10-05 16:37 [musl] " writeonce

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).