mailing list of musl libc
 help / color / mirror / code / Atom feed
* RE: [musl] Re: Regression: git no longer works with musl libc's regex impl
@ 2016-10-05 16:37 writeonce
  0 siblings, 0 replies; 6+ messages in thread
From: writeonce @ 2016-10-05 16:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: musl, git, Jeff King

Hi Johannes,

> 
> 
> -------- Original Message --------
> Subject: RE: [musl] Re: Regression: git no longer works with musl libc's
> regex impl
> From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
> Date: Wed, October 05, 2016 3:49 am
> To: writeonce@midipix.org
> Cc: musl@lists.openwall.com, git@vger.kernel.org, Jeff King
> <peff@peff.net>
> 
> Hi writeonce,
> 
> On Tue, 4 Oct 2016, writeonce@midipix.org wrote:
> 
> > < On Tue, 4 Oct 2016, Rich Felker wrote:
> > < 
> > < > On Tue, Oct 04, 2016 at 11:27:22AM -0400, Jeff King wrote:
> > < > > On Tue, Oct 04, 2016 at 11:08:48AM -0400, Rich Felker wrote:
> > < > > 
> > < > > > 1. is nonzero mod page size, it just works; the remainder of the
> > < > > > last page reads as zero bytes when mmapped.
> > < > > 
> > < > > Is that a portable assumption?
> > < > 
> > < > Yes.
> > < 
> > < No, it is not. You quote POSIX, but the matter of the fact is that we
> > < use a subset of POSIX in order to be able to keep things running on
> > < Windows.
> > 
> > As far as I can tell (and as the attached program may help demonstrate),
> > the above assumption has been valid on all versions of Windows since at
> > least Windows 2000.
> 
> And since W2K is already past its end of life, it would be safe for
> practical considerations.
> 
> However, I have to add two comments to that:
> 
> - it is *not* guaranteed. The behavior is undefined, even if you see
>  consistent behavior so far. Future Windows versions might break that
>  assumption freely, though.
> 

That is of course the official language, and generally speaking a good
rule of thumb. However... there is enough information to suggest that
when it comes to mapping of file-backed sections, the NT kernel
developers will choose to keep things the way they are. In brief, here's
why:

Given a "gray zone" that spans from EOF to end-of-page, there are in
essence three possible behaviors:

[1] bytes in the gray zone are accessible and are all zero's. This is
the current behavior.
[2] bytes in the gray zone are not accessible; trying to read past EOF
would result in a segfault.
[3] bytes in the gray zone are accessible but might contain random data
or junk.

Assessment:

[1] backward-compatible, POSIX-compliant, single code path for both
WIN32 and LXW.
[2] requires changing memory access granularity from 4096 bytes to a
single byte, and is therefore extremely costly.
[3] introduces a whole new class of security vulnerabilities, and will
thus be a lot of fun to watch:-)

All in all taken, then, I'd argue that relying on the current behavior
is very reasonable. If you, too, find the above assessment valid, and
since you mentioned that you were a Microsoft employee, it would be
great if you could make a good-faith effort to have the current behavior
added to the Driver Documentation and thus guaranteed.

PS. this isn't to say that the regex extension should or should not be
used, only that a decision on the matter should not be based on the
undocumentedness of current behavior.

midipix

> - some implementations of the REG_STARTEND feature have the nice property
>  that they can read past NUL characters. Granted, not all of them do
>  (AFAIU one example is FreeBSD itself, the first platform to sport
>  REG_STARTEND), but we at least reap the benefit whenever using a regex
>  that *can* read past NUL characters.
> 
> > In this context, one thing to remember is that the page-size for the mod
> > operation is 4096, whereas the POSIX page-size (for the purpose of mmap
> > and mremap) is 65536.
> 
> Indeed. A colleague of mine spotted the segfault when diffing a file that
> was *exactly* 4,096 bytes.
> 
> > Note also that in the case of file-backed mapped sections, using
> > kernel32.dll or msvcrt.dll or cygwin/newlib or midipix/musl is of little
> > significance, specifically since all invoke ZwCreateSection and
> > ZwMapViewOfSection under the hood.
> 
> Right. It's all backed by the very same kernel functions.
> 
> Ciao,
> Johannes



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Re: Regression: git no longer works with musl libc's regex impl
  2016-10-05 11:59           ` [musl] " James B
@ 2016-10-05 16:11             ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2016-10-05 16:11 UTC (permalink / raw)
  To: James B; +Cc: Johannes Schindelin, musl, Rich Felker, git

On Wed, Oct 05, 2016 at 10:59:34PM +1100, James B wrote:

> Number downloads does not make first-tier platform. You know that as
> well as everyone else.
> 
> First-tier support is the decision made by the maintainers that the
> entire features of the software must be available on those first tier
> platforms. So if Windows is indeed first-tier platform for git, it
> means any features that don't work on git version of Windows must not
> be used/developed or even castrated. That's a scary thought.

Prepare to be scared, then, I guess. Ever since the msysgit project
started years ago, we have made concessions in the code to work both
with POSIX-ish systems and with the msys layer. E.g., see how git-daemon
does not fork(), but actually re-spawns itself to handle connections.

When possible we try to put our abstractions at a level where they can
be implemented in a performant way on all platforms (the git-daemon
things is probably the _most_ ugly in that respect; I think nobody has
really cared about the performance enough to add back in a forking code
path for POSIX systems).

> So this decision that "Windows is now a first-tier platform for git" -
> is your own opinion, or is this the collective opinion of *all* the
> git maintainers?

There is only one maintainer of git: Junio. However, you'll note that I
also used "we" in the paragraphs above. And that is because the approach
I am talking about is something that has been done over the course of
many years by many members of the development community.

You may disagree with that approach, but it is nothing new. The msysgit
project started in 2007.

> Well thank you for being honest. I can see now why you responded the
> way you did (and still do). By being employed by Microsoft, and
> especially paid to work on Git for Windows, you have all the
> incentives to make it work best on Windows, and to make it as its
> first-tier platform within the limitation of Windows.

Please don't insinuate that Johannes is a Microsoft shill. He has been
working on the Windows port of Git for over 9 years, and was only
employed by Microsoft this year. Furthermore, his original REG_STARTEND
patch actually did a run-time fallback of NUL-terminating the input
buffers. It was _I_ who suggested that we should simply push people
towards our compat/regex routines instead. So if you want to be mad at
somebody, be mad at me.

-Peff


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Re: Regression: git no longer works with musl libc's regex impl
  2016-10-05 10:41         ` Johannes Schindelin
@ 2016-10-05 11:59           ` James B
  2016-10-05 16:11             ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: James B @ 2016-10-05 11:59 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: musl, Rich Felker, Jeff King, git

On Wed, 5 Oct 2016 12:41:50 +0200 (CEST)
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> > 
> > Wow, I don't know that Windows is a git's first-tier platform now,
> 
> It is. Git for Windows is maintained by me, and I make as certain as I can
> that it works fine. 
> And yes, we have download numbers to support my claim.
> The latest release is less than 24h old, but I can point you to Git for
> Windows 2.8.1 whose 32-bit installer was downloaded 397,273 times, and
> whose 64-bit installer was downloaded 3,780,079 times.

Number downloads does not make first-tier platform. You know that as well as everyone else.

First-tier support is the decision made by the maintainers that the entire features of the software must be available on those first tier platforms. So if Windows is indeed first-tier platform for git, it means any features that don't work on git version of Windows must not be used/developed or even castrated. That's a scary thought.

Being a maintainer for "Git for Windows" does not make one automatically as the maintainer for "git", although that can happen.

So this decision that "Windows is now a first-tier platform for git" - is your own opinion, or is this the collective opinion of *all* the git maintainers? 


> 
> > and Linux/POSIX second.
> 
> This is not at all what I said, so please be careful of what you accuse
> me.

Yes, you did not say that. I said that. And I will say more. Git has Linux/POSIX roots. Attempting to "not use common POSIX features because they're not available on Windows" *does* make Linux/POSIX feels like second class platform for git. The way I see it, it should be *the other way around*.

It's a very sad day for a tool that was developed originally to maintain Linux kernel, by the Linux kernel author, now is restricted to avoid use/optimise on Linux/POSIX features *because* it has to run on another Windows ...

> 
> What I said is that we never exploited the full POSIX standard, but that
> we made certain to use a subset of POSIX in Git which would be relatively
> easy to emulate using Windows' API.

All this just proves my point above.

And - I notice you use the pronoun "we" - is that a "royal we" (which means the entire point is your own or your cohorts position), or is it the official position of all git maintainers?

> 
> > Are we talking about the same git that was originally written in Linus
> > Torvalds, and is used to manage Linux kernel?
> 
> It was originally written by (not in) Linus Torvalds, and yes, the Linux
> kernel is one of its many users.

That was a rhetorical question.

> 
> > Are you by any chance employed by Redmond, directly or indirectly?
> 
> I am not exactly employed by Redmond, but by Microsoft (this is what you
> meant, I guess).
> 
> I maintained Git for Windows in my spare time, next to a very demanding
> position in science, for eight years. In 2015, I joined Microsoft and part
> of my role is to maintain Git for Windows, allowing me to do a much better
> job at it.


Well thank you for being honest. I can see now why you responded the way you did (and still do). By being employed by Microsoft, and especially paid to work on Git for Windows, you have all the incentives to make it work best on Windows, and to make it as its first-tier platform within the limitation of Windows.

That in itself is not a problem - it only starts to become a problem when you try to cut down support for other platforms or stifle improvements on other platforms because "hey it makes it too hard to do those things in Windows".

> 
> Of course, I do not only improve Git's Windows support, but contribute
> other patches, too. You might also appreciate the fact that some of my
> colleagues started contributing patches to Git that benefit all Git users.
> 

By "colleagues" I assume other Microsoft employees? 
I don't have a problem with that - thank you and your colleagues for making git better.

But that does not give the right to you to control that "if it doesn't work on Windows we shouldn't do it". If you do, and at the same time claim that musl-libc (which is Linux-only) support is unimportant compared to your 4 million Git-for-Windows downloads really don't do well for your or your employer's image. I don't have to say why - everyone outside Microsoft knows why.

In conclusion, I certainly hope that your view is not shared by the other git maintainers.

PS: Rich, sorry for the distraction. I have said what I want to say, so I'll bow out from this thread.

cheers,
James


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Re: Regression: git no longer works with musl libc's regex impl
  2016-10-04 22:33         ` Rich Felker
@ 2016-10-04 22:48           ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2016-10-04 22:48 UTC (permalink / raw)
  To: Rich Felker; +Cc: James B, musl, Johannes Schindelin, Jeff King, git

Rich Felker <dalias@libc.org> writes:

> This is especially unfriendly when the semantics of the switch come
> across, at least to some users, as "your system regex is incomplete"
> rather than "git can't use it because git depends on nonstandard
> extensions".

The latter is exactly what Makefile patch that brought this change
says, I think.

    # Define NO_REGEX if your C library lacks regex support with REG_STARTEND
    # feature.

Before the series updated the message to the above, we used to say:

    # Define NO_REGEX if you have no or inferior regex support in your C library.

which _was_ unfair to those who needed to set NO_REGEX for whatever
reason.  It was totally unclear "inferior" relative to what standard
the message was passing its harsh judgement on your C library.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Re: Regression: git no longer works with musl libc's regex impl
  2016-10-04 16:08     ` Johannes Schindelin
  2016-10-04 17:39       ` [musl] " Rich Felker
@ 2016-10-04 22:06       ` James B
  2016-10-04 22:33         ` Rich Felker
  2016-10-05 10:41         ` Johannes Schindelin
  1 sibling, 2 replies; 6+ messages in thread
From: James B @ 2016-10-04 22:06 UTC (permalink / raw)
  To: musl; +Cc: Johannes Schindelin, Rich Felker, Jeff King, git

On Tue, 4 Oct 2016 18:08:33 +0200 (CEST)
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> 
> No, it is not. You quote POSIX, but the matter of the fact is that we use
> a subset of POSIX in order to be able to keep things running on Windows.
> 
> And quite honestly, there are lots of reasons to keep things running on
> Windows, and even to favor Windows support over musl support. Over four
> million reasons: the Git for Windows users.
> 

Wow, I don't know that Windows is a git's first-tier platform now, and Linux/POSIX second. Are we talking about the same git that was originally written in Linus Torvalds, and is used to manage Linux kernel? Are you by any chance employed by Redmond, directly or indirectly?

Sorry - can't help it.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Re: Regression: git no longer works with musl libc's regex impl
  2016-10-04 16:08     ` Johannes Schindelin
@ 2016-10-04 17:39       ` Rich Felker
  2016-10-04 22:06       ` James B
  1 sibling, 0 replies; 6+ messages in thread
From: Rich Felker @ 2016-10-04 17:39 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jeff King, git, musl

On Tue, Oct 04, 2016 at 06:08:33PM +0200, Johannes Schindelin wrote:
> Hi Rich,
> 
> On Tue, 4 Oct 2016, Rich Felker wrote:
> 
> > On Tue, Oct 04, 2016 at 11:27:22AM -0400, Jeff King wrote:
> > > On Tue, Oct 04, 2016 at 11:08:48AM -0400, Rich Felker wrote:
> > > 
> > > > 1. is nonzero mod page size, it just works; the remainder of the last
> > > >    page reads as zero bytes when mmapped.
> > > 
> > > Is that a portable assumption?
> > 
> > Yes.
> 
> No, it is not. You quote POSIX, but the matter of the fact is that we use
> a subset of POSIX in order to be able to keep things running on Windows.
> 
> And quite honestly, there are lots of reasons to keep things running on
> Windows, and even to favor Windows support over musl support. Over four
> million reasons: the Git for Windows users.
> 
> So rather than getting into an ideological discussion about "broken"
> systems, it would be good to keep things practical, realizing that those
> users make up a very real chunk of all of Git's users.
> 
> As to making NO_REGEX conditional on REG_STARTEND: you are talking about
> apples and oranges here. NO_REGEX is a Makefile flag, while REG_STARTEND
> is a C preprocessor macro.

It seems like you could just always compile the source file, and just
have it all inside #if defined(NO_REGEX) || !defined(REG_STARTEND) or
similar.

> And lastly, the best alternative would be to teach musl about
> REG_STARTEND, as it is rather useful a feature.

Maybe, but it seems fundamentally costly to support -- it's extra
state in the inner loops that imposes costly spill/reload on archs
with too few registers (x86). I'll look at doing this when we
overhaul/replace the regex implementation, and I'm happy to do some
performance-regression tests for adding it now if someone has a simple
patch (as was mentioned on the musl list).

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-10-05 16:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-05 16:37 [musl] Re: Regression: git no longer works with musl libc's regex impl writeonce
  -- strict thread matches above, loose matches on Subject: below --
2016-10-04 15:08 Rich Felker
2016-10-04 15:27 ` Jeff King
2016-10-04 15:40   ` Rich Felker
2016-10-04 16:08     ` Johannes Schindelin
2016-10-04 17:39       ` [musl] " Rich Felker
2016-10-04 22:06       ` James B
2016-10-04 22:33         ` Rich Felker
2016-10-04 22:48           ` [musl] " Junio C Hamano
2016-10-05 10:41         ` Johannes Schindelin
2016-10-05 11:59           ` [musl] " James B
2016-10-05 16:11             ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).