zsh-workers
 help / color / mirror / code / Atom feed
* Possible ZSH bug with IO direction
@ 2016-04-23 18:30 Roger Qiu
  2016-04-23 22:14 ` Bart Schaefer
  0 siblings, 1 reply; 18+ messages in thread
From: Roger Qiu @ 2016-04-23 18:30 UTC (permalink / raw)
  To: zsh-workers

Hi,

I'm running ZSH on Cygwin. But earlier I noticed this error:

```
 > gm convert -compress JPEG - - < input.jpg > output.jpg
gm convert: Corrupt JPEG data: 873 extraneous bytes before marker 0xd9 
(/tmp/gmo1fx92).
```

These 2 work fine in ZSH:

```
 > cat input.jpg | gm convert -compress JPEG - - > output.jpg
 > gm convert -compress JPEG input.jpg output.jpg
```

While all 3 above commands work in Bash without problems.

I'm reporting it here to see if anybody else is meeting the same 
problem, and if so, perhaps it is a ZSH bug.

I had asked this earlier in: 
http://stackoverflow.com/questions/36495113/in-zsh-redirecting-image-file-into-graphic-magick-results-in-corrupt-jpeg-data

Thanks,
Roger

-- 
Founder of Matrix AI
https://matrix.ai/
+61420925975


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-23 18:30 Possible ZSH bug with IO direction Roger Qiu
@ 2016-04-23 22:14 ` Bart Schaefer
  2016-04-24 12:17   ` Peter Stephenson
  2016-04-24 17:08   ` Roger Qiu
  0 siblings, 2 replies; 18+ messages in thread
From: Bart Schaefer @ 2016-04-23 22:14 UTC (permalink / raw)
  To: roger.qiu, zsh-workers

On Apr 24,  4:30am, Roger Qiu wrote:
}
}  > gm convert -compress JPEG - - < input.jpg > output.jpg
} gm convert: Corrupt JPEG data: 873 extraneous bytes before marker 0xd9 
} (/tmp/gmo1fx92).

I suspect you are encountering the issue that "gm" wants input.jpg in
binary mode, but zsh's input redirection operator has forced it to text
mode.  A lengthy comment in Src/main.c (copypasted below) explains this.  

There's [intended to be] an easy workaround, which is to open the file
for both read and write even though you're only going to read it:

    gm convert -compress JPEG - - <> input.jpg > output.jpg

However, it's been years since I had a Cygwin system or the time/patience
to care to set one up, so I haven't tested the workaround.

Aside:  The zsh/system module "sysopen" doesn't recognize O_BINARY or
O_TEXT modes; that should probably be corrected if someone can compile
on a system that has them, but of course the comment below indicates
that O_BINARY would be overridden anyway.


 * Peter A. Castro <doctor@fruitbat.org>
 *
 * Cygwin supports the notion of binary or text mode access to files
 * based on the mount attributes of the filesystem.  If a file is on
 * a binary mounted filesystem, you get exactly what's in the file, CRLF's
 * and all.  If it's on a text mounted filesystem, Cygwin will strip out
 * the CRs.  This presents a problem because zsh code doesn't allow for
 * CRLF's as line terminators.  So, we must force all open files to be
 * in text mode reguardless of the underlying filesystem attributes.
 * However, we only want to do this for reading, not writing as we still
 * want to write files in the mode of the filesystem.  To do this,
 * we have two options: augment all {f}open() calls to have O_TEXT added to
 * the list of file mode options, or have the Cygwin runtime do it for us.
 * I choose the latter. :)
 *
 * Cygwin's runtime provides pre-execution hooks which allow you to set
 * various attributes for the process which effect how the process functions.
 * One of these attributes controls how files are opened.  I've set
 * it up so that all files opened RDONLY will have the O_TEXT option set,
 * thus forcing line termination manipulation.  This seems to solve the
 * problem (at least the Test suite runs clean :).
 *
 * Note: this may not work in later implementations.  This will override
 * all mode options passed into open().  Cygwin (really Windows) doesn't
 * support all that much in options, so for now this is OK, but later on
 * it may not, in which case O_TEXT will have to be added to all opens calls
 * appropriately.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-23 22:14 ` Bart Schaefer
@ 2016-04-24 12:17   ` Peter Stephenson
  2016-04-24 18:36     ` Bart Schaefer
  2016-04-24 17:08   ` Roger Qiu
  1 sibling, 1 reply; 18+ messages in thread
From: Peter Stephenson @ 2016-04-24 12:17 UTC (permalink / raw)
  To: zsh-workers

On Sat, 23 Apr 2016 15:14:36 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Apr 24,  4:30am, Roger Qiu wrote:
> }
> }  > gm convert -compress JPEG - - < input.jpg > output.jpg
> } gm convert: Corrupt JPEG data: 873 extraneous bytes before marker 0xd9 
> } (/tmp/gmo1fx92).
> 
> I suspect you are encountering the issue that "gm" wants input.jpg in
> binary mode, but zsh's input redirection operator has forced it to text
> mode.  A lengthy comment in Src/main.c (copypasted below) explains this.  

Mounting the filesystem within Cygwin as binary works here.  Typically
that's the right thing to do if you're dealing with Unix-y type files,
but if you want reasonably transparent handling of Windows text files
it's not the right thing to do.

pws


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-23 22:14 ` Bart Schaefer
  2016-04-24 12:17   ` Peter Stephenson
@ 2016-04-24 17:08   ` Roger Qiu
  2016-04-24 18:35     ` Bart Schaefer
  1 sibling, 1 reply; 18+ messages in thread
From: Roger Qiu @ 2016-04-24 17:08 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers

Thanks, the workaround did work. But I've never seen read & write 
redirection at the interactive prompt. Does that make the STDIN for the 
program I'm running readable and writable? How does that work? Can the 
program then rewind that descriptor, and write to `input.jpg`?

Also since Bash doesn't suffer from this problem, will this be fixed 
eventually?

On 24/04/2016 8:14 AM, Bart Schaefer wrote:
> On Apr 24,  4:30am, Roger Qiu wrote:
> }
> }  > gm convert -compress JPEG - - < input.jpg > output.jpg
> } gm convert: Corrupt JPEG data: 873 extraneous bytes before marker 0xd9
> } (/tmp/gmo1fx92).
>
> I suspect you are encountering the issue that "gm" wants input.jpg in
> binary mode, but zsh's input redirection operator has forced it to text
> mode.  A lengthy comment in Src/main.c (copypasted below) explains this.
>
> There's [intended to be] an easy workaround, which is to open the file
> for both read and write even though you're only going to read it:
>
>      gm convert -compress JPEG - - <> input.jpg > output.jpg
>
> However, it's been years since I had a Cygwin system or the time/patience
> to care to set one up, so I haven't tested the workaround.
>
> Aside:  The zsh/system module "sysopen" doesn't recognize O_BINARY or
> O_TEXT modes; that should probably be corrected if someone can compile
> on a system that has them, but of course the comment below indicates
> that O_BINARY would be overridden anyway.
>
>
>   * Peter A. Castro <doctor@fruitbat.org>
>   *
>   * Cygwin supports the notion of binary or text mode access to files
>   * based on the mount attributes of the filesystem.  If a file is on
>   * a binary mounted filesystem, you get exactly what's in the file, CRLF's
>   * and all.  If it's on a text mounted filesystem, Cygwin will strip out
>   * the CRs.  This presents a problem because zsh code doesn't allow for
>   * CRLF's as line terminators.  So, we must force all open files to be
>   * in text mode reguardless of the underlying filesystem attributes.
>   * However, we only want to do this for reading, not writing as we still
>   * want to write files in the mode of the filesystem.  To do this,
>   * we have two options: augment all {f}open() calls to have O_TEXT added to
>   * the list of file mode options, or have the Cygwin runtime do it for us.
>   * I choose the latter. :)
>   *
>   * Cygwin's runtime provides pre-execution hooks which allow you to set
>   * various attributes for the process which effect how the process functions.
>   * One of these attributes controls how files are opened.  I've set
>   * it up so that all files opened RDONLY will have the O_TEXT option set,
>   * thus forcing line termination manipulation.  This seems to solve the
>   * problem (at least the Test suite runs clean :).
>   *
>   * Note: this may not work in later implementations.  This will override
>   * all mode options passed into open().  Cygwin (really Windows) doesn't
>   * support all that much in options, so for now this is OK, but later on
>   * it may not, in which case O_TEXT will have to be added to all opens calls
>   * appropriately.
>

-- 
Founder of Matrix AI
https://matrix.ai/
+61420925975


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-24 17:08   ` Roger Qiu
@ 2016-04-24 18:35     ` Bart Schaefer
  2016-04-25 17:31       ` Peter A. Castro
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2016-04-24 18:35 UTC (permalink / raw)
  To: zsh-workers

On Apr 25,  3:08am, Roger Qiu wrote:
} Subject: Re: Possible ZSH bug with IO direction
}
} Thanks, the workaround did work. But I've never seen read & write 
} redirection at the interactive prompt. Does that make the STDIN for the 
} program I'm running readable and writable?

Yes.

} How does that work?

Stdin/out/err are just file descriptors 0/1/2.  They can be open in
whatever mode the parent process likes.  It wouldn't make much sense
to have stdin open for write only, but nothing prevents that.

Consider that /dev/tty is all of stdin/out/err for most interactive
programs.  It's open read/write even though programs don't usually treat
the individual descriptors that way.

} Can the program then rewind that descriptor, and write to `input.jpg`?

If the program were using system-call interfaces on descriptor 0, yes.
However the STDIO library object STDIN will still be initialized for
reading only, so in most cases the program won't notice.

} Also since Bash doesn't suffer from this problem, will this be fixed 
} eventually?

As the comment from main.c explains, it's a question of either breaking
a few external programs that care about raw file data, or breaking the
entire internal string-processing in e.g. zsh's parameter substitution
any time those strings are read from a file.  So I would expect the
answer is that this won't change unless Cygwin changes something.  Also,
we're woefully short on any volunteers for Windows-specific issues.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-24 12:17   ` Peter Stephenson
@ 2016-04-24 18:36     ` Bart Schaefer
  2016-04-24 19:01       ` Peter Stephenson
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2016-04-24 18:36 UTC (permalink / raw)
  To: zsh-workers

On Apr 24,  1:17pm, Peter Stephenson wrote:
}
} Mounting the filesystem within Cygwin as binary works here.

Really?  That contradicts what the main.c comment asserts.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-24 18:36     ` Bart Schaefer
@ 2016-04-24 19:01       ` Peter Stephenson
  2016-04-24 21:23         ` m0viefreak
                           ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Peter Stephenson @ 2016-04-24 19:01 UTC (permalink / raw)
  To: zsh-workers

On Sun, 24 Apr 2016 11:36:44 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Apr 24,  1:17pm, Peter Stephenson wrote:
> }
> } Mounting the filesystem within Cygwin as binary works here.
> 
> Really?  That contradicts what the main.c comment asserts.

So it seems like "binary" isn't quite what it's cracked up to be.

The alternative might be to do something similar in the lower levels of
zsh, i.e. map \r\n to \n when reading shell input.  If done in input.c
it's no worse than doing it in the OS abstraction, and doesn't affect
fd's used by othe programmes.

pws


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-24 19:01       ` Peter Stephenson
@ 2016-04-24 21:23         ` m0viefreak
  2016-04-25  9:12           ` Peter Stephenson
  2016-04-25 17:25           ` Peter A. Castro
  2016-04-24 21:53         ` Bart Schaefer
  2016-04-25 17:16         ` Peter A. Castro
  2 siblings, 2 replies; 18+ messages in thread
From: m0viefreak @ 2016-04-24 21:23 UTC (permalink / raw)
  To: zsh-workers

On 24.04.2016 21:01, Peter Stephenson wrote:
> On Sun, 24 Apr 2016 11:36:44 -0700
> Bart Schaefer <schaefer@brasslantern.com> wrote:
>> On Apr 24,  1:17pm, Peter Stephenson wrote:
>> }
>> } Mounting the filesystem within Cygwin as binary works here.
>>
>> Really?  That contradicts what the main.c comment asserts.
> 
> So it seems like "binary" isn't quite what it's cracked up to be.
> 
> The alternative might be to do something similar in the lower levels of
> zsh, i.e. map \r\n to \n when reading shell input.  If done in input.c
> it's no worse than doing it in the OS abstraction, and doesn't affect
> fd's used by othe programmes.

I think the cygwin_premain0 in main.c is very dangerous.

In fact, bash had the same problem years ago and the hook was removed.
See http://cygwin.com/ml/cygwin/2006-10/msg00989.html for details.


Another example I found:

$ zcat < mc.1.gz > /dev/null
gzip: stdin: invalid compressed data--crc error
gzip: stdin: invalid compressed data--length error

zcat is forced to use O_TEXT and fails to decompress the file whil using
$ zcat mc.1.gz'
directy works just fine.



IMHO cygwin_premain0 should be removed completely. As a result,of
course, scripts created in windows editors such as notepad, that contain
CRLF line endings, could lead to syntax errors, but surely that's less
of a problem and easily fixable using a proper editor.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-24 19:01       ` Peter Stephenson
  2016-04-24 21:23         ` m0viefreak
@ 2016-04-24 21:53         ` Bart Schaefer
  2016-04-25  8:46           ` Peter Stephenson
  2016-04-25 21:12           ` Mikael Magnusson
  2016-04-25 17:16         ` Peter A. Castro
  2 siblings, 2 replies; 18+ messages in thread
From: Bart Schaefer @ 2016-04-24 21:53 UTC (permalink / raw)
  To: zsh-workers

On Apr 24,  8:01pm, Peter Stephenson wrote:
}
} The alternative might be to do something similar in the lower levels of
} zsh, i.e. map \r\n to \n when reading shell input.  If done in input.c
} it's no worse than doing it in the OS abstraction, and doesn't affect
} fd's used by othe programmes.

Will input.c cover the $(...) construct, "read" command, etc.?  All of
those have to get CRFL translation or things like ${(f)...} don't work;
stray \r end up in parameter values, and so on.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-24 21:53         ` Bart Schaefer
@ 2016-04-25  8:46           ` Peter Stephenson
  2016-04-25 21:12           ` Mikael Magnusson
  1 sibling, 0 replies; 18+ messages in thread
From: Peter Stephenson @ 2016-04-25  8:46 UTC (permalink / raw)
  To: zsh-workers

On Sun, 24 Apr 2016 14:53:00 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Apr 24,  8:01pm, Peter Stephenson wrote:
> }
> } The alternative might be to do something similar in the lower levels of
> } zsh, i.e. map \r\n to \n when reading shell input.  If done in input.c
> } it's no worse than doing it in the OS abstraction, and doesn't affect
> } fd's used by othe programmes.
> 
> Will input.c cover the $(...) construct, "read" command, etc.?  All of
> those have to get CRFL translation or things like ${(f)...} don't work;
> stray \r end up in parameter values, and so on.

Commands within $(...) are processed as part of normal shell input, but
if you're executing an external command inside and processing the output
as text that's another matter.  "read" and implicit reads such as
"$(<...)" are similar.  In general you don't know whether you expect the
output to be text or binary; there's nothing to stop you capturing
raw data using a $(...).  There are partial workarounds for read
and word splitting by adding $'\r' to IFS.

pws


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-24 21:23         ` m0viefreak
@ 2016-04-25  9:12           ` Peter Stephenson
  2016-04-25 17:25           ` Peter A. Castro
  1 sibling, 0 replies; 18+ messages in thread
From: Peter Stephenson @ 2016-04-25  9:12 UTC (permalink / raw)
  To: zsh-workers

On Sun, 24 Apr 2016 23:23:04 +0200
m0viefreak <m0viefreak.cm@googlemail.com> wrote:
> I think the cygwin_premain0 in main.c is very dangerous.

What's particularly annoying is it's run too early to be made optional
at run time (it looks like you could parse the command line arguments
but that's not the most useful way of setting shell options).  Unless
you can stick hooks in the hoos...?

pws


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-24 19:01       ` Peter Stephenson
  2016-04-24 21:23         ` m0viefreak
  2016-04-24 21:53         ` Bart Schaefer
@ 2016-04-25 17:16         ` Peter A. Castro
  2 siblings, 0 replies; 18+ messages in thread
From: Peter A. Castro @ 2016-04-25 17:16 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: zsh-workers

On Sun, 24 Apr 2016, Peter Stephenson wrote:

> Date: Sun, 24 Apr 2016 20:01:32 +0100
> From: Peter Stephenson <p.w.stephenson@ntlworld.com>
> To: zsh-workers@zsh.org
> Subject: Re: Possible ZSH bug with IO direction

Greetings, Peter S.
   I'm the author of the offending Cygwin pre-main hook, so you can
squarely level any complains about it at me.  I'm also the maintainer of
the Cygwin port, for whatever that's worth.  :)

> On Sun, 24 Apr 2016 11:36:44 -0700
> Bart Schaefer <schaefer@brasslantern.com> wrote:
>> On Apr 24,  1:17pm, Peter Stephenson wrote:
>> }
>> } Mounting the filesystem within Cygwin as binary works here.
>>
>> Really?  That contradicts what the main.c comment asserts.
>
> So it seems like "binary" isn't quite what it's cracked up to be.

No, it's not.  Neither is "text" mode. :)

> The alternative might be to do something similar in the lower levels of
> zsh, i.e. map \r\n to \n when reading shell input.  If done in input.c
> it's no worse than doing it in the OS abstraction, and doesn't affect
> fd's used by othe programmes.

   I tried to put this kind of specific control (O_TEXT) only in input.c 
(and a few other places), but there are too many places in the code that 
just "know" line-termination is 1 character and either loop trying to read 
more input or simply choke on the difference in size read vs. size 
expected.  It quickly became dozens of source modules affected and still I 
wasn't catching them all.
   I never really got it to work fully and just gave up, putting in the 
pre-main hook as a band-aid until some better solution presented itself. 
Really, the code needs to understand CR+LF.
   I lacked the time to fully understand all of the Zsh code so never went 
back to try and create a proper fix.  Some guidance on this would be 
welcome.

> pws

-- 
--=> Peter A. Castro
Email: doctor at fruitbat dot org / Peter dot Castro at oracle dot com
 	"Cats are just autistic Dogs" -- Dr. Tony Attwood


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-24 21:23         ` m0viefreak
  2016-04-25  9:12           ` Peter Stephenson
@ 2016-04-25 17:25           ` Peter A. Castro
  1 sibling, 0 replies; 18+ messages in thread
From: Peter A. Castro @ 2016-04-25 17:25 UTC (permalink / raw)
  To: m0viefreak; +Cc: zsh-workers

On Sun, 24 Apr 2016, m0viefreak wrote:

> Date: Sun, 24 Apr 2016 23:23:04 +0200
> From: m0viefreak <m0viefreak.cm@googlemail.com>
> To: zsh-workers@zsh.org
> Subject: Re: Possible ZSH bug with IO direction

Greetings, m0viefreak (great name, btw).

> On 24.04.2016 21:01, Peter Stephenson wrote:
>> On Sun, 24 Apr 2016 11:36:44 -0700
>> Bart Schaefer <schaefer@brasslantern.com> wrote:
>>> On Apr 24,  1:17pm, Peter Stephenson wrote:
>>> }
>>> } Mounting the filesystem within Cygwin as binary works here.
>>>
>>> Really?  That contradicts what the main.c comment asserts.
>>
>> So it seems like "binary" isn't quite what it's cracked up to be.
>>
>> The alternative might be to do something similar in the lower levels of
>> zsh, i.e. map \r\n to \n when reading shell input.  If done in input.c
>> it's no worse than doing it in the OS abstraction, and doesn't affect
>> fd's used by othe programmes.
>
> I think the cygwin_premain0 in main.c is very dangerous.
>
> In fact, bash had the same problem years ago and the hook was removed.
> See http://cygwin.com/ml/cygwin/2006-10/msg00989.html for details.

The hook was removed and replaced with an external control: igncr

This control either must be set via envvar SHELLOPTS before starting bash, 
or the script must set it explicitly at the top of any script being 
run/sourced.  Not a very transparent or convenient fix in my opinion.

> Another example I found:
>
> $ zcat < mc.1.gz > /dev/null
> gzip: stdin: invalid compressed data--crc error
> gzip: stdin: invalid compressed data--length error
>
> zcat is forced to use O_TEXT and fails to decompress the file whil using
> $ zcat mc.1.gz'
> directy works just fine.
>
> IMHO cygwin_premain0 should be removed completely. As a result,of
> course, scripts created in windows editors such as notepad, that contain
> CRLF line endings, could lead to syntax errors, but surely that's less
> of a problem and easily fixable using a proper editor.

You might think it would be, but I've had people complain about it, hence 
the hook I introduced.  You wouldn't believe the tongue-lashing I received 
for suggesting people use "binary" mode mounts and just run dos2unix on 
*all* of their scripts (and input files).  Oh yes, "less of a problem" 
indeed.  :-/

-- 
--=> Peter A. Castro
Email: doctor at fruitbat dot org / Peter dot Castro at oracle dot com
 	"Cats are just autistic Dogs" -- Dr. Tony Attwood


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-24 18:35     ` Bart Schaefer
@ 2016-04-25 17:31       ` Peter A. Castro
  2016-04-25 17:47         ` Bart Schaefer
  0 siblings, 1 reply; 18+ messages in thread
From: Peter A. Castro @ 2016-04-25 17:31 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

On Sun, 24 Apr 2016, Bart Schaefer wrote:

> Date: Sun, 24 Apr 2016 11:35:58 -0700
> From: Bart Schaefer <schaefer@brasslantern.com>
> To: zsh-workers@zsh.org
> Subject: Re: Possible ZSH bug with IO direction

Greetings, Bart,

> On Apr 25,  3:08am, Roger Qiu wrote:
> } Subject: Re: Possible ZSH bug with IO direction
> }
> } Thanks, the workaround did work. But I've never seen read & write
> } redirection at the interactive prompt. Does that make the STDIN for the
> } program I'm running readable and writable?
>
> Yes.
>
> } How does that work?
>
> Stdin/out/err are just file descriptors 0/1/2.  They can be open in
> whatever mode the parent process likes.  It wouldn't make much sense
> to have stdin open for write only, but nothing prevents that.
>
> Consider that /dev/tty is all of stdin/out/err for most interactive
> programs.  It's open read/write even though programs don't usually treat
> the individual descriptors that way.
>
> } Can the program then rewind that descriptor, and write to `input.jpg`?
>
> If the program were using system-call interfaces on descriptor 0, yes.
> However the STDIO library object STDIN will still be initialized for
> reading only, so in most cases the program won't notice.

That was kind of the idea.  The program wouldn't know and can happily read 
input as if it were a proper-unix text file.  That the file isn't really 
"text" is beyond scope.  :)

> } Also since Bash doesn't suffer from this problem, will this be fixed
> } eventually?
>
> As the comment from main.c explains, it's a question of either breaking
> a few external programs that care about raw file data, or breaking the
> entire internal string-processing in e.g. zsh's parameter substitution
> any time those strings are read from a file.  So I would expect the
> answer is that this won't change unless Cygwin changes something.  Also,
> we're woefully short on any volunteers for Windows-specific issues.

Really you need to fix Windows, not Cygwin.  Or you need to fix every 
program that's ported to Cygwin.  Cygwin is trying to bridge the two 
worlds the best it can, with as little code change as possible.  That it's 
imperfect is implied.

I believe the best solution is to teach Zsh code to handle CR+LF properly, 
but I lack knowledge of the internal workings to addequate do this.

-- 
--=> Peter A. Castro
Email: doctor at fruitbat dot org / Peter dot Castro at oracle dot com
 	"Cats are just autistic Dogs" -- Dr. Tony Attwood


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-25 17:31       ` Peter A. Castro
@ 2016-04-25 17:47         ` Bart Schaefer
  2016-04-25 18:14           ` Peter A. Castro
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2016-04-25 17:47 UTC (permalink / raw)
  To: Peter A. Castro; +Cc: Zsh hackers list

On Mon, Apr 25, 2016 at 10:31 AM, Peter A. Castro <doctor@fruitbat.org> wrote:
>
> I believe the best solution is to teach Zsh code to handle CR+LF properly,

Which means we're back to the "dozens of source modules" you mentioned.
There's no central place for this; because of vagaries of other I/O systems
(e.g. STREAMS modules), there's not even a consistent use of stdio vs.
low-level read/write (though at least they're usually not mixed on the same
descriptors).

I haven't looked closely at this (and have no idea whether it's possible with
the pre-main hook in place) but perhaps put the descriptors back into
binary mode somewhere in exec.c between whatever passes for zfork()
on cygwin and the actual execve() of any external process?  We mostly
know when a zsh builtin vs. an external command is being run.

Either way the external commands are on their own -- the question is
whether they're better off getting binary when they expect text or the
other way around (as currently).


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-25 17:47         ` Bart Schaefer
@ 2016-04-25 18:14           ` Peter A. Castro
  2016-04-26  9:16             ` Peter Stephenson
  0 siblings, 1 reply; 18+ messages in thread
From: Peter A. Castro @ 2016-04-25 18:14 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh hackers list

On Mon, 25 Apr 2016, Bart Schaefer wrote:

> Date: Mon, 25 Apr 2016 10:47:49 -0700
> From: Bart Schaefer <schaefer@brasslantern.com>
> To: Peter A. Castro <doctor@fruitbat.org>
> Cc: Zsh hackers list <zsh-workers@zsh.org>
> Subject: Re: Possible ZSH bug with IO direction
> 
> On Mon, Apr 25, 2016 at 10:31 AM, Peter A. Castro <doctor@fruitbat.org> wrote:
>>
>> I believe the best solution is to teach Zsh code to handle CR+LF properly,
>
> Which means we're back to the "dozens of source modules" you mentioned.
> There's no central place for this; because of vagaries of other I/O systems
> (e.g. STREAMS modules), there's not even a consistent use of stdio vs.
> low-level read/write (though at least they're usually not mixed on the same
> descriptors).

Yes...I had noticed that in my first few attempts at "fixing" the code.
I quickly became overwhelmed at the scope of things.

> I haven't looked closely at this (and have no idea whether it's possible with
> the pre-main hook in place) but perhaps put the descriptors back into
> binary mode somewhere in exec.c between whatever passes for zfork()
> on cygwin and the actual execve() of any external process?  We mostly
> know when a zsh builtin vs. an external command is being run.

I did consider this, but there's something about using the pre-main hooks 
that prevents resetting the attributes.  I'll ask Corinna about it, but I 
suspect there isn't a way of doing it once the pre-main has been invoked.

In anycase, this also leads to your next comment:

> Either way the external commands are on their own -- the question is
> whether they're better off getting binary when they expect text or the
> other way around (as currently).

That is its own can of worms.  The later is what mostly-works for 95% of 
the use-cases.  There is no good solution here, I'm afraid.

So, to sum things up:
   pre-main hook is bad.
   CR+LF is evil.
   Teaching the code to grok CR+LF is hard.
   Users don't like change and expect things to "just work".

Am I missing anything?  :-/

-- 
--=> Peter A. Castro
Email: doctor at fruitbat dot org / Peter dot Castro at oracle dot com
 	"Cats are just autistic Dogs" -- Dr. Tony Attwood


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-24 21:53         ` Bart Schaefer
  2016-04-25  8:46           ` Peter Stephenson
@ 2016-04-25 21:12           ` Mikael Magnusson
  1 sibling, 0 replies; 18+ messages in thread
From: Mikael Magnusson @ 2016-04-25 21:12 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh workers

On Sun, Apr 24, 2016 at 11:53 PM, Bart Schaefer
<schaefer@brasslantern.com> wrote:
> On Apr 24,  8:01pm, Peter Stephenson wrote:
> }
> } The alternative might be to do something similar in the lower levels of
> } zsh, i.e. map \r\n to \n when reading shell input.  If done in input.c
> } it's no worse than doing it in the OS abstraction, and doesn't affect
> } fd's used by othe programmes.
>
> Will input.c cover the $(...) construct, "read" command, etc.?  All of
> those have to get CRFL translation or things like ${(f)...} don't work;
> stray \r end up in parameter values, and so on.

It's perhaps worth noting that (f) is explicitly documented as being
short for ps:\n: and nothing else.

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Possible ZSH bug with IO direction
  2016-04-25 18:14           ` Peter A. Castro
@ 2016-04-26  9:16             ` Peter Stephenson
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Stephenson @ 2016-04-26  9:16 UTC (permalink / raw)
  To: Zsh hackers list

On Mon, 25 Apr 2016 11:14:06 -0700
"Peter A. Castro" <doctor@fruitbat.org> wrote:
> So, to sum things up:
>    pre-main hook is bad.
>    CR+LF is evil.
>    Teaching the code to grok CR+LF is hard.
>    Users don't like change and expect things to "just work".
> 
> Am I missing anything?  :-/

Don't think so.

I'd be happy to maintain a git branch to attempt to get some fixes into the
code for CRNL line ending handling as a long-term project, though I'm a
bit busy to contribute right now.  The worst that can happen is we
abandon it and simply waste hours of time...

pws


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-04-26  9:27 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-23 18:30 Possible ZSH bug with IO direction Roger Qiu
2016-04-23 22:14 ` Bart Schaefer
2016-04-24 12:17   ` Peter Stephenson
2016-04-24 18:36     ` Bart Schaefer
2016-04-24 19:01       ` Peter Stephenson
2016-04-24 21:23         ` m0viefreak
2016-04-25  9:12           ` Peter Stephenson
2016-04-25 17:25           ` Peter A. Castro
2016-04-24 21:53         ` Bart Schaefer
2016-04-25  8:46           ` Peter Stephenson
2016-04-25 21:12           ` Mikael Magnusson
2016-04-25 17:16         ` Peter A. Castro
2016-04-24 17:08   ` Roger Qiu
2016-04-24 18:35     ` Bart Schaefer
2016-04-25 17:31       ` Peter A. Castro
2016-04-25 17:47         ` Bart Schaefer
2016-04-25 18:14           ` Peter A. Castro
2016-04-26  9:16             ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).