From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12747 invoked from network); 8 Jun 2000 09:52:01 -0000 Received: from sunsite.auc.dk (130.225.51.30) by ns1.primenet.com.au with SMTP; 8 Jun 2000 09:52:01 -0000 Received: (qmail 19926 invoked by alias); 8 Jun 2000 09:51:44 -0000 Mailing-List: contact zsh-workers-help@sunsite.auc.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 11820 Received: (qmail 19919 invoked from network); 8 Jun 2000 09:51:43 -0000 Date: Thu, 08 Jun 2000 10:51:15 +0100 From: Peter Stephenson Subject: Re: zsh changing cp1250 letters beyond recognition In-reply-to: "Your message of Wed, 07 Jun 2000 23:16:10 +0200." <20000607231610.A6504@tornado.sh.cvut.cz> To: Jan Fedak , zsh-workers@sunsite.auc.dk (Zsh hackers list) Message-id: <0FVT001HKY1FC6@la-la.cambridgesiliconradio.com> Content-transfer-encoding: 7BIT > I have written simple script (attached) that gets filenames (in $*). > The filenames possibly contain letters in cp1250 (windows encoding for > middle and eastern Europe) or iso-8859-2. > > Now, if I use bash to run the script, everything works just fine. But > when I replace #!/bin/bash with #!/bin/zsh, all those strange cp1250 > characters (\232\235\236\212\215\216) get replaced with other (and > equally strange) ones. Renaming and tr don't work correctly then. There's a bug that characters in this range present on the command line don't get turned properly into zsh's internal representation. Thanks for spotting this --- it's very hard to see these things if you're using iso-8859-1, as most of us presumably are. You didn't say what version you're using, but the problem is there in both 3.0.8 and 3.1.9. The patch below is for 3.1.9, I'll send a separate one for 3.0.8 (the fix is the same, but the context for the patch is different). Details for those interested: the type table doesn't get set up until well on into initialisation, while the command line arguments called metafy() straight away, which uses the type table to decide what needs to be metafied. I don't dare reorder all the stuff at the start, since there always turns out to be unpleasant dependencies, so I've done the least invasive surgery I can think of, which is simply set up the IMETA bit of the type table right at the start and then initialise the whole thing properly as before. I think the ideal solution would be something like this: set up the bits of the type table which never change at this point, then for the rest of the life of the shell only meddle with the bits which can change according to options. Contributions invited. This needs to be in one of the tests, too. Probably we need a special metafication test. Index: Src/main.c =================================================================== RCS file: /cvsroot/zsh/zsh/Src/main.c,v retrieving revision 1.1.1.6 diff -u -r1.1.1.6 main.c --- Src/main.c 2000/02/23 15:18:44 1.1.1.6 +++ Src/main.c 2000/06/08 09:37:31 @@ -35,11 +35,23 @@ main(int argc, char **argv) { char **t; + int t0; #ifdef USE_LOCALE setlocale(LC_ALL, ""); #endif init_hackzero(argv, environ); + + /* + * Provisionally set up the type table to allow metafication. + * This will be done properly when we have decided if we are + * interactive + */ + typtab['\0'] |= IMETA; + typtab[STOUC(Meta) ] |= IMETA; + typtab[STOUC(Marker)] |= IMETA; + for (t0 = (int)STOUC(Pound); t0 <= (int)STOUC(Nularg); t0++) + typtab[t0] |= ITOK | IMETA; for (t = argv; *t; *t = metafy(*t, -1, META_ALLOC), t++); -- Peter Stephenson Cambridge Silicon Radio, Unit 300, Science Park, Milton Road, Cambridge, CB4 0XL, UK Tel: +44 (0)1223 392070