From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15346 invoked by alias); 25 Nov 2013 17:37:12 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 32054 Received: (qmail 19248 invoked from network); 25 Nov 2013 17:37:06 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI, SPF_HELO_PASS autolearn=ham version=3.3.2 X-AuditID: cbfec7f4-b7fea6d0000026ac-0b-52938abe275b Date: Mon, 25 Nov 2013 17:37:01 +0000 From: Peter Stephenson To: zsh-workers@zsh.org Subject: Re: Helpfiles again (was Re: modify functions hierarchy (was: etc.)) Message-id: <20131125173701.18ae6e5b@pwslap01u.europe.root.pri> In-reply-to: <131125085631.ZM17483@torch.brasslantern.com> References: <20131113112112.1b080b79@pwslap01u.europe.root.pri> <131113080606.ZM11640@torch.brasslantern.com> <131117103047.ZM30518@torch.brasslantern.com> <131117130118.ZM1041@torch.brasslantern.com> <20131120192608.3af3b92c@pws-pc.ntlworld.com> <131120220100.ZM12300@torch.brasslantern.com> <20131123174827.249f9678@pws-pc.ntlworld.com> <131123114714.ZM18477@torch.brasslantern.com> <131123210612.ZM31978@torch.brasslantern.com> <20131124175649.27c2559a@pws!> <-pc.ntlworld.com@samsung.com> <131125001818.ZM26494@torch.brasslantern.com> <691AC9C6-D832-42FC-B983-60C682AA5515@kba.biglobe.ne.jp> <20131125154954.14283de2@pwslap01u.europe.root.pri> <131125085631.ZM17483@torch.brasslantern.com> Organization: Samsung Cambridge Solution Centre X-Mailer: Claws Mail 3.7.9 (GTK+ 2.22.0; i386-redhat-linux-gnu) MIME-version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFuplluLIzCtJLcpLzFFi42I5/e/4Fd19XZODDCZN07U42PyQyYHRY9XB D0wBjFFcNimpOZllqUX6dglcGfO+97AUTBWumPlsCVMD43zeLkZODgkBE4kvB84yQdhiEhfu rWfrYuTiEBJYyiix9uhjVghnOZNE6/KXrCBVLAKqEtc+v2cHsdkEDCWmbprNCGKLCIhLnF17 ngXEFhbwlTjT/wnI5uDgFbCXmHdMCSTMKWAlcWPWXnaImQfZJeategnWyy+gL3H17yeoK+wl Zl45AxbnFRCU+DH5HthMZgEtic3bmlghbHmJzWveMk9gFJiFpGwWkrJZSMoWMDKvYhRNLU0u KE5KzzXUK07MLS7NS9dLzs/dxAgJwi87GBcfszrEKMDBqMTDa1E5KUiINbGsuDL3EKMEB7OS CO/PkslBQrwpiZVVqUX58UWlOanFhxiZODilGhgnPLD7WKA94dvUA/uZJ7ZzftfZXTFxcd1+ +YWh12zVs907Nt/Ir5lm7sjCv/zV9MeV2xqny3hZ++/0eHc3WCRl3rftZ5t2W+/Sk/LeU2ud YvDl6BO1pHiJBXemC1Vq7D/QrLtNurDLUO/eur8aD2/f8Pqz7bmIzg7NZX3a/cvmiDpMya8/ px6uxFKckWioxVxUnAgAeiUcQyACAAA= On Mon, 25 Nov 2013 08:56:31 -0800 Bart Schaefer wrote: > On Nov 25, 3:49pm, Peter Stephenson wrote: > } > } > (1) Completion/Unix/Command/_systemd: line 3, name of the author. > } > The character is an 'e' with accent (in latin1 encoding, i.e., ISO8859-1). > } > } Hmmm.... might be safer to fudge it as "e'". There's not really any > } need for non-ASCII characters in functions. (Or simply convert to > } UTF-8 but I think the simpler the better.) > > Output of "file **/*(.) | egrep 'UTF|ISO' : > > ChangeLog: UTF-8 Unicode English text I think that's OK. If we don't need to process a file using a tool that's worried about locales we get away with it. I don't see any point in forbidding UTF-8 in text files. > Completion/BSD/Command/_portaudit: ISO-8859 English text > Completion/Unix/Command/_cdrdao: UTF-8 Unicode English text > Completion/Unix/Command/_git: UTF-8 Unicode English text > Completion/Unix/Command/_growisofs: UTF-8 Unicode English text Probably safest to make these ASCII; none of the non-ASCII characters are actually needed. The big one here is _git which has a lot of non-ASCII quotes and "...", but they're not actually necessary, even for readability (they're all quotes / ellipses with an ASCII lookalike), so I don't think that's a real loss. > Etc/ChangeLog-3.0: ISO-8859 English text > LICENCE: ISO-8859 English text > Src/module.c: ISO-8859 C program text > Src/Modules/clone.c: ISO-8859 C program text > Src/Modules/example.c: ISO-8859 C program text Should be UTF-8 for consistency, I think. We'd know by now if a compiler cared about non-ASCII characters. > Test/A05execution.ztst: ISO-8859 text > Test/D02glob.ztst: ISO-8859 text > Test/D07multibyte.ztst: UTF-8 Unicode English text > Test/V07pcre.ztst: UTF-8 Unicode English text These are all deliberate: either multibyte tests, or single bytes with the top bit set, so I think they're OK. Certainly they're being explicitly tested. Shout if you're unhappy or I'll do it tomorrow (UK time). pws