From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1372 invoked from network); 7 Apr 2008 15:49:12 -0000 X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.4 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 7 Apr 2008 15:49:12 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 98012 invoked from network); 7 Apr 2008 15:49:05 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 7 Apr 2008 15:49:05 -0000 Received: (qmail 24361 invoked by alias); 7 Apr 2008 15:49:02 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 24798 Received: (qmail 24332 invoked from network); 7 Apr 2008 15:49:00 -0000 Received: from bifrost.dotsrc.org (130.225.254.106) by sunsite.dk with SMTP; 7 Apr 2008 15:49:00 -0000 Received: from cluster-g.mailcontrol.com (cluster-g.mailcontrol.com [85.115.41.190]) by bifrost.dotsrc.org (Postfix) with ESMTP id 28EEA80561C1 for ; Mon, 7 Apr 2008 17:48:53 +0200 (CEST) Received: from cameurexb01.EUROPE.ROOT.PRI ([62.189.241.200]) by rly09g.srv.mailcontrol.com (MailControl) with ESMTP id m37Fl84t013266 for ; Mon, 7 Apr 2008 16:48:43 +0100 Received: from news01.csr.com ([10.103.143.38]) by cameurexb01.EUROPE.ROOT.PRI with Microsoft SMTPSVC(6.0.3790.3959); Mon, 7 Apr 2008 16:48:06 +0100 Received: from news01.csr.com (localhost.localdomain [127.0.0.1]) by news01.csr.com (8.14.2/8.13.4) with ESMTP id m37Fm6Rc001118 for ; Mon, 7 Apr 2008 16:48:06 +0100 Received: from csr.com (pws@localhost) by news01.csr.com (8.14.2/8.14.2/Submit) with ESMTP id m37Fm6UN001115 for ; Mon, 7 Apr 2008 16:48:06 +0100 X-Authentication-Warning: news01.csr.com: pws owned process doing -bs To: zsh-workers@sunsite.dk (Zsh hackers list) Subject: Support for combining characters X-Mailer: MH-E 8.0.3; nmh 1.2-20070115cvs; GNU Emacs 22.1.1 Date: Mon, 07 Apr 2008 16:48:06 +0100 Message-ID: <1114.1207583286@csr.com> From: Peter Stephenson X-OriginalArrivalTime: 07 Apr 2008 15:48:06.0869 (UTC) FILETIME=[C5BFD850:01C898C6] X-Scanned-By: MailControl A-08-00-04 (www.mailcontrol.com) on 10.71.0.119 X-Virus-Scanned: ClamAV 0.91.2/6648/Mon Apr 7 16:48:06 2008 on bifrost X-Virus-Status: Clean I thought a bit more about how to support combining characters in ZLE, and since I'm still trying to avoid having to understand phrases like [1], here are a few tentative conclusions[2]. - Not all terminals[3] support combining characters, and we may not be able to rely on full support for those that do. So I think we need an option like ZLE_COMBINING_CHARS. Possibly we can probe for this eventually: - read the cursor position from the terminal - output a base character - output a zero-width accent character - read the cursor position again - see if it's moved by only the original character width but that's for the future. - As far as I understand it, any Unicode character that claims to be both printable and zero-width is to be treated as a combining character. It needs to follow a real character for that to happen, so I would propose to continue handling any that don't in the current fashion (highlighted etc.) for safety. - Within zle_refresh.c, it would be best to continue with the one-entry-per-screen-cell line format (unless anyone is volunteering to do a wholesale rewrite). This causes difficulties since we need multiple characters in the same entry, implying some form of indirection. On 64-bit systems, using a real pointer for this will double the size even for wide characters. I think we can make use of[4] the extra flags I added for highlighting. We could add a flag that indicates the character is actually an index into an auxiliary array. This is a bit like how option arguments for builtins are handled. It's not particular efficient, but I hope it won't be grotesquely slow with some optimisation of reallocation. - Outside zle_refresh.c I think this scheme would be too messy. In that case we will need to handle moving and deleting by carefully taking account of combining characters: moving left until we reach a non-zero-width character or the start of line, or moving right over any trailing zero-width characters. I'm not sure what to do in the main shell. The most important thing here is to be able to handle combining characters in editor widgets. ${(m)#...} will help with this. I don't know whether we need any more support than that. I will try and look at this[5] over the next few weeks. pws [1] A beamformer shall set the response type format indicated in the CSI/steering field of the HT Control field of any sounding frame excluding the NDP and of any PPDU with the NDP sounding announcement field set to 1 to one of the non-zero values (CSI, Compressed Beamforming or Non-compressed Beamforming) that corresponds to a type that is supported by the beamformee. [2] "creating a context for change" for all you corporate types out there [3] (and certainly not Terminal 5) [4] "leverage" for all you corporate types [5] "implement change within that context"