From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-bw0-f49.google.com (mail-bw0-f49.google.com [209.85.214.49]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id p4I6Llqg032526 for ; Wed, 18 May 2011 02:21:48 -0400 (EDT) Received: by bwz1 with SMTP id 1so1420596bwz.36 for ; Tue, 17 May 2011 23:21:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:x-authentication-warning:date:from:to:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=KX72xk5Khg52JHuIxJa9dz1IyfZuaa/z9M5GX7GZYvo=; b=RIS7U2YES++VZLPKSpgDCRCD0PB/9W0VcgyXIYgevJZ267CkOEcoOu51PCScHqpzSc kPfJNja/1DDEd6pAiQgiEHwq5d3KTIr9nGXD+hW4M3fh2ERV2TDCCwcMCKomtJXd9OWE Byif5eXO59pm5/NjwsHDWrgALUPsL99wqYS84= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=x-authentication-warning:date:from:to:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; b=HjuUsVss+uAsvangEizxTcXfVNl4Zjbn0GauTDG0K7k25jTuwdNMUJgrp9A6T/EyW4 jR/lBG54/e/fqnHbfsG6ZjrPe59gQ07op2cA+udGMNlUsPcAxpZ91z9Z/w+z1Ulq7dM5 YrbhcFqjF7/HJio5Uv+VOWOeW7Zsnrg0rKgJc= Received: by 10.204.20.143 with SMTP id f15mr1406731bkb.173.1305699700933; Tue, 17 May 2011 23:21:40 -0700 (PDT) Received: from procyon.xvoid.org ([213.132.76.142]) by mx.google.com with ESMTPS id 16sm738553bkm.18.2011.05.17.23.21.38 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 17 May 2011 23:21:39 -0700 (PDT) Received: from procyon.xvoid.org (yuri@procyon.xvoid.org [IPv6:::1]) by procyon.xvoid.org (8.14.4/8.14.4) with ESMTP id p4I6Lb5w092938 for ; Wed, 18 May 2011 10:21:37 +0400 (MSD) (envelope-from yuri.pankov@gmail.com) Received: (from yuri@localhost) by procyon.xvoid.org (8.14.4/8.14.4/Submit) id p4I6Lblq092937 for discuss@mdocml.bsd.lv; Wed, 18 May 2011 10:21:37 +0400 (MSD) (envelope-from yuri.pankov@gmail.com) X-Authentication-Warning: procyon.xvoid.org: yuri set sender to yuri.pankov@gmail.com using -f Date: Wed, 18 May 2011 10:21:37 +0400 From: Yuri Pankov To: discuss@mdocml.bsd.lv Subject: Re: Full mandoc locale support committed. Message-ID: <20110518062137.GD1321@procyon.xvoid.org> References: <4DD2FFB0.2070303@bsd.lv> X-Mailinglist: mdocml-discuss Reply-To: discuss@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4DD2FFB0.2070303@bsd.lv> User-Agent: Mutt/1.5.21 (2010-09-15) On Wed, May 18, 2011 at 01:07:28AM +0200, Kristaps Dzonsons wrote: > Hi, > > With this last commit, initial [full] locale support has been fitted > into mandoc! Attached is eye-candy: a manual full of random Unicode > input (\[uNNNN]) first with -Tascii, then with -Tlocale. Unicode escapes work for me, but does that mean that existing localized manpages can't be used? Tried several ja and ru from debian 6, all of them make mandoc die with "FATAL: line scope broken, syntax violated" (-Tlint talks a lot about "ERROR: skipping bad character: ignoring byte"). > From the manual: > > Locale Output > Locale-depending output encoding is triggered with -Tlocale. > This option is not available on all systems: systems without > locale support, or those whose internal representation is not > natively UCS-4, will fall back to -Tascii. See ASCII Output > for font style specification and available command-line > arguments. > > The check-ins: > > (1) http://mdocml.bsd.lv/archives/source/0920.html A small typo here in Makefile's comment - USE_CHAR. > (2) http://mdocml.bsd.lv/archives/source/0919.html > > This support is /very/ fast, and any overhead occurs if and only if > -Tlocale is selected AND supported. If this doesn't hold, then mandoc > runs at -Tascii "native" speed. > > The bad news: it comes at a price. -Tlocale only works if Unicode > code-point values (defined in the UCS-4 (-2?) standards) can be > transformed directly into wide-character values usable by the system. I > check this with an optional C99 feature, __STDC_ISO_10646__. See > > http://www.cl.cam.ac.uk/~mgk25/ucs/iso2022-wc.html > > for details. Unfortunately, this seems only to be exported on glibc. > I'm told the conditions hold on OpenBSD and FreeBSD. NetBSD? > > If your system abides by these rules and doesn't export this symbol, > please let me know and we can special-case the macro test for this > feature. There is a way to convert from Unicode to a system's > wide-character support without this feature, but it isn't pretty. I'll > probably have to implement this anyway for portability. For now, if a > system doesn't do __STDC_ISO_10646__, -Tlocale is a synonym for -Tascii. Checked on Solaris 11 and Illumos - both do not export the symbol, though commenting out #undef USE_WCHAR in term_ascii.c makes -Tlocale work. > Note that if this method is unilaterally hated, it's easy to switch to > another method. This was simply the fastest, simplest, and most > transparent to implement. All of the logic either way is in one file, > and easy to manipulate: > > http://mdocml.bsd.lv/cgi-bin/cvsweb/term_ascii.c?cvsroot=mdocml > > Thoughts? Yuri -- To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv