From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp-1.sys.kth.se (smtp-1.sys.kth.se [130.237.32.175]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id o679X3pu007320 for ; Wed, 7 Jul 2010 05:33:03 -0400 (EDT) Received: from smtp-1.sys.kth.se (localhost [127.0.0.1]) by smtp-1.sys.kth.se (Postfix) with ESMTP id C3D8A156FF4 for ; Wed, 7 Jul 2010 11:32:57 +0200 (CEST) X-Virus-Scanned: by amavisd-new at kth.se Received: from smtp-1.sys.kth.se ([127.0.0.1]) by smtp-1.sys.kth.se (smtp-1.sys.kth.se [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 1scO3ZgZMB5N for ; Wed, 7 Jul 2010 11:32:56 +0200 (CEST) X-KTH-Auth: kristaps [85.8.61.208] X-KTH-mail-from: kristaps@bsd.lv X-KTH-rcpt-to: discuss@mdocml.bsd.lv Received: from lappy.bsd.lv (h85-8-61-208.dynamic.se.alltele.net [85.8.61.208]) by smtp-1.sys.kth.se (Postfix) with ESMTP id E8978156F14 for ; Wed, 7 Jul 2010 11:32:55 +0200 (CEST) Message-ID: <4C3449D5.7020106@bsd.lv> Date: Wed, 07 Jul 2010 11:33:09 +0200 From: Kristaps Dzonsons User-Agent: Thunderbird 2.0.0.16 (X11/20080812) X-Mailinglist: mdocml-discuss Reply-To: discuss@mdocml.bsd.lv MIME-Version: 1.0 To: discuss@mdocml.bsd.lv Subject: Re: Raw UTF-8? References: <4c33f0f0.0c87970a.3458.fffff43f@mx.google.com> In-Reply-To: <4c33f0f0.0c87970a.3458.fffff43f@mx.google.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit > When using special characters in manpages, I use plain UTF-8 instead of > the escapes documented in mandoc_char(7), for a couple reasons. I'm just > wondering, is this practice discouraged in any way? Is there a chance > of this _not_ working in future versions of mandoc? This is being discussed on tech@ right now. Currently, once you use any non-ASCII encoding, the manual is no longer accessable to all terminals. This is bad. Furthermore, -Tps will throw away your input. This is more bad. In fact, only -Thtml will be ok with what you do, which is only by dint of it using the same output encoding. groff promises Unicode support in "the next major version". According to their mailing lists, they plan on using \[uNNN] for a Unicode escape and on-the-fly translate input UTF-8 into Unicode (effectively using "int" instead of "char" for characters). http://www.mail-archive.com/groff@gnu.org/msg01378.html I think it's best for the time being to lift the input warnings and document that non-ASCII characters will Balkanise the manual. I'm flapping between warning about it and not warning. What, by the way, are the reasons you have against using the mandoc_char escapes? -- To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv