From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp-2.sys.kth.se (smtp-2.sys.kth.se [130.237.32.160]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id p4FFVblA016991 for ; Sun, 15 May 2011 11:31:38 -0400 (EDT) Received: from mailscan-1.sys.kth.se (mailscan-1.sys.kth.se [130.237.32.91]) by smtp-2.sys.kth.se (Postfix) with ESMTP id 3388514D7CC; Sun, 15 May 2011 17:31:32 +0200 (CEST) X-Virus-Scanned: by amavisd-new at kth.se Received: from smtp-2.sys.kth.se ([130.237.32.160]) by mailscan-1.sys.kth.se (mailscan-1.sys.kth.se [130.237.32.91]) (amavisd-new, port 10024) with LMTP id JebzBTIatktz; Sun, 15 May 2011 17:31:30 +0200 (CEST) X-KTH-Auth: kristaps [192.75.139.248] X-KTH-mail-from: kristaps@bsd.lv Received: from macky.local (unknown [192.75.139.248]) by smtp-2.sys.kth.se (Postfix) with ESMTP id 32B9314C12F; Sun, 15 May 2011 17:31:26 +0200 (CEST) Message-ID: <4DCFF1CD.7000402@bsd.lv> Date: Sun, 15 May 2011 11:31:25 -0400 From: Kristaps Dzonsons User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10 X-Mailinglist: mdocml-discuss Reply-To: discuss@mdocml.bsd.lv MIME-Version: 1.0 To: Hiroki Sato CC: discuss@mdocml.bsd.lv Subject: Re: mandoc and UTF-8 support. References: <4DCAF4D3.5000207@bsd.lv> <20110515.194157.759326979137963596.hrs@ec.ss.titech.ac.jp> In-Reply-To: <20110515.194157.759326979137963596.hrs@ec.ss.titech.ac.jp> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit > kr> Hiroki (note CC'd to the mdocml mailing list), > kr> > kr> This regards today's conversation about mdocml and Japanese characters > kr> at the FreeBSD summit. > kr> > kr> Enclosed is a screenshot of a locally-modified mandoc producing > kr> on-terminal UTF-8 glyphs for kanji (I have NO IDEA what these > kr> characters mean, I just picked them from the Unicode reference). I > kr> hacked this in as a demonstrandum that it's possible to have UTF-8 > kr> output without much effort. > > Great, it seems to work. > > kr> I used the groff \U'xxxx' input escape sequence to specify Unicode > kr> input. Unfortunately, this doesn't seem to be officially supported by > kr> groff. > kr> > kr> http://lists.gnu.org/archive/html/groff/2000-04/msg00036.html > kr> > kr> My question is this: do you know of the most reliable to feed groff > kr> Unicode codepoints? I'm not sure when a -Tutf8 will exist for mandoc, > kr> but the screenshot demonstrates that it's in principle possible. > > I tried UTF-8 characters directly only but it was a long time ago so > I don't remember the details. I will check the stock version of > groff again and give mandoc a try, then contact a manual page > maintainer in FreeBSD project about how he feels migration from groff > to mandoc, and get back to you. > > And, what do I do to subscribe this mailing-list? I could not find > information on in at bsd.lv page. Hiroki, Send mail to discuss+subscribe@mdocml.bsd.lv to subscribe. Regarding groff, their underlying Unicode input method is detailed here: http://mdocml.bsd.lv/archives/tech/0368.html groff also has support for arbitrary encodings by piping the output of preconv (a groff preprocessor translating multi-byte characters, like UTF8, into the groff escapes): http://manpages.ubuntu.com/manpages/maverick/man1/preconv.1.html This part I'll consider later. I'd like to have wide-character support by the next version of mandoc; however, I'll post versions here for testing when they're available. Thanks again, Kristaps -- To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv