From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 20559 invoked from network); 22 Jun 2020 18:01:10 -0000 Received: from bsd.lv (HELO mandoc.bsd.lv) (66.111.2.12) by inbox.vuxu.org with ESMTPUTF8; 22 Jun 2020 18:01:10 -0000 Received: from fantadrom.bsd.lv (localhost [127.0.0.1]) by mandoc.bsd.lv (OpenSMTPD) with ESMTP id f5100bdb for ; Mon, 22 Jun 2020 13:01:01 -0500 (EST) Received: from localhost (mandoc.bsd.lv [local]) by mandoc.bsd.lv (OpenSMTPD) with ESMTPA id ad8c93cd for ; Mon, 22 Jun 2020 13:01:01 -0500 (EST) Date: Mon, 22 Jun 2020 13:01:01 -0500 (EST) X-Mailinglist: mandoc-source Reply-To: source@mandoc.bsd.lv MIME-Version: 1.0 From: schwarze@mandoc.bsd.lv To: source@mandoc.bsd.lv Subject: mandoc: John Gardner: handling of ASCII control characters during input X-Mailer: activitymail 1.26, http://search.cpan.org/dist/activitymail/ Content-Type: text/plain; charset=utf-8 Message-ID: <490de98f1006a256@mandoc.bsd.lv> Log Message: ----------- John Gardner: handling of ASCII control characters during input Modified Files: -------------- mandoc: TODO Revision Data ------------- Index: TODO =================================================================== RCS file: /home/cvs/mandoc/mandoc/TODO,v retrieving revision 1.302 retrieving revision 1.303 diff -LTODO -LTODO -u -p -r1.302 -r1.303 --- TODO +++ TODO @@ -83,6 +83,20 @@ are mere guesses, and some may be wrong. Jan Stary 20 Apr 2019 20:16:54 +0200 loc * exist *** algo *** size ** imp * +- mandoc replaces all ASCII control characters except tab and line feed + with '?' during input. It would be better to replace them with + Unicode escapes in preconv_encode() or somewhere in the vicinity, + such that the already existing better replacement strings show + up in the output. Emulating groff is not desirable: groff replaces + 0x00, 0x0b, and 0x0d to 0x1f with the empty string (bad because + that's easy to overlook for the document author), 0x01 with '.' + (very confusing), and passes through 0x02 to 0x08, 0x0c, and 0x7f + raw (bad because that is insecure output). Remember that 0x07 may + need special handling because it is sometimes used for certain + delimiters, so it may need handling *after* roff.c rather than before. + reminded by John Gardner 16 Jun 2020 14:26:28 +1000 + loc ** exist ** algo ** size ** imp * + --- missing mdoc features ---------------------------------------------- - .Sh and .Ss should be parsed and partially callable, see groff_mdoc(7) -- To unsubscribe send an email to source+unsubscribe@mandoc.bsd.lv