source@mandoc.bsd.lv
 help / color / mirror / Atom feed
* mandoc: Various updates: * document several missing ESCAPE_* constants *
@ 2023-10-23 14:46 schwarze
  0 siblings, 0 replies; only message in thread
From: schwarze @ 2023-10-23 14:46 UTC (permalink / raw)
  To: source

Log Message:
-----------
Various updates:
* document several missing ESCAPE_* constants
* some sequences are no longer ignored
* more information about what this function is used for
* better mark up output arguments
* improve some ordering
* drop the BUGS section, all that is almost completely fixed now

Modified Files:
--------------
    mandoc:
        mandoc_escape.3

Revision Data
-------------
Index: mandoc_escape.3
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mandoc_escape.3,v
retrieving revision 1.5
retrieving revision 1.6
diff -Lmandoc_escape.3 -Lmandoc_escape.3 -u -p -r1.5 -r1.6
--- mandoc_escape.3
+++ mandoc_escape.3
@@ -80,12 +80,12 @@ that can be used as quoting characters.
 .El
 .Pp
 Upon function entry,
-.Fa end
+.Pf * Fa end
 is expected to point to the escape sequence identifier.
 The values passed in as
-.Fa start
+.Pf * Fa start
 and
-.Fa sz
+.Pf * Fa sz
 are ignored and overwritten.
 .Pp
 By design, this function cannot handle those
@@ -102,7 +102,9 @@ and numerical expression control
 These are handled by
 .Fn roff_expand ,
 a private preprocessor function called from
-.Fn roff_parseln ,
+.Fn roff_parseln
+and
+.Fn roff_getarg ,
 see the file
 .Pa roff.c .
 .Pp
@@ -114,13 +116,22 @@ is used
 recursively by itself, because some escape sequence arguments can
 in turn contain other escape sequences,
 .It
-for error detection internally by the
+for parsing and error detection internally by the
 .Xr roff 7
 parser part of the
 .Xr mandoc 3
 library, see the file
 .Pa roff.c ,
 .It
+occasionally by high-level parser and validation modules when they
+need to skip escape sequences while scanning the input, see the files
+.Pa mdoc.c ,
+.Pa man.c ,
+.Pa man_validate.c ,
+.Pa eqn.c ,
+and
+.Pa tbl_data.c
+.It
 above all externally by the
 .Xr mandoc 1
 formatting modules, in particular
@@ -139,19 +150,19 @@ to purge escape sequences from text.
 .El
 .Sh RETURN VALUES
 Upon function return, the pointer
-.Fa end
+.Pf * Fa end
 is set to the character after the end of the escape sequence,
 such that the calling higher-level parser can easily continue.
 .Pp
 For escape sequences taking an argument, the pointer
-.Fa start
+.Pf * Fa start
 is set to the beginning of the argument and
-.Fa sz
+.Pf * Fa sz
 is set to the length of the argument.
 For escape sequences not taking an argument,
-.Fa start
+.Pf * Fa start
 is set to the character after the end of the sequence and
-.Fa sz
+.Pf * Fa sz
 is set to 0.
 Both
 .Fa start
@@ -165,6 +176,11 @@ For sequences taking an argument, the fu
 .Fn mandoc_escape
 returns one of the following values:
 .Bl -tag -width 2n
+.It Dv ESCAPE_DEVICE
+The escape sequence
+.Ic \e*(.T
+or
+.Ic \e*[.T] .
 .It Dv ESCAPE_FONT
 The escape sequence
 .Ic \ef
@@ -183,6 +199,33 @@ More specific values are returned for th
 .It Cm P Ta Dv ESCAPE_FONTPREV
 .It Cm BI Ta Dv ESCAPE_FONTBI
 .El
+.It Dv ESCAPE_HLINE
+The escape sequence
+.Ic \eh
+followed by an argument delimited by an arbitrary character.
+.It Dv ESCAPE_HORIZ
+The escape sequence
+.Ic \el
+followed by an argument delimited by an arbitrary character.
+.It Dv ESCAPE_NUMBERED
+The escape sequence
+.Ic \eN
+followed by a delimited argument.
+The delimiter character is arbitrary except that digits cannot be used.
+If a digit is encountered instead of the opening delimiter, that
+digit is considered to be the argument and the end of the sequence, and
+.Dv ESCAPE_IGNORE
+is returned.
+.Pp
+Such ASCII character escape sequences can be rendered using the function
+.Fn mchars_num2char
+described in the
+.Xr mchars_alloc 3
+manual.
+.It Dv ESCAPE_OVERSTRIKE
+The escape sequence
+.Ic \eo
+followed by an argument delimited by an arbitrary character.
 .It Dv ESCAPE_SPECIAL
 The escape sequence
 .Ic \eC
@@ -225,11 +268,11 @@ are hexadecimal digits and
 is not zero:
 .Ic \eC'u , \e[u .
 As a special exception,
-.Fa start
+.Pf * Fa start
 is set to the character after the
 .Ic u ,
 and the
-.Fa sz
+.Pf * Fa sz
 return value does not include the
 .Ic u
 either.
@@ -239,26 +282,10 @@ Such Unicode character escape sequences 
 described in the
 .Xr mchars_alloc 3
 manual.
-.It Dv ESCAPE_NUMBERED
-The escape sequence
-.Ic \eN
-followed by a delimited argument.
-The delimiter character is arbitrary except that digits cannot be used.
-If a digit is encountered instead of the opening delimiter, that
-digit is considered to be the argument and the end of the sequence, and
-.Dv ESCAPE_IGNORE
-is returned.
-.Pp
-Such ASCII character escape sequences can be rendered using the function
-.Fn mchars_num2char
-described in the
-.Xr mchars_alloc 3
-manual.
-.It Dv ESCAPE_OVERSTRIKE
-The escape sequence
-.Ic \eo
-followed by an argument delimited by an arbitrary character.
 .It Dv ESCAPE_IGNORE
+Many escape sequences that
+.Xr mandoc 1
+intends to ignore, in particular:
 .Bl -bullet -width 2n
 .It
 The escape sequence
@@ -276,18 +303,15 @@ for all forms.
 .It
 The escape sequences
 .Ic \eF ,
-.Ic \eg ,
 .Ic \ek ,
 .Ic \eM ,
 .Ic \em ,
-.Ic \en ,
-.Ic \eV ,
+.Ic \eO ,
 and
 .Ic \eY
 followed by an argument in standard form.
 .It
 The escape sequences
-.Ic \eA ,
 .Ic \eb ,
 .Ic \eD ,
 .Ic \eR ,
@@ -298,9 +322,7 @@ followed by an argument delimited by an 
 .It
 The escape sequences
 .Ic \eH ,
-.Ic \eh ,
 .Ic \eL ,
-.Ic \el ,
 .Ic \eS ,
 .Ic \ev ,
 and
@@ -312,9 +334,21 @@ is found instead of a delimiter, the seq
 with that character, and
 .Dv ESCAPE_ERROR
 is returned.
+.It
+The escape sequences
+.Ic \eO
+with a single-digit argument in the range from 1 to 4 inclusive.
 .El
+.It Dv ESCAPE_UNSUPP
+An escape sequence that
+.Xr mandoc 1
+can parse, but for which formatting in unsupported, in particular
+.Qq \eO0
+and
+.Qq \eO5 .
 .It Dv ESCAPE_ERROR
-Escape sequences taking an argument but not matching any of the above patterns.
+Escape sequences taking an argument
+where the actual argument contains a syntax error.
 In particular, that happens if the end of the logical input line
 is reached before the end of the argument.
 .El
@@ -323,17 +357,45 @@ For sequences that do not take an argume
 .Fn mandoc_escape
 returns one of the following values:
 .Bl -tag -width 2n
-.It Dv ESCAPE_SKIPCHAR
+.It Dv ESCAPE_BREAK
 The escape sequence
-.Qq \ez .
+.Qq \ep .
+.It Dv ESCAPE_IGNORE
+Many escape sequences including
+.Qq \e% ,
+.Qq \e& ,
+.Qq \e| ,
+.Qq \ed ,
+and
+.Qq \eu .
 .It Dv ESCAPE_NOSPACE
 The escape sequence
 .Qq \ec .
-.It Dv ESCAPE_IGNORE
+.It Dv ESCAPE_SKIPCHAR
+The escape sequence
+.Qq \ez .
+.It Dv ESCAPE_UNSUPP
 The escape sequences
-.Qq \ed
+.Qq \e! ,
+.Qq \e? ,
 and
-.Qq \eu .
+.Qq \er .
+.It Dv ESCAPE_UNDEF
+Many escape sequences that other
+.Xr roff 7
+implementations do not define either, for example
+.Qq \eG ,
+.Qq \eI ,
+.Qq \ei ,
+.Qq \eJ ,
+.Qq \ej ,
+.Qq \eK ,
+.Qq \eP ,
+.Qq \eT ,
+.Qq \eU ,
+.Qq \eW ,
+and
+.Qq \ey .
 .El
 .Sh FILES
 This function is implemented in
@@ -347,21 +409,3 @@ This function has been available since m
 .Sh AUTHORS
 .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv
 .An Ingo Schwarze Aq Mt schwarze@openbsd.org
-.Sh BUGS
-The function doesn't cleanly distinguish between sequences that are
-valid and supported, valid and ignored, valid and unsupported,
-syntactically invalid, or undefined.
-For sequences that are ignored or unsupported, it doesn't tell
-whether that deficiency is likely to cause major formatting problems
-and/or loss of document content.
-The function is already rather complicated and still parses some
-sequences incorrectly.
-.
-.ig
-For these sequences, the list given below specifies a starting string
-and either the length of the argument or an ending character.
-The argument starts after the starting string.
-In the former case, the sequence ends with the end of the argument.
-In the latter case, the argument ends before the ending character,
-and the sequence ends with the ending character.
-..
--
 To unsubscribe send an email to source+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-10-23 14:46 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-23 14:46 mandoc: Various updates: * document several missing ESCAPE_* constants * schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).