discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Patching Mandoc for IRIX
@ 2020-06-02 23:29 Kazuo Kuroi
  2020-06-22 21:44 ` Ingo Schwarze
  0 siblings, 1 reply; 6+ messages in thread
From: Kazuo Kuroi @ 2020-06-02 23:29 UTC (permalink / raw)
  To: discuss

[-- Attachment #1: Type: text/plain, Size: 1735 bytes --]

Hi there,

I have patched mandoc to work with IRIX, but I have a feeling that the 
fixes made will require some changes. Let me first explain my goals here:

I am patching mandoc to work with MIPSPro, the native compiler of IRIX, 
so I'll have to explain the changes I made to get it to build:

http://gitea.irixce.org/Raion/Xenopatches/raw/branch/master/mandoc/mandoc.patch 


Here's the patch file.

First change is because it doesn't reliably detect the c99 driver, and 
as the code uses non ANSI-related things, it needs to detect c99. This 
can probably be disregarded, as can the CFLAGS reference.

Next one is in mandoc.h, and it's because MIPSPro doesn't support the 
__attribute__ block. This could be fixed with a guard for non-GCC 
compilers, like this:
#ifdef __GNUC__
__attribute__((__format__ (__printf__, 4, 5)));

#endif

Or something. I would hope that you won't lock it out to GCC or clang, 
because I'm sure there's other compilers this thing chokes on.

The rest are to fix the IRIX printf() implementation, which doesn't 
allow for %zu as mandoc currently does. You can see my discussion with a 
colleague on the topic here:

https://forums.irixnet.org/thread-1946-post-14522.html

I understand if you do not want to upstream all of these changes, but I 
would hope that we can take some action to prevent someone else from 
having to go through this. Surely, this is not the only UNIX that would 
cause this.

Yes, GCC works, but we don't have clang and often times using a GPL 
compiler isn't only against my own principles when we have perfectly 
good alternatives, and MIPSPro performs better on IRIX than GCC.

If I can be of any assistance or questions in regards to this patch, let 
me know!

-Kazuo Kuroi


[-- Attachment #2: Type: text/html, Size: 2749 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Patching Mandoc for IRIX
  2020-06-02 23:29 Patching Mandoc for IRIX Kazuo Kuroi
@ 2020-06-22 21:44 ` Ingo Schwarze
  2020-06-22 22:09   ` Kazuo Kuroi
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Schwarze @ 2020-06-22 21:44 UTC (permalink / raw)
  To: Kazuo Kuroi; +Cc: discuss

Hello,

sorry for taking so long to respond, i was somewhat distracted by other
matters and now found your message while systematically looking for
unanswered messages regarding mandoc.


Kazuo Kuroi wrote on Tue, Jun 02, 2020 at 07:29:52PM -0400:

> I have patched mandoc to work with IRIX, but I have a feeling that the 
> fixes made will require some changes. Let me first explain my goals here:
> 
> I am patching mandoc to work with MIPSPro, the native compiler of IRIX, 

That sounds good and interesting.  I like it when people try to make
software work with native compilers that is intended to be portable,
rather than following a mindless reflex like "oh let's just require
gcc or clang to compile this".

> so I'll have to explain the changes I made to get it to build:
> 
> http://gitea.irixce.org/Raion/Xenopatches/raw/branch/master/mandoc/mandoc.patch 
> 
> Here's the patch file.
> 
> First change is because it doesn't reliably detect the c99 driver, and 
> as the code uses non ANSI-related things, it needs to detect c99. This 
> can probably be disregarded,

Possibly.  I tried to include code in ./configure in the past to
automatically detect the system compiler, but that attempt caused
more grief than benefit.  Few platforms needed it, and more ended
up having trouble with the attempted test.

So for the next release, i decided to simplify things by just
statically setting CC=cc in ./configure and inviting users to
set CC in configure.local to whatever is appropriate on their
platform - but of course only if "cc" is not.

Arguably, given that POSIX defines c99 but not cc, it might be better
to make CC=c99 the default.  Probably, the only reason that i didn't
is somewhat weak:  My main development platform, OpenBSD, provides
a cc(1) command but does not provide a c99(1) command.  Maybe it
should, but oh well.

For now, i suggest you keep CC=c99 in the configure.local file for IRIX.

> as can the CFLAGS reference.

I certainly won't put -O3 into CFLAGS by default, or any other
optimization flag for that matter.  I think operating systems should
have defaults that are appropriate for compiling general purpose
application software that has no special needs, and i consider it
annoying when upstream maintainers tweak compiler flags like that.

However, if there is some compelling reason why IRIX needs -O3,
then you can put that into the configure.local file for IRIX, though
i'm a bit at a loss trying to guess why that could possibly be
needed.

If IRIX needs -mips3, that definitely needs to go into configure.local.
I have no idea right now how i could possibly test for that.

> Next one is in mandoc.h, and it's because MIPSPro doesn't support the 
> __attribute__ block. This could be fixed with a guard for non-GCC 
> compilers, like this:
> #ifdef __GNUC__
> __attribute__((__format__ (__printf__, 4, 5)));
> #endif

Actually, i was already doing that, ./configure wrote something
like that into config.h.  However, some of the *.c files used private
headers using __attribute__ without including config.h first.  I
just fixed that with the first one of the two commits below.
Thanks for reporting the issue!

While there, i also replaced the horrible __GNUC__ version number
test by a proper feature test.  No idea why we didn't have that
in the first place, it really wasn't difficult to write.

> Or something. I would hope that you won't lock it out to GCC or clang, 

Absolutely not, that's not my intention at all.

> because I'm sure there's other compilers this thing chokes on.

Right.  Only it's sometimes hard to get reports from other compilers,
so thanks for what you are reporting.

> The rest are to fix the IRIX printf() implementation, which doesn't 
> allow for %zu as mandoc currently does.

I would strongly recommend that you fix your IRIX libc to support %zu
because that has been required by POSIX for more than a decade

  https://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html

and because not having it, which often results in using format
specifiers for the wrong length, can cause security issues.
Maybe not in mandoc, but in general: i don't doubt that you also
run software on IRIX that is more prone to develop security
vulnerabilities.  Besides, i would be surprised if closing that
particular feature gap in the IRIX libc would cause particularly
large amounts of work.

Given that apart from the the notoriously quirky and extremely
outdated SunOS 5.9 (2002 version), this is the first time i'm hearing
about a system that does not support %zu - even SunOS 5.10 (2009
version) supports that - i'm a bit hesitant to write a feature test
for that, also because using the result of the feature test in the
code would be somewhat intrusive.

So until you come around to fixing your libc, is think carrying
local patches for %zu is the best you can do.  But i suggest you
use the following idiom for these patches:

  mandoc_asprintf(arg, "%llun", (unsigned long long)width);

That should be perfectly safe and perfectly portable, whereas the 

  size_t width = ...;
  mandoc_asprintf(arg, "%un", width);

that you are currently using can potentially be dangerous.
At the very least, depending on the platform, on endianness,
and maybe depending on other aspects of the ABI, it might
silently produce incorrect results.

> You can see my discussion with a 
> colleague on the topic here:
> 
> https://forums.irixnet.org/thread-1946-post-14522.html

Very interesting, please keep me posted on how things work out.
In particular, i would be interested to learn if any of your
manual pages format poorly with mandoc.  But it's also
interesting to follow how your work progresses in general.

Regarding the topics discussed in your forum:

 * Heirloom (not "Heritage") Troff has very high typesetting
   quality in some respects, in some even better than groff,
   for example it has paragraph-at-once filling.  It is also
   somewhat maintained, but AFAIK more or less by a one-man
   team (Carsten Kunze) with a handful of commits in 2019
   and none so far in 2020.  It is not at all an obvious
   choice as a manual page formatter.  I'm not aware of a
   single operating system using it for that purpose.  Carsten
   has invested some work to make it better for manual pages,
   but you should still expect at least occasional compatibility
   issues when using it for manual pages, much more frquently
   than with mandoc or groff.

 * With mandoc, all three options are workable:  installing source
   manuals or preformatted manuals, and in the former case,
   using mandoc as both the formatter and the man(1) implementation,
   as for example OpenBSD and Alpine Linux do it, or using mandoc
   as the formatter but a different man(1) implementation, as for
   example FreeBSD and Illumos do it.
   In OpenBSD, we switched in three steps: first we switched the
   formatter from groff to mandoc in 2010, then we switched from
   installing preformatted to installing source manuals in 2011,
   and finally to the mandoc implementation of man(1) in 2014.
   Alpine Linux has done all that in one single step in 2014.

> I understand if you do not want to upstream all of these changes, but I 
> would hope that we can take some action to prevent someone else from 
> having to go through this. Surely, this is not the only UNIX that would 
> cause this.

In general, i try to upstream patches when it is possible without
causing unreasonable complication.  Often, that work progressed in
multiple steps, slowly and carefully improving support for that
platform, sometimes over years.  For example, in the beginning,
building on Oracle Solaris 11 needed lots of handholding and various
configure.local tweaks.  By now, even the older SunOS 5.10 just
works out of the box.

> Yes, GCC works, but we don't have clang and often times using a GPL 
> compiler isn't only against my own principles when we have perfectly 
> good alternatives, and MIPSPro performs better on IRIX than GCC.

That sounds reasonable.

> If I can be of any assistance or questions in regards to this patch, let 
> me know!

Sure, see above, and keep me posted about how your work is progressing!

Do you have a porting or packaging system in IRIX, in any way similar
to BSD ports, or Linux packages, or MacPorts, or Homebrew, or AUR,
or SlackBuilds, or Cashew?  If so, is there a distinction between
official or unofficial ports/packages?  I'm wondering whether at some
point, it might make sense to list IRIX on

  https://mandoc.bsd.lv/ports.html

but i'm a bit at a loss as to how...

Could you re-test whether compiling mandoc from CVS HEAD on IRIX
now already works a bit better for you, with smaller patches?

Yours,
  Ingo


Log Message:
-----------
Because mandoc_aux.h and mandoc.h use __attribute__, all files that
include mandoc_aux.h or mandoc.h need to include config.h, too.
It is suspected that for example IRIX needs this, or it is likely
to throw errors in these files because the system compiler doesn't
understand __attribute__.
Issue reported by Kazuo Kuroi <kazuo at irixnet dot org>.

Modified Files:
--------------
    mandoc:
        Makefile.depend
        dba_array.c
        dba_read.c
        mandoc_ohash.c
        mandoc_xr.c
        mdoc_markdown.c
        mdoc_state.c
        roff_html.c
        roff_term.c
        roff_validate.c
        term_tab.c

Revision Data
-------------
Index: mdoc_state.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mdoc_state.c,v
retrieving revision 1.16
retrieving revision 1.17
diff -Lmdoc_state.c -Lmdoc_state.c -u -p -r1.16 -r1.17
--- mdoc_state.c
+++ mdoc_state.c
@@ -1,4 +1,4 @@
-/*	$Id$ */
+/* $Id$ */
 /*
  * Copyright (c) 2014, 2015, 2017 Ingo Schwarze <schwarze@openbsd.org>
  *
@@ -14,6 +14,8 @@
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
+#include "config.h"
+
 #include <sys/types.h>
 
 #include <assert.h>
Index: term_tab.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/term_tab.c,v
retrieving revision 1.5
retrieving revision 1.6
diff -Lterm_tab.c -Lterm_tab.c -u -p -r1.5 -r1.6
--- term_tab.c
+++ term_tab.c
@@ -1,4 +1,4 @@
-/*	$Id$ */
+/* $Id$ */
 /*
  * Copyright (c) 2017 Ingo Schwarze <schwarze@openbsd.org>
  *
@@ -14,6 +14,8 @@
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
+#include "config.h"
+
 #include <sys/types.h>
 
 #include <stddef.h>
Index: roff_html.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/roff_html.c,v
retrieving revision 1.20
retrieving revision 1.21
diff -Lroff_html.c -Lroff_html.c -u -p -r1.20 -r1.21
--- roff_html.c
+++ roff_html.c
@@ -1,4 +1,4 @@
-/*	$Id$ */
+/* $Id$ */
 /*
  * Copyright (c) 2010 Kristaps Dzonsons <kristaps@bsd.lv>
  * Copyright (c) 2014, 2017, 2018, 2019 Ingo Schwarze <schwarze@openbsd.org>
@@ -15,6 +15,8 @@
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
+#include "config.h"
+
 #include <sys/types.h>
 
 #include <assert.h>
Index: mandoc_ohash.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mandoc_ohash.c,v
retrieving revision 1.2
retrieving revision 1.3
diff -Lmandoc_ohash.c -Lmandoc_ohash.c -u -p -r1.2 -r1.3
--- mandoc_ohash.c
+++ mandoc_ohash.c
@@ -1,4 +1,4 @@
-/*	$Id$	*/
+/* $Id$	*/
 /*
  * Copyright (c) 2014, 2015 Ingo Schwarze <schwarze@openbsd.org>
  *
@@ -14,6 +14,8 @@
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
+#include "config.h"
+
 #include <sys/types.h>
 #include <stddef.h>
 #include <stdint.h>
Index: mdoc_markdown.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mdoc_markdown.c,v
retrieving revision 1.35
retrieving revision 1.36
diff -Lmdoc_markdown.c -Lmdoc_markdown.c -u -p -r1.35 -r1.36
--- mdoc_markdown.c
+++ mdoc_markdown.c
@@ -16,6 +16,8 @@
  *
  * Markdown formatter for mdoc(7) used by mandoc(1).
  */
+#include "config.h"
+
 #include <sys/types.h>
 
 #include <assert.h>
Index: dba_read.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/dba_read.c,v
retrieving revision 1.4
retrieving revision 1.5
diff -Ldba_read.c -Ldba_read.c -u -p -r1.4 -r1.5
--- dba_read.c
+++ dba_read.c
@@ -1,4 +1,4 @@
-/*	$Id$ */
+/* $Id$ */
 /*
  * Copyright (c) 2016 Ingo Schwarze <schwarze@openbsd.org>
  *
@@ -19,6 +19,8 @@
  * The interface is defined in "dba.h".
  * This file is seperate from dba.c because this also uses "dbm.h".
  */
+#include "config.h"
+
 #include <regex.h>
 #include <stdint.h>
 #include <stdlib.h>
Index: Makefile.depend
===================================================================
RCS file: /home/cvs/mandoc/mandoc/Makefile.depend,v
retrieving revision 1.48
retrieving revision 1.49
diff -LMakefile.depend -LMakefile.depend -u -p -r1.48 -r1.49
--- Makefile.depend
+++ Makefile.depend
@@ -22,8 +22,8 @@ compat_strsep.o: compat_strsep.c config.
 compat_strtonum.o: compat_strtonum.c config.h
 compat_vasprintf.o: compat_vasprintf.c config.h
 dba.o: dba.c config.h mandoc_aux.h mandoc_ohash.h compat_ohash.h mansearch.h dba_write.h dba_array.h dba.h
-dba_array.o: dba_array.c mandoc_aux.h dba_write.h dba_array.h
-dba_read.o: dba_read.c mandoc_aux.h mansearch.h dba_array.h dba.h dbm.h
+dba_array.o: dba_array.c config.h mandoc_aux.h dba_write.h dba_array.h
+dba_read.o: dba_read.c config.h mandoc_aux.h mansearch.h dba_array.h dba.h dbm.h
 dba_write.o: dba_write.c config.h dba_write.h
 dbm.o: dbm.c config.h mansearch.h dbm_map.h dbm.h
 dbm_map.o: dbm_map.c config.h mansearch.h dbm_map.h dbm.h
@@ -42,8 +42,8 @@ man_validate.o: man_validate.c config.h 
 mandoc.o: mandoc.c config.h mandoc_aux.h mandoc.h roff.h libmandoc.h roff_int.h
 mandoc_aux.o: mandoc_aux.c config.h mandoc.h mandoc_aux.h
 mandoc_msg.o: mandoc_msg.c config.h mandoc.h
-mandoc_ohash.o: mandoc_ohash.c mandoc_aux.h mandoc_ohash.h compat_ohash.h
-mandoc_xr.o: mandoc_xr.c mandoc_aux.h mandoc_ohash.h compat_ohash.h mandoc_xr.h
+mandoc_ohash.o: mandoc_ohash.c config.h mandoc_aux.h mandoc_ohash.h compat_ohash.h
+mandoc_xr.o: mandoc_xr.c config.h mandoc_aux.h mandoc_ohash.h compat_ohash.h mandoc_xr.h
 mandocd.o: mandocd.c config.h mandoc.h roff.h mdoc.h man.h mandoc_parse.h main.h manconf.h
 mandocdb.o: mandocdb.c config.h compat_fts.h mandoc_aux.h mandoc_ohash.h compat_ohash.h mandoc.h roff.h mdoc.h man.h mandoc_parse.h manconf.h mansearch.h dba_array.h dba.h
 manpath.o: manpath.c config.h mandoc_aux.h mandoc.h manconf.h
@@ -53,8 +53,8 @@ mdoc_argv.o: mdoc_argv.c config.h mandoc
 mdoc_html.o: mdoc_html.c config.h mandoc_aux.h mandoc.h roff.h mdoc.h out.h html.h main.h
 mdoc_macro.o: mdoc_macro.c config.h mandoc.h roff.h mdoc.h libmandoc.h roff_int.h libmdoc.h
 mdoc_man.o: mdoc_man.c config.h mandoc_aux.h mandoc.h roff.h mdoc.h man.h out.h main.h
-mdoc_markdown.o: mdoc_markdown.c mandoc_aux.h mandoc.h roff.h mdoc.h main.h
-mdoc_state.o: mdoc_state.c mandoc.h roff.h mdoc.h libmandoc.h roff_int.h libmdoc.h
+mdoc_markdown.o: mdoc_markdown.c config.h mandoc_aux.h mandoc.h roff.h mdoc.h main.h
+mdoc_state.o: mdoc_state.c config.h mandoc.h roff.h mdoc.h libmandoc.h roff_int.h libmdoc.h
 mdoc_term.o: mdoc_term.c config.h mandoc_aux.h roff.h mdoc.h out.h term.h term_tag.h main.h
 mdoc_validate.o: mdoc_validate.c config.h mandoc_aux.h mandoc.h mandoc_xr.h roff.h mdoc.h libmandoc.h roff_int.h libmdoc.h tag.h
 msec.o: msec.c config.h mandoc.h libmandoc.h msec.in
@@ -62,9 +62,9 @@ out.o: out.c config.h mandoc_aux.h tbl.h
 preconv.o: preconv.c config.h mandoc.h roff.h mandoc_parse.h libmandoc.h
 read.o: read.c config.h mandoc_aux.h mandoc.h roff.h mdoc.h man.h mandoc_parse.h libmandoc.h roff_int.h tag.h
 roff.o: roff.c config.h mandoc_aux.h mandoc_ohash.h compat_ohash.h mandoc.h roff.h mandoc_parse.h libmandoc.h roff_int.h tbl_parse.h eqn_parse.h predefs.in
-roff_html.o: roff_html.c mandoc.h roff.h out.h html.h
-roff_term.o: roff_term.c mandoc.h roff.h out.h term.h
-roff_validate.o: roff_validate.c mandoc.h roff.h libmandoc.h roff_int.h
+roff_html.o: roff_html.c config.h mandoc.h roff.h out.h html.h
+roff_term.o: roff_term.c config.h mandoc.h roff.h out.h term.h
+roff_validate.o: roff_validate.c config.h mandoc.h roff.h libmandoc.h roff_int.h
 soelim.o: soelim.c config.h compat_stringlist.h
 st.o: st.c config.h mandoc.h roff.h libmdoc.h
 tag.o: tag.c config.h mandoc_aux.h mandoc_ohash.h compat_ohash.h roff.h mdoc.h roff_int.h tag.h
@@ -77,6 +77,6 @@ tbl_term.o: tbl_term.c config.h mandoc.h
 term.o: term.c config.h mandoc.h mandoc_aux.h out.h term.h main.h
 term_ascii.o: term_ascii.c config.h mandoc.h mandoc_aux.h out.h term.h manconf.h main.h
 term_ps.o: term_ps.c config.h mandoc_aux.h out.h term.h manconf.h main.h
-term_tab.o: term_tab.c mandoc_aux.h out.h term.h
+term_tab.o: term_tab.c config.h mandoc_aux.h out.h term.h
 term_tag.o: term_tag.c config.h mandoc.h roff.h roff_int.h tag.h term_tag.h
 tree.o: tree.c config.h mandoc.h roff.h mdoc.h man.h tbl.h eqn.h main.h
Index: dba_array.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/dba_array.c,v
retrieving revision 1.1
retrieving revision 1.2
diff -Ldba_array.c -Ldba_array.c -u -p -r1.1 -r1.2
--- dba_array.c
+++ dba_array.c
@@ -1,4 +1,4 @@
-/*	$Id$ */
+/* $Id$ */
 /*
  * Copyright (c) 2016 Ingo Schwarze <schwarze@openbsd.org>
  *
@@ -17,6 +17,8 @@
  * Allocation-based arrays for the mandoc database, for read-write access.
  * The interface is defined in "dba_array.h".
  */
+#include "config.h"
+
 #include <assert.h>
 #include <stdint.h>
 #include <stdlib.h>
Index: mandoc_xr.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mandoc_xr.c,v
retrieving revision 1.3
retrieving revision 1.4
diff -Lmandoc_xr.c -Lmandoc_xr.c -u -p -r1.3 -r1.4
--- mandoc_xr.c
+++ mandoc_xr.c
@@ -1,4 +1,4 @@
-/*	$Id$ */
+/* $Id$ */
 /*
  * Copyright (c) 2017 Ingo Schwarze <schwarze@openbsd.org>
  *
@@ -14,6 +14,8 @@
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
+#include "config.h"
+
 #include <sys/types.h>
 
 #include <assert.h>
Index: roff_validate.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/roff_validate.c,v
retrieving revision 1.19
retrieving revision 1.20
diff -Lroff_validate.c -Lroff_validate.c -u -p -r1.19 -r1.20
--- roff_validate.c
+++ roff_validate.c
@@ -1,4 +1,4 @@
-/*	$Id$ */
+/* $Id$ */
 /*
  * Copyright (c) 2010, 2017, 2018, 2020 Ingo Schwarze <schwarze@openbsd.org>
  *
@@ -14,6 +14,8 @@
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
+#include "config.h"
+
 #include <sys/types.h>
 
 #include <assert.h>
Index: roff_term.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/roff_term.c,v
retrieving revision 1.19
retrieving revision 1.20
diff -Lroff_term.c -Lroff_term.c -u -p -r1.19 -r1.20
--- roff_term.c
+++ roff_term.c
@@ -1,4 +1,4 @@
-/*	$Id$ */
+/* $Id$ */
 /*
  * Copyright (c) 2010,2014,2015,2017-2019 Ingo Schwarze <schwarze@openbsd.org>
  *
@@ -14,6 +14,8 @@
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
+#include "config.h"
+
 #include <sys/types.h>
 
 #include <assert.h>


Log Message:
-----------
Provide a real feature test for __attribute__().
Looking at version numbers like __GNUC__ is always a bad idea.
Believe it or not, this even makes ./configure shorter by one line.

Modified Files:
--------------
    mandoc:
        Makefile
        configure
        configure.local.example

Added Files:
-----------
    mandoc:
        test-attribute.c

Revision Data
-------------
Index: Makefile
===================================================================
RCS file: /home/cvs/mandoc/mandoc/Makefile,v
retrieving revision 1.534
retrieving revision 1.535
diff -LMakefile -LMakefile -u -p -r1.534 -r1.535
--- Makefile
+++ Makefile
@@ -19,7 +19,8 @@ VERSION = 1.14.5
 
 # === LIST OF FILES ====================================================
 
-TESTSRCS	 = test-be32toh.c \
+TESTSRCS	 = test-attribute.c \
+		   test-be32toh.c \
 		   test-cmsg.c \
 		   test-dirent-namlen.c \
 		   test-EFTYPE.c \
--- /dev/null
+++ test-attribute.c
@@ -0,0 +1,48 @@
+/* $Id: test-attribute.c,v 1.1 2020/06/22 20:00:38 schwarze Exp $ */
+/*
+ * Copyright (c) 2020 Ingo Schwarze <schwarze@openbsd.org>
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+void	 var_arg(const char *, ...)
+		__attribute__((__format__ (__printf__, 1, 2)));
+void	 no_ret(int)
+		__attribute__((__noreturn__));
+
+void
+var_arg(const char *fmt, ...)
+{
+	va_list	 ap;
+
+	va_start(ap, fmt);
+	vprintf(fmt, ap);
+	va_end(ap);
+}
+
+void
+no_ret(int i)
+{
+	exit(i);
+}
+
+int
+main(void)
+{
+	var_arg("Test output: %d\n", 42);
+	no_ret(0);
+}
Index: configure.local.example
===================================================================
RCS file: /home/cvs/mandoc/mandoc/configure.local.example,v
retrieving revision 1.37
retrieving revision 1.38
diff -Lconfigure.local.example -Lconfigure.local.example -u -p -r1.37 -r1.38
--- configure.local.example
+++ configure.local.example
@@ -288,6 +288,7 @@ CFLAGS="-g"
 # and will be regarded as failed) or 1 (test will not be run and will
 # be regarded as successful).
 
+HAVE_ATTRIBUTE=0
 HAVE_DIRENT_NAMLEN=0
 HAVE_ENDIAN=0
 HAVE_EFTYPE=0
Index: configure
===================================================================
RCS file: /home/cvs/mandoc/mandoc/configure,v
retrieving revision 1.75
retrieving revision 1.76
diff -Lconfigure -Lconfigure -u -p -r1.75 -r1.76
--- configure
+++ configure
@@ -55,6 +55,7 @@ BUILD_CGI=0
 BUILD_CATMAN=0
 INSTALL_LIBMANDOC=0
 
+HAVE_ATTRIBUTE=
 HAVE_CMSG=
 HAVE_DIRENT_NAMLEN=
 HAVE_EFTYPE=
@@ -294,6 +295,7 @@ fi
 # --- tests for config.h  ----------------------------------------------
 
 # --- library functions ---
+runtest attribute	ATTRIBUTE	|| true
 runtest cmsg		CMSG		"" "-D_XPG4_2" || true
 runtest dirent-namlen	DIRENT_NAMLEN	|| true
 runtest be32toh		ENDIAN		|| true
@@ -422,10 +424,6 @@ cat << __HEREDOC__
 #error "Do not use C++.  See the INSTALL file."
 #endif
 
-#if !defined(__GNUC__) || (__GNUC__ < 4)
-#define __attribute__(x)
-#endif
-
 __HEREDOC__
 
 [ ${NEED_GNU_SOURCE} -eq 0 ] || echo "#define _GNU_SOURCE"
@@ -447,6 +445,7 @@ echo "#define OSENUM ${OSENUM}"
 [ -n "${OSNAME}" ] && echo "#define OSNAME \"${OSNAME}\""
 [ -n "${UTF8_LOCALE}" ] && echo "#define UTF8_LOCALE \"${UTF8_LOCALE}\""
 [ -n "${HOMEBREWDIR}" ] && echo "#define HOMEBREWDIR \"${HOMEBREWDIR}\""
+[ ${HAVE_ATTRIBUTE} -eq 0 ] && echo "#define __attribute__(x)"
 [ ${HAVE_EFTYPE} -eq 0 ] && echo "#define EFTYPE EINVAL"
 [ ${HAVE_O_DIRECTORY} -eq 0 ] && echo "#define O_DIRECTORY 0"
 [ ${HAVE_PATH_MAX} -eq 0 ] && echo "#define PATH_MAX 4096"
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Patching Mandoc for IRIX
  2020-06-22 21:44 ` Ingo Schwarze
@ 2020-06-22 22:09   ` Kazuo Kuroi
  2020-08-27 18:09     ` Ingo Schwarze
  0 siblings, 1 reply; 6+ messages in thread
From: Kazuo Kuroi @ 2020-06-22 22:09 UTC (permalink / raw)
  To: discuss

Hi Ingo!

Thank you for getting back to me, I appreciate the professionalism! 
Don't worry about the specific CFLAGS, -mips4 should work as well and I 
was just demonstrating what my particular use case requested. -O3 was 
just a test, and it got roped into the patch, my mistake!

I'm glad to hear you're taking my feedback seriously, let me speak on 
one point though:

IRIX is not free/open source, and we do not have access to the IRIX libc 
to change the printf() implementation. I've considered adding an 
external override printf() replacement to my library I use to improve 
portability (libxg) but let me quote a dev on my forums on the topic:

"The "%zun" format specifier to the printf family of functions is 
problematic for us.  Newer libc implementations (GNU, BSD, ...) use the 
"z" character as a length modifier to indicate the following "u" 
argument is of type size_t.  IRIX libc doesn't support the "z" length 
modifier at all.  So that's got to go, for starters.  I'm assuming 
you're compiling as N32 code, so an unmodified "%u" should suffice since 
size_t is defined as an unsigned int in /usr/include/sys/types.h.  (Use 
%lu for 64-bit code where size_t is an unsigned long instead.  Which 
clearly shows why the z modifier is needed for portable code!)

Also, the trailing "n" is potentially problematic.  I'm not 100% sure, 
but I think the code is intended to print a size_t followed by a literal 
"n".  But IRIX libc seems to be interpreting it as the "n" "conversion 
character" which requests the printf family of functions to put the 
length of the printed string into a variable.  But the next argument to 
the printf call (from mdoc_validate.c) is an int rather than a pointer 
to an int.  That would explain the segmentation fault:  printf treating 
that integer as a pointer means it is trying to write to memory the 
process doesn't own."

In this case, the trailing n didn't cause any issues. I understand for 
portability reasons you wouldn't want to change it, and that's totally 
understandable. One way you could accommodate IRIX would be to use the 
__sgi definition in configure, and then you could use perl or sed to 
change the code, or you could throw a warning out in configure regarding 
%zu in the code and we could work out a more conservative patch that 
fixes just the %zu for those who stumble upon this. This mostly affects 
the makewhatis commands, the actual mandoc binary appears to work fine.

I would think that some other platforms like older AIX, Solaris, HP-UX 
etc. may have this issue too, but I've not worked on those extensively.

Pragmatically, after I got a copy of the DWB, I think the best option 
for our usecase is to use mandoc because the implementation is just 
"Better" than the others here, and it doesn't preclude groff or nroff 
for others. The native awf used in the base IRIX has its manpages packed 
in .z pack files, I still need to test if we can use the "man" but if 
not, I am gonna make a script with a cronjob that allows for the man 
files to be packed up easily.

On your question of whether or not we have ports or anything, not 
currently. There's two main projects looking to "modernize" IRIX, each 
with its own way of doing things:

Nekoware II - Named after the old forum nekochan.net's premiere project, 
which was Nekoware. We follow the same philosophy of the base of the 
ports should be compiled with the native compiler (MIPSPro) while 
projects that are too "new" can use GCC as needed. We package stuff in 
the .tardist native format, basically a tarball with the 
idb/spec/distfile format used by IRIX since the 4D1 68k era. It's a bit 
clunky, but we prefer to kind of keep things historical.

I'm a project head for Nekoware II - I don't do development that 
intensively, I'm basically a guy who knows how to beat on other people's 
code to make it work. I work more as a documenter, packager and 
occasional supervisor.

SGUG RSE - A competing project by the sgi.sh forum. They have ported RPM 
to IRIX and are using an SRPM approach with GCC and distcc. I do not 
know but so much about this, but they are not currently packaging mandoc 
as they use GNU GROFF.

I usually, for now, keep patches and references in "Xenopatches" here:
http://gitea.irixce.org/Raion/Xenopatches/src/branch/master/mandoc

So pragmatically, once I figure out how to get it all together, you can 
link here and I'll include a build instructions file. Once Nekoware II 
is packaging tardists again, you can link back to nekoware II's homepage.


Thank you very much.

On 6/22/2020 5:44 PM, Ingo Schwarze wrote:
> Hello,
>
> sorry for taking so long to respond, i was somewhat distracted by other
> matters and now found your message while systematically looking for
> unanswered messages regarding mandoc.
>
>
> Kazuo Kuroi wrote on Tue, Jun 02, 2020 at 07:29:52PM -0400:
>
>> I have patched mandoc to work with IRIX, but I have a feeling that the
>> fixes made will require some changes. Let me first explain my goals here:
>>
>> I am patching mandoc to work with MIPSPro, the native compiler of IRIX,
> That sounds good and interesting.  I like it when people try to make
> software work with native compilers that is intended to be portable,
> rather than following a mindless reflex like "oh let's just require
> gcc or clang to compile this".
>
>> so I'll have to explain the changes I made to get it to build:
>>
>> http://gitea.irixce.org/Raion/Xenopatches/raw/branch/master/mandoc/mandoc.patch
>>
>> Here's the patch file.
>>
>> First change is because it doesn't reliably detect the c99 driver, and
>> as the code uses non ANSI-related things, it needs to detect c99. This
>> can probably be disregarded,
> Possibly.  I tried to include code in ./configure in the past to
> automatically detect the system compiler, but that attempt caused
> more grief than benefit.  Few platforms needed it, and more ended
> up having trouble with the attempted test.
>
> So for the next release, i decided to simplify things by just
> statically setting CC=cc in ./configure and inviting users to
> set CC in configure.local to whatever is appropriate on their
> platform - but of course only if "cc" is not.
>
> Arguably, given that POSIX defines c99 but not cc, it might be better
> to make CC=c99 the default.  Probably, the only reason that i didn't
> is somewhat weak:  My main development platform, OpenBSD, provides
> a cc(1) command but does not provide a c99(1) command.  Maybe it
> should, but oh well.
>
> For now, i suggest you keep CC=c99 in the configure.local file for IRIX.
>
>> as can the CFLAGS reference.
> I certainly won't put -O3 into CFLAGS by default, or any other
> optimization flag for that matter.  I think operating systems should
> have defaults that are appropriate for compiling general purpose
> application software that has no special needs, and i consider it
> annoying when upstream maintainers tweak compiler flags like that.
>
> However, if there is some compelling reason why IRIX needs -O3,
> then you can put that into the configure.local file for IRIX, though
> i'm a bit at a loss trying to guess why that could possibly be
> needed.
>
> If IRIX needs -mips3, that definitely needs to go into configure.local.
> I have no idea right now how i could possibly test for that.
>
>> Next one is in mandoc.h, and it's because MIPSPro doesn't support the
>> __attribute__ block. This could be fixed with a guard for non-GCC
>> compilers, like this:
>> #ifdef __GNUC__
>> __attribute__((__format__ (__printf__, 4, 5)));
>> #endif
> Actually, i was already doing that, ./configure wrote something
> like that into config.h.  However, some of the *.c files used private
> headers using __attribute__ without including config.h first.  I
> just fixed that with the first one of the two commits below.
> Thanks for reporting the issue!
>
> While there, i also replaced the horrible __GNUC__ version number
> test by a proper feature test.  No idea why we didn't have that
> in the first place, it really wasn't difficult to write.
>
>> Or something. I would hope that you won't lock it out to GCC or clang,
> Absolutely not, that's not my intention at all.
>
>> because I'm sure there's other compilers this thing chokes on.
> Right.  Only it's sometimes hard to get reports from other compilers,
> so thanks for what you are reporting.
>
>> The rest are to fix the IRIX printf() implementation, which doesn't
>> allow for %zu as mandoc currently does.
> I would strongly recommend that you fix your IRIX libc to support %zu
> because that has been required by POSIX for more than a decade
>
>    https://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html
>
> and because not having it, which often results in using format
> specifiers for the wrong length, can cause security issues.
> Maybe not in mandoc, but in general: i don't doubt that you also
> run software on IRIX that is more prone to develop security
> vulnerabilities.  Besides, i would be surprised if closing that
> particular feature gap in the IRIX libc would cause particularly
> large amounts of work.
>
> Given that apart from the the notoriously quirky and extremely
> outdated SunOS 5.9 (2002 version), this is the first time i'm hearing
> about a system that does not support %zu - even SunOS 5.10 (2009
> version) supports that - i'm a bit hesitant to write a feature test
> for that, also because using the result of the feature test in the
> code would be somewhat intrusive.
>
> So until you come around to fixing your libc, is think carrying
> local patches for %zu is the best you can do.  But i suggest you
> use the following idiom for these patches:
>
>    mandoc_asprintf(arg, "%llun", (unsigned long long)width);
>
> That should be perfectly safe and perfectly portable, whereas the
>
>    size_t width = ...;
>    mandoc_asprintf(arg, "%un", width);
>
> that you are currently using can potentially be dangerous.
> At the very least, depending on the platform, on endianness,
> and maybe depending on other aspects of the ABI, it might
> silently produce incorrect results.
>
>> You can see my discussion with a
>> colleague on the topic here:
>>
>> https://forums.irixnet.org/thread-1946-post-14522.html
> Very interesting, please keep me posted on how things work out.
> In particular, i would be interested to learn if any of your
> manual pages format poorly with mandoc.  But it's also
> interesting to follow how your work progresses in general.
>
> Regarding the topics discussed in your forum:
>
>   * Heirloom (not "Heritage") Troff has very high typesetting
>     quality in some respects, in some even better than groff,
>     for example it has paragraph-at-once filling.  It is also
>     somewhat maintained, but AFAIK more or less by a one-man
>     team (Carsten Kunze) with a handful of commits in 2019
>     and none so far in 2020.  It is not at all an obvious
>     choice as a manual page formatter.  I'm not aware of a
>     single operating system using it for that purpose.  Carsten
>     has invested some work to make it better for manual pages,
>     but you should still expect at least occasional compatibility
>     issues when using it for manual pages, much more frquently
>     than with mandoc or groff.
>
>   * With mandoc, all three options are workable:  installing source
>     manuals or preformatted manuals, and in the former case,
>     using mandoc as both the formatter and the man(1) implementation,
>     as for example OpenBSD and Alpine Linux do it, or using mandoc
>     as the formatter but a different man(1) implementation, as for
>     example FreeBSD and Illumos do it.
>     In OpenBSD, we switched in three steps: first we switched the
>     formatter from groff to mandoc in 2010, then we switched from
>     installing preformatted to installing source manuals in 2011,
>     and finally to the mandoc implementation of man(1) in 2014.
>     Alpine Linux has done all that in one single step in 2014.
> In general, i try to upstream patches when it is possible without
> causing unreasonable complication.  Often, that work progressed in
> multiple steps, slowly and carefully improving support for that
> platform, sometimes over years.  For example, in the beginning,
> building on Oracle Solaris 11 needed lots of handholding and various
> configure.local tweaks.  By now, even the older SunOS 5.10 just
> works out of the box.
>
> That sounds reasonable.
> Sure, see above, and keep me posted about how your work is progressing!
>
> Do you have a porting or packaging system in IRIX, in any way similar
> to BSD ports, or Linux packages, or MacPorts, or Homebrew, or AUR,
> or SlackBuilds, or Cashew?  If so, is there a distinction between
> official or unofficial ports/packages?  I'm wondering whether at some
> point, it might make sense to list IRIX on
>
>    https://mandoc.bsd.lv/ports.html
>
> but i'm a bit at a loss as to how...
>
> Could you re-test whether compiling mandoc from CVS HEAD on IRIX
> now already works a bit better for you, with smaller patches?
>
> Yours,
>    Ingo

--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Patching Mandoc for IRIX
  2020-06-22 22:09   ` Kazuo Kuroi
@ 2020-08-27 18:09     ` Ingo Schwarze
  2020-08-28 19:40       ` Kazuo Kuroi
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Schwarze @ 2020-08-27 18:09 UTC (permalink / raw)
  To: Kazuo Kuroi; +Cc: discuss

Hi Kazuo,

oh no, this thread got buried under other matters *again*.
This is getting really embarassing...  :-(

Kazuo Kuroi wrote on Mon, Jun 22, 2020 at 06:09:09PM -0400:

> IRIX is not free/open source, and we do not have access
> to the IRIX libc to change the printf() implementation.

Now i'm somewhat confused - how do you diagnose and fix security
vulnerabilities in libc in that case?  The system is no longer
maintained by the original vendor, right?

> I've considered adding an external override printf() replacement
> to my library I use to improve portability (libxg)

Yes.  My general philosophy is to write code according to current
standards, and if an older system lacks some function, provide
a replacement function if that can be done without undue effort.
I do not want to pollute the portable code with wrapper functions
or #ifdefs or even macro expansion.

Unfortunately, writing a portability wrapper for printf(3) that
modifies the format string to change %zu into whatever is appropriate
for the current platform is tricky and potentially dangerous, so
maybe it is better that you manually maintain patches for that,
unless or until we have a better idea.

> but let me quote a dev on my forums on the topic:
> 
> "The "%zun" format specifier to the printf family of functions is 
> problematic for us.  Newer libc implementations (GNU, BSD, ...) use the 
> "z" character as a length modifier to indicate the following "u" 
> argument is of type size_t.  IRIX libc doesn't support the "z" length 
> modifier at all.  So that's got to go, for starters.  I'm assuming 
> you're compiling as N32 code, so an unmodified "%u" should suffice since 
> size_t is defined as an unsigned int in /usr/include/sys/types.h.  (Use 
> %lu for 64-bit code where size_t is an unsigned long instead.  Which 
> clearly shows why the z modifier is needed for portable code!)

So far, i see what the problem is here.

Apart from the fact that fixing this in compat_* code would be tricky
and would be needed for few platforms (IRIX the only one known so far),
the detection is also tricky.  Besically, the test_* code would have
to compare sizeof(size_t) to native types like sizeof(unsigned)
and sizeof(unsigned long) and sizeof(unsigned long long) to make
the decision.  That's all somewhat ugly...

> Also, the trailing "n" is potentially problematic.  I'm not 100% sure, 
> but I think the code is intended to print a size_t followed by a literal 
> "n".

Correct.

> But IRIX libc seems to be interpreting it as the "n" "conversion 
> character" which requests the printf family of functions to put the 
> length of the printed string into a variable.

That sounds like a very dangerous security vulnerability in IRIX
libc.  The "%n" conversion specifier is supposed to cause writing
into a variable, and even that is rather risky even when implemented
and used correctly, but the literal letter "n" is absolutely not
supposed to do any such thing.  Also, each conversion specification
ends with the conversion specifier letter, in this case the 'u',
so whatever follows the 'u' is no longer part of the conversion
specification but just literal text.

You really need to get that bug fixed.  I sounds extremely dangerous
and and i can think of no way to work around it.  It is likely to
have dire consequences in any software you compile.

> In this case, the trailing n didn't cause any issues. I understand for 
> portability reasons you wouldn't want to change it, and that's totally 
> understandable.

Indeed, printing the literal 'n' right after the number is required,
roff(7) syntax dictates that it must be there.  Doing it in some
different way would cause substantial complication of the code,
and i don't think working around such a serious libc bug would be
reasonable.

> One way you could accommodate IRIX would be to use the 
> __sgi definition in configure,

I strongly dislike testing for platform IDs or version numbers in
autoconfiguration; it's fragile, never complete, and easily gets
outdated when platforms improve, which most platforms do all the
time.  Besides, the OpenBSD mips64 port also defines __sgi__, so
it idenfifies the CPU architecture rather than the operating system.
I guess NetBSD is likely to do even more of that kind because NetBSD
is famous for its wide range of hardware support.

> and then you could use perl or sed to 
> change the code, or you could throw a warning out in configure regarding 
> %zu in the code and we could work out a more conservative patch that 
> fixes just the %zu for those who stumble upon this. This mostly affects 
> the makewhatis commands, the actual mandoc binary appears to work fine.

I doubt that all else works fine without %zu.  That sequence is used
for

 * reporting of configuration errors in manpath.c
 * ctags(1) support for terminal output in term_tag.c
 * abstract syntax tree dumping in tree.c
 * PostScript and PDF generation in term_ps.c
 * mdoc(7) syntax validation in mdoc_validate.c

All that is potentially broken unless correctly patched.

> I would think that some other platforms like older AIX, Solaris, HP-UX 
> etc. may have this issue too, but I've not worked on those extensively.

F
Solaris 11 and Solaris 10 are definitely fine.  Solaris 9 is now
so old that i'm not very diligent about supporting it any longer.
There was a mandoc port for AIX many years ago, and people have
occasionally done light testing on AIX in recent years, but no
such issue came up.  I'm not aware that anyone ever tested on HP-UX,
that system doesn't appear to be used a lot.

> On your question of whether or not we have ports or anything, not 
> currently.

No problem, so i'll just link to these build instructions:

> I usually, for now, keep patches and references in "Xenopatches" here:
> http://gitea.irixce.org/Raion/Xenopatches/src/branch/master/mandoc
> 
> So pragmatically, once I figure out how to get it all together, you can 
> link here and I'll include a build instructions file. Once Nekoware II 
> is packaging tardists again, you can link back to nekoware II's homepage.

Thanks for your information about Nekoware II.

For now, i have added links to ports.html and porthistory.html
as shown below.  Speak up if you think there is a better way.

Yours,
  Ingo


Index: porthistory.html
===================================================================
RCS file: /home/cvs/mandoc/www/porthistory.html,v
retrieving revision 1.52
diff -u -r1.52 porthistory.html
--- porthistory.html	4 Mar 2020 03:19:07 -0000	1.52
+++ porthistory.html	27 Aug 2020 17:58:28 -0000
@@ -36,6 +36,7 @@
   illumos (<a href="https://github.com/illumos/illumos-gate/commit/cec8643b41ebefad6c677010fc784dc4bb0550f3#diff-46d91f95b4440f9432e65c9b3e674271">2019 May 30</a>, Michal Nowak)
   Alpine Linux (<a href="https://git.alpinelinux.org/cgit/aports/commit?id=a33e421da04f54e4a9398da416982720bbae84eb">2019 Aug 25</a>)
   Fedora (<a href="https://src.fedoraproject.org/rpms/mandoc/c/8e2f011858c0b699122ad5ceec4afba9564a5c4c">2019 Oct 16</a>, David Cantrell)
+  <strong>IRIX</strong> Nekoware II (<a href="http://gitea.irixce.org/Raion/Xenopatches/commit/f29efd3d02b4d336f1d2ab6682e51ca52ee636ee">2020 June 2</a>, Kazuo Kuroi)
   </li>
 <li>1.14.4 (<a href="/cgi-bin/cvsweb/NEWS#rev1.32">2018 Aug 8</a>):
   Void Linux (<a href="https://github.com/void-linux/void-packages/commit/9a366969487696e4d8743cd198fef084924814b4">2018 Aug 8</a>, Leah Neukirchen)
Index: ports.html
===================================================================
RCS file: /home/cvs/mandoc/www/ports.html,v
retrieving revision 1.78
diff -u -r1.78 ports.html
--- ports.html	4 Mar 2020 03:20:27 -0000	1.78
+++ ports.html	27 Aug 2020 17:54:42 -0000
@@ -245,6 +245,17 @@
     <td>&mdash;</td>
   </tr>
   <tr>
+    <td><a class="external" href="https://irixnet.org/">IRIX</a></td>
+    <td>1.14.5</td>
+    <td>&mdash;</td>
+    <td>&mdash;</td>
+    <td>2020 June 2</td>
+    <td>&mdash;</td>
+    <td>&mdash;</td>
+    <td>&mdash;</td>
+    <td>&mdash;</td>
+  </tr>
+  <tr>
     <td><a class="external" href="https://crux.nu/">Crux Linux</a></td>
     <td>1.14.3</td>
     <td>&mdash;</td>
@@ -559,6 +569,15 @@
       >macports/mandoc</a></td>
     <td>dito</td>
     <td>&mdash;</td>
+  </tr>
+  <tr>
+    <td><a class="external" href="https://irixnet.org/">IRIX</a></td>
+    <td><a class="external"
+      href="http://gitea.irixce.org/Raion/Xenopatches/commits/branch/master/mandoc"
+      >Xenopatches/mandoc</a></td>
+    <td>work in progress</td>
+    <td><a class="external" href="http://gitea.irixce.org/Raion"
+      >Kazuo Kuroi</a>, Nekoware II</td>
   </tr>
   <tr>
     <td><a class="external" href="https://crux.nu/">Crux Linux</a></td>
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Patching Mandoc for IRIX
  2020-08-27 18:09     ` Ingo Schwarze
@ 2020-08-28 19:40       ` Kazuo Kuroi
  2020-08-31 14:12         ` Ingo Schwarze
  0 siblings, 1 reply; 6+ messages in thread
From: Kazuo Kuroi @ 2020-08-28 19:40 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: discuss

Hi Ingo,

This has gotten piled up and I've not had a chance to apply any of your 
fixes.

1. Libc on IRIX:

We're not looking to replace libc with say GNUlibc (oh gods no) and I 
understand the viewpoint that you don't want to pollute the code with 
workarounds. What I am suggesting possibly would be to have your 
configure script detect IRIX either by the preprocessor macro __sgi or 
by uname -a output and automatically advise them to check your ports 
page for a proper patch or something like that. Give me some time to see 
what I can do. . There's literally no chance of fixing the libc unless 
someone comes along and writes a 100% FOSS reimplementation, which I'm 
sure we both know is wishful thinking.

Similar to say HP-UX 11iv1 or Tru64, IRIX is 100% unsupported by its 
current IP holder (HP Enterprise) and hasn't received any patches since 
2006, it's also not FOSS and there's thus no real option but to create a 
wrapper or something around it. Sure, it's an issue, and this is not the 
last BSD-related project I've had these issues with, I have been 
researching byacc, bsd awk, openrsync (From OpenBSD), openm4 (also from 
BSD) etc. to figure out good FOSS replacements for IRIX tools.

Ironically, I don't have issues with %zu with GNU ports for the most 
part, and I've been developing a little code scanner to figure out how 
to go in and catch these issues before I try compiling so I can be on 
the lookout. It's not a bug; more like a lack of functionality. The 
trailing n, apparently, doesn't cause issues when I remove the z format 
specifier. I'm going to confirm that all of your %zun are for size_t, 
and I'll go in and double check that we've not introduced any nasty bugs 
in our downstream patches. The postscript functionality I've not tested, 
primarily because the main goal here is to avoid having to use GNU groff.

2. __sgi vs __sgi__

__sgi is different. IRIX doesn't define __sgi__ as to my knowledge. But 
IRIX is pretty much static at this point, so uname output and platform 
IDs aren't going to change. AFAIK, OpenBSD no longer supports SGI 
hardware, so why their other MIPS ports would define that is probably a 
bug unless it's using a straight port of IRIX n32/n64 ABIs as Linux MIPS 
does (There's so many ABI similarities other than syscalls/libc 
differences it's kinda uncanny).

Fair point either way, but in that case what alternatives do you 
suggest? I think it's fair to just point us downstream if that's all you 
can do it.

3. Conclusions

To end my message for now I want to explain that the reason people still 
use IRIX is because of its uniqueness and historical value. I certainly 
am understanding that I can't expect you to make all kinds of special 
accomodations for us, all the same I do appreciate your help. For me, 
and most of the community, Linux and BSD on SGI hardware misses the 
point. Without the original environment, it's just below-average 
underpowered hardware. if I wanted a high performance RISC system 
without that baggage, I'd use POWER64el. So we're kinda stuck with IRIX 
since it's a system of interest.

For now, I'll check the latest CVS revisions, see if that solves some of 
the issues mentioned, and I'll do some more extensive testing just to 
verify if I have any more issues. Will take me a bit of time before you 
hear from me again!

Best Regards,

Kazuo


On 8/27/2020 2:09 PM, Ingo Schwarze wrote:
> Hi Kazuo,
>
> oh no, this thread got buried under other matters *again*.
> This is getting really embarassing...  :-(
>
> Kazuo Kuroi wrote on Mon, Jun 22, 2020 at 06:09:09PM -0400:
>
>> IRIX is not free/open source, and we do not have access
>> to the IRIX libc to change the printf() implementation.
> Now i'm somewhat confused - how do you diagnose and fix security
> vulnerabilities in libc in that case?  The system is no longer
> maintained by the original vendor, right?
>
>> I've considered adding an external override printf() replacement
>> to my library I use to improve portability (libxg)
> Yes.  My general philosophy is to write code according to current
> standards, and if an older system lacks some function, provide
> a replacement function if that can be done without undue effort.
> I do not want to pollute the portable code with wrapper functions
> or #ifdefs or even macro expansion.
>
> Unfortunately, writing a portability wrapper for printf(3) that
> modifies the format string to change %zu into whatever is appropriate
> for the current platform is tricky and potentially dangerous, so
> maybe it is better that you manually maintain patches for that,
> unless or until we have a better idea.
>
>> but let me quote a dev on my forums on the topic:
>>
>> "The "%zun" format specifier to the printf family of functions is
>> problematic for us.  Newer libc implementations (GNU, BSD, ...) use the
>> "z" character as a length modifier to indicate the following "u"
>> argument is of type size_t.  IRIX libc doesn't support the "z" length
>> modifier at all.  So that's got to go, for starters.  I'm assuming
>> you're compiling as N32 code, so an unmodified "%u" should suffice since
>> size_t is defined as an unsigned int in /usr/include/sys/types.h.  (Use
>> %lu for 64-bit code where size_t is an unsigned long instead.  Which
>> clearly shows why the z modifier is needed for portable code!)
> So far, i see what the problem is here.
>
> Apart from the fact that fixing this in compat_* code would be tricky
> and would be needed for few platforms (IRIX the only one known so far),
> the detection is also tricky.  Besically, the test_* code would have
> to compare sizeof(size_t) to native types like sizeof(unsigned)
> and sizeof(unsigned long) and sizeof(unsigned long long) to make
> the decision.  That's all somewhat ugly...
>
>> Also, the trailing "n" is potentially problematic.  I'm not 100% sure,
>> but I think the code is intended to print a size_t followed by a literal
>> "n".
> Correct.
>
>> But IRIX libc seems to be interpreting it as the "n" "conversion
>> character" which requests the printf family of functions to put the
>> length of the printed string into a variable.
> That sounds like a very dangerous security vulnerability in IRIX
> libc.  The "%n" conversion specifier is supposed to cause writing
> into a variable, and even that is rather risky even when implemented
> and used correctly, but the literal letter "n" is absolutely not
> supposed to do any such thing.  Also, each conversion specification
> ends with the conversion specifier letter, in this case the 'u',
> so whatever follows the 'u' is no longer part of the conversion
> specification but just literal text.
>
> You really need to get that bug fixed.  I sounds extremely dangerous
> and and i can think of no way to work around it.  It is likely to
> have dire consequences in any software you compile.
>
>> In this case, the trailing n didn't cause any issues. I understand for
>> portability reasons you wouldn't want to change it, and that's totally
>> understandable.
> Indeed, printing the literal 'n' right after the number is required,
> roff(7) syntax dictates that it must be there.  Doing it in some
> different way would cause substantial complication of the code,
> and i don't think working around such a serious libc bug would be
> reasonable.
>
>> One way you could accommodate IRIX would be to use the
>> __sgi definition in configure,
> I strongly dislike testing for platform IDs or version numbers in
> autoconfiguration; it's fragile, never complete, and easily gets
> outdated when platforms improve, which most platforms do all the
> time.  Besides, the OpenBSD mips64 port also defines __sgi__, so
> it idenfifies the CPU architecture rather than the operating system.
> I guess NetBSD is likely to do even more of that kind because NetBSD
> is famous for its wide range of hardware support.
>
>> and then you could use perl or sed to
>> change the code, or you could throw a warning out in configure regarding
>> %zu in the code and we could work out a more conservative patch that
>> fixes just the %zu for those who stumble upon this. This mostly affects
>> the makewhatis commands, the actual mandoc binary appears to work fine.
> I doubt that all else works fine without %zu.  That sequence is used
> for
>
>   * reporting of configuration errors in manpath.c
>   * ctags(1) support for terminal output in term_tag.c
>   * abstract syntax tree dumping in tree.c
>   * PostScript and PDF generation in term_ps.c
>   * mdoc(7) syntax validation in mdoc_validate.c
>
> All that is potentially broken unless correctly patched.
>
>> I would think that some other platforms like older AIX, Solaris, HP-UX
>> etc. may have this issue too, but I've not worked on those extensively.
> F
> Solaris 11 and Solaris 10 are definitely fine.  Solaris 9 is now
> so old that i'm not very diligent about supporting it any longer.
> There was a mandoc port for AIX many years ago, and people have
> occasionally done light testing on AIX in recent years, but no
> such issue came up.  I'm not aware that anyone ever tested on HP-UX,
> that system doesn't appear to be used a lot.
>
>> On your question of whether or not we have ports or anything, not
>> currently.
> No problem, so i'll just link to these build instructions:
>
>> I usually, for now, keep patches and references in "Xenopatches" here:
>> http://gitea.irixce.org/Raion/Xenopatches/src/branch/master/mandoc
>>
>> So pragmatically, once I figure out how to get it all together, you can
>> link here and I'll include a build instructions file. Once Nekoware II
>> is packaging tardists again, you can link back to nekoware II's homepage.
> Thanks for your information about Nekoware II.
>
> For now, i have added links to ports.html and porthistory.html
> as shown below.  Speak up if you think there is a better way.
>
> Yours,
>    Ingo
>
>
> Index: porthistory.html
> ===================================================================
> RCS file: /home/cvs/mandoc/www/porthistory.html,v
> retrieving revision 1.52
> diff -u -r1.52 porthistory.html
> --- porthistory.html	4 Mar 2020 03:19:07 -0000	1.52
> +++ porthistory.html	27 Aug 2020 17:58:28 -0000
> @@ -36,6 +36,7 @@
>     illumos (<a href="https://github.com/illumos/illumos-gate/commit/cec8643b41ebefad6c677010fc784dc4bb0550f3#diff-46d91f95b4440f9432e65c9b3e674271">2019 May 30</a>, Michal Nowak)
>     Alpine Linux (<a href="https://git.alpinelinux.org/cgit/aports/commit?id=a33e421da04f54e4a9398da416982720bbae84eb">2019 Aug 25</a>)
>     Fedora (<a href="https://src.fedoraproject.org/rpms/mandoc/c/8e2f011858c0b699122ad5ceec4afba9564a5c4c">2019 Oct 16</a>, David Cantrell)
> +  <strong>IRIX</strong> Nekoware II (<a href="http://gitea.irixce.org/Raion/Xenopatches/commit/f29efd3d02b4d336f1d2ab6682e51ca52ee636ee">2020 June 2</a>, Kazuo Kuroi)
>     </li>
>   <li>1.14.4 (<a href="/cgi-bin/cvsweb/NEWS#rev1.32">2018 Aug 8</a>):
>     Void Linux (<a href="https://github.com/void-linux/void-packages/commit/9a366969487696e4d8743cd198fef084924814b4">2018 Aug 8</a>, Leah Neukirchen)
> Index: ports.html
> ===================================================================
> RCS file: /home/cvs/mandoc/www/ports.html,v
> retrieving revision 1.78
> diff -u -r1.78 ports.html
> --- ports.html	4 Mar 2020 03:20:27 -0000	1.78
> +++ ports.html	27 Aug 2020 17:54:42 -0000
> @@ -245,6 +245,17 @@
>       <td>&mdash;</td>
>     </tr>
>     <tr>
> +    <td><a class="external" href="https://irixnet.org/">IRIX</a></td>
> +    <td>1.14.5</td>
> +    <td>&mdash;</td>
> +    <td>&mdash;</td>
> +    <td>2020 June 2</td>
> +    <td>&mdash;</td>
> +    <td>&mdash;</td>
> +    <td>&mdash;</td>
> +    <td>&mdash;</td>
> +  </tr>
> +  <tr>
>       <td><a class="external" href="https://crux.nu/">Crux Linux</a></td>
>       <td>1.14.3</td>
>       <td>&mdash;</td>
> @@ -559,6 +569,15 @@
>         >macports/mandoc</a></td>
>       <td>dito</td>
>       <td>&mdash;</td>
> +  </tr>
> +  <tr>
> +    <td><a class="external" href="https://irixnet.org/">IRIX</a></td>
> +    <td><a class="external"
> +      href="http://gitea.irixce.org/Raion/Xenopatches/commits/branch/master/mandoc"
> +      >Xenopatches/mandoc</a></td>
> +    <td>work in progress</td>
> +    <td><a class="external" href="http://gitea.irixce.org/Raion"
> +      >Kazuo Kuroi</a>, Nekoware II</td>
>     </tr>
>     <tr>
>       <td><a class="external" href="https://crux.nu/">Crux Linux</a></td>
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Patching Mandoc for IRIX
  2020-08-28 19:40       ` Kazuo Kuroi
@ 2020-08-31 14:12         ` Ingo Schwarze
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Schwarze @ 2020-08-31 14:12 UTC (permalink / raw)
  To: discuss; +Cc: Kazuo Kuroi

Hi Kazuo,

Kazuo Kuroi wrote on Fri, Aug 28, 2020 at 03:40:26PM -0400:

> We're not looking to replace libc with say GNUlibc (oh gods no)

That makes sense to me.  Replacing the complete libc would be a very
drastic change in any operating system.  It can easily be contrary
to project goals.

> and I 
> understand the viewpoint that you don't want to pollute the code with 
> workarounds. What I am suggesting possibly would be to have your 
> configure script detect IRIX either by the preprocessor macro __sgi or 
> by uname -a output and automatically advise them to check your ports 
> page for a proper patch or something like that.

Again, i don't like testing for platform names like that.  Imagine
one day, you find a way to really fix the problem in IRIX - that could
happen even if today, you don't know yet how.  Then such a test would
deliver totally bogus advice to users, and i'm not likely to even
notice because i don't use IRIX daily.  I hate it when application
software makes libellious statements about operating systems merely
because they had a problem at some point in the past.

Also imagine somebody else has another system that also chokes on %zu.
Testing for __sgi doesn't help with that at all.

But now your input gave me a somewhat similar idea.  Maybe i should
write an automated test that detects whether printf("%zu...", ...)
works as expected, and if it doesn't, print a diagnostic message
advising the user what exactly the problem is, why it can't be fixed
automatically, and how they can fix it manually (for example, using
a patch you provide on your website).

I don't think i will do that for the upcoming mandoc realease yet...

> Give me some time to see what I can do.

 ... to give you a chance to think it through: whether that approach
makes sense to you, too.

> There's literally no chance of fixing the libc unless 
> someone comes along and writes a 100% FOSS reimplementation,
> which I'm sure we both know is wishful thinking.

It seems to me you need some way to fix at least security bugs one
way or another, and the misinterpretion of "%zun" as "something + %n"
does indeed look like a security vulnerability to me.

I'm not convinced fixing a bad bug in a single function requiires
replacing the whole library.  Even without decompiling, it is likely
possible to replace an individual buggy function in the library.
Something like

   $ ld --shared -o libc.so.new printf.o libc.so.old

might be all that is needed, maybe with a few more options, if
necessary replacing printf.o with something like vfprintf.o if it
turns out the actual bug is in that function.  Of course, you need
to write your own printf.c or vfprintf.c or whatever or grab one
from a freely licensed C library (for example a BSD one). If that
isn't possible, using a feature like LD_PRELOAD might be another
option, again only overriding functions that are actually broken,
not rewriting everything from scratch.

Either way, problems like this require solutions on the operating
system level, not on the application software level.

> Ironically, I don't have issues with %zu with GNU ports for the most 
> part,

GNU projects using autoconf are notorious for not using the printf(3)
implementation from the native libc.  Autoconf does tests for some
subfeatures of printf(3), is extremely picky about the results,
usually concludes the native libc is no good and proceeds to use
some bundled version of printf(3) instead.  That of course hides
potential issues and instead creates others, if the bundled (GNU)
implementation isn't fully compatible with what users expect on the
operating system in question.

I hate it, though, when application programs test for operating
system bugs (rather than whether features are supported) and then
replace components of the operating system merely because they
don't like what they see, so i'm not going to do that.

If the OS provides a function, i will use it and rely on the operating
system maintainers to make sure it works.  It is not my job as an
application developer to second-guess operating system maintainers.
This dog also bites the other way: imagine a libc function has a
common bug in all operating systems.  It is reported and all systems
fix it, but most application software (in particular programs
endulging in autoconf orgies) still suffer from it because they
considered it smart to bundle and use their own copy.  That's a
bummer, isn't it?

> and I've been developing a little code scanner to figure out how 
> to go in and catch these issues before I try compiling so I can be on 
> the lookout. It's not a bug; more like a lack of functionality. The 
> trailing n, apparently, doesn't cause issues when I remove the z format 
> specifier. I'm going to confirm that all of your %zun are for size_t, 

If you find any that aren't, please tell me, because that might indicate
a bug in mandoc...

> and I'll go in and double check that we've not introduced any nasty bugs 
> in our downstream patches. The postscript functionality I've not tested, 
> primarily because the main goal here is to avoid having to use GNU groff.
> 
> 2. __sgi vs __sgi__
> 
> __sgi is different. IRIX doesn't define __sgi__ as to my knowledge. But 
> IRIX is pretty much static at this point, so uname output and platform 
> IDs aren't going to change. AFAIK, OpenBSD no longer supports SGI 
> hardware,

Oh, you are right, the sgi port was dropped recently:

  https://www.openbsd.org/sgi.html

> so why their other MIPS ports would define that is probably a 
> bug unless it's using a straight port of IRIX n32/n64 ABIs as Linux MIPS 
> does (There's so many ABI similarities other than syscalls/libc 
> differences it's kinda uncanny).
> 
> Fair point either way, but in that case what alternatives do you 
> suggest? I think it's fair to just point us downstream if that's all you 
> can do it.

See above for another idea.

> 3. Conclusions
> 
> To end my message for now I want to explain that the reason people still 
> use IRIX is because of its uniqueness and historical value. I certainly 
> am understanding that I can't expect you to make all kinds of special 
> accomodations for us, all the same I do appreciate your help. For me, 
> and most of the community, Linux and BSD on SGI hardware misses the 
> point. Without the original environment, it's just below-average 
> underpowered hardware. if I wanted a high performance RISC system 
> without that baggage, I'd use POWER64el. So we're kinda stuck with IRIX 
> since it's a system of interest.

Oh, i see.  Yes, that makes sense to me.

> For now, I'll check the latest CVS revisions, see if that solves some of 
> the issues mentioned, and I'll do some more extensive testing just to 
> verify if I have any more issues. Will take me a bit of time before you 
> hear from me again!

Sure, there is no hurry.

I appreciate that you consider and investigate all this seriously.

We can probably iterate a bit, both sides moving closer to the
other in baby steps (a bit of detection on my side, smaller and
cleaner patches on your side, even more bits of detection here,
even smaller and cleaner patches there, and so on), but it need
not be finished next week.

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-08-31 14:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-02 23:29 Patching Mandoc for IRIX Kazuo Kuroi
2020-06-22 21:44 ` Ingo Schwarze
2020-06-22 22:09   ` Kazuo Kuroi
2020-08-27 18:09     ` Ingo Schwarze
2020-08-28 19:40       ` Kazuo Kuroi
2020-08-31 14:12         ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).