discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: Kazuo Kuroi <kazuo@irixnet.org>
To: discuss@mandoc.bsd.lv
Subject: Re: Patching Mandoc for IRIX
Date: Mon, 22 Jun 2020 18:09:09 -0400	[thread overview]
Message-ID: <adc90b4b-4092-b514-dca7-d4e8d257ce26@irixnet.org> (raw)
In-Reply-To: <20200622214406.GD93760@athene.usta.de>

Hi Ingo!

Thank you for getting back to me, I appreciate the professionalism! 
Don't worry about the specific CFLAGS, -mips4 should work as well and I 
was just demonstrating what my particular use case requested. -O3 was 
just a test, and it got roped into the patch, my mistake!

I'm glad to hear you're taking my feedback seriously, let me speak on 
one point though:

IRIX is not free/open source, and we do not have access to the IRIX libc 
to change the printf() implementation. I've considered adding an 
external override printf() replacement to my library I use to improve 
portability (libxg) but let me quote a dev on my forums on the topic:

"The "%zun" format specifier to the printf family of functions is 
problematic for us.  Newer libc implementations (GNU, BSD, ...) use the 
"z" character as a length modifier to indicate the following "u" 
argument is of type size_t.  IRIX libc doesn't support the "z" length 
modifier at all.  So that's got to go, for starters.  I'm assuming 
you're compiling as N32 code, so an unmodified "%u" should suffice since 
size_t is defined as an unsigned int in /usr/include/sys/types.h.  (Use 
%lu for 64-bit code where size_t is an unsigned long instead.  Which 
clearly shows why the z modifier is needed for portable code!)

Also, the trailing "n" is potentially problematic.  I'm not 100% sure, 
but I think the code is intended to print a size_t followed by a literal 
"n".  But IRIX libc seems to be interpreting it as the "n" "conversion 
character" which requests the printf family of functions to put the 
length of the printed string into a variable.  But the next argument to 
the printf call (from mdoc_validate.c) is an int rather than a pointer 
to an int.  That would explain the segmentation fault:  printf treating 
that integer as a pointer means it is trying to write to memory the 
process doesn't own."

In this case, the trailing n didn't cause any issues. I understand for 
portability reasons you wouldn't want to change it, and that's totally 
understandable. One way you could accommodate IRIX would be to use the 
__sgi definition in configure, and then you could use perl or sed to 
change the code, or you could throw a warning out in configure regarding 
%zu in the code and we could work out a more conservative patch that 
fixes just the %zu for those who stumble upon this. This mostly affects 
the makewhatis commands, the actual mandoc binary appears to work fine.

I would think that some other platforms like older AIX, Solaris, HP-UX 
etc. may have this issue too, but I've not worked on those extensively.

Pragmatically, after I got a copy of the DWB, I think the best option 
for our usecase is to use mandoc because the implementation is just 
"Better" than the others here, and it doesn't preclude groff or nroff 
for others. The native awf used in the base IRIX has its manpages packed 
in .z pack files, I still need to test if we can use the "man" but if 
not, I am gonna make a script with a cronjob that allows for the man 
files to be packed up easily.

On your question of whether or not we have ports or anything, not 
currently. There's two main projects looking to "modernize" IRIX, each 
with its own way of doing things:

Nekoware II - Named after the old forum nekochan.net's premiere project, 
which was Nekoware. We follow the same philosophy of the base of the 
ports should be compiled with the native compiler (MIPSPro) while 
projects that are too "new" can use GCC as needed. We package stuff in 
the .tardist native format, basically a tarball with the 
idb/spec/distfile format used by IRIX since the 4D1 68k era. It's a bit 
clunky, but we prefer to kind of keep things historical.

I'm a project head for Nekoware II - I don't do development that 
intensively, I'm basically a guy who knows how to beat on other people's 
code to make it work. I work more as a documenter, packager and 
occasional supervisor.

SGUG RSE - A competing project by the sgi.sh forum. They have ported RPM 
to IRIX and are using an SRPM approach with GCC and distcc. I do not 
know but so much about this, but they are not currently packaging mandoc 
as they use GNU GROFF.

I usually, for now, keep patches and references in "Xenopatches" here:
http://gitea.irixce.org/Raion/Xenopatches/src/branch/master/mandoc

So pragmatically, once I figure out how to get it all together, you can 
link here and I'll include a build instructions file. Once Nekoware II 
is packaging tardists again, you can link back to nekoware II's homepage.


Thank you very much.

On 6/22/2020 5:44 PM, Ingo Schwarze wrote:
> Hello,
>
> sorry for taking so long to respond, i was somewhat distracted by other
> matters and now found your message while systematically looking for
> unanswered messages regarding mandoc.
>
>
> Kazuo Kuroi wrote on Tue, Jun 02, 2020 at 07:29:52PM -0400:
>
>> I have patched mandoc to work with IRIX, but I have a feeling that the
>> fixes made will require some changes. Let me first explain my goals here:
>>
>> I am patching mandoc to work with MIPSPro, the native compiler of IRIX,
> That sounds good and interesting.  I like it when people try to make
> software work with native compilers that is intended to be portable,
> rather than following a mindless reflex like "oh let's just require
> gcc or clang to compile this".
>
>> so I'll have to explain the changes I made to get it to build:
>>
>> http://gitea.irixce.org/Raion/Xenopatches/raw/branch/master/mandoc/mandoc.patch
>>
>> Here's the patch file.
>>
>> First change is because it doesn't reliably detect the c99 driver, and
>> as the code uses non ANSI-related things, it needs to detect c99. This
>> can probably be disregarded,
> Possibly.  I tried to include code in ./configure in the past to
> automatically detect the system compiler, but that attempt caused
> more grief than benefit.  Few platforms needed it, and more ended
> up having trouble with the attempted test.
>
> So for the next release, i decided to simplify things by just
> statically setting CC=cc in ./configure and inviting users to
> set CC in configure.local to whatever is appropriate on their
> platform - but of course only if "cc" is not.
>
> Arguably, given that POSIX defines c99 but not cc, it might be better
> to make CC=c99 the default.  Probably, the only reason that i didn't
> is somewhat weak:  My main development platform, OpenBSD, provides
> a cc(1) command but does not provide a c99(1) command.  Maybe it
> should, but oh well.
>
> For now, i suggest you keep CC=c99 in the configure.local file for IRIX.
>
>> as can the CFLAGS reference.
> I certainly won't put -O3 into CFLAGS by default, or any other
> optimization flag for that matter.  I think operating systems should
> have defaults that are appropriate for compiling general purpose
> application software that has no special needs, and i consider it
> annoying when upstream maintainers tweak compiler flags like that.
>
> However, if there is some compelling reason why IRIX needs -O3,
> then you can put that into the configure.local file for IRIX, though
> i'm a bit at a loss trying to guess why that could possibly be
> needed.
>
> If IRIX needs -mips3, that definitely needs to go into configure.local.
> I have no idea right now how i could possibly test for that.
>
>> Next one is in mandoc.h, and it's because MIPSPro doesn't support the
>> __attribute__ block. This could be fixed with a guard for non-GCC
>> compilers, like this:
>> #ifdef __GNUC__
>> __attribute__((__format__ (__printf__, 4, 5)));
>> #endif
> Actually, i was already doing that, ./configure wrote something
> like that into config.h.  However, some of the *.c files used private
> headers using __attribute__ without including config.h first.  I
> just fixed that with the first one of the two commits below.
> Thanks for reporting the issue!
>
> While there, i also replaced the horrible __GNUC__ version number
> test by a proper feature test.  No idea why we didn't have that
> in the first place, it really wasn't difficult to write.
>
>> Or something. I would hope that you won't lock it out to GCC or clang,
> Absolutely not, that's not my intention at all.
>
>> because I'm sure there's other compilers this thing chokes on.
> Right.  Only it's sometimes hard to get reports from other compilers,
> so thanks for what you are reporting.
>
>> The rest are to fix the IRIX printf() implementation, which doesn't
>> allow for %zu as mandoc currently does.
> I would strongly recommend that you fix your IRIX libc to support %zu
> because that has been required by POSIX for more than a decade
>
>    https://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html
>
> and because not having it, which often results in using format
> specifiers for the wrong length, can cause security issues.
> Maybe not in mandoc, but in general: i don't doubt that you also
> run software on IRIX that is more prone to develop security
> vulnerabilities.  Besides, i would be surprised if closing that
> particular feature gap in the IRIX libc would cause particularly
> large amounts of work.
>
> Given that apart from the the notoriously quirky and extremely
> outdated SunOS 5.9 (2002 version), this is the first time i'm hearing
> about a system that does not support %zu - even SunOS 5.10 (2009
> version) supports that - i'm a bit hesitant to write a feature test
> for that, also because using the result of the feature test in the
> code would be somewhat intrusive.
>
> So until you come around to fixing your libc, is think carrying
> local patches for %zu is the best you can do.  But i suggest you
> use the following idiom for these patches:
>
>    mandoc_asprintf(arg, "%llun", (unsigned long long)width);
>
> That should be perfectly safe and perfectly portable, whereas the
>
>    size_t width = ...;
>    mandoc_asprintf(arg, "%un", width);
>
> that you are currently using can potentially be dangerous.
> At the very least, depending on the platform, on endianness,
> and maybe depending on other aspects of the ABI, it might
> silently produce incorrect results.
>
>> You can see my discussion with a
>> colleague on the topic here:
>>
>> https://forums.irixnet.org/thread-1946-post-14522.html
> Very interesting, please keep me posted on how things work out.
> In particular, i would be interested to learn if any of your
> manual pages format poorly with mandoc.  But it's also
> interesting to follow how your work progresses in general.
>
> Regarding the topics discussed in your forum:
>
>   * Heirloom (not "Heritage") Troff has very high typesetting
>     quality in some respects, in some even better than groff,
>     for example it has paragraph-at-once filling.  It is also
>     somewhat maintained, but AFAIK more or less by a one-man
>     team (Carsten Kunze) with a handful of commits in 2019
>     and none so far in 2020.  It is not at all an obvious
>     choice as a manual page formatter.  I'm not aware of a
>     single operating system using it for that purpose.  Carsten
>     has invested some work to make it better for manual pages,
>     but you should still expect at least occasional compatibility
>     issues when using it for manual pages, much more frquently
>     than with mandoc or groff.
>
>   * With mandoc, all three options are workable:  installing source
>     manuals or preformatted manuals, and in the former case,
>     using mandoc as both the formatter and the man(1) implementation,
>     as for example OpenBSD and Alpine Linux do it, or using mandoc
>     as the formatter but a different man(1) implementation, as for
>     example FreeBSD and Illumos do it.
>     In OpenBSD, we switched in three steps: first we switched the
>     formatter from groff to mandoc in 2010, then we switched from
>     installing preformatted to installing source manuals in 2011,
>     and finally to the mandoc implementation of man(1) in 2014.
>     Alpine Linux has done all that in one single step in 2014.
> In general, i try to upstream patches when it is possible without
> causing unreasonable complication.  Often, that work progressed in
> multiple steps, slowly and carefully improving support for that
> platform, sometimes over years.  For example, in the beginning,
> building on Oracle Solaris 11 needed lots of handholding and various
> configure.local tweaks.  By now, even the older SunOS 5.10 just
> works out of the box.
>
> That sounds reasonable.
> Sure, see above, and keep me posted about how your work is progressing!
>
> Do you have a porting or packaging system in IRIX, in any way similar
> to BSD ports, or Linux packages, or MacPorts, or Homebrew, or AUR,
> or SlackBuilds, or Cashew?  If so, is there a distinction between
> official or unofficial ports/packages?  I'm wondering whether at some
> point, it might make sense to list IRIX on
>
>    https://mandoc.bsd.lv/ports.html
>
> but i'm a bit at a loss as to how...
>
> Could you re-test whether compiling mandoc from CVS HEAD on IRIX
> now already works a bit better for you, with smaller patches?
>
> Yours,
>    Ingo

--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv


  reply	other threads:[~2020-06-22 22:09 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-02 23:29 Kazuo Kuroi
2020-06-22 21:44 ` Ingo Schwarze
2020-06-22 22:09   ` Kazuo Kuroi [this message]
2020-08-27 18:09     ` Ingo Schwarze
2020-08-28 19:40       ` Kazuo Kuroi
2020-08-31 14:12         ` Ingo Schwarze

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adc90b4b-4092-b514-dca7-d4e8d257ce26@irixnet.org \
    --to=kazuo@irixnet.org \
    --cc=discuss@mandoc.bsd.lv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).