From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED autolearn=no autolearn_force=no version=3.4.4 Received: (qmail 14253 invoked from network); 22 Jun 2020 22:09:27 -0000 Received: from bsd.lv (HELO mandoc.bsd.lv) (66.111.2.12) by inbox.vuxu.org with ESMTPUTF8; 22 Jun 2020 22:09:27 -0000 Received: from fantadrom.bsd.lv (localhost [127.0.0.1]) by mandoc.bsd.lv (OpenSMTPD) with ESMTP id b45cdfb8 for ; Mon, 22 Jun 2020 17:09:25 -0500 (EST) Received: from sender4-of-o52.zoho.com (sender4-of-o52.zoho.com [136.143.188.52]) by mandoc.bsd.lv (OpenSMTPD) with ESMTP id ab13c41f for ; Mon, 22 Jun 2020 17:09:15 -0500 (EST) ARC-Seal: i=1; a=rsa-sha256; t=1592863754; cv=none; d=zohomail.com; s=zohoarc; b=LiAHzFvj4wZcxXNicqgc7PzjC8b+RYFv3idiaJ8SdqtOlLyVb5v9CjMo+TRtdLLyE9xEdplPCTgehkhBCMi31CY2vh5qw9VyqD5c6r4lfT+sUuT7XWZblZUCuDWfCoL+KM1oRqsOgXXgl6k++tseTdtjW0dOZKBzTkUWWT1aXRk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1592863754; h=Content-Type:Content-Transfer-Encoding:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To; bh=JZ1PB66tYc2uEa5b9K1AZaj0dwYc4PKhHD9wNZ0290M=; b=WkwF3C7k3jOpQ7UiwdP/xUfAn9xHMMnI5dzJThdMIQBaBFM2eoHcPoh32W8UglpOOKJWXcTiQ0GntL6DgVwoG5iAr1eFkMOdCJRijZyLojbG8nKxHbqkK7yeaQQoBEPx+UgGoK1GlKqBJCHMCo+Qj+EsyIVCp6oafKbAGWA+dm8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=irixnet.org; spf=pass smtp.mailfrom=kazuo@irixnet.org; dmarc=pass header.from= header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1592863754; s=default; d=irixnet.org; i=kazuo@irixnet.org; h=Subject:To:References:From:Message-ID:Date:MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding; bh=JZ1PB66tYc2uEa5b9K1AZaj0dwYc4PKhHD9wNZ0290M=; b=XN3sDGsdr2ArBvSUV3PhwsKAk6/rSiZO1nw6SWCAx+Spe4+CpplByRFgNitQTXrr mHWsAXr9yEqydhT0papnsuKBW5xCCai+PM9luyXCsTnweZ9bdX5DGHehovtR9l8CW8+ L/3ydSMdh4fyBtlD40yZoywx3Fdtof+4YNIrX1V8= Received: from [10.8.253.173] (172.93.226.174 [172.93.226.174]) by mx.zohomail.com with SMTPS id 1592863752675287.7250116034577; Mon, 22 Jun 2020 15:09:12 -0700 (PDT) Subject: Re: Patching Mandoc for IRIX To: discuss@mandoc.bsd.lv References: <20200622214406.GD93760@athene.usta.de> From: Kazuo Kuroi Message-ID: Date: Mon, 22 Jun 2020 18:09:09 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 X-Mailinglist: mandoc-discuss Reply-To: discuss@mandoc.bsd.lv MIME-Version: 1.0 In-Reply-To: <20200622214406.GD93760@athene.usta.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Content-Language: en-US X-ZohoMailClient: External Hi Ingo! Thank you for getting back to me, I appreciate the professionalism!=20 Don't worry about the specific CFLAGS, -mips4 should work as well and I=20 was just demonstrating what my particular use case requested. -O3 was=20 just a test, and it got roped into the patch, my mistake! I'm glad to hear you're taking my feedback seriously, let me speak on=20 one point though: IRIX is not free/open source, and we do not have access to the IRIX libc=20 to change the printf() implementation. I've considered adding an=20 external override printf() replacement to my library I use to improve=20 portability (libxg) but let me quote a dev on my forums on the topic: "The "%zun" format specifier to the printf family of functions is=20 problematic for us.=C2=A0 Newer libc implementations (GNU, BSD, ...) use th= e=20 "z" character as a length modifier to indicate the following "u"=20 argument is of type size_t.=C2=A0 IRIX libc doesn't support the "z" length= =20 modifier at all.=C2=A0 So that's got to go, for starters.=C2=A0 I'm assumin= g=20 you're compiling as N32 code, so an unmodified "%u" should suffice since=20 size_t is defined as an unsigned int in /usr/include/sys/types.h.=C2=A0 (Us= e=20 %lu for 64-bit code where size_t is an unsigned long instead.=C2=A0 Which= =20 clearly shows why the z modifier is needed for portable code!) Also, the trailing "n" is potentially problematic.=C2=A0 I'm not 100% sure,= =20 but I think the code is intended to print a size_t followed by a literal=20 "n".=C2=A0 But IRIX libc seems to be interpreting it as the "n" "conversion= =20 character" which requests the printf family of functions to put the=20 length of the printed string into a variable.=C2=A0 But the next argument t= o=20 the printf call (from mdoc_validate.c) is an int rather than a pointer=20 to an int.=C2=A0 That would explain the segmentation fault:=C2=A0 printf tr= eating=20 that integer as a pointer means it is trying to write to memory the=20 process doesn't own." In this case, the trailing n didn't cause any issues. I understand for=20 portability reasons you wouldn't want to change it, and that's totally=20 understandable. One way you could accommodate IRIX would be to use the=20 __sgi definition in configure, and then you could use perl or sed to=20 change the code, or you could throw a warning out in configure regarding=20 %zu in the code and we could work out a more conservative patch that=20 fixes just the %zu for those who stumble upon this. This mostly affects=20 the makewhatis commands, the actual mandoc binary appears to work fine. I would think that some other platforms like older AIX, Solaris, HP-UX=20 etc. may have this issue too, but I've not worked on those extensively. Pragmatically, after I got a copy of the DWB, I think the best option=20 for our usecase is to use mandoc because the implementation is just=20 "Better" than the others here, and it doesn't preclude groff or nroff=20 for others. The native awf used in the base IRIX has its manpages packed=20 in .z pack files, I still need to test if we can use the "man" but if=20 not, I am gonna make a script with a cronjob that allows for the man=20 files to be packed up easily. On your question of whether or not we have ports or anything, not=20 currently. There's two main projects looking to "modernize" IRIX, each=20 with its own way of doing things: Nekoware II - Named after the old forum nekochan.net's premiere project,=20 which was Nekoware. We follow the same philosophy of the base of the=20 ports should be compiled with the native compiler (MIPSPro) while=20 projects that are too "new" can use GCC as needed. We package stuff in=20 the .tardist native format, basically a tarball with the=20 idb/spec/distfile format used by IRIX since the 4D1 68k era. It's a bit=20 clunky, but we prefer to kind of keep things historical. I'm a project head for Nekoware II - I don't do development that=20 intensively, I'm basically a guy who knows how to beat on other people's=20 code to make it work. I work more as a documenter, packager and=20 occasional supervisor. SGUG RSE - A competing project by the sgi.sh forum. They have ported RPM=20 to IRIX and are using an SRPM approach with GCC and distcc. I do not=20 know but so much about this, but they are not currently packaging mandoc=20 as they use GNU GROFF. I usually, for now, keep patches and references in "Xenopatches" here: http://gitea.irixce.org/Raion/Xenopatches/src/branch/master/mandoc So pragmatically, once I figure out how to get it all together, you can=20 link here and I'll include a build instructions file. Once Nekoware II=20 is packaging tardists again, you can link back to nekoware II's homepage. Thank you very much. On 6/22/2020 5:44 PM, Ingo Schwarze wrote: > Hello, > > sorry for taking so long to respond, i was somewhat distracted by other > matters and now found your message while systematically looking for > unanswered messages regarding mandoc. > > > Kazuo Kuroi wrote on Tue, Jun 02, 2020 at 07:29:52PM -0400: > >> I have patched mandoc to work with IRIX, but I have a feeling that the >> fixes made will require some changes. Let me first explain my goals here= : >> >> I am patching mandoc to work with MIPSPro, the native compiler of IRIX, > That sounds good and interesting. I like it when people try to make > software work with native compilers that is intended to be portable, > rather than following a mindless reflex like "oh let's just require > gcc or clang to compile this". > >> so I'll have to explain the changes I made to get it to build: >> >> http://gitea.irixce.org/Raion/Xenopatches/raw/branch/master/mandoc/mando= c.patch >> >> Here's the patch file. >> >> First change is because it doesn't reliably detect the c99 driver, and >> as the code uses non ANSI-related things, it needs to detect c99. This >> can probably be disregarded, > Possibly. I tried to include code in ./configure in the past to > automatically detect the system compiler, but that attempt caused > more grief than benefit. Few platforms needed it, and more ended > up having trouble with the attempted test. > > So for the next release, i decided to simplify things by just > statically setting CC=3Dcc in ./configure and inviting users to > set CC in configure.local to whatever is appropriate on their > platform - but of course only if "cc" is not. > > Arguably, given that POSIX defines c99 but not cc, it might be better > to make CC=3Dc99 the default. Probably, the only reason that i didn't > is somewhat weak: My main development platform, OpenBSD, provides > a cc(1) command but does not provide a c99(1) command. Maybe it > should, but oh well. > > For now, i suggest you keep CC=3Dc99 in the configure.local file for IRIX= . > >> as can the CFLAGS reference. > I certainly won't put -O3 into CFLAGS by default, or any other > optimization flag for that matter. I think operating systems should > have defaults that are appropriate for compiling general purpose > application software that has no special needs, and i consider it > annoying when upstream maintainers tweak compiler flags like that. > > However, if there is some compelling reason why IRIX needs -O3, > then you can put that into the configure.local file for IRIX, though > i'm a bit at a loss trying to guess why that could possibly be > needed. > > If IRIX needs -mips3, that definitely needs to go into configure.local. > I have no idea right now how i could possibly test for that. > >> Next one is in mandoc.h, and it's because MIPSPro doesn't support the >> __attribute__ block. This could be fixed with a guard for non-GCC >> compilers, like this: >> #ifdef __GNUC__ >> __attribute__((__format__ (__printf__, 4, 5))); >> #endif > Actually, i was already doing that, ./configure wrote something > like that into config.h. However, some of the *.c files used private > headers using __attribute__ without including config.h first. I > just fixed that with the first one of the two commits below. > Thanks for reporting the issue! > > While there, i also replaced the horrible __GNUC__ version number > test by a proper feature test. No idea why we didn't have that > in the first place, it really wasn't difficult to write. > >> Or something. I would hope that you won't lock it out to GCC or clang, > Absolutely not, that's not my intention at all. > >> because I'm sure there's other compilers this thing chokes on. > Right. Only it's sometimes hard to get reports from other compilers, > so thanks for what you are reporting. > >> The rest are to fix the IRIX printf() implementation, which doesn't >> allow for %zu as mandoc currently does. > I would strongly recommend that you fix your IRIX libc to support %zu > because that has been required by POSIX for more than a decade > > https://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html > > and because not having it, which often results in using format > specifiers for the wrong length, can cause security issues. > Maybe not in mandoc, but in general: i don't doubt that you also > run software on IRIX that is more prone to develop security > vulnerabilities. Besides, i would be surprised if closing that > particular feature gap in the IRIX libc would cause particularly > large amounts of work. > > Given that apart from the the notoriously quirky and extremely > outdated SunOS 5.9 (2002 version), this is the first time i'm hearing > about a system that does not support %zu - even SunOS 5.10 (2009 > version) supports that - i'm a bit hesitant to write a feature test > for that, also because using the result of the feature test in the > code would be somewhat intrusive. > > So until you come around to fixing your libc, is think carrying > local patches for %zu is the best you can do. But i suggest you > use the following idiom for these patches: > > mandoc_asprintf(arg, "%llun", (unsigned long long)width); > > That should be perfectly safe and perfectly portable, whereas the > > size_t width =3D ...; > mandoc_asprintf(arg, "%un", width); > > that you are currently using can potentially be dangerous. > At the very least, depending on the platform, on endianness, > and maybe depending on other aspects of the ABI, it might > silently produce incorrect results. > >> You can see my discussion with a >> colleague on the topic here: >> >> https://forums.irixnet.org/thread-1946-post-14522.html > Very interesting, please keep me posted on how things work out. > In particular, i would be interested to learn if any of your > manual pages format poorly with mandoc. But it's also > interesting to follow how your work progresses in general. > > Regarding the topics discussed in your forum: > > * Heirloom (not "Heritage") Troff has very high typesetting > quality in some respects, in some even better than groff, > for example it has paragraph-at-once filling. It is also > somewhat maintained, but AFAIK more or less by a one-man > team (Carsten Kunze) with a handful of commits in 2019 > and none so far in 2020. It is not at all an obvious > choice as a manual page formatter. I'm not aware of a > single operating system using it for that purpose. Carsten > has invested some work to make it better for manual pages, > but you should still expect at least occasional compatibility > issues when using it for manual pages, much more frquently > than with mandoc or groff. > > * With mandoc, all three options are workable: installing source > manuals or preformatted manuals, and in the former case, > using mandoc as both the formatter and the man(1) implementation, > as for example OpenBSD and Alpine Linux do it, or using mandoc > as the formatter but a different man(1) implementation, as for > example FreeBSD and Illumos do it. > In OpenBSD, we switched in three steps: first we switched the > formatter from groff to mandoc in 2010, then we switched from > installing preformatted to installing source manuals in 2011, > and finally to the mandoc implementation of man(1) in 2014. > Alpine Linux has done all that in one single step in 2014. > In general, i try to upstream patches when it is possible without > causing unreasonable complication. Often, that work progressed in > multiple steps, slowly and carefully improving support for that > platform, sometimes over years. For example, in the beginning, > building on Oracle Solaris 11 needed lots of handholding and various > configure.local tweaks. By now, even the older SunOS 5.10 just > works out of the box. > > That sounds reasonable. > Sure, see above, and keep me posted about how your work is progressing! > > Do you have a porting or packaging system in IRIX, in any way similar > to BSD ports, or Linux packages, or MacPorts, or Homebrew, or AUR, > or SlackBuilds, or Cashew? If so, is there a distinction between > official or unofficial ports/packages? I'm wondering whether at some > point, it might make sense to list IRIX on > > https://mandoc.bsd.lv/ports.html > > but i'm a bit at a loss as to how... > > Could you re-test whether compiling mandoc from CVS HEAD on IRIX > now already works a bit better for you, with smaller patches? > > Yours, > Ingo -- To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv