From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 28913 invoked from network); 1 Sep 2021 20:35:31 -0000 Received: from bsd.lv (HELO mandoc.bsd.lv) (66.111.2.12) by inbox.vuxu.org with ESMTPUTF8; 1 Sep 2021 20:35:31 -0000 Received: from fantadrom.bsd.lv (localhost [127.0.0.1]) by mandoc.bsd.lv (OpenSMTPD) with ESMTP id c664ba23 for ; Wed, 1 Sep 2021 15:35:27 -0500 (EST) Received: from scc-mailout-kit-01.scc.kit.edu (scc-mailout-kit-01.scc.kit.edu [129.13.231.81]) by mandoc.bsd.lv (OpenSMTPD) with ESMTP id c0a61129 for ; Wed, 1 Sep 2021 15:35:26 -0500 (EST) Received: from hekate.asta.kit.edu ([141.3.145.153] helo=hekate.usta.de) by scc-mailout-kit-01.scc.kit.edu with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (envelope-from ) id 1mLWxA-0000xZ-Pi; Wed, 01 Sep 2021 22:35:25 +0200 Received: from donnerwolke.asta.kit.edu ([141.3.145.61] helo=donnerwolke.usta.de) by hekate.usta.de with esmtp (Exim 4.92.2) (envelope-from ) id 1mLWx9-0005k6-E2; Wed, 01 Sep 2021 22:35:23 +0200 Received: from athene.asta.kit.edu ([141.3.145.60] helo=athene.usta.de) by donnerwolke.usta.de with esmtp (Exim 4.84_2) (envelope-from ) id 1mLWx9-0005ZK-Ag; Wed, 01 Sep 2021 22:35:23 +0200 Received: from localhost (athene.usta.de [local]) by athene.usta.de (OpenSMTPD) with ESMTPA id 2ad770ea; Wed, 1 Sep 2021 22:35:23 +0200 (CEST) Date: Wed, 1 Sep 2021 22:35:23 +0200 From: Ingo Schwarze To: Soeren Tempel Cc: tech@mandoc.bsd.lv Subject: Re: Discrepancy in mansearch and fs_lookup behavior Message-ID: <20210901203523.GB15706@athene.usta.de> References: <1ZRQZL79I0W2O.3ILLDNAL4JXUP@8pit.net> X-Mailinglist: mandoc-tech Reply-To: tech@mandoc.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1ZRQZL79I0W2O.3ILLDNAL4JXUP@8pit.net> User-Agent: Mutt/1.12.2 (2019-09-21) Hi Soeren, Soeren Tempel wrote on Mon, Aug 30, 2021 at 09:26:59PM +0200: > I am currently working on rearranging the Alpine Linux POSIX and Perl > man pages Hold on a second; if there are mandoc bugs or misfeatures in this area, uprooting major areas of manual page forests in a particular operating system may not be the best option for dealing with that. > to make them work properly without mandoc. Do you mean, "with mandoc"? > For some historic reason, Alpine presently install these as follows: > > /usr/share/man/man3/open.3pm.gz > /usr/share/man/man3/open.3p.gz > > where 3pm is the perl open man page and 3p is the POSIX open man page. Certainly not maximally robust, but not totally unreasonable on first sight. There are three places where section identifiers can be stored: 1) at the end of the directory name 2) at the end of the file name 3) in the .Dt or .TH macro. Needless to say, it is most robust to have all three agree, but that is not always possible for a wide variety of reasons. Operating system conventions are just one such reason. Some manual pages may belong to more than one section. For example, a section 8 manual page may do double duty as a section 5 manual page for the associated configuration file. And third party manual pages may be hard to control. Packagers may be more willing to adjust file and directory names at package build time and more hesitant to patch upstream file content, in particular with respect to something that can be perceived as mostly an aesthetic issue. And there may be more reasons. Hence, in general, mandoc tries to handle mismatching section identifiers gracefully, putting the page in all sections mentioned. Apart from the section stated inside the file, man3/open.3pm is supposed to show up in both sections "3" and "3pm" and man3/open.3p in both "3" and "3p". In the following, let us keep the discussion of makewhatis(8), mansearch() and fs_search() strictly separate. This is the order of decreasing importance. First, regarding makewhatis(8). To test, i ran: rm -rf Test mkdir Test mkdir Test/man3 cp /usr/ports/pobj/man-pages-posix-2017a/man-pages-posix-2017/man3p/open.3p\ Test/man3/ cp /usr/share/man/man3p/open.3p Test/man3/open.3pm makewhatis Test alias dbm_dump=/usr/obj/regress/usr.bin/mandoc/db/dbm_dump/dbm_dump dbm_dump Test/mandoc.db which yielded: === PAGES === page name # [fh1t]open # [t]openat page sect # 3 # 3P # 3p page desc # open file page file src # man3/open.3p page name # [fh1t]open page sect # 3 # 3p # 3pm page desc # perl pragma to set default PerlIO layers for input and output page file src # man3/open.3pm === END OF PAGES === That looks reasonable to me. In particular, it has the "3" from the directory name, the 3P/3p from the file content, and the 3p/3pm from the file name. Next up, regarding mansearch(). man -M Test open -> Perl; random choice because bits, sec, name, and arch all compare equal man -M Test 3 open -> Perl; dto. and both file extensions mismatch man -M Test 3p open -> POSIX; preferred due to exact file extension match man -M Test 3pm open -> Perl; because the POSIX page does not match at all That all looks correct to me, too. Now finally, fs_search(). rm -f Test/mandoc.db man -M Test open -> POSIX; random man -M Test 3p open -> No entry; because directory does not exist man -M Test 3pm open -> No entry; because directory does not exist I think the globbing in the second part of fs_lookup could be relaxed to allow "manpath/man3*/name.[01-9]*" for sec == 3*, and multiple paths could be added to *res. Then the main program would do the usual prioritization. I think i'll start working on that but found it useful to provide feedback without too much delay, and i would welcome feedback on my other points and questions in turn. > With this setup `man 3p open` will always open the Perl man page and > there seems to be no way to open the POSIX man page. I can't reproduce, see above. Are you sure? Is this with or without mandoc.db? Are you using the latest mandoc from CVS? > Looking at the fs_lookup implementation I believe this to be the > case because mandoc expects each section to have its own subdirectory > in MANDIR. Yes. > I am also aware that OpenBSD uses the 3p section for Perl man pages > which is a bit confusing but probably Alpine's fault. Using 3p for POSIX seems to be an upstream decision, not sure whether by the Austin group or by the kernel.org packager. Anyway, it does not look like Alpine's fault. > My present understanding is that this would have to be fixed on the > Alpine side by moving these man pages to their own subdirectory. Please > let me know if there is an alternative solution. While experimenting > with moving these pages, I noticed a discrepancy in the man page lookup > behavior of mansearch and fs_lookup. Assuming the above man pages are > installed as follows: > > /usr/share/man/man3/open.3pm.gz > /usr/share/man/man3p/open.3p.gz > > if a mandoc.db exists, `man 3p open` will display the Perl (open.3pm.gz) > man page (mansearch). I cannot reproduce, see above. > If it doesn't (fs_lookup), it will display the > POSIX man page (open.3p.gz). Yes. > I find this surprising as I would expect > the two algorithms to be equivalent. They cannot be equivalent: fs_search() needs to be *much* simpler because we cannot possibly duplicate all the mandocdb.c logic in main.c. That would be too much code and too slow. Besides, main.c cannot look at file contents. When it doesn't cause too much complexity, making the two more similar to each other may be be worthwhile for particular cases that matter for practical purposes, though. > I am reporting this here as I believe this to be a "minor bug" that > doesn't interest the majority of mandoc users. Good choice, thank you. > See also: https://gitlab.alpinelinux.org/alpine/aports/-/issues/12958 Yours, Ingo -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv