discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Arch Linux Improvements
@ 2018-12-30  1:25 John McKay
  2018-12-30  2:57 ` Ingo Schwarze
  0 siblings, 1 reply; 3+ messages in thread
From: John McKay @ 2018-12-30  1:25 UTC (permalink / raw)
  To: discuss

The man-pages package in Arch Linux has two versions of the manual
page for several topics. One is the version from the Linux man-pages
project and the other is from the POSIX reference. For example
/usr/share/man/man3/freeaddrinfo.3p.gz
/usr/share/man/man3/getaddrinfo.3.gz
both contain entries for freeaddrinfo(3).

This causes two difficulties when using mandoc on Arch Linux. The
first is that if you type
man freeaddrinfo
it pulls up the POSIX version. The POSIX versions of the pages are
typically not what you want as they have less detail and don't always
match the implementation described in the other page. The second
difficulty is that even if you specify
man -s 3 freeaddrinfo
it still brings up the POSIX version. While fixing the first
difficulty seems to be non-trivial (I have a patch that works for me,
but it's a bit of a hack and not portable to other systems), the
second difficulty seems much easier to fix. I have attached a patch
below.

The cause is twofold. In dbadd_mlink it adds the page both with the
section that is found in the containing folder and also the section
that is found by scanning the file name. Also, in lstmatch the section
names are compared using strcasestr. By only adding the page using the
file name's section and changing lstmatch to use strcasecmp the search
by section allows you to find the correct version of the page.

There are two side effects. If you have a manual page that's in the
correct folder but is named wrong you can no longer find it if you
search by section. You also cannot truncate the architecture name when
searching by architecture. None of the manual pages on my system have
names that don't match the folder they are in. I never search by
architecture so I don't know if it's common to shorten the name.

Any comments are greatly appreciated, especially concerning a fix for
the order that it uses to find the result shown.

Index: mandocdb.c
===================================================================
RCS file: /cvs/mandoc/mandocdb.c,v
retrieving revision 1.261
diff -c -u -r1.261 mandocdb.c
--- mandocdb.c	14 Dec 2018 01:18:26 -0000	1.261
+++ mandocdb.c	29 Dec 2018 23:15:01 -0000
@@ -2022,7 +2022,7 @@
 dbadd_mlink(const struct mlink *mlink)
 {
 	dba_page_alias(mlink->mpage->dba, mlink->name, NAME_FILE);
-	dba_page_add(mlink->mpage->dba, DBP_SECT, mlink->dsec);
 	dba_page_add(mlink->mpage->dba, DBP_SECT, mlink->fsec);
 	dba_page_add(mlink->mpage->dba, DBP_ARCH, mlink->arch);
 	dba_page_add(mlink->mpage->dba, DBP_FILE, mlink->file);
Index: mansearch.c
===================================================================
RCS file: /cvs/mandoc/mansearch.c,v
retrieving revision 1.80
diff -c -u -r1.80 mansearch.c
--- mansearch.c	13 Dec 2018 11:55:46 -0000	1.80
+++ mansearch.c	29 Dec 2018 23:15:01 -0000
@@ -534,7 +537,8 @@
         if (want == NULL || have == NULL || *have == '\0')
                 return 1;
         while (*have != '\0') {
-                if (strcasestr(have, want) != NULL)
+                if (strcasecmp(have, want) == 0)
                         return 1;
                 have = strchr(have, '\0') + 1;
         }
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Arch Linux Improvements
  2018-12-30  1:25 Arch Linux Improvements John McKay
@ 2018-12-30  2:57 ` Ingo Schwarze
  2018-12-30 22:01   ` John McKay
  0 siblings, 1 reply; 3+ messages in thread
From: Ingo Schwarze @ 2018-12-30  2:57 UTC (permalink / raw)
  To: John McKay; +Cc: discuss

Hi John,

John McKay wrote on Sun, Dec 30, 2018 at 01:25:59AM +0000:

> The man-pages package in Arch Linux has two versions of the manual
> page for several topics. One is the version from the Linux man-pages
> project and the other is from the POSIX reference.

POSIX is not providing manual pages, even though the form is
superficially similar.  But first, those are not manual pages because
they do not even attempt to explain things in a way that is easy
to understand for users; instead, they are standards documents,
aiming for maximum formalism, in a way which is typically quite
hard to understand even for very experienced programmers.  Those
are documents for lawyers, not for users.  And secondly, as you
say, they don't document the tools you have actually installed.

So do not install them in the manpath.  Install them somewhere else
where man(1) does not find them by default, for example below
/usr/share/doc/posix/man/ and instruct the few users who really
want to use man(1) for reading these pages to use

   $ man -M /usr/share/doc/posix/man

> For example
> /usr/share/man/man3/freeaddrinfo.3p.gz
> /usr/share/man/man3/getaddrinfo.3.gz
> both contain entries for freeaddrinfo(3).

Right, that is a totally broken installation.  The manual page
system - neither the traditional one nor the mandoc implemation -
is not designed for mixing completely different things in the same
directory.

> This causes two difficulties when using mandoc on Arch Linux. The
> first is that if you type
> man freeaddrinfo
> it pulls up the POSIX version. The POSIX versions of the pages are
> typically not what you want as they have less detail and don't always
> match the implementation described in the other page. The second
> difficulty is that even if you specify
> man -s 3 freeaddrinfo
> it still brings up the POSIX version. While fixing the first
> difficulty seems to be non-trivial (I have a patch that works for me,
> but it's a bit of a hack and not portable to other systems), the
> second difficulty seems much easier to fix. I have attached a patch
> below.
> 
> The cause is twofold. In dbadd_mlink it adds the page both with the
> section that is found in the containing folder and also the section
> that is found by scanning the file name. Also, in lstmatch the section
> names are compared using strcasestr.

All that is done on purpose.  Several systems install pages with
suffixes in the filenames (like 3x for X11 library pages) and you
do want to find these with man -s 3, not only with man -s 3x.

Also, when the section in the directory name, file name, and in the
file itself contradict each other - like 3, 3p, 3 in your broken
installation, which should be either 3, 3, 3 or 3p, 3p, 3p, but not
a mix of both - then it is intentional that all variants are found.
After all, you have all those section names in your tree for that
pages, so man(1) should better find it for each of the names.

> By only adding the page using the file name's section and changing
> lstmatch to use strcasecmp the search by section allows you to find
> the correct version of the page.

No way.  You are deleting important functionality.

> There are two side effects. If you have a manual page that's in the
> correct folder but is named wrong you can no longer find it if you
> search by section. You also cannot truncate the architecture name when
> searching by architecture. None of the manual pages on my system have
> names that don't match the folder they are in.

You do have such pages, look at what you wrote above: man3/freeaddrinfo.3p

> I never search by architecture

Others users do, so we can't break that functionality.

> so I don't know if it's common to shorten the name.
> 
> Any comments are greatly appreciated, especially concerning a fix for
> the order that it uses to find the result shown.

If you have two pages that both match the name and the section
and that come from the same manpath, the selection is unspecified.
There is simply no way to select one or the other, they are identitical
in every respect, according to the search criteria.

The solution is to not have contradictory manual pages in the same
tree.  Keep your tree consistent.

Manual pages have four properties: manpath (=tree), name, section,
architecture.  If you have a four-tuple that matches two pages, you
cannot tell the two them apart.  At least one of the four properties
must be different to be able to select one but not the other.

If you have a broken tree containing conflicts, the best you can do
is use man -a and show all conflicting versions together.

> Index: mandocdb.c
> ===================================================================
> RCS file: /cvs/mandoc/mandocdb.c,v
> retrieving revision 1.261
> diff -c -u -r1.261 mandocdb.c
> --- mandocdb.c	14 Dec 2018 01:18:26 -0000	1.261
> +++ mandocdb.c	29 Dec 2018 23:15:01 -0000
> @@ -2022,7 +2022,7 @@
>  dbadd_mlink(const struct mlink *mlink)
>  {
>  	dba_page_alias(mlink->mpage->dba, mlink->name, NAME_FILE);
> -	dba_page_add(mlink->mpage->dba, DBP_SECT, mlink->dsec);

No.  That breaks various use cases, for example preformatted
pages like /usr/local/man/cat5/mwmrc.0, pages with unusual file
names, and so on.

>  	dba_page_add(mlink->mpage->dba, DBP_SECT, mlink->fsec);
>  	dba_page_add(mlink->mpage->dba, DBP_ARCH, mlink->arch);
>  	dba_page_add(mlink->mpage->dba, DBP_FILE, mlink->file);
> Index: mansearch.c
> ===================================================================
> RCS file: /cvs/mandoc/mansearch.c,v
> retrieving revision 1.80
> diff -c -u -r1.80 mansearch.c
> --- mansearch.c	13 Dec 2018 11:55:46 -0000	1.80
> +++ mansearch.c	29 Dec 2018 23:15:01 -0000
> @@ -534,7 +537,8 @@
>          if (want == NULL || have == NULL || *have == '\0')
>                  return 1;
>          while (*have != '\0') {
> -                if (strcasestr(have, want) != NULL)
> +                if (strcasecmp(have, want) == 0)

No.  If you do that on Solaris, then basically nothing whatsoever
works any longer, just to give one drastic example of the several
ways in which that breaks:

schwarze@login [login]:~ > ls /usr/share/man/
entities        man3libucb      man3volmgt      sman3curses     sman3sasl
man.cf          man3m           man3wsreg       sman3dat        sman3scf
man1            man3mail        man3xcurses     sman3devid      sman3sched
man1b           man3malloc      man3xfn         sman3devinfo    sman3sec
man1c           man3mlib        man3xnet        sman3dl         sman3secdb
man1f           man3mp          man3xtsol       sman3dmi        sman3slp
man1m           man3mpapi       man4            sman3door       sman3smartcard
man1s           man3mvec        man4b           sman3elf        sman3snmp
man2            man3nsl         man5            sman3exacct     sman3socket
man3            man3nvpair      man6            sman3ext        sman3sysevent
man3aio         man3pam         man7            sman3gen        sman3tecla
man3bsm         man3papi        man7d           sman3gss        sman3thr
man3c           man3perl        man7fs          sman3hbaapi     sman3tiff
man3c_db        man3picl        man7i           sman3head       sman3tnf
man3cfgadm      man3picltree    man7ipp         sman3kstat      sman3ucb
man3commputil   man3plot        man7m           sman3kvm        sman3uuid
man3contract    man3pool        man7p           sman3layout     sman3volmgt
man3cpc         man3proc        man8            sman3ldap       sman3wsreg
man3curses      man3project     man9            sman3lgrp       sman3xcurses
man3dat         man3rac         man9e           sman3lib        sman3xfn
man3devid       man3resolv      man9f           sman3libucb     sman3xnet
man3devinfo     man3rpc         man9p           sman3m          sman4
man3dl          man3rsm         man9s           sman3mail       sman4b
man3dlpi        man3rt          manl            sman3malloc     sman5
man3dmi         man3sasl        mann            sman3mlib       sman6
man3door        man3scf         sman1           sman3mp         sman7
man3elf         man3sched       sman1as         sman3nsl        sman7d
man3exacct      man3sec         sman1b          sman3nvpair     sman7fs
man3ext         man3secdb       sman1c          sman3pam        sman7i
man3fm          man3sip         sman1f          sman3perl       sman7ipp
man3fontconfig  man3slp         sman1m          sman3picl       sman7m
man3gen         man3smartcard   sman1s          sman3picltree   sman7p
man3gss         man3snmp        sman2           sman3plot       sman9
man3hbaapi      man3socket      sman3           sman3pool       sman9e
man3head        man3sysevent    sman3aio        sman3proc       sman9f
man3kstat       man3tecla       sman3bsm        sman3project    sman9p
man3kvm         man3thr         sman3c          sman3rac        sman9s
man3layout      man3tnf         sman3c_db       sman3resolv     windex
man3ldap        man3tsol        sman3cfgadm     sman3rpc
man3lgrp        man3ucb         sman3contract   sman3rsm
man3lib         man3uuid        sman3cpc        sman3rt
schwarze@login [login]:~ > uname -a
SunOS login 5.10 Generic_150400-17 sun4v sparc SUNW,SPARC-Enterprise-T5220
schwarze@login [login]:~ > 

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Arch Linux Improvements
  2018-12-30  2:57 ` Ingo Schwarze
@ 2018-12-30 22:01   ` John McKay
  0 siblings, 0 replies; 3+ messages in thread
From: John McKay @ 2018-12-30 22:01 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: discuss

On Sun, 30 Dec 2018 03:57:48 +0100, Ingo Schwarze wrote:
> So do not install them in the manpath.  Install them somewhere else
> where man(1) does not find them by default, for example below
> /usr/share/doc/posix/man/ and instruct the few users who really
> want to use man(1) for reading these pages to use
>
>    $ man -M /usr/share/doc/posix/man
>
> > For example
> > /usr/share/man/man3/freeaddrinfo.3p.gz
> > /usr/share/man/man3/getaddrinfo.3.gz
> > both contain entries for freeaddrinfo(3).
>
> Right, that is a totally broken installation.  The manual page
> system - neither the traditional one nor the mandoc implemation -
> is not designed for mixing completely different things in the same
> directory.

Broken or not, it's the way that Arch Linux has set up their manual page
page package. It looks like man-db makes it work by defining 5(!)
different ways for it to look for the proper file depending on what
layout it was told at compile time. Since their default man utility
specifically supports the layout they use, I don't think they are going
to be open to changing it.

> If you have two pages that both match the name and the section
> and that come from the same manpath, the selection is unspecified.
> There is simply no way to select one or the other, they are identitical
> in every respect, according to the search criteria.
> 
> The solution is to not have contradictory manual pages in the same
> tree.  Keep your tree consistent.
> 
> Manual pages have four properties: manpath (=tree), name, section,
> architecture.  If you have a four-tuple that matches two pages, you
> cannot tell the two them apart.  At least one of the four properties
> must be different to be able to select one but not the other.
> 
> If you have a broken tree containing conflicts, the best you can do
> is use man -a and show all conflicting versions together.

What about adding a new key that is just the section as derived from
the file name? That would add to the size of the database but if you
have a bad layout you can specify which page you actually want. It would
be useful for Linux users as it seems many distributions will install
manual pages under the same directory and expect the file name suffix to
be a sufficient distinguisher because man-db allows for it.

Thanks for pointing out the problems in my original plan. I was unaware
that my original suggested change would cause such a problem.
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-12-30 22:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-30  1:25 Arch Linux Improvements John McKay
2018-12-30  2:57 ` Ingo Schwarze
2018-12-30 22:01   ` John McKay

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).