tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* makewhatis: segfault in dbadd when using -a
@ 2021-08-06 23:53 sternenseemann
  2021-08-11 15:09 ` Ingo Schwarze
  0 siblings, 1 reply; 7+ messages in thread
From: sternenseemann @ 2021-08-06 23:53 UTC (permalink / raw)
  To: tech

[-- Attachment #1: Type: text/plain, Size: 1096 bytes --]

Hi all,

while testing Ingo's latest patch I found a segfault in makewhatis
involving -a and symlinks which I can also reproduce on unpatched CVS.

To reproduce do something like this

mkdir test-manpath
mkdir -p test-manpath/de/man1
ln -sf $(realpath test-manpath/de) test-manpath/DE

So we have a base directory which has one child that is a symlink. This
symlink points to a directory below the base directory which contains
another directory. Running makewhatis -a test-manpath will then segfault
with the following backtrace:

#0  0x000000000041a9e1 in dbadd (dba=dba@entry=0x471050,
mpage=mpage@entry=0x473560) at mandocdb.c:2140
#1  0x000000000041b0ae in mpages_merge (dba=dba@entry=0x471050,
mp=mp@entry=0x46c2b0) at mandocdb.c:1294
#2  0x000000000041ca13 in mandocdb (argc=1, argc@entry=3,
argv=<optimized out>, argv@entry=0x7fffffffd678) at mandocdb.c:513
#3  0x000000000041dffa in main (argc=3, argv=0x7fffffffd678) at main.c:165

Full backtrace is attached as well. I've stared at mpages_merge for a
little bit, but haven't figured out how this is happening so far.

Cheers,

  Lukas

[-- Attachment #2: makewhatis-full-backtrace --]
[-- Type: text/plain, Size: 3304 bytes --]

#0  0x000000000041a9e1 in dbadd (dba=dba@entry=0x471050, mpage=mpage@entry=0x473560) at mandocdb.c:2140
        mlink = 0x472510
        key = <optimized out>
        cp = 0x476460 "DE"
        mask = <optimized out>
        i = 2
        slot = 0
        mustfree = 0
        __PRETTY_FUNCTION__ = "dbadd"
#1  0x000000000041b0ae in mpages_merge (dba=dba@entry=0x471050, mp=mp@entry=0x46c2b0) at mandocdb.c:1294
        mpage = 0x473560
        mpage_dest = <optimized out>
        mlink = 0x0
        mlink_dest = <optimized out>
        meta = 0x46ca00
        cp = 0x437301 <hash_calloc> "H\203\354\b\350\\\374\377\377H\203\304\b\303H\203\354\b\350p\374\377\377H\203\304\b\303H\203\354\070dH\213\004%("
        fd = <optimized out>
        __PRETTY_FUNCTION__ = "mpages_merge"
#2  0x000000000041ca13 in mandocdb (argc=1, argc@entry=3, argv=<optimized out>, argv@entry=0x7fffffffd678) at mandocdb.c:513
        conf = {output = {includes = 0x0, man = 0x0, outfilename = 0x0, paper = 0x0, style = 0x0, tag = 0x0, tagfilename = 0x0, indent = 0, width = 0, fragment = 0, mdoc = 0, noval = 0, synopsisonly = 0, tag_found = 0, toc = 0}, manpath = {paths = 0x46d0c0, sz = 1}}
        mp = 0x46c2b0
        dba = 0x471050
        path_arg = <optimized out>
        progname = <optimized out>
        j = 0
        sz = <optimized out>
        ch = <optimized out>
        i = <optimized out>
#3  0x000000000041dffa in main (argc=3, argv=0x7fffffffd678) at main.c:165
        conf = {output = {includes = 0x100000 <error: Cannot access memory at address 0x100000>, man = 0x0, outfilename = 0xffffffffffffffff <error: Cannot access memory at address 0xffffffffffffffff>, paper = 0x8000 <error: Cannot access memory at address 0x8000>, style = 0x0, 
            tag = 0xffffffffffffffff <error: Cannot access memory at address 0xffffffffffffffff>, tagfilename = 0x0, indent = 140737488347803, width = 140737354082303, fragment = 0, mdoc = 0, noval = 0, synopsisonly = 0, tag_found = 0, toc = 0}, manpath = {paths = 0x0, sz = 140737353989472}}
        outst = {tag_files = 0x1, outdata = 0xffffffffffffffff, use_pager = 4096, wstop = 0, had_output = 256, outtype = OUTT_ASCII}
        ws = {ws_row = 2048, ws_col = 0, ws_xpixel = 0, ws_ypixel = 0}
        search = {arch = 0xffffffffffffffff <error: Cannot access memory at address 0xffffffffffffffff>, sec = 0xc0000 <error: Cannot access memory at address 0xc0000>, outkey = 0x0, argmode = 4294967295, firstmatch = -1}
        res = 0x40
        resn = 0x38000000380
        mp = <optimized out>
        conf_file = <optimized out>
        os_s = <optimized out>
        progname = 0x7fffffffda72 "makewhatis"
        sec = <optimized out>
        defpaths = <optimized out>
        auxpaths = <optimized out>
        oarg = 0x38000000380 <error: Cannot access memory at address 0x38000000380>
        tagarg = <optimized out>
        uc = <optimized out>
        ressz = 17179870080
        resnsz = 1048576
        i = <optimized out>
        ib = <optimized out>
        ssz = <optimized out>
        options = <optimized out>
        show_usage = <optimized out>
        prio = <optimized out>
        best_prio = <optimized out>
        startdir = <optimized out>
        c = <optimized out>
        os_e = MANDOC_OS_OTHER
        outmode = <optimized out>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: makewhatis: segfault in dbadd when using -a
  2021-08-06 23:53 makewhatis: segfault in dbadd when using -a sternenseemann
@ 2021-08-11 15:09 ` Ingo Schwarze
  2021-08-18 22:02   ` sternenseemann
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Schwarze @ 2021-08-11 15:09 UTC (permalink / raw)
  To: sternenseemann; +Cc: tech

Hi Lukas,

sorry for the delay in addressing a bug as serious as a segfault.
I got distracted by bugfixing in date(1), editline(3), less(1),
sftp(1), and tbl(7).

A clear case of silly season: everyone is on holiday and no one
reports any bugs, right?  Right.  Or perhaps not...  :)

sternenseemann wrote on Sat, Aug 07, 2021 at 01:53:18AM +0200:

> while testing Ingo's latest patch I found a segfault in makewhatis
> involving -a and symlinks which I can also reproduce on unpatched CVS.
> 
> To reproduce do something like this
> 
> mkdir test-manpath
> mkdir -p test-manpath/de/man1
> ln -sf $(realpath test-manpath/de) test-manpath/DE

Whoa.  A symlink inside a manpath that points to a directory?
I must admit it never crossed my mind that people might do
something like that.  No excuse for crashing, of course.

> So we have a base directory which has one child that is a symlink. This
> symlink points to a directory below the base directory which contains
> another directory. Running makewhatis -a test-manpath will then segfault
[...]
> I've stared at mpages_merge for a little bit, but haven't figured out

  "Nothing to see here, move on!"

You stared at the wrong place.  That may be where the car finally
exploded, but not where it ran off the road.

Note that i don't feel like actually *following* such a link unless
people show real-world use cases that require it.  If the link target
is below the same manpath, it will be reached by direct traversal, too.
Nobody claimed so far that having such a link point to the cellar or
store might be useful, and if i understand sufficiently well what the
point of a cellar or a store is, indeed it could hardly be useful.
If it points somewhere else, it certainly should not be followed.

Does the following patch make sense to you and work for you?

Thanks for the report,
  Ingo


Index: mandocdb.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/mandocdb.c,v
retrieving revision 1.216
diff -u -p -U7 -r1.216 mandocdb.c
--- mandocdb.c	3 Apr 2020 11:34:19 -0000	1.216
+++ mandocdb.c	11 Aug 2021 15:05:58 -0000
@@ -588,14 +588,16 @@ treescan(void)
 			}
 			/* Use logical inode to avoid mpages dupe. */
 			if (stat(path, ff->fts_statp) == -1) {
 				if (warnings)
 					say(path, "&stat");
 				continue;
 			}
+			if ((ff->fts_statp->st_mode & S_IFMT) != S_IFREG)
+				continue;
 			/* FALLTHROUGH */
 
 		/*
 		 * If we're a regular file, add an mlink by using the
 		 * stored directory data and handling the filename.
 		 */
 		case FTS_F:
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: makewhatis: segfault in dbadd when using -a
  2021-08-11 15:09 ` Ingo Schwarze
@ 2021-08-18 22:02   ` sternenseemann
  2021-08-19 17:11     ` Ingo Schwarze
  0 siblings, 1 reply; 7+ messages in thread
From: sternenseemann @ 2021-08-18 22:02 UTC (permalink / raw)
  To: tech

Hi Ingo,

On 8/11/21 5:09 PM, Ingo Schwarze wrote:
> Does the following patch make sense to you and work for you?

Yes, tested it with my real world case and it doesn't crash anymore!
Sorry for the delay with testing.

> You stared at the wrong place.  That may be where the car finally
> exploded, but not where it ran off the road.

Indeed, I later realized that treescan was to blame when I investigated
something else which you won't like to hear…

> Note that i don't feel like actually *following* such a link unless
> people show real-world use cases that require it.  If the link target
> is below the same manpath, it will be reached by direct traversal, too.
> Nobody claimed so far that having such a link point to the cellar or
> store might be useful, and if i understand sufficiently well what the
> point of a cellar or a store is, indeed it could hardly be useful.
> If it points somewhere else, it certainly should not be followed.

So, about that: NixOS creates its man directories exclusively by
symlinking man pages into a common directory from a lot of sysroot-esque
directories in the store. Of course the starting point is not individual
man pages, but lots of share/man directories in the “little sysroots”.
When building the common man path, naturally the script tasked with
doing this tries to create as little symlinks as possible: If a
directory of the common manpath is only present in one store-sysroot,
the script will symlink the directory instead of creating one and
symlinking every file in it.

Unfortunately, there is a few man sections which basically come
exclusively from one package, so from one store path: man1p and man3p
from Linux' man-pages-posix package. So for me
$GLOBAL_MANPATH/share/man3p is a symlink to
/nix/store/2fn4bf6xj9bxq3dpa6mznqbzqqanfdiv-man-pages-posix-2017a/share/man/man3p
and makewhatis(8) can't index it.

This is, however, a bit of an edge case and there is a workaround
(running a script which converts it into a proper directory), but still
a bit annoying. For the standard man sections this will probably never
happen.

I also fear that traversing such symlinks is not easy to patch into the
portable version since it would mean changing treescan quite
significantly (which currently assumes that symlinks if they are “good”
point to regular files) and you'd be hesitant to touch the in-tree one
for something like this?!

Cheers,

Lukas
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: makewhatis: segfault in dbadd when using -a
  2021-08-18 22:02   ` sternenseemann
@ 2021-08-19 17:11     ` Ingo Schwarze
  2021-09-06 14:50       ` Ingo Schwarze
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Schwarze @ 2021-08-19 17:11 UTC (permalink / raw)
  To: sternenseemann; +Cc: tech

Hi Lukas,

sternenseemann wrote on Thu, Aug 19, 2021 at 12:02:13AM +0200:
> On 8/11/21 5:09 PM, Ingo Schwarze wrote:

>> Does the following patch make sense to you and work for you?

> Yes, tested it with my real world case and it doesn't crash anymore!
> Sorry for the delay with testing.

No problem, thanks for testing.

The patch is now committed to OpenBSD-current and to bsd.lv HEAD.


> If a directory of the common manpath is only present in one
> store-sysroot, the script will symlink the directory instead
> of creating one and symlinking every file in it.
> 
> Unfortunately, there is a few man sections which basically come
> exclusively from one package, so from one store path: man1p and man3p
> from Linux' man-pages-posix package. So for me
> $GLOBAL_MANPATH/share/man3p is a symlink to
> /nix/store/2fn4bf6xj9bxq3dpa6mznqbzqqanfdiv-man-pages-posix-2017a/share/man/man3p
> and makewhatis(8) can't index it.

Uh oh.  So there *is* a real-world use case for symlinks to
directories.

It doesn't matter that on first sight, this whole approach feels quite
fragile to me.  Isn't that going to blow up in horrible ways as soon as
somebody creates a package that also uses "man3p"?  For example, on
OpenBSD, "man3p" is used for manuals of Perl library functions.

  schwarze@isnote $ uname -a
  OpenBSD isnote.usta.de 6.9 GENERIC.MP#101 amd64
  schwarze@isnote $ man -f 3p warnings 
  encoding::warnings(3p) - Warn on implicit encoding conversions
  warnings(3p) - Perl pragma to control optional warnings
  warnings::register(3p) - warnings import function

Anyway, if systems use that, considering whether it can be supported
with resonable effort makes sense.  No guarantee, of course.
I'll think about it for a bit.

> This is, however, a bit of an edge case and there is a workaround
> (running a script which converts it into a proper directory), but still
> a bit annoying. For the standard man sections this will probably never
> happen.
> 
> I also fear that traversing such symlinks is not easy to patch into the
> portable version since it would mean changing treescan quite
> significantly (which currently assumes that symlinks if they are "good"
> point to regular files)

Yes.

> and you'd be hesitant to touch the in-tree one
> for something like this?!

Can't say without looking at the code in detail, so bear with me.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: makewhatis: segfault in dbadd when using -a
  2021-08-19 17:11     ` Ingo Schwarze
@ 2021-09-06 14:50       ` Ingo Schwarze
  2021-09-06 15:52         ` sternenseemann
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Schwarze @ 2021-09-06 14:50 UTC (permalink / raw)
  To: sternenseemann; +Cc: tech

Hi Lukas,

Ingo Schwarze wrote on Thu, Aug 19, 2021 at 07:11:50PM +0200:

> Uh oh.  So there *is* a real-world use case for symlinks to
> directories.
[...]
> Anyway, if systems use that, considering whether it can be supported
> with resonable effort makes sense.  No guarantee, of course.
> I'll think about it for a bit.

So, an acceptable way to support this probably exists, using the
FTS_FOLLOW option of the function fts_set(3).  That isn't completely
trivial in this code which is already somewhat convoluted, but it
should not be excessively complicated either.  Then again, there
is a slight risk of breaking stuff while implementing it, and maybe
an even bigger risk that FTS_FOLLOW may not be completely portable -
i don't really know yet.

Besides, i need to limit the difficulty of changes i'm doing right
now or i will never get the next release out of the door.  So i
decided to postpone this particular feature until after release,
see the commit below.

Yours,
  Ingo


Log Message:
-----------
TODO: let makewhatis(8) follow symbolic links to dirs

Modified Files:
--------------
    mandoc:
        TODO

Revision Data
-------------
Index: TODO
===================================================================
RCS file: /home/cvs/mandoc/mandoc/TODO,v
retrieving revision 1.316
retrieving revision 1.317
diff -LTODO -LTODO -u -p -r1.316 -r1.317
--- TODO
+++ TODO
@@ -286,6 +286,11 @@ are mere guesses, and some may be wrong.
 
 --- missing misc features ----------------------------------------------
 
+- let makewhatis(8) follow symbolic links to dirs below READ_ALLOWED_PATH
+  this may be feasible using fts_set(FTS_FOLLOW)
+  mail to sternenseemann 19 Aug 2021 19:11:50 +0200
+  loc *  exist **  algo **  size *  imp **
+
 - -T man does not handle eqn(7) and tbl(7)
   Stephen Gregoratto 16 Feb 2020 01:28:07 +1100
   also https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=901636
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: makewhatis: segfault in dbadd when using -a
  2021-09-06 14:50       ` Ingo Schwarze
@ 2021-09-06 15:52         ` sternenseemann
  2021-09-06 17:12           ` Ingo Schwarze
  0 siblings, 1 reply; 7+ messages in thread
From: sternenseemann @ 2021-09-06 15:52 UTC (permalink / raw)
  To: tech

Hi Ingo,

On 9/6/21 16:50, Ingo Schwarze wrote:
> So, an acceptable way to support this probably exists, using the
> FTS_FOLLOW option of the function fts_set(3).  That isn't completely
> trivial in this code which is already somewhat convoluted, but it
> should not be excessively complicated either.

That's good to hear!

> Besides, i need to limit the difficulty of changes i'm doing right
> now or i will never get the next release out of the door.  So i
> decided to postpone this particular feature until after release,
> see the commit below.

That's no problem and I agree that it should probably be prioritize in 
this way. It only affects a few, albeit annoying cases and there is a 
reasonable workaround for it. Also I'm still (as far as I can tell) the 
only user of mandoc's whatis db on NixOS since I haven't yet gotten 
around to upstreaming my module / been waiting for a release.

Also one small clarification:

 > It doesn't matter that on first sight, this whole approach feels quite
 > fragile to me.  Isn't that going to blow up in horrible ways as soon as
 > somebody creates a package that also uses "man3p"?  For example, on
 > OpenBSD, "man3p" is used for manuals of Perl library functions.

The symlink manpath is immutable in NixOS, so it will be recreated from 
scratch if the list of installed packages changes. Thus this is not an 
issue.

Cheers, Lukas
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: makewhatis: segfault in dbadd when using -a
  2021-09-06 15:52         ` sternenseemann
@ 2021-09-06 17:12           ` Ingo Schwarze
  0 siblings, 0 replies; 7+ messages in thread
From: Ingo Schwarze @ 2021-09-06 17:12 UTC (permalink / raw)
  To: sternenseemann; +Cc: tech

Hi Lukas,

sternenseemann wrote on Mon, Sep 06, 2021 at 05:52:48PM +0200:

> The symlink manpath is immutable in NixOS, so it will be recreated from 
> scratch if the list of installed packages changes. Thus this is not an 
> issue.

I see.
Sufficiently advanced technology is indistinguishable from magic.  :)

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-09-06 17:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-06 23:53 makewhatis: segfault in dbadd when using -a sternenseemann
2021-08-11 15:09 ` Ingo Schwarze
2021-08-18 22:02   ` sternenseemann
2021-08-19 17:11     ` Ingo Schwarze
2021-09-06 14:50       ` Ingo Schwarze
2021-09-06 15:52         ` sternenseemann
2021-09-06 17:12           ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).