* Discrepancy in mansearch and fs_lookup behavior @ 2021-08-30 19:26 Sören Tempel 2021-09-01 20:35 ` Ingo Schwarze 0 siblings, 1 reply; 8+ messages in thread From: Sören Tempel @ 2021-08-30 19:26 UTC (permalink / raw) To: tech Hello! I am currently working on rearranging the Alpine Linux POSIX and Perl man pages to make them work properly without mandoc. For some historic reason, Alpine presently install these as follows: /usr/share/man/man3/open.3pm.gz /usr/share/man/man3/open.3p.gz where 3pm is the perl open man page and 3p is the POSIX open man page. With this setup `man 3p open` will always open the Perl man page and there seems to be no way to open the POSIX man page. Looking at the fs_lookup implementation I believe this to be the case because mandoc expects each section to have its own subdirectory in MANDIR. I am also aware that OpenBSD uses the 3p section for Perl man pages which is a bit confusing but probably Alpine's fault. My present understanding is that this would have to be fixed on the Alpine side by moving these man pages to their own subdirectory. Please let me know if there is an alternative solution. While experimenting with moving these pages, I noticed a discrepancy in the man page lookup behavior of mansearch and fs_lookup. Assuming the above man pages are installed as follows: /usr/share/man/man3/open.3pm.gz /usr/share/man/man3p/open.3p.gz if a mandoc.db exists, `man 3p open` will display the Perl (open.3pm.gz) man page (mansearch). If it doesn't (fs_lookup), it will display the POSIX man page (open.3p.gz). I find this surprising as I would expect the two algorithms to be equivalent. I am reporting this here as I believe this to be a "minor bug" that doesn't interest the majority of mandoc users. See also: https://gitlab.alpinelinux.org/alpine/aports/-/issues/12958 Greetings, Sören -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Discrepancy in mansearch and fs_lookup behavior 2021-08-30 19:26 Discrepancy in mansearch and fs_lookup behavior Sören Tempel @ 2021-09-01 20:35 ` Ingo Schwarze 2021-09-02 19:09 ` Ingo Schwarze 2021-09-04 10:12 ` Sören Tempel 0 siblings, 2 replies; 8+ messages in thread From: Ingo Schwarze @ 2021-09-01 20:35 UTC (permalink / raw) To: Soeren Tempel; +Cc: tech Hi Soeren, Soeren Tempel wrote on Mon, Aug 30, 2021 at 09:26:59PM +0200: > I am currently working on rearranging the Alpine Linux POSIX and Perl > man pages Hold on a second; if there are mandoc bugs or misfeatures in this area, uprooting major areas of manual page forests in a particular operating system may not be the best option for dealing with that. > to make them work properly without mandoc. Do you mean, "with mandoc"? > For some historic reason, Alpine presently install these as follows: > > /usr/share/man/man3/open.3pm.gz > /usr/share/man/man3/open.3p.gz > > where 3pm is the perl open man page and 3p is the POSIX open man page. Certainly not maximally robust, but not totally unreasonable on first sight. There are three places where section identifiers can be stored: 1) at the end of the directory name 2) at the end of the file name 3) in the .Dt or .TH macro. Needless to say, it is most robust to have all three agree, but that is not always possible for a wide variety of reasons. Operating system conventions are just one such reason. Some manual pages may belong to more than one section. For example, a section 8 manual page may do double duty as a section 5 manual page for the associated configuration file. And third party manual pages may be hard to control. Packagers may be more willing to adjust file and directory names at package build time and more hesitant to patch upstream file content, in particular with respect to something that can be perceived as mostly an aesthetic issue. And there may be more reasons. Hence, in general, mandoc tries to handle mismatching section identifiers gracefully, putting the page in all sections mentioned. Apart from the section stated inside the file, man3/open.3pm is supposed to show up in both sections "3" and "3pm" and man3/open.3p in both "3" and "3p". In the following, let us keep the discussion of makewhatis(8), mansearch() and fs_search() strictly separate. This is the order of decreasing importance. First, regarding makewhatis(8). To test, i ran: rm -rf Test mkdir Test mkdir Test/man3 cp /usr/ports/pobj/man-pages-posix-2017a/man-pages-posix-2017/man3p/open.3p\ Test/man3/ cp /usr/share/man/man3p/open.3p Test/man3/open.3pm makewhatis Test alias dbm_dump=/usr/obj/regress/usr.bin/mandoc/db/dbm_dump/dbm_dump dbm_dump Test/mandoc.db which yielded: === PAGES === page name # [fh1t]open # [t]openat page sect # 3 # 3P # 3p page desc # open file page file src # man3/open.3p page name # [fh1t]open page sect # 3 # 3p # 3pm page desc # perl pragma to set default PerlIO layers for input and output page file src # man3/open.3pm === END OF PAGES === That looks reasonable to me. In particular, it has the "3" from the directory name, the 3P/3p from the file content, and the 3p/3pm from the file name. Next up, regarding mansearch(). man -M Test open -> Perl; random choice because bits, sec, name, and arch all compare equal man -M Test 3 open -> Perl; dto. and both file extensions mismatch man -M Test 3p open -> POSIX; preferred due to exact file extension match man -M Test 3pm open -> Perl; because the POSIX page does not match at all That all looks correct to me, too. Now finally, fs_search(). rm -f Test/mandoc.db man -M Test open -> POSIX; random man -M Test 3p open -> No entry; because directory does not exist man -M Test 3pm open -> No entry; because directory does not exist I think the globbing in the second part of fs_lookup could be relaxed to allow "manpath/man3*/name.[01-9]*" for sec == 3*, and multiple paths could be added to *res. Then the main program would do the usual prioritization. I think i'll start working on that but found it useful to provide feedback without too much delay, and i would welcome feedback on my other points and questions in turn. > With this setup `man 3p open` will always open the Perl man page and > there seems to be no way to open the POSIX man page. I can't reproduce, see above. Are you sure? Is this with or without mandoc.db? Are you using the latest mandoc from CVS? > Looking at the fs_lookup implementation I believe this to be the > case because mandoc expects each section to have its own subdirectory > in MANDIR. Yes. > I am also aware that OpenBSD uses the 3p section for Perl man pages > which is a bit confusing but probably Alpine's fault. Using 3p for POSIX seems to be an upstream decision, not sure whether by the Austin group or by the kernel.org packager. Anyway, it does not look like Alpine's fault. > My present understanding is that this would have to be fixed on the > Alpine side by moving these man pages to their own subdirectory. Please > let me know if there is an alternative solution. While experimenting > with moving these pages, I noticed a discrepancy in the man page lookup > behavior of mansearch and fs_lookup. Assuming the above man pages are > installed as follows: > > /usr/share/man/man3/open.3pm.gz > /usr/share/man/man3p/open.3p.gz > > if a mandoc.db exists, `man 3p open` will display the Perl (open.3pm.gz) > man page (mansearch). I cannot reproduce, see above. > If it doesn't (fs_lookup), it will display the > POSIX man page (open.3p.gz). Yes. > I find this surprising as I would expect > the two algorithms to be equivalent. They cannot be equivalent: fs_search() needs to be *much* simpler because we cannot possibly duplicate all the mandocdb.c logic in main.c. That would be too much code and too slow. Besides, main.c cannot look at file contents. When it doesn't cause too much complexity, making the two more similar to each other may be be worthwhile for particular cases that matter for practical purposes, though. > I am reporting this here as I believe this to be a "minor bug" that > doesn't interest the majority of mandoc users. Good choice, thank you. > See also: https://gitlab.alpinelinux.org/alpine/aports/-/issues/12958 Yours, Ingo -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Discrepancy in mansearch and fs_lookup behavior 2021-09-01 20:35 ` Ingo Schwarze @ 2021-09-02 19:09 ` Ingo Schwarze 2021-09-04 10:12 ` Sören Tempel 1 sibling, 0 replies; 8+ messages in thread From: Ingo Schwarze @ 2021-09-02 19:09 UTC (permalink / raw) To: Soeren Tempel; +Cc: tech Hi Soeren, Ingo Schwarze wrote on Wed, Sep 01, 2021 at 10:35:23PM +0200: > Now finally, fs_search(). > > rm -f Test/mandoc.db > man -M Test open -> POSIX; random > man -M Test 3p open -> No entry; because directory does not exist > man -M Test 3pm open -> No entry; because directory does not exist > > I think the globbing in the second part of fs_lookup could be relaxed > to allow "manpath/man3*/name.[01-9]*" for sec == 3*, and multiple > paths could be added to *res. Then the main program would do the > usual prioritization. See below for a patch resulting in man -M Test open -> POSIX; random man -M Test 3p open -> POSIX because exact match of file name extension man -M Test 3pm open -> Perl because man3/open.3p does not match at all [...] > They cannot be equivalent: fs_search() needs to be *much* simpler > because we cannot possibly duplicate all the mandocdb.c logic in main.c. > That would be too much code and too slow. Besides, main.c cannot look > at file contents. > > When it doesn't cause too much complexity, making the two more similar > to each other may be be worthwhile for particular cases that matter > for practical purposes, though. This was trickier than anticipated. When given "man -M Test 3p open", my first try was allowing "Test/man3*/name.[01-9]*" which worked well enough for finding the desired files, but blew up quite badly elsewhere with lots of false positives and fallout all over the place. To keep false positives at bay, it is definitely required that (unless both match in the first place) either the directory name or the file name extension match the requested section number. What i'm proposing here is the following list of priorities, described using the example of "man 3p open": 1. man3p/open.3p # double exact match 2. cat3p/open.0 # preformatted exact match 3. man3p/<arch>/open.3p # architecture dependent page 4. man3p/open.[01-9]* # directory + name match 5. man3/open.3p* # name + extension match Does the patch make sense to you and work for your use case? Yours, Ingo Index: main.c =================================================================== RCS file: /home/cvs/mandoc/mandoc/main.c,v retrieving revision 1.356 diff -u -p -r1.356 main.c --- main.c 14 Aug 2021 13:53:08 -0000 1.356 +++ main.c 2 Sep 2021 18:36:49 -0000 @@ -94,9 +94,11 @@ struct outstate { int mandocdb(int, char *[]); static void check_xr(struct manpaths *); -static int fs_lookup(const struct manpaths *, - size_t ipath, const char *, - const char *, const char *, +static void fs_append(char **, size_t, int, + size_t, const char *, enum form, + struct manpage **, size_t *); +static int fs_lookup(const struct manpaths *, size_t, + const char *, const char *, const char *, struct manpage **, size_t *); static int fs_search(const struct mansearch *, const struct manpaths *, const char *, @@ -700,6 +702,30 @@ glob_esc(char **dst, const char *src, co *(*dst)++ = *suffix++; } +static void +fs_append(char **file, size_t filesz, int copy, size_t ipath, + const char *sec, enum form form, struct manpage **res, size_t *ressz) +{ + struct manpage *page; + + *res = mandoc_reallocarray(*res, *ressz + filesz, sizeof(**res)); + page = *res + *ressz; + *ressz += filesz; + for (;;) { + page->file = copy ? mandoc_strdup(*file) : *file; + page->names = NULL; + page->output = NULL; + page->bits = NAME_FILE & NAME_MASK; + page->ipath = ipath; + page->sec = (*sec >= '1' && *sec <= '9') ? *sec - '1' + 1 : 10; + page->form = form; + if (--filesz == 0) + break; + file++; + page++; + } +} + static int fs_lookup(const struct manpaths *paths, size_t ipath, const char *sec, const char *arch, const char *name, @@ -707,16 +733,19 @@ fs_lookup(const struct manpaths *paths, { struct stat sb; glob_t globinfo; - struct manpage *page; - char *file, *cp; + char *file, *cp, secnum[2]; int globres; enum form form; const char *const slman = "/man"; const char *const slash = "/"; const char *const sglob = ".[01-9]*"; + const char *const dot = "."; + const char *const aster = "*"; + memset(&globinfo, 0, sizeof(globinfo)); form = FORM_SRC; + mandoc_asprintf(&file, "%s/man%s/%s.%s", paths->paths[ipath], sec, name, sec); if (stat(file, &sb) != -1) @@ -751,14 +780,34 @@ fs_lookup(const struct manpaths *paths, mandoc_msg(MANDOCERR_GLOB, 0, 0, "%s: %s", file, strerror(errno)); free(file); + file = NULL; if (globres == 0) - file = mandoc_strdup(*globinfo.gl_pathv); + goto found; globfree(&globinfo); - if (globres == 0) { - if (stat(file, &sb) != -1) - goto found; + + if (sec[1] != '\0' && *ressz == 0) { + secnum[0] = sec[0]; + secnum[1] = '\0'; + cp = file = mandoc_malloc(strlen(paths->paths[ipath]) * 2 + + strlen(slman) + strlen(secnum) * 2 + strlen(slash) + + strlen(name) * 2 + strlen(dot) + + strlen(sec) * 2 + strlen(aster) + 1); + glob_esc(&cp, paths->paths[ipath], slman); + glob_esc(&cp, secnum, slash); + glob_esc(&cp, name, dot); + glob_esc(&cp, sec, aster); + *cp = '\0'; + globres = glob(file, 0, NULL, &globinfo); + if (globres != 0 && globres != GLOB_NOMATCH) + mandoc_msg(MANDOCERR_GLOB, 0, 0, + "%s: %s", file, strerror(errno)); free(file); + file = NULL; + if (globres == 0) + goto found; + globfree(&globinfo); } + if (res != NULL || ipath + 1 != paths->sz) return -1; @@ -770,19 +819,14 @@ fs_lookup(const struct manpaths *paths, found: warnx("outdated mandoc.db lacks %s(%s) entry, run %s %s", name, sec, BINM_MAKEWHATIS, paths->paths[ipath]); - if (res == NULL) { + if (res == NULL) free(file); - return 0; - } - *res = mandoc_reallocarray(*res, ++*ressz, sizeof(**res)); - page = *res + (*ressz - 1); - page->file = file; - page->names = NULL; - page->output = NULL; - page->bits = NAME_FILE & NAME_MASK; - page->ipath = ipath; - page->sec = (*sec >= '1' && *sec <= '9') ? *sec - '1' + 1 : 10; - page->form = form; + else if (file == NULL) + fs_append(globinfo.gl_pathv, globinfo.gl_pathc, 1, + ipath, sec, form, res, ressz); + else + fs_append(&file, 1, 0, ipath, sec, form, res, ressz); + globfree(&globinfo); return 0; } -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Discrepancy in mansearch and fs_lookup behavior 2021-09-01 20:35 ` Ingo Schwarze 2021-09-02 19:09 ` Ingo Schwarze @ 2021-09-04 10:12 ` Sören Tempel 2021-09-04 13:18 ` Ingo Schwarze 1 sibling, 1 reply; 8+ messages in thread From: Sören Tempel @ 2021-09-04 10:12 UTC (permalink / raw) To: tech Ingo Schwarze <schwarze@usta.de> wrote: > Hi Soeren, Hello Ingo, > Hold on a second; if there are mandoc bugs or misfeatures in this area, > uprooting major areas of manual page forests in a particular operating > system may not be the best option for dealing with that. Indeed, I would also prefer not move the Alpine man page. It was just unclear to me whether mandoc intended to support lookup for section which do not have their own subdirectory in MANDIR from reading the fs_lookup code it seemed to me that this was not supported, thus my initial idea was to move the pages. > Do you mean, "with mandoc"? Yes. > In the following, let us keep the discussion of makewhatis(8), > mansearch() and fs_search() strictly separate. This is the order > of decreasing importance. In the following I will perform all tests with the open(3p) and the open(3pm) man page installed as follows in MANDIR: /usr/share/man/man3/open.3pm.gz /usr/share/man/man3/open.3p.gz > === PAGES === > page name # [fh1t]open # [t]openat > page sect # 3 # 3P # 3p > page desc # open file > page file src # man3/open.3p > page name # [fh1t]open > page sect # 3 # 3p # 3pm > page desc # perl pragma to set default PerlIO layers for input and output > page file src # man3/open.3pm > === END OF PAGES === > > That looks reasonable to me. In particular, it has the "3" from > the directory name, the 3P/3p from the file content, and the 3p/3pm > from the file name. The dbm_dump output for these two pages looks as follows on a standard Alpine Linux Edge installation with a mandoc.db generated with makewhatis(8) from mandoc-1.14.5: page name # [fh1t]open # [t]openat page sect # 3 # 3P # 3p page desc # open file page file src # man3/open.3p.gz page name # [fh1t]open page sect # 3 # 3pm page desc # perl pragma to set default PerlIO layers for input and output page file src # man3/open.3pm.gz This does look similar to your output. The only difference seems to be that my man pages are gzip compressed. > Next up, regarding mansearch(). > > man -M Test open -> Perl; random choice because bits, sec, name, > and arch all compare equal > man -M Test 3 open -> Perl; dto. and both file extensions mismatch > man -M Test 3p open -> POSIX; preferred due to exact file extension match If I try to duplicate your mansearch() tests I get the following results: man open -> open(2) Linux man page (/usr/share/man/man2/open.2.gz) man 3 open -> open(3pm) Perl man page man 3p open -> open(3pm) Perl man page I get these results with both mandoc-1.14.5 and mandoc from current CVS. Only for `man 3pm open` these two mandoc versions seem to return different results: man 3pm open: CVS version: open(3pm) Perl man page mandoc-1.14.5: open(2) Linux man page For `man 3p open` I briefly stepped through the mansearch() code with gdb on the mandoc CVS version. What happens here is that mansearch() returns the following results: (gdb) p resnsz $1 = 2 (gdb) p resn[0] $2 = { file = 0x7ffff7ffe8e0 "/usr/share/man/man3/open.3pm.gz", names = 0x7ffff7ffed20 "open(3, 3pm)", output = 0x7ffff7f60b00 "perl pragma to set default PerlIO layers for input and output", bits = 30, ipath = 0, sec = 3, form = FORM_SRC } (gdb) p resn[1] $3 = { file = 0x7ffff7ffe8b0 "/usr/share/man/man3/open.3p.gz", names = 0x7ffff7ffed00 "open, openat(3, 3P, 3p)", output = 0x7ffff7f608f0 "open file", bits = 30, ipath = 0, sec = 3, form = FORM_SRC } mandoc then searches for the best match according to priority. In this case both resn[0] and resn[1] get a priority of 35. However, since it iterates over resn[0] first and the comparison with best_prio is a greater than or equal comparison [1], resn[0] is the one that gets selected. Maybe this is a problem with the .gz file extension? if (search.sec != NULL && (strlen(sec) <= ssz + 3 || strcmp(sec + strlen(sec) - ssz, search.sec) != 0)) prio += 20; /* Wrong file ext. */ The above code is where I would expect the two man pages to differ since search.sec is 3p and as such should cause the open(3p) man page to get the better priority. Unfortunately, this does not work correctly since the file extension extracted here is gz, not 3p. This would also explain why we get different mansearch() results. Regarding fs_search(): I tested the patch from your other reply. This patch work entirely fine for me and does what it is supposed to do. I applied this patch on top of the CVS version. With mandoc.db removed, I get the following results: man 3 open -> open(3p) POSIX man page man 3p open -> open(3p) POSIX man page man 3pm open -> open(3pm) Perl man page So yes, your patch does seem to improve the fs_search() behaviour, thanks a lot! This is what I would expect the mansearch() to also return. > > With this setup `man 3p open` will always open the Perl man page and > > there seems to be no way to open the POSIX man page. > > I can't reproduce, see above. Are you sure? Is this with or without > mandoc.db? Are you using the latest mandoc from CVS? I tried to clarify this above. There presently does seem to be no way to access open(3p) on Alpine with mansearch() neither with the CVS version nor with 1.14.5. Greetings, Sören [1]: https://github.com/openbsd/src/blob/2207c4325726fdc5c4bcd0011af0fdf7d3dab137/usr.bin/mandoc/main.c#L519-L520 -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Discrepancy in mansearch and fs_lookup behavior 2021-09-04 10:12 ` Sören Tempel @ 2021-09-04 13:18 ` Ingo Schwarze 2021-09-04 16:16 ` Ingo Schwarze 0 siblings, 1 reply; 8+ messages in thread From: Ingo Schwarze @ 2021-09-04 13:18 UTC (permalink / raw) To: Sören Tempel; +Cc: tech Hi Soeren, Soeren Tempel wrote on Sat, Sep 04, 2021 at 12:12:50PM +0200: > Ingo Schwarze <schwarze@usta.de> wrote: >> Hold on a second; if there are mandoc bugs or misfeatures in this area, >> uprooting major areas of manual page forests in a particular operating >> system may not be the best option for dealing with that. > Indeed, I would also prefer not move the Alpine man page. It was just > unclear to me whether mandoc intended to support lookup for section > which do not have their own subdirectory in MANDIR I did not intend to in the past - but merely because i was unaware of the need to support such a configuration. This is a typical case of software needing real-world use and testing to mature. > from reading the fs_lookup code it seemed to me that this was not > supported, thus my initial idea was to move the pages. I hope we get this sorted out and you don't need to - hopefully without breaking the setup in other operating systems either. Illumos may be particularly fragile, so i hope people will test the next release over there. And i'd like to see tests on FreeBSD too, but those will almost certainly happen anyway. [...] >> === PAGES === >> page name # [fh1t]open >> page sect # 3 # 3p # 3pm [...] > page name # [fh1t]open > page sect # 3 # 3pm Looks like your Perl pages contain ".TH open 3" or ".TH open 3pm" whereas OpenBSD has ".TH open 3p", but i don't expect that to cause trouble. Actually, not having "3p" in there is good for you. It lessens the risk of mixups. [...] >> Next up, regarding mansearch(). >> >> man -M Test open -> Perl; random choice because bits, sec, name, >> and arch all compare equal >> man -M Test 3 open -> Perl; dto. and both file extensions mismatch >> man -M Test 3p open -> POSIX; preferred due to exact file extension match > If I try to duplicate your mansearch() tests I get the following results: > > man open -> open(2) Linux man page (/usr/share/man/man2/open.2.gz) > man 3 open -> open(3pm) Perl man page > man 3p open -> open(3pm) Perl man page > > I get these results with both mandoc-1.14.5 and mandoc from current CVS. Let's talk about CVS HEAD only, mixing in statements about old releases can cause nothing but confusion. [...] > For `man 3p open` I briefly stepped through the mansearch() code with > gdb on the mandoc CVS version. What happens here is that mansearch() > returns the following results: [... snip correct results ...] > mandoc then searches for the best match according to priority. In this > case both resn[0] and resn[1] get a priority of 35. However, since it > iterates over resn[0] first and the comparison with best_prio is a > greater than or equal comparison [1], resn[0] is the one that gets > selected. Maybe this is a problem with the .gz file extension? Yes, most likely. I shall have a look separately and try to fix that. [...] > Regarding fs_search(): I tested the patch from your other reply. This > patch work entirely fine for me and does what it is supposed to do. I > applied this patch on top of the CVS version. With mandoc.db removed, I > get the following results: > > man 3 open -> open(3p) POSIX man page > man 3p open -> open(3p) POSIX man page > man 3pm open -> open(3pm) Perl man page > > So yes, your patch does seem to improve the fs_search() behaviour, > thanks a lot! Thank you very much for testing! It is really good to have a real-world test before commit rather only what i did in my Test/ sandbox. The patch appended below is now committed. Hang on a moment for the remaining issue with the .gz extension probably breaking the prio code, i'll try to figure that out and send a patch for that, too. Yours, Ingo Log Message: ----------- In the fallback code to look for manual pages without using mandoc.db(5), accept files "man<one-digit-section>/<name>.<full-section>" in addition to the already supported "man<full-section>/name.[01-9]*". Needed for example on Alpine Linux which puts its Perl manuals into "man3/<name>.3pm" and the POSIX manuals into "man3/<name>.3p". While here, allow the glob(3) at the end of fs_lookup() to add multiple matches to the result set. This improves man -w output and may also help some cases of plain man(1), allowing main() to prioritize properly rather than fs_lookup() picking a random match. Issue reported and patch tested by Soeren Tempel <soeren at soeren hyphen tempel dot net>. Modified Files: -------------- mandoc: main.c Revision Data ------------- Index: main.c =================================================================== RCS file: /home/cvs/mandoc/mandoc/main.c,v retrieving revision 1.356 retrieving revision 1.357 diff -Lmain.c -Lmain.c -u -p -r1.356 -r1.357 --- main.c +++ main.c @@ -94,9 +94,11 @@ struct outstate { int mandocdb(int, char *[]); static void check_xr(struct manpaths *); -static int fs_lookup(const struct manpaths *, - size_t ipath, const char *, - const char *, const char *, +static void fs_append(char **, size_t, int, + size_t, const char *, enum form, + struct manpage **, size_t *); +static int fs_lookup(const struct manpaths *, size_t, + const char *, const char *, const char *, struct manpage **, size_t *); static int fs_search(const struct mansearch *, const struct manpaths *, const char *, @@ -700,6 +702,30 @@ glob_esc(char **dst, const char *src, co *(*dst)++ = *suffix++; } +static void +fs_append(char **file, size_t filesz, int copy, size_t ipath, + const char *sec, enum form form, struct manpage **res, size_t *ressz) +{ + struct manpage *page; + + *res = mandoc_reallocarray(*res, *ressz + filesz, sizeof(**res)); + page = *res + *ressz; + *ressz += filesz; + for (;;) { + page->file = copy ? mandoc_strdup(*file) : *file; + page->names = NULL; + page->output = NULL; + page->bits = NAME_FILE & NAME_MASK; + page->ipath = ipath; + page->sec = (*sec >= '1' && *sec <= '9') ? *sec - '1' + 1 : 10; + page->form = form; + if (--filesz == 0) + break; + file++; + page++; + } +} + static int fs_lookup(const struct manpaths *paths, size_t ipath, const char *sec, const char *arch, const char *name, @@ -707,16 +733,19 @@ fs_lookup(const struct manpaths *paths, { struct stat sb; glob_t globinfo; - struct manpage *page; - char *file, *cp; + char *file, *cp, secnum[2]; int globres; enum form form; const char *const slman = "/man"; const char *const slash = "/"; const char *const sglob = ".[01-9]*"; + const char *const dot = "."; + const char *const aster = "*"; + memset(&globinfo, 0, sizeof(globinfo)); form = FORM_SRC; + mandoc_asprintf(&file, "%s/man%s/%s.%s", paths->paths[ipath], sec, name, sec); if (stat(file, &sb) != -1) @@ -751,14 +780,34 @@ fs_lookup(const struct manpaths *paths, mandoc_msg(MANDOCERR_GLOB, 0, 0, "%s: %s", file, strerror(errno)); free(file); + file = NULL; if (globres == 0) - file = mandoc_strdup(*globinfo.gl_pathv); + goto found; globfree(&globinfo); - if (globres == 0) { - if (stat(file, &sb) != -1) - goto found; + + if (sec[1] != '\0' && *ressz == 0) { + secnum[0] = sec[0]; + secnum[1] = '\0'; + cp = file = mandoc_malloc(strlen(paths->paths[ipath]) * 2 + + strlen(slman) + strlen(secnum) * 2 + strlen(slash) + + strlen(name) * 2 + strlen(dot) + + strlen(sec) * 2 + strlen(aster) + 1); + glob_esc(&cp, paths->paths[ipath], slman); + glob_esc(&cp, secnum, slash); + glob_esc(&cp, name, dot); + glob_esc(&cp, sec, aster); + *cp = '\0'; + globres = glob(file, 0, NULL, &globinfo); + if (globres != 0 && globres != GLOB_NOMATCH) + mandoc_msg(MANDOCERR_GLOB, 0, 0, + "%s: %s", file, strerror(errno)); free(file); + file = NULL; + if (globres == 0) + goto found; + globfree(&globinfo); } + if (res != NULL || ipath + 1 != paths->sz) return -1; @@ -770,19 +819,14 @@ fs_lookup(const struct manpaths *paths, found: warnx("outdated mandoc.db lacks %s(%s) entry, run %s %s", name, sec, BINM_MAKEWHATIS, paths->paths[ipath]); - if (res == NULL) { + if (res == NULL) free(file); - return 0; - } - *res = mandoc_reallocarray(*res, ++*ressz, sizeof(**res)); - page = *res + (*ressz - 1); - page->file = file; - page->names = NULL; - page->output = NULL; - page->bits = NAME_FILE & NAME_MASK; - page->ipath = ipath; - page->sec = (*sec >= '1' && *sec <= '9') ? *sec - '1' + 1 : 10; - page->form = form; + else if (file == NULL) + fs_append(globinfo.gl_pathv, globinfo.gl_pathc, 1, + ipath, sec, form, res, ressz); + else + fs_append(&file, 1, 0, ipath, sec, form, res, ressz); + globfree(&globinfo); return 0; } -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Discrepancy in mansearch and fs_lookup behavior 2021-09-04 13:18 ` Ingo Schwarze @ 2021-09-04 16:16 ` Ingo Schwarze 2021-09-04 17:51 ` Sören Tempel 0 siblings, 1 reply; 8+ messages in thread From: Ingo Schwarze @ 2021-09-04 16:16 UTC (permalink / raw) To: Soeren Tempel; +Cc: tech Hi Soeren, Ingo Schwarze wrote on Sat, Sep 04, 2021 at 03:18:32PM +0200: > Hang on a moment for the remaining issue with the .gz extension > probably breaking the prio code, i'll try to figure that out and > send a patch for that, too. I believe the patch below fixes the issue that main() does not properly prioritize gzipped files according to their file name extensions. Does it make sense to you, and does it work for you? The main() program is growing to become a bit unwieldy, but i'm not going to do any major refactoring right now. Yours, Ingo Index: main.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/main.c,v retrieving revision 1.259 diff -u -r1.259 main.c --- main.c 4 Sep 2021 12:47:04 -0000 1.259 +++ main.c 4 Sep 2021 16:04:44 -0000 @@ -131,7 +131,7 @@ struct mparse *mp; /* Opaque parser object. */ const char *conf_file; /* -C: alternate config file. */ const char *os_s; /* -I: Operating system for display. */ - const char *progname, *sec; + const char *progname, *sec, *ep; char *defpaths; /* -M: override manpaths. */ char *auxpaths; /* -m: additional manpaths. */ char *oarg; /* -O: output option string. */ @@ -514,11 +514,16 @@ sec++; /* Prefer without suffix. */ if (*sec != '/') prio += 10; /* Wrong dir name. */ - if (search.sec != NULL && - (strlen(sec) <= ssz + 3 || - strcmp(sec + strlen(sec) - ssz, - search.sec) != 0)) - prio += 20; /* Wrong file ext. */ + if (search.sec != NULL) { + ep = strchr(sec, '\0'); + if (ep - sec > 3 && + strncmp(ep - 3, ".gz", 3) == 0) + ep -= 3; + if ((size_t)(ep - sec) < ssz + 3 || + strncmp(ep - ssz, search.sec, + ssz) != 0) /* Wrong file */ + prio += 20; /* extension. */ + } if (prio >= best_prio) continue; best_prio = prio; -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Discrepancy in mansearch and fs_lookup behavior 2021-09-04 16:16 ` Ingo Schwarze @ 2021-09-04 17:51 ` Sören Tempel 2021-09-05 12:47 ` Ingo Schwarze 0 siblings, 1 reply; 8+ messages in thread From: Sören Tempel @ 2021-09-04 17:51 UTC (permalink / raw) To: tech Ingo Schwarze <schwarze@usta.de> wrote: > Hi Soeren, Hello Ingo, > I believe the patch below fixes the issue that main() does not > properly prioritize gzipped files according to their file name > extensions. > > Does it make sense to you, and does it work for you? Yes, the patch does make sense and it does work well for me: man 3 open -> Perl open(3pm) man page man 3p open -> POSIX open(3p) man page man 3pm open -> Perl open(3pm) man page Thanks a lot for fixing this so quickly! BTW: Is there any chance that you could make a new release soonish? Has been a while since the last release and CVS has accumulated quite a few useful fixes since 1.14.5. Greetings, Sören -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Discrepancy in mansearch and fs_lookup behavior 2021-09-04 17:51 ` Sören Tempel @ 2021-09-05 12:47 ` Ingo Schwarze 0 siblings, 0 replies; 8+ messages in thread From: Ingo Schwarze @ 2021-09-05 12:47 UTC (permalink / raw) To: Soeren Tempel; +Cc: tech Hi Soeren, Soeren Tempel wrote on Sat, Sep 04, 2021 at 07:51:58PM +0200: > Ingo Schwarze <schwarze@usta.de> wrote: >> I believe the patch below fixes the issue that main() does not >> properly prioritize gzipped files according to their file name >> extensions. >> >> Does it make sense to you, and does it work for you? > Yes, the patch does make sense and it does work well for me: > > man 3 open -> Perl open(3pm) man page > man 3p open -> POSIX open(3p) man page > man 3pm open -> Perl open(3pm) man page > > Thanks a lot for fixing this so quickly! Thank you for reporting and testing! The patch is now committed, see the commit below. > BTW: Is there any chance that you could make a new release soonish? Has > been a while since the last release and CVS has accumulated quite a few > useful fixes since 1.14.5. Yes, the next release is long overdue. I have been paying attention to not do major reorganizations that might destabilize the code base for quite some time now. I'm trying to get those bugs fixed that are easy to fix with a low risk, then draft the release notes, then send out a beta to downstream maintainers for testing without too much additional delay. My list of pending patches is already free from mandoc entries. My list of urgent bug reports still contains one or two entries related to tbl(7). After that, a final pass over the TODO file makes sense, to see whether anything trivial remains in there - but the bulk of the TODO entries will certainly not be addressed before the upcoming release. Yours, Ingo Log Message: ----------- during prioritization for man(1), correctly extract the section name from the file name extension of gzipped manual page files; bug found on Alpine Linux by Soeren Tempel <soeren at soeren hyphen tempel dot net>, who also tested this patch Modified Files: -------------- mandoc: main.c Revision Data ------------- Index: main.c =================================================================== RCS file: /home/cvs/mandoc/mandoc/main.c,v retrieving revision 1.357 retrieving revision 1.358 diff -Lmain.c -Lmain.c -u -p -r1.357 -r1.358 --- main.c +++ main.c @@ -132,7 +132,7 @@ main(int argc, char *argv[]) struct mparse *mp; /* Opaque parser object. */ const char *conf_file; /* -C: alternate config file. */ const char *os_s; /* -I: Operating system for display. */ - const char *progname, *sec; + const char *progname, *sec, *ep; char *defpaths; /* -M: override manpaths. */ char *auxpaths; /* -m: additional manpaths. */ char *oarg; /* -O: output option string. */ @@ -536,11 +536,16 @@ main(int argc, char *argv[]) sec++; /* Prefer without suffix. */ if (*sec != '/') prio += 10; /* Wrong dir name. */ - if (search.sec != NULL && - (strlen(sec) <= ssz + 3 || - strcmp(sec + strlen(sec) - ssz, - search.sec) != 0)) - prio += 20; /* Wrong file ext. */ + if (search.sec != NULL) { + ep = strchr(sec, '\0'); + if (ep - sec > 3 && + strncmp(ep - 3, ".gz", 3) == 0) + ep -= 3; + if ((size_t)(ep - sec) < ssz + 3 || + strncmp(ep - ssz, search.sec, + ssz) != 0) /* Wrong file */ + prio += 20; /* extension. */ + } if (prio >= best_prio) continue; best_prio = prio; -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-09-05 12:47 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-30 19:26 Discrepancy in mansearch and fs_lookup behavior Sören Tempel 2021-09-01 20:35 ` Ingo Schwarze 2021-09-02 19:09 ` Ingo Schwarze 2021-09-04 10:12 ` Sören Tempel 2021-09-04 13:18 ` Ingo Schwarze 2021-09-04 16:16 ` Ingo Schwarze 2021-09-04 17:51 ` Sören Tempel 2021-09-05 12:47 ` Ingo Schwarze
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).