tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Allow gzipped .so and search .so in manpath
@ 2014-11-24 14:36 Baptiste Daroussin
  2014-11-26 20:10 ` Ingo Schwarze
  0 siblings, 1 reply; 7+ messages in thread
From: Baptiste Daroussin @ 2014-11-24 14:36 UTC (permalink / raw)
  To: tech


[-- Attachment #1.1: Type: text/plain, Size: 307 bytes --]

Hi,

Here is a new patch that allows to open gzipped .so files as well as looking
inside mpath for the files requested.

This allows zshall manpage which contains: .so man1/zshmodules.1 to work with
mandoc, tested on freebsd this .so find as expected
/usr/local/man/man1/zshmodules.1.gz

Best regards,
Bapt

[-- Attachment #1.2: mandoc-so_gzipped.diff --]
[-- Type: text/x-diff, Size: 2936 bytes --]

Index: Makefile
===================================================================
RCS file: /cvs/mdocml/Makefile,v
retrieving revision 1.445
diff -u -r1.445 Makefile
--- Makefile	25 Oct 2014 01:03:52 -0000	1.445
+++ Makefile	24 Nov 2014 14:32:46 -0000
@@ -236,7 +236,7 @@
 
 MANPAGE_OBJS	 = manpage.o mansearch.o mansearch_const.o manpath.o
 
-DEMANDOC_OBJS	 = demandoc.o
+DEMANDOC_OBJS	 = demandoc.o manpath.o
 
 WWW_MANS	 = apropos.1.html \
 		   demandoc.1.html \
Index: read.c
===================================================================
RCS file: /cvs/mdocml/read.c,v
retrieving revision 1.96
diff -u -r1.96 read.c
--- read.c	1 Nov 2014 06:03:13 -0000	1.96
+++ read.c	24 Nov 2014 14:32:46 -0000
@@ -19,6 +19,7 @@
 #include "config.h"
 
 #include <sys/types.h>
+#include <sys/param.h>
 #if HAVE_MMAP
 #include <sys/mman.h>
 #include <sys/stat.h>
@@ -41,6 +42,7 @@
 #include "libmandoc.h"
 #include "mdoc.h"
 #include "man.h"
+#include "manpath.h"
 #include "main.h"
 
 #define	REPARSE_LIMIT	1000
@@ -783,10 +785,61 @@
 mparse_readfd(struct mparse *curp, int fd, const char *file)
 {
 	struct buf	 blk;
+	char		 buf[PATH_MAX];
 	int		 with_mmap;
 	int		 save_filenc;
+	size_t		 i;
+	struct manpaths	 paths;
+	struct stat	 st;
+	pid_t		 child_pid;
+
+	child_pid = 0;
+
+	if (-1 == fd && *file != '/') {
+		do {
+			/* First try locally */
+			if ((fd = open(file, O_RDONLY, 0)) != -1)
+				break;
+
+			/* Try the gzip version */
+			(void)snprintf(buf, sizeof(buf), "%s.gz", file);
+			if (stat(buf, &st) != -1)
+				(void)mparse_open(curp, &fd, buf, &child_pid);
+			if (-1 != fd) {
+				curp->file = buf;
+				break;
+			}
+
+			/* Lookup in the manpath */
+			manpath_parse(&paths, NULL, NULL, NULL);
+			for (i = 0; i < paths.sz; i++) {
+				(void)snprintf(buf, sizeof(buf), "%s/%s",
+				    paths.paths[i], file);
+				if (-1 != (fd = open(buf, O_RDONLY, 0))) {
+					curp->file = buf;
+					break;
+				}
+				(void)snprintf(buf, sizeof(buf), "%s/%s.gz",
+				    paths.paths[i], file);
+				if (stat(buf, &st) != -1)
+					(void)mparse_open(curp, &fd, buf, &child_pid);
+				if (-1 != fd) {
+					curp->file = buf;
+					break;
+				}
+			}
+			manpath_free(&paths);
+			if (-1 != fd)
+				break;
 
-	if (-1 == fd && -1 == (fd = open(file, O_RDONLY, 0))) {
+			curp->file_status = MANDOCLEVEL_SYSERR;
+			if (curp->mmsg)
+				(*curp->mmsg)(MANDOCERR_SYSOPEN,
+				    curp->file_status,
+				    file, 0, 0, strerror(errno));
+			return(curp->file_status);
+		} while (0);
+	} else if (-1 == fd && -1 == (fd = open(file, O_RDONLY, 0))) {
 		curp->file_status = MANDOCLEVEL_SYSERR;
 		if (curp->mmsg)
 			(*curp->mmsg)(MANDOCERR_SYSOPEN,
@@ -818,6 +871,8 @@
 
 	if (STDIN_FILENO != fd && -1 == close(fd))
 		perror(file);
+	if (child_pid)
+		mparse_wait(curp, child_pid);
 
 	return(curp->file_status);
 }

[-- Attachment #2: Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Allow gzipped .so and search .so in manpath
  2014-11-24 14:36 Allow gzipped .so and search .so in manpath Baptiste Daroussin
@ 2014-11-26 20:10 ` Ingo Schwarze
  2014-11-26 20:24   ` Baptiste Daroussin
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Schwarze @ 2014-11-26 20:10 UTC (permalink / raw)
  To: Baptiste Daroussin; +Cc: tech

Hi Baptiste,

Baptiste Daroussin wrote on Mon, Nov 24, 2014 at 03:36:29PM +0100:

> Here is a new patch that allows to open gzipped .so files
> as well as looking inside mpath for the files requested.

This is a terrible layering violation for three (!) reasons:

 1. The mandoc toolbox consists of code in two major categories:
     a) code that is handling files irrespective of context
     b) code that is aware of file context / file systems
    While b) code is allowed to use a) code, the reverse
    is not true.
    The file read.c is part of the file handling code a), while
    manpath.h is part of the b) interface; so read.c cannot
    include manpath.h.
    It is true that the implementation of .so in roff.c (which
    is also part of a) slightly strains the "irrespective of
    context" paradigm by accessing another file.  But at least
    it retrieves the file in exactly one well-defined place
    near the original file, and this strain is unavoidable
    if we want to support .so at all.
    Widening the gap by making read.c search for files along
    the path is neither acceptable nor needed.

 2. As proposed, the patch is horribly inefficient.
    For some purposes, the function mparse_readfd() is called
    in inner loops, so it is not acceptable to do such expensive
    operations in there.
    Explained differently, we must not recalculate the manpath
    for each and every file we parse.  Think about makewhatis(8),
    for example.
    Calculating the manpath belongs on the level of the main
    program, not into some inner loop in a library.

 3. The patch utterly breaks the library hierarchy.
    The function mparse_readfd() is a libmandoc interface
    as documented in mandoc(3), while manpath.h is not.
    While manpath.c can (and does) use libmandoc, the reverse
    cannot happen.
    You see part of the havoc wrought in that you suddenly
    need manpath.o to link demandoc(1) even though that
    program does not use the manpath.h interface.

I have to think about a better way to solve this.
Maybe i should make mparse_readfd() use mparse_open()
and extend mparse_open a bit to use file.gz if file
is not found.  Or something like that.
It shouldn't take too long to figure out.

> This allows zshall manpage which contains: .so man1/zshmodules.1
> to work with mandoc, tested on freebsd this .so find as expected
> /usr/local/man/man1/zshmodules.1.gz

The .so links are a bad idea in the first place, so it would be
good to get rid of them completely.  But admittedly, that's a
separate matter.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Allow gzipped .so and search .so in manpath
  2014-11-26 20:10 ` Ingo Schwarze
@ 2014-11-26 20:24   ` Baptiste Daroussin
  2014-11-27  0:12     ` Ingo Schwarze
  0 siblings, 1 reply; 7+ messages in thread
From: Baptiste Daroussin @ 2014-11-26 20:24 UTC (permalink / raw)
  To: tech

[-- Attachment #1: Type: text/plain, Size: 3109 bytes --]

On Wed, Nov 26, 2014 at 09:10:33PM +0100, Ingo Schwarze wrote:
> Hi Baptiste,
> 
> Baptiste Daroussin wrote on Mon, Nov 24, 2014 at 03:36:29PM +0100:
> 
> > Here is a new patch that allows to open gzipped .so files
> > as well as looking inside mpath for the files requested.
> 
> This is a terrible layering violation for three (!) reasons:
> 
>  1. The mandoc toolbox consists of code in two major categories:
>      a) code that is handling files irrespective of context
>      b) code that is aware of file context / file systems
>     While b) code is allowed to use a) code, the reverse
>     is not true.
>     The file read.c is part of the file handling code a), while
>     manpath.h is part of the b) interface; so read.c cannot
>     include manpath.h.
>     It is true that the implementation of .so in roff.c (which
>     is also part of a) slightly strains the "irrespective of
>     context" paradigm by accessing another file.  But at least
>     it retrieves the file in exactly one well-defined place
>     near the original file, and this strain is unavoidable
>     if we want to support .so at all.
>     Widening the gap by making read.c search for files along
>     the path is neither acceptable nor needed.
> 
>  2. As proposed, the patch is horribly inefficient.
>     For some purposes, the function mparse_readfd() is called
>     in inner loops, so it is not acceptable to do such expensive
>     operations in there.
>     Explained differently, we must not recalculate the manpath
>     for each and every file we parse.  Think about makewhatis(8),
>     for example.
>     Calculating the manpath belongs on the level of the main
>     program, not into some inner loop in a library.
> 
>  3. The patch utterly breaks the library hierarchy.
>     The function mparse_readfd() is a libmandoc interface
>     as documented in mandoc(3), while manpath.h is not.
>     While manpath.c can (and does) use libmandoc, the reverse
>     cannot happen.
>     You see part of the havoc wrought in that you suddenly
>     need manpath.o to link demandoc(1) even though that
>     program does not use the manpath.h interface.
> 
> I have to think about a better way to solve this.
> Maybe i should make mparse_readfd() use mparse_open()
> and extend mparse_open a bit to use file.gz if file
> is not found.  Or something like that.
> It shouldn't take too long to figure out.

I was expecting the above :)
What about using zlib directly, that could simplify a bunch of things.
> 
> > This allows zshall manpage which contains: .so man1/zshmodules.1
> > to work with mandoc, tested on freebsd this .so find as expected
> > /usr/local/man/man1/zshmodules.1.gz
> 
> The .so links are a bad idea in the first place, so it would be
> good to get rid of them completely.  But admittedly, that's a
> separate matter.

Right now in the ports tree I'm ripping them at package creation using soelim
from groff and I wrote minimalistic version in base to be able to continue using
that when groff will be removed.

regards,
Bapt

[-- Attachment #2: Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Allow gzipped .so and search .so in manpath
  2014-11-26 20:24   ` Baptiste Daroussin
@ 2014-11-27  0:12     ` Ingo Schwarze
  2014-11-27  2:03       ` Ingo Schwarze
  2014-11-29 17:54       ` Baptiste Daroussin
  0 siblings, 2 replies; 7+ messages in thread
From: Ingo Schwarze @ 2014-11-27  0:12 UTC (permalink / raw)
  To: Baptiste Daroussin; +Cc: tech

Hi Baptiste,

Baptiste Daroussin wrote on Wed, Nov 26, 2014 at 09:24:13PM +0100:

> I was expecting the above :)

I just committed the two following patches, and i believe they
resolve the issue.

There is still a minor problem in makewhatis(8) with database
generation when .so to .gz is involved, but i'll get round to that.

Note that -current is at

  http://fantadrom.bsd.lv/cgi-bin/cvsweb/
  CVSROOT=anoncvs@fantadrom.bsd.lv:/cvs

today, mdocl.bsd.lv is a few days old.  I'm right in the middle of
switching the server to new hardware.  It will be back at mdocml
tomorrow, or in a few days at the latest.

> What about using zlib directly, that could simplify a bunch of things.

That idea makes sense to me.  However, i want to do the 1.13.2
release without adding any new dependencies to help the downstream
projects to stabilize and/or migrate to the 1.13 series.  Granted,
libz is all but heavyweight, and it's hard to imagine there might
be anybody out there who doesn't have it, but anyway...
I'd rather return to that idea after the 1.13.2 release.

> Right now in the ports tree I'm ripping them at package creation
> using soelim from groff and I wrote minimalistic version in base
> to be able to continue using that when groff will be removed.

Sounds reasonable.

When dealing with pure .so files, i.e. those containing a single .so
request and nothing else, it would even be better to replace them with
hard links instead of having two files with the same content - not so
much because that saves space, but rather because it allows combining
information in mandoc.db(5)/apropos(1).

For the user, the end result looks like this:

With copies of files:

   $ apropos getuid
  geteuid(2) - get user identification
  getuid(2) - get user identification

With hard (or soft or .so) links:

   $ apropos getuid
  geteuid, getuid(2) - get user identification

The latter not only saves screen real estate, but also helps to
realize both are documented in the same manual page.

Yours,
  Ingo


Log Message:
-----------
Simplify the mparse_open()/mparse_wait() interface.
Don't bother the user with the PID of the child process,
store it inside the opaque mparse handle.

Modified Files:
--------------
    mdocml:
        TODO
        main.c
        mandoc.3
        mandoc.h
        mandocdb.c
        read.c

Revision Data
-------------
Index: mandocdb.c
===================================================================
RCS file: /home/cvs/mdocml/mdocml/mandocdb.c,v
retrieving revision 1.169
retrieving revision 1.170
diff -Lmandocdb.c -Lmandocdb.c -u -p -r1.169 -r1.170
--- mandocdb.c
+++ mandocdb.c
@@ -1084,7 +1084,6 @@ mpages_merge(struct mchars *mc, struct m
 	struct man		*man;
 	char			*sodest;
 	char			*cp;
-	pid_t			 child_pid;
 	int			 fd;
 	unsigned int		 pslot;
 	enum mandoclevel	 lvl;
@@ -1112,9 +1111,8 @@ mpages_merge(struct mchars *mc, struct m
 		mdoc = NULL;
 		man = NULL;
 		sodest = NULL;
-		child_pid = 0;
 
-		mparse_open(mp, &fd, mpage->mlinks->file, &child_pid);
+		mparse_open(mp, &fd, mpage->mlinks->file);
 		if (fd == -1) {
 			say(mpage->mlinks->file, "&open");
 			goto nextpage;
@@ -1231,8 +1229,7 @@ mpages_merge(struct mchars *mc, struct m
 		dbadd(mpage, mc);
 
 nextpage:
-		if (child_pid &&
-		    mparse_wait(mp, child_pid) != MANDOCLEVEL_OK) {
+		if (mparse_wait(mp) != MANDOCLEVEL_OK) {
 			exitcode = (int)MANDOCLEVEL_SYSERR;
 			say(mpage->mlinks->file, "&wait gunzip");
 		}
Index: mandoc.h
===================================================================
RCS file: /home/cvs/mdocml/mdocml/mandoc.h,v
retrieving revision 1.168
retrieving revision 1.169
diff -Lmandoc.h -Lmandoc.h -u -p -r1.168 -r1.169
--- mandoc.h
+++ mandoc.h
@@ -436,8 +436,7 @@ struct mparse	 *mparse_alloc(int, enum m
 			const struct mchars *, const char *);
 void		  mparse_free(struct mparse *);
 void		  mparse_keep(struct mparse *);
-enum mandoclevel  mparse_open(struct mparse *, int *, const char *,
-			pid_t *);
+enum mandoclevel  mparse_open(struct mparse *, int *, const char *);
 enum mandoclevel  mparse_readfd(struct mparse *, int, const char *);
 enum mandoclevel  mparse_readmem(struct mparse *, const void *, size_t,
 			const char *);
@@ -447,7 +446,7 @@ void		  mparse_result(struct mparse *,
 const char	 *mparse_getkeep(const struct mparse *);
 const char	 *mparse_strerror(enum mandocerr);
 const char	 *mparse_strlevel(enum mandoclevel);
-enum mandoclevel  mparse_wait(struct mparse *, pid_t);
+enum mandoclevel  mparse_wait(struct mparse *);
 
 __END_DECLS
 
Index: TODO
===================================================================
RCS file: /home/cvs/mdocml/mdocml/TODO,v
retrieving revision 1.188
retrieving revision 1.189
diff -LTODO -LTODO -u -p -r1.188 -r1.189
--- TODO
+++ TODO
@@ -564,6 +564,9 @@ Several areas can be cleaned up to make 
 * structural issues
 ************************************************************************
 
+- Use libz directly instead of forking gunzip(1).
+  Suggested by bapt at FreeBSD among others.
+
 - We use the input line number at several places to distinguish
   same-line from different-line input.  That plainly doesn't work
   with user-defined macros, leading to random breakage.
Index: read.c
===================================================================
RCS file: /home/cvs/mdocml/mdocml/read.c,v
retrieving revision 1.96
retrieving revision 1.97
diff -Lread.c -Lread.c -u -p -r1.96 -r1.97
--- read.c
+++ read.c
@@ -64,6 +64,7 @@ struct	mparse {
 	int		  filenc; /* encoding of the current file */
 	int		  reparse_count; /* finite interp. stack */
 	int		  line; /* line number in the file */
+	pid_t		  child; /* the gunzip(1) process */
 };
 
 static	void	  choose_parser(struct mparse *);
@@ -823,8 +824,7 @@ mparse_readfd(struct mparse *curp, int f
 }
 
 enum mandoclevel
-mparse_open(struct mparse *curp, int *fd, const char *file,
-	pid_t *child_pid)
+mparse_open(struct mparse *curp, int *fd, const char *file)
 {
 	int		  pfd[2];
 	char		 *cp;
@@ -834,7 +834,7 @@ mparse_open(struct mparse *curp, int *fd
 	curp->file = file;
 	if ((cp = strrchr(file, '.')) == NULL ||
 	    strcmp(cp + 1, "gz")) {
-		*child_pid = 0;
+		curp->child = 0;
 		if ((*fd = open(file, O_RDONLY)) == -1) {
 			err = MANDOCERR_SYSOPEN;
 			goto out;
@@ -847,7 +847,7 @@ mparse_open(struct mparse *curp, int *fd
 		goto out;
 	}
 
-	switch (*child_pid = fork()) {
+	switch (curp->child = fork()) {
 	case -1:
 		err = MANDOCERR_SYSFORK;
 		close(pfd[0]);
@@ -871,7 +871,7 @@ mparse_open(struct mparse *curp, int *fd
 
 out:
 	*fd = -1;
-	*child_pid = 0;
+	curp->child = 0;
 	curp->file_status = MANDOCLEVEL_SYSERR;
 	if (curp->mmsg)
 		(*curp->mmsg)(err, curp->file_status, file,
@@ -882,11 +882,14 @@ out:
 }
 
 enum mandoclevel
-mparse_wait(struct mparse *curp, pid_t child_pid)
+mparse_wait(struct mparse *curp)
 {
 	int	  status;
 
-	if (waitpid(child_pid, &status, 0) == -1) {
+	if (curp->child == 0)
+		return(MANDOCLEVEL_OK);
+
+	if (waitpid(curp->child, &status, 0) == -1) {
 		mandoc_msg(MANDOCERR_SYSWAIT, curp, 0, 0,
 		    strerror(errno));
 		curp->file_status = MANDOCLEVEL_SYSERR;
Index: mandoc.3
===================================================================
RCS file: /home/cvs/mdocml/mdocml/mandoc.3,v
retrieving revision 1.27
retrieving revision 1.28
diff -Lmandoc.3 -Lmandoc.3 -u -p -r1.27 -r1.28
--- mandoc.3
+++ mandoc.3
@@ -81,7 +81,6 @@
 .Fa "struct mparse *parse"
 .Fa "int *fd"
 .Fa "const char *fname"
-.Fa "pid_t *child_pid"
 .Fc
 .Ft "enum mandoclevel"
 .Fo mparse_readfd
@@ -111,7 +110,6 @@
 .Ft "enum mandoclevel"
 .Fo mparse_wait
 .Fa "struct mparse *parse"
-.Fa "pid_t child_pid"
 .Fc
 .In sys/types.h
 .In mandoc.h
@@ -404,14 +402,6 @@ or -1 on failure.
 It can be passed to
 .Fn mparse_readfd
 or used directly.
-If applicable, return the
-.Xr gunzip 1
-child process ID in
-.Fa child_pid ,
-or otherwise 0.
-If non-zero, it should be passed to
-.Fn mparse_wait
-after completing the parse sequence.
 Declared in
 .In mandoc.h ,
 implemented in
Index: main.c
===================================================================
RCS file: /home/cvs/mdocml/mdocml/main.c,v
retrieving revision 1.199
retrieving revision 1.200
diff -Lmain.c -Lmain.c -u -p -r1.199 -r1.200
--- main.c
+++ main.c
@@ -115,7 +115,6 @@ main(int argc, char *argv[])
 #endif
 	enum mandoclevel rc;
 	enum outmode	 outmode;
-	pid_t		 child_pid;
 	int		 fd;
 	int		 show_usage;
 	int		 use_pager;
@@ -388,8 +387,7 @@ main(int argc, char *argv[])
 	while (argc) {
 #if HAVE_SQLITE3
 		if (resp != NULL) {
-			rc = mparse_open(curp.mp, &fd, resp->file,
-			    &child_pid);
+			rc = mparse_open(curp.mp, &fd, resp->file);
 			if (fd == -1)
 				/* nothing */;
 			else if (resp->form & FORM_SRC) {
@@ -403,14 +401,12 @@ main(int argc, char *argv[])
 		} else
 #endif
 		{
-			rc = mparse_open(curp.mp, &fd, *argv++,
-			    &child_pid);
+			rc = mparse_open(curp.mp, &fd, *argv++);
 			if (fd != -1)
 				parse(&curp, fd, argv[-1], &rc);
 		}
 
-		if (child_pid &&
-		    mparse_wait(curp.mp, child_pid) != MANDOCLEVEL_OK)
+		if (mparse_wait(curp.mp) != MANDOCLEVEL_OK)
 			rc = MANDOCLEVEL_SYSERR;
 
 		if (MANDOCLEVEL_OK != rc && curp.wstop)


Log Message:
-----------
Let mparse_readfd() use mparse_open() and mparse_wait()
and let mparse_open() fall back to .gz files
such that .so works even when the target is zipped,
requested by and in part using ideas from <bapt at FreeBSD>.
While here, make sure files are readable before forking,
both for efficiency and for better error reporting.

Modified Files:
--------------
    mdocml:
        mandoc.3
        read.c

Revision Data
-------------
Index: read.c
===================================================================
RCS file: /home/cvs/mdocml/mdocml/read.c,v
retrieving revision 1.97
retrieving revision 1.98
diff -Lread.c -Lread.c -u -p -r1.97 -r1.98
--- read.c
+++ read.c
@@ -780,28 +780,25 @@ mparse_readmem(struct mparse *curp, cons
 	return(curp->file_status);
 }
 
+/*
+ * If a file descriptor is given, use it and assume it points
+ * to the named file.  Otherwise, open the named file.
+ * Read the whole file into memory and call the parsers.
+ * Called recursively when an .so request is encountered.
+ */
 enum mandoclevel
 mparse_readfd(struct mparse *curp, int fd, const char *file)
 {
 	struct buf	 blk;
 	int		 with_mmap;
 	int		 save_filenc;
+	pid_t		 save_child;
 
-	if (-1 == fd && -1 == (fd = open(file, O_RDONLY, 0))) {
-		curp->file_status = MANDOCLEVEL_SYSERR;
-		if (curp->mmsg)
-			(*curp->mmsg)(MANDOCERR_SYSOPEN,
-			    curp->file_status,
-			    file, 0, 0, strerror(errno));
-		return(curp->file_status);
-	}
-
-	/*
-	 * Run for each opened file; may be called more than once for
-	 * each full parse sequence if the opened file is nested (i.e.,
-	 * from `so').  Simply sucks in the whole file and moves into
-	 * the parse phase for the file.
-	 */
+	save_child = curp->child;
+	if (fd != -1)
+		curp->child = 0;
+	else if (mparse_open(curp, &fd, file) >= MANDOCLEVEL_SYSERR)
+		goto out;
 
 	if (read_whole_file(curp, file, fd, &blk, &with_mmap)) {
 		save_filenc = curp->filenc;
@@ -817,9 +814,12 @@ mparse_readfd(struct mparse *curp, int f
 			free(blk.buf);
 	}
 
-	if (STDIN_FILENO != fd && -1 == close(fd))
+	if (fd != STDIN_FILENO && close(fd) == -1)
 		perror(file);
 
+	mparse_wait(curp);
+out:
+	curp->child = save_child;
 	return(curp->file_status);
 }
 
@@ -827,21 +827,40 @@ enum mandoclevel
 mparse_open(struct mparse *curp, int *fd, const char *file)
 {
 	int		  pfd[2];
+	int		  save_errno;
 	char		 *cp;
 	enum mandocerr	  err;
 
 	pfd[1] = -1;
 	curp->file = file;
+
+	/* Unless zipped, try to just open the file. */
+
 	if ((cp = strrchr(file, '.')) == NULL ||
 	    strcmp(cp + 1, "gz")) {
 		curp->child = 0;
-		if ((*fd = open(file, O_RDONLY)) == -1) {
-			err = MANDOCERR_SYSOPEN;
-			goto out;
-		}
-		return(MANDOCLEVEL_OK);
+		if ((*fd = open(file, O_RDONLY)) != -1)
+			return(MANDOCLEVEL_OK);
+
+		/* Open failed; try to append ".gz". */
+
+		mandoc_asprintf(&cp, "%s.gz", file);
+		file = cp;
+	} else
+		cp = NULL;
+
+	/* Before forking, make sure the file can be read. */
+
+	save_errno = errno;
+	if (access(file, R_OK) == -1) {
+		if (cp != NULL)
+			errno = save_errno;
+		err = MANDOCERR_SYSOPEN;
+		goto out;
 	}
 
+	/* Run gunzip(1). */
+
 	if (pipe(pfd) == -1) {
 		err = MANDOCERR_SYSPIPE;
 		goto out;
@@ -870,11 +889,12 @@ mparse_open(struct mparse *curp, int *fd
 	}
 
 out:
+	free(cp);
 	*fd = -1;
 	curp->child = 0;
 	curp->file_status = MANDOCLEVEL_SYSERR;
 	if (curp->mmsg)
-		(*curp->mmsg)(err, curp->file_status, file,
+		(*curp->mmsg)(err, curp->file_status, curp->file,
 		    0, 0, strerror(errno));
 	if (pfd[1] != -1)
 		exit(1);
Index: mandoc.3
===================================================================
RCS file: /home/cvs/mdocml/mdocml/mandoc.3,v
retrieving revision 1.28
retrieving revision 1.29
diff -Lmandoc.3 -Lmandoc.3 -u -p -r1.28 -r1.29
--- mandoc.3
+++ mandoc.3
@@ -396,6 +396,12 @@ open with
 .Xr gunzip 1 ;
 otherwise, with
 .Xr open 2 .
+If
+.Xr open 2
+fails, append
+.Pa .gz
+and try with
+.Xr gunzip 1 .
 Return a file descriptor open for reading in
 .Fa fd ,
 or -1 on failure.
@@ -410,14 +416,18 @@ implemented in
 Parse a file or file descriptor.
 If
 .Va fd
-is -1,
+is -1, open
 .Va fname
-is opened for reading.
+with
+.Fn mparse_open .
 Otherwise,
 .Va fname
 is assumed to be the name associated with
 .Va fd .
-This may be called multiple times with different parameters; however,
+Calls
+.Fn mparse_wait
+before returning.
+This function may be called multiple times with different parameters; however,
 .Fn mparse_reset
 should be invoked between parses.
 Declared in
@@ -461,11 +471,12 @@ implemented in
 .It Fn mparse_wait
 Bury a
 .Xr gunzip 1
-child process
-.Fa child_pid
-that was spawned with
+child process that was spawned with
 .Fn mparse_open .
 To be called after the parse sequence is complete.
+Not needed after
+.Fn mparse_readfd ,
+but does no harm in that case, either.
 Returns
 .Dv MANDOCLEVEL_OK
 on success and
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Allow gzipped .so and search .so in manpath
  2014-11-27  0:12     ` Ingo Schwarze
@ 2014-11-27  2:03       ` Ingo Schwarze
  2014-11-29 17:54       ` Baptiste Daroussin
  1 sibling, 0 replies; 7+ messages in thread
From: Ingo Schwarze @ 2014-11-27  2:03 UTC (permalink / raw)
  To: Baptiste Daroussin; +Cc: tech

Hi Baptiste,

Ingo Schwarze wrote on Thu, Nov 27, 2014 at 01:12:42AM +0100:

> There is still a minor problem in makewhatis(8) with database
> generation when .so to .gz is involved, but i'll get round to that.

Fixed, see the commit below.

Yours,
  Ingo


Log Message:
-----------
Make makewhatis(8) understand .so links to .gz pages.
Drop the FORM_GZ annotation in the mpages table; it is conceptually wrong 
because it ought to be in the mlinks table: An uncompressed .so link file
can point to a compressed manual page file and vice versa.  
Besides, it is no longer needed because mparse_open() handles it all.
Sprinkle some KNF while here.

Modified Files:
--------------
    mdocml:
        mandocdb.c
        mansearch.c
        mansearch.h

Revision Data
-------------
Index: mansearch.c
===================================================================
RCS file: /home/cvs/mdocml/mdocml/mansearch.c,v
retrieving revision 1.50
retrieving revision 1.51
diff -Lmansearch.c -Lmansearch.c -u -p -r1.50 -r1.51
--- mansearch.c
+++ mansearch.c
@@ -410,7 +410,6 @@ buildnames(struct manpage *mpage, sqlite
 {
 	char		*newnames, *prevsec, *prevarch;
 	const char	*oldnames, *sep1, *name, *sec, *sep2, *arch, *fsec;
-	const char	*gzip;
 	size_t		 i;
 	int		 c;
 
@@ -473,7 +472,7 @@ buildnames(struct manpage *mpage, sqlite
 
 		/* Also save the first file name encountered. */
 
-		if (NULL != mpage->file)
+		if (mpage->file != NULL)
 			continue;
 
 		if (form & FORM_SRC) {
@@ -483,22 +482,18 @@ buildnames(struct manpage *mpage, sqlite
 			sep1 = "cat";
 			fsec = "0";
 		}
-		if (form & FORM_GZ)
-			gzip = ".gz";
-		else
-			gzip = "";
-		sep2 = '\0' == *arch ? "" : "/";
-		mandoc_asprintf(&mpage->file, "%s/%s%s%s%s/%s.%s%s",
-		    path, sep1, sec, sep2, arch, name, fsec, gzip);
+		sep2 = *arch == '\0' ? "" : "/";
+		mandoc_asprintf(&mpage->file, "%s/%s%s%s%s/%s.%s",
+		    path, sep1, sec, sep2, arch, name, fsec);
 	}
-	if (SQLITE_DONE != c)
+	if (c != SQLITE_DONE)
 		fprintf(stderr, "%s\n", sqlite3_errmsg(db));
 	sqlite3_reset(s);
 
 	/* Append one final section to the names. */
 
-	if (NULL != prevsec) {
-		sep2 = '\0' == *prevarch ? "" : "/";
+	if (prevsec != NULL) {
+		sep2 = *prevarch == '\0' ? "" : "/";
 		mandoc_asprintf(&newnames, "%s(%s%s%s)",
 		    mpage->names, prevsec, sep2, prevarch);
 		free(mpage->names);
Index: mandocdb.c
===================================================================
RCS file: /home/cvs/mdocml/mdocml/mandocdb.c,v
retrieving revision 1.170
retrieving revision 1.171
diff -Lmandocdb.c -Lmandocdb.c -u -p -r1.170 -r1.171
--- mandocdb.c
+++ mandocdb.c
@@ -1093,13 +1093,13 @@ mpages_merge(struct mchars *mc, struct m
 	str_info.free = hash_free;
 	str_info.key_offset = offsetof(struct str, key);
 
-	if (0 == nodb)
+	if ( ! nodb)
 		SQL_EXEC("BEGIN TRANSACTION");
 
 	mpage = ohash_first(&mpages, &pslot);
-	while (NULL != mpage) {
+	while (mpage != NULL) {
 		mlinks_undupe(mpage);
-		if (NULL == mpage->mlinks) {
+		if (mpage->mlinks == NULL) {
 			mpage = ohash_next(&mpages, &pslot);
 			continue;
 		}
@@ -1123,17 +1123,23 @@ mpages_merge(struct mchars *mc, struct m
 		 * source code, unless it is already known to be
 		 * formatted.  Fall back to formatted mode.
 		 */
-		if (FORM_CAT != mpage->mlinks->dform ||
-		    FORM_CAT != mpage->mlinks->fform) {
+		if (mpage->mlinks->dform != FORM_CAT ||
+		    mpage->mlinks->fform != FORM_CAT) {
 			lvl = mparse_readfd(mp, fd, mpage->mlinks->file);
 			if (lvl < MANDOCLEVEL_FATAL)
 				mparse_result(mp, &mdoc, &man, &sodest);
 		}
 
-		if (NULL != sodest) {
+		if (sodest != NULL) {
 			mlink_dest = ohash_find(&mlinks,
 			    ohash_qlookup(&mlinks, sodest));
-			if (NULL != mlink_dest) {
+			if (mlink_dest == NULL) {
+				mandoc_asprintf(&cp, "%s.gz", sodest);
+				mlink_dest = ohash_find(&mlinks,
+				    ohash_qlookup(&mlinks, cp));
+				free(cp);
+			}
+			if (mlink_dest != NULL) {
 
 				/* The .so target exists. */
 
@@ -1154,7 +1160,7 @@ mpages_merge(struct mchars *mc, struct m
 					if (mpage_dest->pageid)
 						dbadd_mlink_name(mlink);
 
-					if (NULL == mlink->next)
+					if (mlink->next == NULL)
 						break;
 					mlink = mlink->next;
 				}
@@ -1166,17 +1172,17 @@ mpages_merge(struct mchars *mc, struct m
 				mpage->mlinks = NULL;
 			}
 			goto nextpage;
-		} else if (NULL != mdoc) {
+		} else if (mdoc != NULL) {
 			mpage->form = FORM_SRC;
 			mpage->sec = mdoc_meta(mdoc)->msec;
 			mpage->sec = mandoc_strdup(
-			    NULL == mpage->sec ? "" : mpage->sec);
+			    mpage->sec == NULL ? "" : mpage->sec);
 			mpage->arch = mdoc_meta(mdoc)->arch;
 			mpage->arch = mandoc_strdup(
-			    NULL == mpage->arch ? "" : mpage->arch);
+			    mpage->arch == NULL ? "" : mpage->arch);
 			mpage->title =
 			    mandoc_strdup(mdoc_meta(mdoc)->title);
-		} else if (NULL != man) {
+		} else if (man != NULL) {
 			mpage->form = FORM_SRC;
 			mpage->sec =
 			    mandoc_strdup(man_meta(man)->msec);
@@ -1193,8 +1199,6 @@ mpages_merge(struct mchars *mc, struct m
 			mpage->title =
 			    mandoc_strdup(mpage->mlinks->name);
 		}
-		if (mpage->mlinks->gzip)
-			mpage->form |= FORM_GZ;
 		putkey(mpage, mpage->sec, TYPE_sec);
 		if (*mpage->arch != '\0')
 			putkey(mpage, mpage->arch, TYPE_arch);
Index: mansearch.h
===================================================================
RCS file: /home/cvs/mdocml/mdocml/mansearch.h,v
retrieving revision 1.20
retrieving revision 1.21
diff -Lmansearch.h -Lmansearch.h -u -p -r1.20 -r1.21
--- mansearch.h
+++ mansearch.h
@@ -70,7 +70,6 @@
 
 #define	FORM_CAT	 0  /* manual page is preformatted */
 #define	FORM_SRC	 1  /* format is mdoc(7) or man(7) */
-#define	FORM_GZ		 2  /* compressed with gzip(1) */
 #define	FORM_NONE	 4  /* format is unknown */
 
 enum	argmode {
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Allow gzipped .so and search .so in manpath
  2014-11-27  0:12     ` Ingo Schwarze
  2014-11-27  2:03       ` Ingo Schwarze
@ 2014-11-29 17:54       ` Baptiste Daroussin
  2014-11-29 18:14         ` Ingo Schwarze
  1 sibling, 1 reply; 7+ messages in thread
From: Baptiste Daroussin @ 2014-11-29 17:54 UTC (permalink / raw)
  To: tech

[-- Attachment #1: Type: text/plain, Size: 3030 bytes --]

On Thu, Nov 27, 2014 at 01:12:42AM +0100, Ingo Schwarze wrote:
> Hi Baptiste,
> 
> Baptiste Daroussin wrote on Wed, Nov 26, 2014 at 09:24:13PM +0100:
> 
> > I was expecting the above :)
> 
> I just committed the two following patches, and i believe they
> resolve the issue.
> 
> There is still a minor problem in makewhatis(8) with database
> generation when .so to .gz is involved, but i'll get round to that.
> 
> Note that -current is at
> 
>   http://fantadrom.bsd.lv/cgi-bin/cvsweb/
>   CVSROOT=anoncvs@fantadrom.bsd.lv:/cvs
> 
> today, mdocl.bsd.lv is a few days old.  I'm right in the middle of
> switching the server to new hardware.  It will be back at mdocml
> tomorrow, or in a few days at the latest.
> 
> > What about using zlib directly, that could simplify a bunch of things.
> 
> That idea makes sense to me.  However, i want to do the 1.13.2
> release without adding any new dependencies to help the downstream
> projects to stabilize and/or migrate to the 1.13 series.  Granted,
> libz is all but heavyweight, and it's hard to imagine there might
> be anybody out there who doesn't have it, but anyway...
> I'd rather return to that idea after the 1.13.2 release.
> 
> > Right now in the ports tree I'm ripping them at package creation
> > using soelim from groff and I wrote minimalistic version in base
> > to be able to continue using that when groff will be removed.
> 
> Sounds reasonable.
> 
> When dealing with pure .so files, i.e. those containing a single .so
> request and nothing else, it would even be better to replace them with
> hard links instead of having two files with the same content - not so
> much because that saves space, but rather because it allows combining
> information in mandoc.db(5)/apropos(1).
> 
> For the user, the end result looks like this:
> 
> With copies of files:
> 
>    $ apropos getuid
>   geteuid(2) - get user identification
>   getuid(2) - get user identification
> 
> With hard (or soft or .so) links:
> 
>    $ apropos getuid
>   geteuid, getuid(2) - get user identification
> 
> The latter not only saves screen real estate, but also helps to
> realize both are documented in the same manual page.

Yes that makes sense, I'll look in that direction, I ll see what I can cook for
ports.
> 
> Yours,
>   Ingo
> 
Shouldn't the look up for .so links be done in the manpath.

For example I have the WindowMaker manpage (which should really be a link in
that case :D) installed here:

/usr/local/man/man1/WindowMaker.1x.gz

The content is the following:
.so man1/wmaker.1x.gz

If I run:
./mandoc /usr/local/man/man1/WindowMaker.1x.gz

mandoc: man1/wmaker.1x: SYSERR: No such file or directory
mandoc: man1/wmaker.1x:1:19: FATAL: .so request failed: .so man1/wmaker.1x

If I trace it it tries to open man1/wmaker.1x then man1/wmaker.1x.gz but in the
directory where I run the command.

zshall(1) is a better example as in this it cannot be replaced by links.

Best regards,
Bapt

[-- Attachment #2: Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Allow gzipped .so and search .so in manpath
  2014-11-29 17:54       ` Baptiste Daroussin
@ 2014-11-29 18:14         ` Ingo Schwarze
  0 siblings, 0 replies; 7+ messages in thread
From: Ingo Schwarze @ 2014-11-29 18:14 UTC (permalink / raw)
  To: Baptiste Daroussin; +Cc: tech

Hi Baptiste,

Baptiste Daroussin wrote on Sat, Nov 29, 2014 at 06:54:08PM +0100:

> Shouldn't the look up for .so links be done in the manpath.

No.  That might end up including completely unrelated files
that happen to be in the manpath before the intended file.
It has to include the file in the *same* manpath directory.

> For example I have the WindowMaker manpage (which should really
> be a link in that case :D) installed here:
> 
> /usr/local/man/man1/WindowMaker.1x.gz
> 
> The content is the following:
> .so man1/wmaker.1x.gz
> 
> If I run:
> ./mandoc /usr/local/man/man1/WindowMaker.1x.gz

User error.

> mandoc: man1/wmaker.1x: SYSERR: No such file or directory
> mandoc: man1/wmaker.1x:1:19: FATAL: .so request failed: .so man1/wmaker.1x
> 
> If I trace it it tries to open man1/wmaker.1x then man1/wmaker.1x.gz
> but in the directory where I run the command.

Yes.  The man(1) command is responsible for changing the directory
before running the formatter.  If you run the formatter manually,
you need to specify the directory manually, too:

  $ (cd /usr/local/man && mandoc man1/WindowMaker.1x.gz)

That's one of the reasons .so links are so fragile.


Oh.  Now that i explain how it's supposed to work, i realize
that i could do something about it.  The makewhatis(8) utility
already contains (relatively simple) code to split absolute
filenames into a manpath (/usr/local/man) and a local file
path (man1/WindowMaker.1x[.gz]).  I might try to reuse that
code in mandoc(1) = man -l  to:

 * figure out the manpath for each absolute path given
 * try to chdir(2) there
    - if it fails, be silent about it
    - if it succeeds, remember the original working dir
 * open the file, run the parsers and formatters
 * return to the original working dir
 * proceed to the next file

I'll think about it.

Thanks for asking the question,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-11-29 18:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-24 14:36 Allow gzipped .so and search .so in manpath Baptiste Daroussin
2014-11-26 20:10 ` Ingo Schwarze
2014-11-26 20:24   ` Baptiste Daroussin
2014-11-27  0:12     ` Ingo Schwarze
2014-11-27  2:03       ` Ingo Schwarze
2014-11-29 17:54       ` Baptiste Daroussin
2014-11-29 18:14         ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).