tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* whatis(1)
@ 2011-11-27 12:16 Kristaps Dzonsons
  2011-11-27 18:13 ` whatis(1) Ingo Schwarze
  0 siblings, 1 reply; 5+ messages in thread
From: Kristaps Dzonsons @ 2011-11-27 12:16 UTC (permalink / raw)
  To: tech

[-- Attachment #1: Type: text/plain, Size: 375 bytes --]

Hi,

Enclosed is a simple implementation of whatis(1).  It's a mode of 
apropos(1) where arguments are re-written as

   foo  =>  Nm~^foo$

This follows OpenBSD's method; other systems, like my Mac, search for 
both `Nm' and `Nd', but I'm avoiding this for now because I can't figure 
out the proper regex for word boundaries in multi-word `Nd' strings.

Thoughts?

Kristaps

[-- Attachment #2: patch.txt --]
[-- Type: text/plain, Size: 5045 bytes --]

Index: apropos.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/apropos.c,v
retrieving revision 1.19
diff -u -r1.19 apropos.c
--- apropos.c	26 Nov 2011 22:38:11 -0000	1.19
+++ apropos.c	27 Nov 2011 12:15:23 -0000
@@ -38,12 +38,13 @@
 int
 main(int argc, char *argv[])
 {
-	int		 ch, rc;
+	int		 ch, rc, whatis, i;
 	struct manpaths	 paths;
-	size_t		 terms;
+	size_t		 terms, sz;
 	struct opts	 opts;
 	struct expr	*e;
 	char		*defpaths, *auxpaths;
+	char		*buf;
 	extern int	 optind;
 	extern char	*optarg;
 
@@ -53,12 +54,13 @@
 	else
 		++progname;
 
+	whatis = 0 == strcmp(progname, "whatis");
+
 	memset(&paths, 0, sizeof(struct manpaths));
 	memset(&opts, 0, sizeof(struct opts));
 
 	auxpaths = defpaths = NULL;
 	e = NULL;
-	rc = 0;
 
 	while (-1 != (ch = getopt(argc, argv, "M:m:S:s:")))
 		switch (ch) {
@@ -76,19 +78,34 @@
 			break;
 		default:
 			usage();
-			goto out;
+			return(EXIT_FAILURE);
 		}
 
 	argc -= optind;
 	argv += optind;
 
-	if (0 == argc) {
-		rc = 1;
-		goto out;
-	}
+	if (0 == argc) 
+		return(EXIT_SUCCESS);
+
+	rc = 0;
 
 	manpath_parse(&paths, defpaths, auxpaths);
 
+	/*
+	 * whatis(1) has a much simpler syntax than apropos(1): it
+	 * accepts standalone words and composes them into full-string
+	 * matches within the name.
+	 */
+	if (whatis) 
+		for (i = 0; i < argc; i++) {
+			sz = strlen(argv[i]) + 6;
+			buf = mandoc_malloc(sz);
+			strlcpy(buf, "Nm~^", sz);
+			strlcat(buf, argv[i], sz);
+			strlcat(buf, "$", sz);
+			argv[i] = buf;
+		}
+		
 	if (NULL == (e = exprcomp(argc, argv, &terms))) {
 		fprintf(stderr, "%s: Bad expression\n", progname);
 		goto out;
@@ -105,6 +122,10 @@
 out:
 	manpath_free(&paths);
 	exprfree(e);
+
+	if (whatis)
+		for (i = 0; i < argc; i++)
+			free(argv[i]);
 
 	return(rc ? EXIT_SUCCESS : EXIT_FAILURE);
 }
Index: whatis.1
===================================================================
RCS file: whatis.1
diff -N whatis.1
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ whatis.1	27 Nov 2011 12:15:23 -0000
@@ -0,0 +1,111 @@
+.\"	$Id: apropos.1,v 1.9 2011/11/26 22:38:11 schwarze Exp $
+.\"
+.\" Copyright (c) 2011 Kristaps Dzonsons <kristaps@bsd.lv>
+.\"
+.\" Permission to use, copy, modify, and distribute this software for any
+.\" purpose with or without fee is hereby granted, provided that the above
+.\" copyright notice and this permission notice appear in all copies.
+.\"
+.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+.\"
+.Dd $Mdocdate: November 26 2011 $
+.Dt WHATIS 1
+.Os
+.Sh NAME
+.Nm whatis
+.Nd search manual page databases
+.Sh SYNOPSIS
+.Nm
+.Op Fl M Ar manpath
+.Op Fl m Ar manpath
+.Op Fl S Ar arch
+.Op Fl s Ar section
+.Ar term...
+.Sh DESCRIPTION
+The
+.Nm
+utility searches for manuals named
+.Ar term
+in manual page databases generated by
+.Xr mandocdb 8 .
+Its arguments are as follows:
+.Bl -tag -width Ds
+.It Fl M Ar manpath
+Use the colon-separated path instead of the default list of paths
+searched for
+.Xr mandocdb 8
+databases.
+Invalid paths, or paths without manual databases, are ignored.
+.It Fl m Ar manpath
+Prepend the colon-separated paths to the list of paths searched
+for
+.Xr mandocdb 8
+databases.
+Invalid paths, or paths without manual databases, are ignored.
+.It Fl S Ar arch
+Search only for a particular architecture.
+.It Fl s Ar cat
+Search only for a manual section.
+See
+.Xr man 1
+for a listing of manual sections.
+.El
+.Pp
+By default,
+.Nm
+searches for
+.Xr mandocdb 8
+databases in the default paths stipulated by
+.Xr man 1 .
+Results are sorted by manual title, with output formatted as
+.Pp
+.D1 title(sec) \- description
+.Pp
+Where
+.Qq title
+is the manual's title (note multiple manual names may exist for one
+title),
+.Qq sec
+is the manual section, and
+.Qq description
+is the manual's short description.
+If an architecture is specified for the manual, it is displayed as
+.Pp
+.D1 title(cat/arch) \- description
+.Pp
+Resulting manuals may be accessed as
+.Pp
+.Dl $ man \-s sec title
+.Pp
+If an architecture is specified in the output, use
+.Pp
+.Dl $ man \-s sec \-S arch title
+.Sh ENVIRONMENT
+.Bl -tag -width Ds
+.It Ev MANPATH
+Colon-separated paths overriding the default list of paths searched for
+manual databases.
+Invalid paths, or paths without manual databases, are ignored.
+Overridden by
+.Fl M .
+.El
+.\" .Sh FILES
+.Sh EXIT STATUS
+.Ex -std
+.Sh SEE ALSO
+.Xr apropos 1 ,
+.Xr man 1 ,
+.Xr mandoc 1 ,
+.Xr mandocdb 8
+.Sh AUTHORS
+The
+.Nm
+utility was written by
+.An Kristaps Dzonsons ,
+.Mt kristaps@bsd.lv .

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: whatis(1)
  2011-11-27 12:16 whatis(1) Kristaps Dzonsons
@ 2011-11-27 18:13 ` Ingo Schwarze
  2011-11-27 18:56   ` whatis(1) Kristaps Dzonsons
  0 siblings, 1 reply; 5+ messages in thread
From: Ingo Schwarze @ 2011-11-27 18:13 UTC (permalink / raw)
  To: tech

Hi,

Kristaps Dzonsons wrote on Sun, Nov 27, 2011 at 01:16:40PM +0100:

> Enclosed is a simple implementation of whatis(1).  It's a mode of
> apropos(1) where arguments are re-written as
> 
>   foo  =>  Nm~^foo$
> 
> This follows OpenBSD's method;

Not really.  A cursory look gives me the impression that OpenBSD
whatis(1) matches whole words in Nm, case-insentively.  However,
i didn't really study the details yet.  For example:

  $ whatis man
  Pod::Man (3p) - Convert POD data to formatted *roff input
  man (1) - display manual pages
  man (7) - legacy formatting language for manual pages
  man.conf (5) - configuration file for man (1)

> other systems, like my Mac, search
> for both `Nm' and `Nd', but I'm avoiding this for now because I
> can't figure out the proper regex for word boundaries

  Nm~[[:<:]]man[[:>:]]

> in multi-word `Nd' strings.

  Nd~[[:<:]]man[[:>:]]  # matches Nd only
  ~[[:<:]]man[[:>:]]  # matches both Nm and Nd

> Thoughts?

Something like this will be needed, yes.

Thanks,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: whatis(1)
  2011-11-27 18:13 ` whatis(1) Ingo Schwarze
@ 2011-11-27 18:56   ` Kristaps Dzonsons
  2011-11-28  0:35     ` whatis(1) Ingo Schwarze
  0 siblings, 1 reply; 5+ messages in thread
From: Kristaps Dzonsons @ 2011-11-27 18:56 UTC (permalink / raw)
  To: tech

On 27/11/2011 19:13, Ingo Schwarze wrote:
> Hi,
>
> Kristaps Dzonsons wrote on Sun, Nov 27, 2011 at 01:16:40PM +0100:
>
>> Enclosed is a simple implementation of whatis(1).  It's a mode of
>> apropos(1) where arguments are re-written as
>>
>>    foo  =>   Nm~^foo$
>>
>> This follows OpenBSD's method;
>
> Not really.  A cursory look gives me the impression that OpenBSD
> whatis(1) matches whole words in Nm, case-insentively.  However,
> i didn't really study the details yet.  For example:
>
>    $ whatis man
>    Pod::Man (3p) - Convert POD data to formatted *roff input
>    man (1) - display manual pages
>    man (7) - legacy formatting language for manual pages
>    man.conf (5) - configuration file for man (1)
>
>> other systems, like my Mac, search
>> for both `Nm' and `Nd', but I'm avoiding this for now because I
>> can't figure out the proper regex for word boundaries
>
>    Nm~[[:<:]]man[[:>:]]
>
>> in multi-word `Nd' strings.
>
>    Nd~[[:<:]]man[[:>:]]  # matches Nd only
>    ~[[:<:]]man[[:>:]]  # matches both Nm and Nd
>
>> Thoughts?
>
> Something like this will be needed, yes.

Ingo,

To get the ball rolling, I just checked in whatis.1 and the code 
modified to use your regexp.  I simplified the call into apropos_db.h 
with termcomp() instead of exprcomp(), as this mode might be useful for 
man.cgi too.  The search expression is now case insensitive and 
~[[:<:]]term[[:>:]].  (Thanks!)

NOTE: this fixed a subtle bug in apropos_db.h in compiling case 
insensitive regular expression.

Thanks,

Kristaps
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: whatis(1)
  2011-11-27 18:56   ` whatis(1) Kristaps Dzonsons
@ 2011-11-28  0:35     ` Ingo Schwarze
  2011-11-28  8:42       ` whatis(1) Kristaps Dzonsons
  0 siblings, 1 reply; 5+ messages in thread
From: Ingo Schwarze @ 2011-11-28  0:35 UTC (permalink / raw)
  To: tech

Hi Kristaps,

Kristaps Dzonsons wrote on Sun, Nov 27, 2011 at 07:56:30PM +0100:

> To get the ball rolling, I just checked in whatis.1 and the code
> modified to use your regexp.  I simplified the call into
> apropos_db.h with termcomp() instead of exprcomp(), as this mode
> might be useful for man.cgi too.  The search expression is now case
> insensitive and ~[[:<:]]term[[:>:]].  (Thanks!)

I have merged this to OpenBSD, with a few tweaks, see below:

 - One real bug: termcomp() returned the wrong end of the singly
   linked list it built, such that only the last word given on
   the command line was used.

 - Whatever MacOS may be doing, i tweaked the the regex
   from "Nm,Nd~" to "Nm~", which is current OpenBSD behaviour.
   When we replace tools, we should not change behaviour at the
   same time, but propose changes separately - if we want them.
   In this case, i'm not even convinced the change makes sense.
   If you ask "whatis mandoc", you clearly want "mandoc(1)" as
   an answer; but why should you get "eqn(7), roff(7), tbl(7)"
   as well?  That's, i think, what plain apropos(1) is for.
   Maybe that's even a bug in MacOS?

 - I suggest to check only the beginning of the command name,
   such that you can e.g. install as "whatis.m" for testing
   purposes.

OK to commit these three tweaks to bsd.lv, too?

> NOTE: this fixed a subtle bug in apropos_db.h in compiling case
> insensitive regular expression.

Indeed, thanks!
  Ingo

 ----- 8< ----- schnipp ----- >8 ----- 8< ----- schnapp ----- >8 -----

CVSROOT:	/cvs
Module name:	src
Changes by:	schwarze@cvs.openbsd.org	2011/11/27 17:16:38

Modified files:
	usr.bin/mandoc : apropos.c apropos_db.c apropos_db.h main.c 

Log message:
Implement whatis(1) as a special apropos(1) mode as a part of
the mandoc(1) binary; not yet enabled for the general public.
Intended to replace src/usr.bin/whatis at a later time.
Coded by kristaps@, with a few tweaks by me.

To test this:
$ mandocdb  # unless you have already done so earlier
$ sudo ln -s /usr/bin/mandoc /usr/bin/whatis.m
$ whatis.m mandoc apropos whatis
$ whatis.m man

 ----- 8< ----- schnipp ----- >8 ----- 8< ----- schnapp ----- >8 -----

--- /co/mdocml/apropos.c	Sun Nov 27 23:51:01 2011
+++ ./apropos.c	Mon Nov 28 01:16:38 2011
@@ -53,7 +49,7 @@ main(int argc, char *argv[])
 	else
 		++progname;
 
-	whatis = 0 == strcmp(progname, "whatis");
+	whatis = 0 == strncmp(progname, "whatis", 6);
 
 	memset(&paths, 0, sizeof(struct manpaths));
 	memset(&opts, 0, sizeof(struct opts));
--- /co/mdocml/apropos_db.c	Mon Nov 28 00:11:37 2011
+++ ./apropos_db.c	Mon Nov 28 01:16:38 2011
@@ -599,10 +599,10 @@ termcomp(int argc, char *argv[], size_t *tt)
 	e = NULL;
 	*tt = 0;
 
-	for (pos = 0; pos < argc; pos++) {
-		sz = strlen(argv[pos]) + 16;
+	for (pos = argc - 1; pos >= 0; pos--) {
+		sz = strlen(argv[pos]) + 18;
 		buf = mandoc_realloc(buf, sz);
-		strlcpy(buf, "~[[:<:]]", sz);
+		strlcpy(buf, "Nm~[[:<:]]", sz);
 		strlcat(buf, argv[pos], sz);
 		strlcat(buf, "[[:>:]]", sz);
 		if (NULL == (next = exprterm(buf, 0))) {
@@ -610,8 +610,7 @@ termcomp(int argc, char *argv[], size_t *tt)
 			exprfree(e);
 			return(NULL);
 		}
-		if (NULL != e)
-			e->next = next;
+		next->next = e;
 		e = next;
 		(*tt)++;
 	}
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: whatis(1)
  2011-11-28  0:35     ` whatis(1) Ingo Schwarze
@ 2011-11-28  8:42       ` Kristaps Dzonsons
  0 siblings, 0 replies; 5+ messages in thread
From: Kristaps Dzonsons @ 2011-11-28  8:42 UTC (permalink / raw)
  To: tech

On 28/11/2011 01:35, Ingo Schwarze wrote:
> Hi Kristaps,
>
> Kristaps Dzonsons wrote on Sun, Nov 27, 2011 at 07:56:30PM +0100:
>
>> To get the ball rolling, I just checked in whatis.1 and the code
>> modified to use your regexp.  I simplified the call into
>> apropos_db.h with termcomp() instead of exprcomp(), as this mode
>> might be useful for man.cgi too.  The search expression is now case
>> insensitive and ~[[:<:]]term[[:>:]].  (Thanks!)
>
> I have merged this to OpenBSD, with a few tweaks, see below:
>
>   - One real bug: termcomp() returned the wrong end of the singly
>     linked list it built, such that only the last word given on
>     the command line was used.
>
>   - Whatever MacOS may be doing, i tweaked the the regex
>     from "Nm,Nd~" to "Nm~", which is current OpenBSD behaviour.
>     When we replace tools, we should not change behaviour at the
>     same time, but propose changes separately - if we want them.
>     In this case, i'm not even convinced the change makes sense.
>     If you ask "whatis mandoc", you clearly want "mandoc(1)" as
>     an answer; but why should you get "eqn(7), roff(7), tbl(7)"
>     as well?  That's, i think, what plain apropos(1) is for.
>     Maybe that's even a bug in MacOS?
>
>   - I suggest to check only the beginning of the command name,
>     such that you can e.g. install as "whatis.m" for testing
>     purposes.
>
> OK to commit these three tweaks to bsd.lv, too?

Great!  I prefer the `Nm' behaviour too; although I understand why Mac 
does it that way, I'm not convinced I like it yet.  Please update the 
whatis.1 manual, by the way, to note these changes.

Thanks,

Kristaps

--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-11-28  8:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-27 12:16 whatis(1) Kristaps Dzonsons
2011-11-27 18:13 ` whatis(1) Ingo Schwarze
2011-11-27 18:56   ` whatis(1) Kristaps Dzonsons
2011-11-28  0:35     ` whatis(1) Ingo Schwarze
2011-11-28  8:42       ` whatis(1) Kristaps Dzonsons

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).