tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* mandocdb(8) fixes
@ 2011-11-30 20:58 Kristaps Dzonsons
  2011-11-30 21:09 ` Kristaps Dzonsons
  2011-12-01 23:24 ` Ingo Schwarze
  0 siblings, 2 replies; 3+ messages in thread
From: Kristaps Dzonsons @ 2011-11-30 20:58 UTC (permalink / raw)
  To: tech

[-- Attachment #1: Type: text/plain, Size: 587 bytes --]

Hi,

While endian-neutralising mandocdb(8), I noticed that it broke with the 
last series of commits: the manual type (mdoc, man, cat) wasn't 
accounted for when pruning the database.

The enclosed patch takes account for this.  It also adds some in-line 
documentation.  It also removes the "verb > 1" parts, which I find a 
little unnecessary (they can be added back in).

Thoughts?

Kristaps

P.S., I noticed that mandoc.db grows a little each time `-d' is run over 
it with the same keys (mandoc.index does not).  I'll look carefully to 
see if we're letting anything stay in there.

[-- Attachment #2: patch.txt --]
[-- Type: text/plain, Size: 3305 bytes --]

Index: mandocdb.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mandocdb.c,v
retrieving revision 1.17
diff -u -r1.17 mandocdb.c
--- mandocdb.c	29 Nov 2011 00:34:50 -0000	1.17
+++ mandocdb.c	30 Nov 2011 20:43:32 -0000
@@ -635,9 +635,6 @@
 			val.size = sizeof(struct db_val);
 			val.data = &vbuf;
 
-			if (verb > 1)
-				printf("%s: Added keyword: %s\n", 
-						fn, (char *)key.data);
 			dbt_put(db, dbf, &key, &val);
 		}
 		if (ch < 0) {
@@ -661,6 +658,7 @@
 
 		if (verb)
 			printf("%s: Added index\n", fn);
+
 		dbt_put(idx, idxf, &key, &val);
 	}
 }
@@ -677,7 +675,7 @@
 		recno_t *maxrec, recno_t **recs, size_t *recsz)
 {
 	const struct of	*of;
-	const char	*fn;
+	const char	*fn, *cp;
 	struct db_val	*vbuf;
 	unsigned	 seq, sseq;
 	DBT		 key, val;
@@ -689,18 +687,32 @@
 	while (0 == (ch = (*idx->seq)(idx, &key, &val, seq))) {
 		seq = R_NEXT;
 		*maxrec = *(recno_t *)key.data;
-		if (0 == val.size) {
-			if (reccur >= *recsz) {
-				*recsz += MANDOC_SLOP;
-				*recs = mandoc_realloc(*recs, 
-					*recsz * sizeof(recno_t));
-			}
-			(*recs)[(int)reccur] = *maxrec;
-			reccur++;
-			continue;
-		}
+		cp = val.data;
+
+		/* Deleted records are zero-sized.  Skip them. */
+
+		if (0 == val.size)
+			goto cont;
+
+		/*
+		 * Make sure we're sane.
+		 * Read past our mdoc/man/cat type to the next string,
+		 * then make sure it's bounded by a NUL.
+		 * Failing any of these, we go into our error handler.
+		 */
+
+		if (NULL == (fn = memchr(cp, '\0', val.size)))
+			break;
+		if (++fn - cp >= (int)val.size)
+			break;
+		if (NULL == memchr(fn, '\0', val.size - (fn - cp)))
+			break;
+
+		/* 
+		 * Search for the file in those we care about.
+		 * XXX: build this into a tree.  Too slow.
+		 */
 
-		fn = (char *)val.data;
 		for (of = ofile; of; of = of->next)
 			if (0 == strcmp(fn, of->fname))
 				break;
@@ -708,23 +720,31 @@
 		if (NULL == of)
 			continue;
 
+		/*
+		 * Search through the keyword database, throwing out all
+		 * references to our file.
+		 */
+
 		sseq = R_FIRST;
 		while (0 == (ch = (*db->seq)(db, &key, &val, sseq))) {
 			sseq = R_NEXT;
-			assert(sizeof(struct db_val) == val.size);
+			if (sizeof(struct db_val) != val.size)
+				break;
+
 			vbuf = val.data;
 			if (*maxrec != vbuf->rec)
 				continue;
-			if (verb)
-				printf("%s: Deleted keyword: %s\n", 
-						fn, (char *)key.data);
-			ch = (*db->del)(db, &key, R_CURSOR);
-			if (ch < 0)
+
+			if ((ch = (*db->del)(db, &key, R_CURSOR)) < 0)
 				break;
 		}
+
 		if (ch < 0) {
 			perror(dbf);
 			exit((int)MANDOCLEVEL_SYSERR);
+		} else if (1 != ch) {
+			fprintf(stderr, "%s: Corrupt database\n", dbf);
+			exit((int)MANDOCLEVEL_SYSERR);
 		}
 
 		if (verb)
@@ -732,11 +752,10 @@
 
 		val.size = 0;
 		ch = (*idx->put)(idx, &key, &val, R_CURSOR);
-		if (ch < 0) {
-			perror(idxf);
-			exit((int)MANDOCLEVEL_SYSERR);
-		}
 
+		if (ch < 0)
+			break;
+cont:
 		if (reccur >= *recsz) {
 			*recsz += MANDOC_SLOP;
 			*recs = mandoc_realloc
@@ -746,6 +765,15 @@
 		(*recs)[(int)reccur] = *maxrec;
 		reccur++;
 	}
+
+	if (ch < 0) {
+		perror(idxf);
+		exit((int)MANDOCLEVEL_SYSERR);
+	} else if (1 != ch) {
+		fprintf(stderr, "%s: Corrupt database\n", idxf);
+		exit((int)MANDOCLEVEL_SYSERR);
+	}
+
 	(*maxrec)++;
 }
 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: mandocdb(8) fixes
  2011-11-30 20:58 mandocdb(8) fixes Kristaps Dzonsons
@ 2011-11-30 21:09 ` Kristaps Dzonsons
  2011-12-01 23:24 ` Ingo Schwarze
  1 sibling, 0 replies; 3+ messages in thread
From: Kristaps Dzonsons @ 2011-11-30 21:09 UTC (permalink / raw)
  To: tech

> P.S., I noticed that mandoc.db grows a little each time `-d' is run over
> it with the same keys (mandoc.index does not). I'll look carefully to
> see if we're letting anything stay in there.

Being more specific (I meant mandoc.index, not .db):

% ./mandocdb -d . *.[1-9] && ls -l mandoc.index mandoc.db
-rw-r--r--  1 kristaps  staff  16384 30 Nov 22:08 mandoc.db
-rw-r--r--  1 kristaps  staff    887 30 Nov 22:08 mandoc.index
% ./mandocdb -d . *.[1-9] && ls -l mandoc.index mandoc.db
-rw-r--r--  1 kristaps  staff  16384 30 Nov 22:08 mandoc.db
-rw-r--r--  1 kristaps  staff    903 30 Nov 22:08 mandoc.index
% ./mandocdb -d . *.[1-9] && ls -l mandoc.index mandoc.db
-rw-r--r--  1 kristaps  staff  16384 30 Nov 22:08 mandoc.db
-rw-r--r--  1 kristaps  staff    919 30 Nov 22:08 mandoc.index
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: mandocdb(8) fixes
  2011-11-30 20:58 mandocdb(8) fixes Kristaps Dzonsons
  2011-11-30 21:09 ` Kristaps Dzonsons
@ 2011-12-01 23:24 ` Ingo Schwarze
  1 sibling, 0 replies; 3+ messages in thread
From: Ingo Schwarze @ 2011-12-01 23:24 UTC (permalink / raw)
  To: tech

Hi Kristaps,

Kristaps Dzonsons wrote on Wed, Nov 30, 2011 at 09:58:17PM +0100:

> While endian-neutralising mandocdb(8), I noticed that it broke with
> the last series of commits: the manual type (mdoc, man, cat) wasn't
> accounted for when pruning the database.

Oops, yaya, sorry, i broke that.
I missed that function.

> The enclosed patch takes account for this.  It also adds some
> in-line documentation.  It also removes the "verb > 1" parts, which
> I find a little unnecessary (they can be added back in).

Looks good and merged to OpenBSD.

Thanks,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-12-01 23:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-30 20:58 mandocdb(8) fixes Kristaps Dzonsons
2011-11-30 21:09 ` Kristaps Dzonsons
2011-12-01 23:24 ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).