From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp-2.sys.kth.se (smtp-2.sys.kth.se [130.237.32.160]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id pAUKwQxB026752 for ; Wed, 30 Nov 2011 15:58:27 -0500 (EST) Received: from mailscan-1.sys.kth.se (mailscan-1.sys.kth.se [130.237.32.91]) by smtp-2.sys.kth.se (Postfix) with ESMTP id 1350914E80D for ; Wed, 30 Nov 2011 21:58:21 +0100 (CET) X-Virus-Scanned: by amavisd-new at kth.se Received: from smtp-2.sys.kth.se ([130.237.32.160]) by mailscan-1.sys.kth.se (mailscan-1.sys.kth.se [130.237.32.91]) (amavisd-new, port 10024) with LMTP id ow0N-rpDALiJ for ; Wed, 30 Nov 2011 21:58:19 +0100 (CET) X-KTH-Auth: kristaps [83.250.6.251] X-KTH-mail-from: kristaps@bsd.lv X-KTH-rcpt-to: tech@mdocml.bsd.lv Received: from macky.local (c83-250-6-251.bredband.comhem.se [83.250.6.251]) by smtp-2.sys.kth.se (Postfix) with ESMTP id 62E0214E35E for ; Wed, 30 Nov 2011 21:58:17 +0100 (CET) Message-ID: <4ED698E9.6090002@bsd.lv> Date: Wed, 30 Nov 2011 21:58:17 +0100 From: Kristaps Dzonsons User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0) Gecko/20111105 Thunderbird/8.0 X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 To: tech@mdocml.bsd.lv Subject: mandocdb(8) fixes Content-Type: multipart/mixed; boundary="------------000207050303090707010203" This is a multi-part message in MIME format. --------------000207050303090707010203 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi, While endian-neutralising mandocdb(8), I noticed that it broke with the last series of commits: the manual type (mdoc, man, cat) wasn't accounted for when pruning the database. The enclosed patch takes account for this. It also adds some in-line documentation. It also removes the "verb > 1" parts, which I find a little unnecessary (they can be added back in). Thoughts? Kristaps P.S., I noticed that mandoc.db grows a little each time `-d' is run over it with the same keys (mandoc.index does not). I'll look carefully to see if we're letting anything stay in there. --------------000207050303090707010203 Content-Type: text/plain; name="patch.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="patch.txt" Index: mandocdb.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mandocdb.c,v retrieving revision 1.17 diff -u -r1.17 mandocdb.c --- mandocdb.c 29 Nov 2011 00:34:50 -0000 1.17 +++ mandocdb.c 30 Nov 2011 20:43:32 -0000 @@ -635,9 +635,6 @@ val.size = sizeof(struct db_val); val.data = &vbuf; - if (verb > 1) - printf("%s: Added keyword: %s\n", - fn, (char *)key.data); dbt_put(db, dbf, &key, &val); } if (ch < 0) { @@ -661,6 +658,7 @@ if (verb) printf("%s: Added index\n", fn); + dbt_put(idx, idxf, &key, &val); } } @@ -677,7 +675,7 @@ recno_t *maxrec, recno_t **recs, size_t *recsz) { const struct of *of; - const char *fn; + const char *fn, *cp; struct db_val *vbuf; unsigned seq, sseq; DBT key, val; @@ -689,18 +687,32 @@ while (0 == (ch = (*idx->seq)(idx, &key, &val, seq))) { seq = R_NEXT; *maxrec = *(recno_t *)key.data; - if (0 == val.size) { - if (reccur >= *recsz) { - *recsz += MANDOC_SLOP; - *recs = mandoc_realloc(*recs, - *recsz * sizeof(recno_t)); - } - (*recs)[(int)reccur] = *maxrec; - reccur++; - continue; - } + cp = val.data; + + /* Deleted records are zero-sized. Skip them. */ + + if (0 == val.size) + goto cont; + + /* + * Make sure we're sane. + * Read past our mdoc/man/cat type to the next string, + * then make sure it's bounded by a NUL. + * Failing any of these, we go into our error handler. + */ + + if (NULL == (fn = memchr(cp, '\0', val.size))) + break; + if (++fn - cp >= (int)val.size) + break; + if (NULL == memchr(fn, '\0', val.size - (fn - cp))) + break; + + /* + * Search for the file in those we care about. + * XXX: build this into a tree. Too slow. + */ - fn = (char *)val.data; for (of = ofile; of; of = of->next) if (0 == strcmp(fn, of->fname)) break; @@ -708,23 +720,31 @@ if (NULL == of) continue; + /* + * Search through the keyword database, throwing out all + * references to our file. + */ + sseq = R_FIRST; while (0 == (ch = (*db->seq)(db, &key, &val, sseq))) { sseq = R_NEXT; - assert(sizeof(struct db_val) == val.size); + if (sizeof(struct db_val) != val.size) + break; + vbuf = val.data; if (*maxrec != vbuf->rec) continue; - if (verb) - printf("%s: Deleted keyword: %s\n", - fn, (char *)key.data); - ch = (*db->del)(db, &key, R_CURSOR); - if (ch < 0) + + if ((ch = (*db->del)(db, &key, R_CURSOR)) < 0) break; } + if (ch < 0) { perror(dbf); exit((int)MANDOCLEVEL_SYSERR); + } else if (1 != ch) { + fprintf(stderr, "%s: Corrupt database\n", dbf); + exit((int)MANDOCLEVEL_SYSERR); } if (verb) @@ -732,11 +752,10 @@ val.size = 0; ch = (*idx->put)(idx, &key, &val, R_CURSOR); - if (ch < 0) { - perror(idxf); - exit((int)MANDOCLEVEL_SYSERR); - } + if (ch < 0) + break; +cont: if (reccur >= *recsz) { *recsz += MANDOC_SLOP; *recs = mandoc_realloc @@ -746,6 +765,15 @@ (*recs)[(int)reccur] = *maxrec; reccur++; } + + if (ch < 0) { + perror(idxf); + exit((int)MANDOCLEVEL_SYSERR); + } else if (1 != ch) { + fprintf(stderr, "%s: Corrupt database\n", idxf); + exit((int)MANDOCLEVEL_SYSERR); + } + (*maxrec)++; } --------------000207050303090707010203-- -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv